Going Pro with Feedly

I’ve been a heavy user of RSS for years now.  I’ve tried and used everything from custom built applications and scripts, to browser add-ons, to third-party services.  Even this very blog’s archives are full migration and review articles form one tool to another.  Here are a few links, if you are interested:

For the last 3 years, I’ve been using Feedly, which I like a lot.  I’ve been thinking about going Pro for about a year now.  Last week, I made the switch.  Here’s why:

  1. I do love the service and want to support it!  After all, I’m spending at least an hour every day going through my feeds.  Sometimes even more.
  2. The Pro version removes the limit on the number of feeds and items in each feed.  Not that I don’t have enough to read, but I don’t like the idea that I might be missing something.
  3. The Pro version provides integrations and easier sharing to a variety of third-party services.  The one that is most important for me is WordPress integration.
  4. Their blog post about the upcoming changes to feed organization was the last drop – I WANT THAT!

Feedly constantly improves the user experience and brings new features.  It works very stable – I think only remember one or two downtimes in the last three years.  Their web interface is very handy and the mobile app works well too.  They have plenty of browser add-ons to make things even easier.

All in all, it’s well worth $5 per month for me.

Data Gravity

On the drive back home today I was listening to DevOps Cafe podcast, episode 59.  I’ve recently subscribed to this show and I think this was the first episode of it I ever heard.  It’s one of many tech talk podcasts, where two or more people chat for a varied period of time on a selection of topics, mostly related to technology.

In this particular episode, program hosts John and Damon were interviewing the CTO of BashoDave McCrory.  I wasn’t familiar with either Basho or Dave prior to the episode.  Gladly, a somewhat lengthy introduction by Dave gave me a good idea who he is.  What followed though was way more interesting – a discussion about data.

To be completely honest with you, I haven’t even finished the episode yet (got home right in the middle of it), but I feel like it’s one of those worth blogging about.  For one, I’ve learned a new term – “data lake”.  Apparently, that’s a new and fancy way of branding “data warehousing”.  Here is a bit from TechTarget, for example:

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed.

While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question.

The term data lake is often associated with Hadoop-oriented object storage.

But that was just the beginning.  What followed was a fascinating discussion on Data Gravity.  Obviously, this whole thing is too fresh in my mind and I can’t formulate it well yet, so I suggest you listen to the episode and read the intro on the Data Gravity site.  For the sake of brevity:

[…] it’s also a misleading term. Behind it all is the notion that data which is near other data is more useful, and the tendency of data to cling together comes from the usefulness of the resulting knowledge. […]

A lot of it seems obvious, but here it’s all put into a nice thought framework, with references to other, more established fields, like math and physics.  Easily one of the most interesting technology related discussions I’ve heard in a while!

Extract, Transform, Load

I’ve been doing all kinds of data migrations and system integration for years now.  But only yesterday I’ve learned that there is a very specific term linked to the process.

In computing, extract, transform, and load (ETL) refers to a process in database usage and especially in data warehousing that:

  • Extracts data from outside sources
  • Transforms it to fit operational needs, which can include quality levels
  • Loads it into the end target (database, more specifically, operational data store, data mart, or data warehouse)

ETL systems commonly integrate data from multiple applications, typically developed and supported by different vendors or hosted on separate computer hardware. The disparate systems containing the original data are frequently managed and operated by different employees. For example a cost accounting system may combine data from payroll, sales and purchasing.