Extract, Transform, Load

I’ve been doing all kinds of data migrations and system integration for years now.  But only yesterday I’ve learned that there is a very specific term linked to the process.

In computing, extract, transform, and load (ETL) refers to a process in database usage and especially in data warehousing that:

  • Extracts data from outside sources
  • Transforms it to fit operational needs, which can include quality levels
  • Loads it into the end target (database, more specifically, operational data store, data mart, or data warehouse)

ETL systems commonly integrate data from multiple applications, typically developed and supported by different vendors or hosted on separate computer hardware. The disparate systems containing the original data are frequently managed and operated by different employees. For example a cost accounting system may combine data from payroll, sales and purchasing.

Download your Gmail and Google Calendar data … soon or now

I am a well known Google fan.  But even those who call it an Evil Corporation and a Global Spy, can’t argue with the awesomeness of these news:

Starting today we’re rolling out the ability to export a copy of your Gmail and Google Calendar data, making it easy to back up your data or move to another service.

You can download all of your mail and calendars or choose a subset of labels and calendars. You can also download a single archive file for multiple products with a copy of your Gmail, Calendar, Google+, YouTube, Drive, and other Google data.

gmail data export

Most of the 20 GB of data I store on Google Drive is actually my email archive.  I’ve imported email into my Gmail from as early as 1998 – much, much earlier than Gmail was even born.  Having a way to export them all out in one go, without using clunky POP or IMAP is much appreciated.

How To Survive a Ground-Up Rewrite Without Losing Your Sanity

How To Survive a Ground-Up Rewrite Without Losing Your Sanity

Developers tend to spectacularly underestimate the effort involved in such a rewrite (more on that below), and spectacularly overestimate the value generated (more on that below, as well).