Extract, Transform, Load

I’ve been doing all kinds of data migrations and system integration for years now.  But only yesterday I’ve learned that there is a very specific term linked to the process.

In computing, extract, transform, and load (ETL) refers to a process in database usage and especially in data warehousing that:

  • Extracts data from outside sources
  • Transforms it to fit operational needs, which can include quality levels
  • Loads it into the end target (database, more specifically, operational data store, data mart, or data warehouse)

ETL systems commonly integrate data from multiple applications, typically developed and supported by different vendors or hosted on separate computer hardware. The disparate systems containing the original data are frequently managed and operated by different employees. For example a cost accounting system may combine data from payroll, sales and purchasing.

What the hell is a serial killer?

I’m watching TV series “The Bridge“, and in the 5th episode of the 1st season there was this great scene where two Mexican thugs are discussing what’s a serial killer.  The TV series are not a comedy by any means, and the whole scene is done with a straight face, but I think it’s one of the funniest things I’ve seen recently.

 

Water testing is not a term (for software testing)

I’ve been hearing the term “water testing” for one of the work projects that I am involved in.  The term is used to describe the stage of the project when it’s available on the production servers with live data, but open only to a subset of the users.  After searching around for a bit, I can’t find a reference to this term anywhere, except the water industry:

Water testing is a broad description for various procedures used to analyze water quality.

So that of course sent me on to the path of finding the correct term.  The closest by analogy that I heard of is “smoke testing“.

The plumbing industry started using the smoke test in 1875.

Later this usage seems to have been forgotten, leading some to believe the term originated in the electronics industry: “The phrase smoke test comes from [electronic] hardware testing. You plug in a new board and turn on the power. If you see smoke coming from the board, turn off the power. You don’t have to do any more testing.”

Specifically for software development and testing:

In computer programming and software testing, smoke testing is preliminary testing to reveal simple failures severe enough to reject a prospective software release. In this case, the smoke is metaphorical. A subset of test cases that cover the most important functionality of a component or system are selected and run, to ascertain if the most crucial functions of a program work correctly. For example, a smoke test may ask basic questions like “Does the program run?”, “Does it open a window?”, or “Does clicking the main button do anything?” The purpose is to determine whether the application is so badly broken that further testing is unnecessary. As the book “Lessons Learned in Software Testing”  puts it, “smoke tests broadly cover product features in a limited time … if key features don’t work or if key bugs haven’t yet been fixed, your team won’t waste further time installing or testing”.

Smoke testing performed on a particular build is also known as a build verification test.

A daily build and smoke test is among industry best practices.

This sounds very much like “sanity testing“:

sanity test or sanity check is a basic test to quickly evaluate whether a claim or the result of a calculation can possibly be true. It is a simple check to see if the produced material is rational (that the material’s creator was thinking rationally, applying sanity). The point of a sanity test is to rule out certain classes of obviously false results, not to catch every possible error. A rule-of-thumb may be checked to perform the test. The advantage of a sanity test, over performing a complete or rigorous test, is speed.

[…]

In computer science, a sanity test is a very brief run-through of the functionality of a computer program, system, calculation, or other analysis, to assure that part of the system or methodology works roughly as expected. This is often prior to a more exhaustive round of testing.

After reviewing all sorts of testing types, I think the correct term for our scenario is actually “beta testing“:

Beta testing comes after alpha testing and can be considered a form of external user acceptance testing. Versions of the software, known as beta versions, are released to a limited audience outside of the programming team. The software is released to groups of people so that further testing can ensure the product has few faults or bugs. Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.

Metasyntactic variable

Metasyntactic variable

A “standard list of metasyntactic variables used in syntax examples” often used in the United States is: foo, bar, baz, qux, quux, corge, grault, garply, waldo, fred, plugh, xyzzy, thud. The word foo occurs in over 330 RFCs and bar occurs in over 290. […]

Due to English being the foundation-language, or lingua franca, of most computer programming languages these variables are also commonly seen even in programs and examples of programs written for other spoken-language audiences.

A duck

While reading through insightful and funny New programming jargon post over at Coding Horror, I realized that I do a lot of duck, without even knowing it:

This started as a piece of Interplay corporate lore. It was well known that producers (a game industry position, roughly equivalent to PMs) had to make a change to everything that was done. The assumption was that subconsciously they felt that if they didn’t, they weren’t adding value.

The artist working on the queen animations for Battle Chess was aware of this tendency, and came up with an innovative solution. He did the animations for the queen the way that he felt would be best, with one addition: he gave the queen a pet duck. He animated this duck through all of the queen’s animations, had it flapping around the corners. He also took great care to make sure that it never overlapped the “actual” animation.

Eventually, it came time for the producer to review the animation set for the queen. The producer sat down and watched all of the animations. When they were done, he turned to the artist and said, “that looks great. Just one thing – get rid of the duck.”

One of the downsides of web design and development is that the results are so easy to understand for an outsider, that often people think they are qualified to participate in the discussion, even when it’s rather technical.  Suggestions, and even demands, are often made without knowing best practices or understanding the basic principles of web design, usability, color coordination, etc.  Arguing against such suggestions (and especially demands) is usually counter-productive.  The time waste is horrendous. So, “a duck” is a usually a good solution.  It comes in all shapes and forms – an ugly banner for marketing, a few typos for the language purists, ultra small font or an insufficient contrast color combination for the design-savvy, and so on.  It’s different every time and it heavily depends on who is it built to defend from.  I just didn’t know it was called ” a duck”.

Meteor vs. Meteorite vs. Meteoroid

If you are not an astronaut  or some other kind of space geek, chances are you have no idea what’s the difference between meteor, meteorite and meteoroid.  If you are anything like, you probably use meteor and meteorite interchangeably.  Apparently, there is quite a specific difference.  Here is an easy to understand description from the Mental Floss:

Say you’re a bit of interplanetary dust or debris trucking through the vacuum of space, minding your own business. You’re not very big. Certainly not big enough to be called an asteroid. In fact, you might just be a speck of dust or even smaller. Congrats! You’re a meteoroid!

But say, for example, a bright blue planet suddenly gets in your way and sucks you in, and before you know it you’re streaking through an atmosphere so fast that you ablate (fancy way to say “vaporize”) and let off a bright streak of light. You are now officially a meteor.

Now, on the other hand, if you started out big enough, then enough of you will emerge from this furnace o’ friction to hit the ground in some farmer’s field, making you a meteorite.

What is a startup?

There is a lot of talk about startups on the web.   But what exactly is a “startup”?  Different people put different meaning in the word, and sometimes it gets very confusing.  Here is one example I came across recently:

 The reason I get asked this is that I left a perfectly good start up called Preemptive Solutions to come here. When I say “perfectly good” its one that I am a co-founder, is now 10 years old, and was President (which I later became VP as I decided I wanted to live away from the HQ).

Somehow, a 10-year old company didn’t fit my understanding of “startups”.  From a quick definition check at Google I like the one from Oakridge Venture Capital best of all:

 New business venture in its earliest stage of development.

This fits my understanding perfectly.  And with this in mind, I think that most companies grow out (or die out) the “startup” stage in their first year or so.  If the company survives that period, it starts getting some routine in it (procedures, practices, paperwork, traditions, etc).  The culture of the company shapes up.  Most of the “what is good and what is not” issues are ironed out.  Etc.  And then it’s not a startup anymore.

What’s your understanding of startups?  How can you say if a company is a startup or not?