Software Engineering Radio : CAP Theorem

On the way to work today I enjoyed an excellent episode of Software Engineering Radio which featured an interview with Eric Brewer, a VP of Infrastructure at Google,  probably more famous for his CAP Theorem.

In theoretical computer science, the CAP theorem, also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it succeeded or failed)
  • Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

The discussion around “2 out of 3” was very thought-provoking and, at first, a little bit counter-intuitive.  If you don’t want to listen to the show, read though this page, which covers the important bits.

The easiest way to understand CAP is to think of two nodes on opposite sides of a partition. Allowing at least one node to update state will cause the nodes to become inconsistent, thus forfeiting C. Likewise, if the choice is to preserve consistency, one side of the partition must act as if it is unavailable, thus forfeiting A. Only when nodes communicate is it possible to preserve both consistency and availability, thereby forfeiting P. The general belief is that for wide-area systems, designers cannot forfeit P and therefore have a difficult choice between C and A. In some sense, the NoSQL movement is about creating choices that focus on availability first and consistency second; databases that adhere to ACID properties (atomicity, consistency, isolation, and durability) do the opposite.

This puts some of the current trends into perspective.

Claim to fame : phinx LONGBLOB

My largest claim to fame in the Open Source software just got merged in – a pull request to the phinx project, adding support for MySQL’s LONGBLOB (as well as TINYBLOB and MEDIUMBLOB).  Phinx is the PHP tool for database migrations.  It’s used, among others, by the CakePHP 3 framework.

The patch itself was rather simple and I was surprised that it hasn’t been done by someone else earlier (there was an open issue requesting this for more than a year).  Phinx already had support for BLOB, and for TINYTEXT, MEDIUMTEXT, TEXT, and LONGTEXT.  So practically all I had to do was a bit of copy-paste and find-replace.  Gladly, there were some unit tests in place already, preventing me from breaking a thing or two.

What I found interesting though, wasn’t the patch itself, but the support of the CakePHP community (thank you guys!).   Every few days someone (even core CakePHP developers) would “thumbs up” the pull request to draw the attention of the maintainer to it.  Some people pulled the branch and tested it.  Some wrote comments.  That was awesome and quite inspiring!

 

 

SchemaSpy – Graphical Database Schema Metadata Browser

SchemaSpy – Graphical Database Schema Metadata Browser.  This is a tool written in Java that helps one to generate database schema documentation.  Have a look at some sample pages.  Those familiar with Graphviz will immediately realize that the tools is using dot for graphing tables and their relationships.  Those familiar with SugarCRM documentation will immediately notice that SchemaSpy is used for the API documentation.