Code is poetry. But not literature.

WordPress has been known, among other things, for coining the phrase “code is poetry”.  It’s used in the footer of their website as well as quite a bit around the web.  This goes along with Donald Knuth paradigm of Literate Programming.

Turns out, some people disagree.  Most recently, Peter Seibel, the author of “Coders at Work” book of interviews, wrote this blog post titled “Code is not literature”.

It was sometime after that presentation that I finally realized the obvious: code is not literature. We don’t read code, we decodeit. We examine it. A piece of code is not literature; it is a specimen.

It’s an interesting read, filled with quotes and references to some of the smartest people in IT and Computer Science.

The Definitive Guide to Natural Language Processing

The Definitive Guide to Natural Language Processing” is an easy to follow article on what a challanging task it is for machines to understand human language.  There’s also this cool video of two bots talking to each other.

27 languages to improve your Python

Nick Coghlan writes:

One of the things we do as part of the Python core development process is to look at features we appreciate having available in other languages we have experience with, and see whether or not there is a way to adapt them to be useful in making Python code easier to both read and write. This means that learning another programming language that focuses more specifically on a given style of software development can help improve anyone’s understanding of that style of programming in the context of Python.

To aid in such efforts, I’ve provided a list below of some possible areas for exploration, and other languages which may provide additional insight into those areas.

The languages and areas are:

  • Procedural programming: C, Rust, Cython
  • Object-oriented data modelling: Java, C#, Eiffel
  • Object-oriented C derivatives: C++, D
  • Array-oriented data processing: MATLAB/Octave, Julia
  • Statistical data analysis: R
  • Computational pipeline modelling: Haskell, Scala, Clojure, F#
  • Event driven programming: JavaScript, Go, Erlang, Elixir
  • Gradual typing: TypeScript
  • Dynamic metaprogramming: Hy, Ruby
  • Pragmatic problem solving: Lua, PHP, Perl
  • Computational thinking: Scratch, Logo

Free Data Science Books

I came across a collection of free data science books:

Pulled from the web, here is a great collection of eBooks (most of which have a physical version that you can purchase on Amazon) written on the topics of Data Science, Business Analytics, Data Mining, Big Data, Machine Learning, Algorithms, Data Science Tools, and Programming Languages for Data Science.

Most notably, there are introductory books, handbooks, Hadoop guide, SQL books, social media data mining stuff, and d3 tips and tricks.  There’s also plenty on artificial intelligence and machine learning, but that’s too far out for me.

How does a relational database work

databases

How does a relational database work” is an excellent (lengthy, technical, but simply written and well explained) article on some of the most important bits inside the relational database.  It’s somewhat of a middle ground between a theoretical database discussion in college and vendor-specific documentation of a database engine.

Though the title of this article is explicit, the aim of this article is NOT to understand how to use a database. Therefore, you should already know how to write a simple join query and basic CRUD queries; otherwise you might not understand this article. This is the only thing you need to know, I’ll explain everything else.

I’ll start with some computer science stuff like time complexity. I know that some of you hate this concept but, without it, you can’t understand the cleverness inside a database. Since it’s a huge topic, I’ll focus on what I think is essential: the way a database handles an SQL query. I’ll only present the basic concepts behind a database so that at the end of the article you’ll have a good idea of what’s happening under the hood.

Whether you are a young programmer or an experienced DBA, I think, you’ll still find something in there which you either didn’t know or didn’t think about in this particular way.   Even if you know all this stuff, it’s a good memory refresher.

Strongly recommended!

Software Engineering Radio : CAP Theorem

On the way to work today I enjoyed an excellent episode of Software Engineering Radio which featured an interview with Eric Brewer, a VP of Infrastructure at Google,  probably more famous for his CAP Theorem.

In theoretical computer science, the CAP theorem, also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it succeeded or failed)
  • Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

The discussion around “2 out of 3” was very thought-provoking and, at first, a little bit counter-intuitive.  If you don’t want to listen to the show, read though this page, which covers the important bits.

The easiest way to understand CAP is to think of two nodes on opposite sides of a partition. Allowing at least one node to update state will cause the nodes to become inconsistent, thus forfeiting C. Likewise, if the choice is to preserve consistency, one side of the partition must act as if it is unavailable, thus forfeiting A. Only when nodes communicate is it possible to preserve both consistency and availability, thereby forfeiting P. The general belief is that for wide-area systems, designers cannot forfeit P and therefore have a difficult choice between C and A. In some sense, the NoSQL movement is about creating choices that focus on availability first and consistency second; databases that adhere to ACID properties (atomicity, consistency, isolation, and durability) do the opposite.

This puts some of the current trends into perspective.