ORM: Active Record vs. Data Mapper

Everybody building a web application with a modern framework, is already probably using an ORM (Object-Relational Mapping).  Most frameworks include one out of the box.  But digging deeper into the subject, ORMs do vary from each other, and some cases, very significantly.

Most variations are coming from two main approaches: Active Record and Data Mapper.  I’ve heard the terms for a long time, but today decided to look into the meaning and the actual difference.

The two approaches seem very similar.  The difference is described in a multitude of articles online.  I particularly liked this one.  In essence, Active Record is a better choice for simpler, CRUD-based applications.  Data Mapper, on the other hand, is better for domain-specific applications, as it provides another level of abstraction between the domain objects and the persistence layer.

Most of my work these days is done with CakePHP framework, which I now thought uses the Active Record pattern.  But it turns out that CakePHP ORM so powerful, because it’s more than just one of those:

The CakePHP ORM borrows ideas and concepts from both ActiveRecord and Datamapper patterns. It aims to create a hybrid implementation that combines aspects of both patterns to create a fast, simple to use ORM.

It looks like I need to do some learning and dig deeper into the subject.  Pointers are welcome.

MySQL 8.0 release

MySQL 8.0 has been released and it brings the following new features, enhancements, and more:

  1. SQL Window functions, Common Table Expressions, NOWAIT and SKIP LOCKED, Descending Indexes, Grouping, Regular Expressions, Character Sets, Cost Model, and Histograms.
  2. JSON Extended syntax, new functions, improved sorting, and partial updates. With JSON table functions you can use the SQL machinery for JSON data.
  3. GIS Geography support. Spatial Reference Systems (SRS), as well as SRS aware spatial datatypes,  spatial indexes,  and spatial functions.
  4. Reliability DDL statements have become atomic and crash safe, meta-data is stored in a single, transactional data dictionary. Powered by InnoDB! 
  5. Observability Significant enhancements to Performance Schema, Information Schema, Configuration Variables, and Error Logging.
  6. Manageability Remote management, Undo tablespace management, and new instant DDL.
  7. Security OpenSSL improvements, new default authentication, SQL Roles, breaking up the super privilege, password strength, and more.
  8. Performance InnoDB is significantly better at Read/Write workloads, IO bound workloads, and high contention “hot spot” workloads. Added Resource Group feature to give users an option optimize for specific workloads on specific hardware by mapping user threads to CPUs.

Distributed architecture concepts I learned while building a large payments system

Gergely Orosz, an engineer who worked at Uber on the large scale payments system used by the company, shares some of the distributed architecture concepts he had to learn in the blog post titled “Distributed architecture concepts I learned while building a large payments system“.

The article is very well written and easy to follow. But it’s also a goldmine of links to other resources on the subject.  Here’s a list links and concepts for a quick research and/or click-through later:

Registry of Open Data on AWS

AWS News Blog covers the Registry of Open Data on AWS:

Almost a decade ago, my colleague Deepak Singh introduced the AWS Public Datasets in his post Paging Researchers, Analysts, and Developers. I’m happy to report that Deepak is still an important part of the AWS team and that the Public Datasets program is still going strong!

Today we are announcing a new take on open and public data, the Registry of Open Data on AWS, or RODA. This registry includes existing Public Datasets and allows anyone to add their own datasets so that they can be accessed and analyzed on AWS.

Currently, there are 53 data sets in the registry.  Each provides a tonne of data.  Subjects vary from satellite imagery and weather monitoring to political and financial information.

Hopefully, this will grow and expand with time.

SQLBolt – Learn SQL with simple, interactive exercises

SQLBolt is by far the best SQL tutorial that I’ve ever seen!  Yes, I know, it’s a very bold statement.  But I promise that it’s true.

With hundreds of books, videos, and other tutorials around, the problem of delivering the understanding of data management, databases, and SQL to regular people still hasn’t been sold.  But SQLBolt provides a giant leap forward in this area.

The tutorial starts from the very basics and gets progressively more and more advanced.  But this progression is divided into small, very focused chapters.  Each chapter provides a brief description of the concept, an example query for the concept, and a set of exercises.  The exercises are all interactive, so that you don’t have to install a database or get access to a real one, and you don’t have to trust yourself on correctly solving the tasks.  The interactive exercises system marks the problem as solved the moment you type in the correct query.

If you get stuck at any point with any particular exercises, just click on the Solution link nearby, and the tutorial will show you the correct answer.  I found this to be a perfect balance between forcing the reader to try things out, but without the annoying delays for those of us who like to skip ahead.

There is really no reason now for anybody at all to learn SQL.  SQLBolt is brilliant!