databases

Why Uber Engineering Switched from Postgres to MySQL

“Why Uber Engineering Switched from Postgres to MySQL” is an interesting study with plenty of technical detail of how MySQL was a better choice than PostgreSQL for the very demanding growth of Uber. These kinds of issues are probably way out of scope for any “regular Joe” application, but the insight into the differences of MySQL and PostgreSQL architectures is still useful.

Main PostgreSQL limitations covered by the study are:

Inefficient architecture for writes
Inefficient data replication
Issues with table corruption
Poor replica MVCC support
Difficulty upgrading to newer releases

forget-db – a simple GDPR inspired tool to anonymise confidential database data

forget-db:

A simple(ish) command line tool written in PHP 7.1 using Laravel Zero and Faker to help you anonymise/pseudonymise data within your database to support protecting either sensitive information, or peoples right to be forgotten with GDPR compliance.

The tool allows you to connect to either mysql, postgres, sqlite or sqlserver and replace defined information with random data to allow you to keep statistics/relationships/audit of actions etc.

It uses a simple yaml configuration file to define the conditions for overwriting, which fields you want to overwrite, and what to overwrite them with.

When and where to determine the ID of an entity

It always amazes me when I randomly come across an article or a blog post precisely on the subject that I’m mulling over in my head – all without searching specifically for the solution or even researching the problem domain. It’s almost like the universe knows what I’m thinking and sends help my way.

“When and where to determine the ID of an entity” is an example of exactly that. Lately, I’ve been working with events in CakePHP a lot. And for one particular scenario, I was considering the beforeSave() event in the model layer, which would trigger some functionality that modifies data in other models. So, having a reference of the current ID would be useful for debugging and logging purposes. But since the current entity hasn’t been saved it, the ID is not there. And that’s where I started thinking about this whole thing and considering where is the right place to generate the ID.

One thing that kind of bothered me on top of the theoretical discussion, was the practical implementation, especially in different frameworks. If I remember correctly, the earlier version of CakePHP framwork, used the presence or absense of the ID in the entity to differentiate between insert and update operations. It might still be true now, but at least there is a way to work around it, as CakePHP now has isNew() method to check if the entity needs to be inserted or updated.

ipstack – IP geolocation and database service

ipstack looks like an excellent IP geolocation service with a beautiful API. If you haven’t used anything except for the MaxMind GeoIP, give it a try. Their pricing is quite good, with 10,000 lookups per month going for free.

Database Flow – modern, self-hosted web interface for SQL and GraphQL

Database Flow is a modern, Open Source, self-hosted, web-based tool for working with SQL databases and GraphQL APIs. It supports a variety of the database engines: IBM DB2, Oracle, H2, PostgreSQL, MySQL, SQLite, Informix, and Microsoft SQL Server. It features an advanced SQL editor, query plan analyzer, GraphQL client, schema explorer, charting, query history, and more.

The only visible downside so far is that it’s written in Java.