Searching DynamoDB: An indexer sidecar for Elasticsearch

One thing that I like about the modern world is that large technology companies are a lot more open than they were in the previous century. Many of them contribute to the Open Source ecosystem and frequently share their wisdom on how to use and not to use a particular technology.

Have a look at the recent post from Bitbucket blog: Searching DynamoDB: An indexer sidecar for Elasticsearch, for example.

It’s not your usual marketing nonsense about introducing a new needless service or self-praising review of a product. It’s a rather deep dive into a technical topic that has been getting a lot of attention for the last few years – NoSQL databases. Not only the blog post itself is interesting, but it provides plenty of useful links to other resources. Like this one, which covers database partitioning in depth. Or this one, which lists some of the best practices for designing and using partition keys effectively.

I wish more companies shared their technical insights like this.

Dgraph – fast, transactional, distributed graph database

Dgraph is a fast, transactional distributed graph database, written in Go. It’s Open Source too.

If you need a quick introduction to graph databases or if you are wondering whether you need to use one, here’s a good video to get you started.

For even more insight, read “Why Google Needed a Graph Serving System“. There are some interesting examples of problems, solutions, and data discovery. For example:

Cerebro would often reveal very interesting facts that one didn’t originally search for. When you’d run a query like [us presidents], Cerebro would understand that presidents are humans, and humans have height. Therefore, it would allow you to sort presidents by height and show that Abraham Lincoln is the tallest US president. It would also let people be filtered by nationality. In this case, it showed America and Britain in the list, because US had one British president, namely George Washington. (Disclaimer: Results based on the state of KG at the time; can’t vouch for the correctness of these results.)

UUIDs in MySQL are really not random

Jouke Waleson points out to an interesting fact about UUIDs in MySQL, which you might have missed in the documentation:

Warning: Although UUID() values are intended to be unique, they are not necessarily unguessable or unpredictable. If unpredictability is required, UUID values should be generated some other way.

Make a note!

MySQL High Availability at GitHub

Shlomi Noach, GitHub’s Senior Infrastructure Engineer, shares some details on both the current and future high availability setup of MySQL databases at GitHub.

This is probably way too far out for most people using MySQL for their web applications.  But it does highlight the technical complexity of running high load web applications, and how some of the issues can be solved or worked around.

Pretty fascinating stuff there …