Slimming down Docker images

It’s been a while since I posted anything about Docker.  That’s mostly because I still don’t really use it for anything – playing around locally, testing and learning doesn’t count yet.

But just to keep the ball rolling, here are a couple of handy links for the ideas on how to improve your Docker images, so that Docker uses much less space, benefits more from caching, and brings up the containers faster:

Both articles are around the same theme – choose your  base image carefully, try to minimize the layers, use only what you need, and don’t forget to clean up the disk space with “docker system prune“.

Getting the best performance out of Amazon EFS

Jeff Geerling shares his tips for “Getting the best performance out of Amazon EFS”.  Given how (still) new the Amazon EFS is and how limited is the documentation of the best practices, this stuff is golden.

tl;dr: EFS is NFS. Networked file systems have inherent tradeoffs over local filesystem access—EFS doesn’t change that. Don’t expect the moon, benchmark and monitor it, and you’ll do fine.

SQL Keys in Depth

SQL Keys in Depth is an excellent read if you want to brush up on your knowledge of database keys and how they affect the performance of your application.  For the laziest among you, here are the summary points, based on an extensive research of 60+ articles, StackOverflow questions and IRC discussions:

For each table:

  1. Identify and declare all natural keys.
  2. Create a <table_name>_id surrogate key of type uuid with default value uuid_generate_v1(). You can even mark it as a primary key if you like. Including the table name in this id makes joins simpler. It’s JOIN foo USING (bar_id) vs JOIN foo ON (foo.bar_id = bar.id). Do not expose this key to clients or anywhere outside the database.
  3. For “join tables” declare all foreign key columns as a single composite primary key.
  4. Add an artificial key if desired for use in a URL or anywhere else you want to share a reference to a row. Use a Feistel cipher or pg_hashids to conceal auto-incrementing integers.
  5. Mark foreign keys to surrogate UUIDs as ON UPDATE RESTRICT and to external artificial keys as ON UPDATE CASCADE. Use your own judgement for natural keys.

How to Read Big Files with PHP (Without Killing Your Server)

Here’s an interesting article that was hanging around in my “to blog” tabs for a while now: How to Read Big Files with PHP (Without Killing Your Server).  I found the title to be slightly misleading, expecting the good old advice of reading and processing files line by line rather than all at once.  But it’s not that.  It’s much better.  It covers some techniques that aren’t that well known to the majority of the PHP developers – generators, streams, and filters.

Strongly recommended read!

MySQL optimize, repair, and analyze

Years ago I had the following script running as a cron job, but then I lost it somewhere.  It took me a few minutes to find it again, but just in case I need it in the future, I’m saving it here.

#!/bin/bash
mysqlcheck --all-databases
mysqlcheck --all-databases -o
mysqlcheck --all-databases --auto-repair
mysqlcheck --all-databases --analyze

Found it here this time.