JPEG Huffman Coding Tutorial

I came across this rather useful and practical tutorial on Huffman Coding in JPEG images.  It looks at a very small and basic black-and-white image, and how the size of the data and overhead changes between different image formats, and then in more detail, how the Huffman Coding helps make that happen.

Unless you are dealing with compression, image formats, and binary trees on a daily basis, this tutorial is a good memory refresher of those college days.

Boostnote – Open Source note-taking app for programmers

Boostnote is yet another alternative for taking notes.   This one is an Open Source and is built for developers.  Some of the features – Markdown support, search, cross-platform, works offline.

There is also Boostnote Team edition for, you know, teams.

BitBucket Pipelines improved support for Docker

Here are some exciting news from the BitBucket Pipelines blog: Bitbucket Pipelines now supports building Docker images, and service containers for database testing.

We developed Pipelines to enable teams to test and deploy software faster, using Docker containers to manage their build environment. Now we’re adding advanced Docker support – building Docker images, and Service containers for database testing.

Writing systemd Units

Vidar Hokstad explains what systemd units are and how to write them.  Very useful for that day when I will stop hating systemd and will try to embrace it.

Systemd has become the defacto new standard init for Linux-based systems. While not everyone has made the switch yet, pretty much all the major distros have made the decision to switch.

For most people this has not meant all that much yet, other than a lot of controversy. Systemd has built in SysV init system compatibility, and so it’s possible to avoid dealing with it quite well.

But there is much to be gained from picking up some basics. Systemd is very poweful.

I’m not going to deal with the basics of interacting with systemd as that’s well covered elsewhere. You can find a number of basic tips and tricks here.

Instead I want to talk about how to write systemd units.

Deprecated Linux networking commands and their replacements

Doug Vitale Tech Blog runs a post with a collection of the deprecated Linux networking commands and their replacements. Pretty handy if you want update some of your old bash scripts.

Deprecated command Replacement command(s)
arp ip n (ip neighbor)
ifconfig ip a (ip addr), ip link, ip -s (ip -stats)
iptunnel ip tunnel
iwconfig iw
nameif ip link, ifrename
netstat ss, ip route (for netstat-r), ip -s link (for netstat -i), ip maddr (for netstat-g)
route ip r (ip route)

History of Icons

History of Icons looks at the evolution of icons used for desktop, mobile, and web.  There are plenty of nostalgia triggering screenshots from a variety of systems.  Given that nobody could ever afford having all of those systems, I’m sure you’ll find interesting screens from computers you didn’t have or didn’t see.

Validating CSV schema

CSV, or comma-separated values, is a very common format for managing all kinds of configurations, as well data manipulation.  As the linked Wikipedia page mentions, there are a few RFCs that try to standardize the format.  However, I thought, there is still a lack of schema-type standard that would allow one to define a format for particular file.

Today I came across an effort that attempts to do just that – CSV Schema Language v1.1 – an unofficial draft of the language for defining and validating CSV data.  This is work in progress by the Digital Preservation team at The National Archives.

Apart from the unofficial draft of the language, there is also an Open Source CSV Validator v1.1 application, written in Scala.

Docker Image Vulnerability Research

Federacy has an interesting research in Docker image vulnerabilities.  The bottom line is:

24% of latest Docker images have significant vulnerabilities

This can and should be improved, especially given the whole hierarchical structure of Docker images.  It’s not like improving security of all those random GitHub repositories.

Why Configuration Management and Provisioning are Different

In “Why Configuration Management and Provisioning are Different” Carlos Nuñez advocates for the use of specialized infrastructure provisioning tools, like Terraform, Heat, and CloudFormation, instead of relying on the configuration management tools, like Ansible or Puppet.

I agree with his argument for the rollbacks, but not so much for the maintaining state and complexity.  However I’m not yet comfortable to word my disagreement – my head is all over the place with clouds, and I’m still weak on the terminology.

The article is nice regardless, and made me look at the provisioning tools once again.

Living Without Atomic Clocks

Living Without Atomic Clocks” is an interesting article that covers some design bits of distributed systems and CockroachDB (what a name!), especially those related to time precision.  This part in particular is the one I’m sure I’ll came back to at some point:

How does TrueTime provide linearizability?

OK, back to Spanner and TrueTime. It’s important to keep in mind that TrueTime does not guarantee perfectly synchronized clocks. Rather, TrueTime gives an upper bound for clock offsets between nodes in a cluster. Synchronization hardware helps minimize the upper bound. In Spanner’s case, Google mentions an upper bound of 7ms. That’s pretty tight; by contrast, using NTP for clock synchronization is likely to give somewhere between 100ms and 250ms.

So how does Spanner use TrueTime to provide linearizability given that there are still inaccuracies between clocks? It’s actually surprisingly simple. It waits. Before a node is allowed to report that a transaction has committed, it must wait 7ms. Because all clocks in the system are within 7ms of each other, waiting 7ms means that no subsequent transaction may commit at an earlier timestamp, even if the earlier transaction was committed on a node with a clock which was fast by the maximum 7ms. Pretty clever.