Reading postmortems

Once in a while a seemingly straightforward article turns into a goldmine of links and resources. This happened to me today with this one – “Reading postmortems“.

Not only this article itself is a very nice roundup of common sources for system failures, but it also links to a couple of awesome references:

  • Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems. This is both a talk and a paper.
  • danluu/post-mortems – a GitHub repository with a collection of publicly available postmortems from a variety of organizations, like Google, Amazon, Facebook, NASA, GitHub, and more.

If you still have no idea what postmortem is, Wikipedia explains.

Cloud Diagrams & Notes

Jerry Hargrove – Cloud Diagrams & Notes is an excellent resource for (mostly Amazon AWS) cloud diagrams and notes. I’m sure I’ve seen some of these around, but never thought to visit the original site. To some degree, these are similar to the Julia Evans’ drawings, but are more subject specific.

Serverless PHP on AWS Lambda

For all those of you who want to try out Amazon Lambda with PHP, here’s a quick and simple guide as to how to set it up: Serverless PHP on AWS Lambda.

This is some pretty exciting stuff!

SSH Examples, Tips & Tunnels

SSH Examples, Tips & Tunnels” is a nice collection of tips and examples for Secure Shell (ssh) users. It covers a variety of scenarios from simple remote connections, to file copying, to tunnels and jump hosts.

A Serverless Sequence Diagram

Paul Hammant has a quick and simple blog post illustrating the request-response sequence in the web application on a serverless infrastructure.  I find it quite useful as a reference, when explaining serverless to people who are considering it for the first time.

This stuff and the abandonment of names:ports in the config is a key advance for our industry. It is like a limiting Unix problem has been overcome. While it is still is impossible for two processes on one server to listen on the same port (say port 443), it is now not important as we have a mechanism for efficiently stitching together components (functions) of an application together via the simplest thing – a name. A name that is totally open for my naming creativity (ports were restricted if they were numbered below 1024 and also tied to specific purposes) . We’re also relieved of the problem of having to think of processes now, and whether they’ve crashed and won’t be receiving requests any more. Here’s a really great, but rambling, rant on a bunch of related topics by Smash Company that you should read too as it touches on some of the same things but goes much broader.