Fixing outdated Let’s Encrypt (zope.interface error)

I’ve started using Let’s Encrypt for the SSL certificates a while back.  I installed it on all the web servers, irrelevant of the need for SSL, just to have it there, when I need it (thanks to this Ansible role).  One of those old web servers needed an SSL certificate recently, so I thought it’d be no problem to generate one.

But I was wrong. The letsencrypt-auto tool got outdated and was failing to execute, throwing some Python exception about missing zope.interface module.  A quick Google search brought this StackOverflow discussion, with the exact issue I was having.

Traceback (most recent call last):
  File "/root/.local/share/letsencrypt/bin/letsencrypt", line 7, in <module>
    from certbot.main import main
  File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/certbot/main.py", line 12, in <module>
    import zope.component
  File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/zope/component/__init__.py", line 16, in <module>
    from zope.interface import Interface
ImportError: No module named interface

However, the solution didn’t fix the problem for me:

unset PYTHON_INSTALL_LAYOUT
/opt/letsencrypt/letsencrypt-auto -v

Even pulling the updated version from the GitHub repository didn’t solve it.

After poking around for a while more, I found this bug report from the last year, which solved my problem.

I recommend:

  1. Running rm -rf /root/.local/share/letsencrypt. This removes your installation of letsencrypt, but keeps all configuration files, certificates, logs, etc.
  2. Make sure you have an up to date copy of letsencrypt-auto. It can be found here.
  3. Run letsencrypt-auto again.

If you get the same behavior, you can try installing zope.interface manually by running:

/root/.local/share/letsencrypt/bin/pip install zope.interface

Hopefully, next time I’ll remember to search my blog’s archives …

Mcrouter: a memcached protocol router

Mcrouter is an Open Source tool developed by Facebook for scaling up the memcached deployments:

Mcrouter is a memcached protocol router for scaling memcached (http://memcached.org/) deployments. It’s a core component of cache infrastructure at Facebook and Instagram where mcrouter handles almost 5 billion requests per second at peak.

Here is a good overview of some of the scenarios where Mcrouter is useful.  There’s more than one.  Here are some of the features to get you started:

  • Memcached ASCII protocol
  • Connection pooling
  • Multiple hashing schemes
  • Prefix routing
  • Replicated pools
  • Production traffic shadowing
  • Online reconfiguration
  • Flexible routing
  • Destination health monitoring/automatic failover
  • Cold cache warm up
  • Broadcast operations
  • Reliable delete stream
  • Multi-cluster support
  • Rich stats and debug commands
  • Quality of service
  • Large values
  • Multi-level caches
  • IPv6 support
  • SSL support

Amazon AWS : MTU for EC2

I came across this handy Amazon AWS manual for the maximum transfer unit (MTU) configuration for EC2 instances.  This is not something one needs every day, but, I’m sure, when I need it, I’ll otherwise be spending hours trying to find it.

The maximum transmission unit (MTU) of a network connection is the size, in bytes, of the largest permissible packet that can be passed over the connection. The larger the MTU of a connection, the more data that can be passed in a single packet. Ethernet packets consist of the frame, or the actual data you are sending, and the network overhead information that surrounds it.

Ethernet frames can come in different formats, and the most common format is the standard Ethernet v2 frame format. It supports 1500 MTU, which is the largest Ethernet packet size supported over most of the Internet. The maximum supported MTU for an instance depends on its instance type. All Amazon EC2 instance types support 1500 MTU, and many current instance sizes support 9001 MTU, or jumbo frames.

The document goes into the detail of how to set, check and troubleshoot MTU on the EC2 instances, which instance types support jumbo frames,  when you should and shouldn’t change the MTU, etc.

The following instances support jumbo frames:

  • Compute optimized: C3, C4, CC2
  • General purpose: M3, M4, T2
  • Accelerated computing: CG1, G2, P2
  • Memory optimized: CR1, R3, R4, X1
  • Storage optimized: D2, HI1, HS1, I2

As always, Julia Evans has got you covered on the basics of networking and the MTU.

Things Every Hacker Once Knew

Eric Raymond goes over a few things every hacker once knew.

One fine day in January 2017 I was reminded of something I had half-noticed a few times over the previous decade. That is, younger hackers don’t know the bit structure of ASCII and the meaning of the odder control characters in it.

This is knowledge every fledgling hacker used to absorb through their pores. It’s nobody’s fault this changed; the obsolescence of hardware terminals and the near-obsolescence of the RS-232 protocol is what did it. Tools generate culture; sometimes, when a tool becomes obsolete, a bit of cultural commonality quietly evaporates. It can be difficult to notice that this has happened.

This document is a collection of facts about ASCII and related technologies, notably hardware terminals and RS-232 and modems. This is lore that was at one time near-universal and is no longer. It’s not likely to be directly useful today – until you trip over some piece of still-functioning technology where it’s relevant (like a GPS puck), or it makes sense of some old-fart war story. Even so, it’s good to know anyway, for cultural-literacy reasons.

The article goes over:

  • Hardware context
  • The strange afterlife of the outboard modem
  • 36-bit machines and the persistence of octal
  • RS232 and its discontents
  • UUCP, the forgotten pre-Internet
  • Terminal confusion
  • ASCII
  • Key dates

Found via a couple of other interesting bits –
What we still use ASCII CR for today (on Unix) and
How Unix erases things when you type a backspace while entering text.

Defensive BASH Programming

If you write any Bash code that lasts more than a day, you should definitely read “Defensive BASH Programming” and follow the advice, if you haven’t already.  It covers the following:

  • Immutable global variables
  • Everything is local
  • main()
  • Everything is a function
  • Debugging functions
  • Code clarity
  • Each line does just one thing
  • Printing usage
  • Command line arguments
  • Unit Testing

All that with code examples and explanation of importance.

 

PagerDuty Incident Response Documentation

PagerDuty shares their Incident Response Documentation:

This documentation covers parts of the PagerDuty Incident Response process. It is a cut-down version of our internal documentation, used at PagerDuty for any major incidents, and to prepare new employees for on-call responsibilities. It provides information not only on preparing for an incident, but also what to do during and after. It is intended to be used by on-call practitioners and those involved in an operational incident response process (or those wishing to enact a formal incident response process).

I think this is a goldmine for anybody involved with incident response teams, operations, monitoring, technical support, network centers, and other similar setups.  Not only it covers the specific steps and expectations during different situations, but it also defines the culture, which the company is trying to built.

I wish I had this 15 years ago when I was involved in setting up the Network Operations Center (NOC).  I will definitely use it in the near future, when we’ll be setting up the support department at work.

Dissecting an SSL certificate

Julia Evans does it again.  If you ever wanted to understand SSL certificates, her post “Dissecting an SSL certificate” is for you.   This part made me smile:

Picking the right settings for your SSL certificates and SSL configuration on your webserver is confusing. As far as I understand it there are about 3 billion settings. Here is an example of an SSL Labs result for mail.google.com. There is all this stuff like OLD_TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 on that page (for real, that is a real thing.). I’m happy there are tools like SSL Labs that help mortals make sense of all of it.

GitLab horror story : backup / restore failure

As I am reading this story – GitLab.com melts down after wrong directory deleted, backups fail and these details – every single hair I have, moves … I don’t (and didn’t) have any data on GitLab, so I haven’t lost anything.  But as somebody who worked as a system administrator (and backup administrator) for years, I can imagine the physical and psychological state of the team all too well.

Sure, things could have been done better.  But it’s easier said than done.  Modern technology is very complex.  And it changes fast.  And businesses want to move fast too.  And the proper resources (time, money, people) are not always allocated for mission critical tasks.  One thing is for sure, the responsibility lies on a whole bunch of people for a whole bunch of decisions.  But the hardest job is right now upon the tech people to bring back whatever they can.  There’s no sleep.  Probably no food.  No fun.  And a tremendous pressure all around.

I wish the guys and gals at GitLab a super good luck.  Hopefully they will find a snapshot to restore from and this whole thing will calm down and sort itself out.  Stay strong!

And I guess I’ll be doing test restores all night today, making sure that all my things are covered…

Update: you can now read the full post-mortem as well.

Choosing the “best software”

Julia Evans has a nice blog post about choosing the “best software”.  Here is my favorite part:

So, let’s talk about another way to think about making decisions than “what is the Best Thing in this situation”.

I run an event series called “lightning talks and pie”. At the most recent one, Ines Sombra gave a talk about capacity planning. In it, she said that there are 3 reasons you might want to change something about your system:

  1. It’s too expensive
  2. It’s too difficult to operate (humans spend a ton of time worrying about it)
  3. It’s not doing the job it’s supposed to

I find these 3 criteria a lot easier to reason about than the “Choose The Best Thing” framework.

She provides some examples on how to apply this thinking, as well as how to deal with tradeoffs and limitations.