HTTPS on Stack Overflow: The End of a Long Road

Way too often I hear rants from random people (unfortunately, many of them are also from the IT industry, with the deep understanding of the underlying issues) complaining about why company X or product Y doesn’t implement this or that feature.  As someone who has been involved a dozens, if not hundreds, of projects, I pretty much always can think of a number of reasons why even seemingly the simplest of features aren’t implemented for years.  These can vary from business side of things – insufficient budgets, strategic goals, and the like – to technical, such as architectural limitations, insufficient expertise, insufficient resources, etc.

One of the recent frequent rant that keeps coming up is “Why don’t they just enable HTTPS?”.  Again, as someone being involved in HTTPS setup for several different environments I can think of a number of reasons why.  SSL certificates used to cost money and were quite cumbersome to install until very recently.  Thanks to Let’s Encrypt effort, SSL certificates are now free and quite easy to issue and renew.  But that’s only part of the problem.  Enabling HTTPS requires infrastructural changes, and the more complex your infrastructure, the more changes are needed.  Just think of a few points here – web server configuration (especially when you have multiple web servers, with varied software (Apache, Nginx, IIS) and varied versions of that software), load balancers, web application firewalls, reverse proxies, caching servers, and so on.

Apart from the infrastructural changes, HTTPS often needs changes on the application level.  Caching, cookies, headers, making sure that all your resources are HTTPS-only, redirects, and the like.

All of the above issues are multiplied by a gadzillion, when your project is publicly available, used by tonnes of people, and provides embeddable content or APIs to third-party (hello, backward compatibility).

This is not to mention that HTTPS itself is a complex subject, not well understood by even the most experienced system administrators and developers.  There are different protocols and versions (SSL vs. TLS), cipher suites, handshakes, and protocol details.  Just have a look at the variety of checks and the report length done by Qualys’ SSL Labs Server Test.  Even giants like Google, who employ thousands of smart people, can’t get it all right.

But for some reason, people either don’t know or prefer to ignore all this complexities, and whine and cry anyway.

Recently, Stack Overflow – a well known collection of sites on a variety of technical subjects, has completed the migration to HTTPS everywhere.  These are also people with a lot of knowledge and expertise and with access to all the information.  Just have a look at their long way, which took not months, but years: HTTPS on Stack Overflow: The End of a Long Road.

Today, we deployed HTTPS by default on Stack Overflow. All traffic is now redirected to https:// and Google links will change over the next few weeks. The activation of this is quite literally flipping a switch (feature flag), but getting to that point has taken years of work. As of now, HTTPS is the default on all Q&A websites.

We’ve been rolling it out across the Stack Exchange network for the past 2 months. Stack Overflow is the last site, and by far the largest. This is a huge milestone for us, but by no means the end. There’s still more work to do, which we’ll get to. But the end is finally in sight, hooray!

So next time you are about to start crying about somebody not having feature X or Y, just give it a minute first.  Try to imagine what goes on on the other side.  You aren’t the only one with low budgets, pressing deadlines, insufficient knowledge, bad colleagues and horrible bosses…

Using the Strict-Transport-Security header

Julia Evans has an excellent write-up on “Using the Strict-Transport-Security header” – what it is, why you’d want to use it, and what are some of the consequences of using one.

As always with her blog posts, this one is very focused on one particular subject, easy to read, and explains things simply, so that the reader’s technical level is always irrelevant (OK, OK, you do need a basic understanding of how HTTP works, but not more than that).

HAProxy SNI

HAProxy SNI” is pure gold! If you want to have a load balancer for HTTPS traffic, without managing SSL certificates on the said load balancer, there is a way to do so.

The approach is utilizing the Server Name Indication (SNI) extension to the TLS protocol.  I knew about it and I was already using it on the web server side, but it didn’t occur to me that it’ll be utilized on the load balancer.  Here’s the configuration bit:

frontend https *:443
  description Incoming traffic to port 443
  mode tcp
  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }
  use_backend backend-ssl-foobar if { req_ssl_sni -i foobar.com }
  use_backend backend-ssl-example if { req_ssl_sni -i example.com }
  default_backend backend-ssl-default

The above will make HAProxy listen on port 443, and then send all traffic for foobar.com to one backend, all traffic for example.com to another backend, and the rest to the third, default backend.

Fixing outdated Let’s Encrypt (zope.interface error)

I’ve started using Let’s Encrypt for the SSL certificates a while back.  I installed it on all the web servers, irrelevant of the need for SSL, just to have it there, when I need it (thanks to this Ansible role).  One of those old web servers needed an SSL certificate recently, so I thought it’d be no problem to generate one.

But I was wrong. The letsencrypt-auto tool got outdated and was failing to execute, throwing some Python exception about missing zope.interface module.  A quick Google search brought this StackOverflow discussion, with the exact issue I was having.

Traceback (most recent call last):
  File "/root/.local/share/letsencrypt/bin/letsencrypt", line 7, in <module>
    from certbot.main import main
  File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/certbot/main.py", line 12, in <module>
    import zope.component
  File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/zope/component/__init__.py", line 16, in <module>
    from zope.interface import Interface
ImportError: No module named interface

However, the solution didn’t fix the problem for me:

unset PYTHON_INSTALL_LAYOUT
/opt/letsencrypt/letsencrypt-auto -v

Even pulling the updated version from the GitHub repository didn’t solve it.

After poking around for a while more, I found this bug report from the last year, which solved my problem.

I recommend:

  1. Running rm -rf /root/.local/share/letsencrypt. This removes your installation of letsencrypt, but keeps all configuration files, certificates, logs, etc.
  2. Make sure you have an up to date copy of letsencrypt-auto. It can be found here.
  3. Run letsencrypt-auto again.

If you get the same behavior, you can try installing zope.interface manually by running:

/root/.local/share/letsencrypt/bin/pip install zope.interface

Hopefully, next time I’ll remember to search my blog’s archives …

Update (May 31, 2017): check out my brother’s follow up post with even better way of fixing this issue.

Dissecting an SSL certificate

Julia Evans does it again.  If you ever wanted to understand SSL certificates, her post “Dissecting an SSL certificate” is for you.   This part made me smile:

Picking the right settings for your SSL certificates and SSL configuration on your webserver is confusing. As far as I understand it there are about 3 billion settings. Here is an example of an SSL Labs result for mail.google.com. There is all this stuff like OLD_TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 on that page (for real, that is a real thing.). I’m happy there are tools like SSL Labs that help mortals make sense of all of it.

Google and HTTPS

Here are some interesting news on the subject of Google and HTTPS:

In support of our work to implement HTTPS across all of our products (https://www.google.com/transparencyreport/https/) we have been operating our own subordinate Certificate Authority (GIAG2), issued by a third-party. This has been a key element enabling us to more rapidly handle the SSL/TLS certificate needs of Google products.

As we look forward to the evolution of both the web and our own products it is clear HTTPS will continue to be a foundational technology. This is why we have made the decision to expand our current Certificate Authority efforts to include the operation of our own Root Certificate Authority. To this end, we have established Google Trust Services (https://pki.goog/), the entity we will rely on to operate these Certificate Authorities on behalf of Google and Alphabet.

The process of embedding Root Certificates into products and waiting for the associated versions of those products to be broadly deployed can take time. For this reason we have also purchased two existing Root Certificate Authorities, GlobalSign R2 and R4. These Root Certificates will enable us to begin independent certificate issuance sooner rather than later.

We intend to continue the operation of our existing GIAG2 subordinate Certificate Authority.

If you need a bit of help putting this into perspective, this Hacker News thread has your back:

You can now have a website secured by a certificate issued by a Google CA, hosted on Google web infrastructure, with a domain registered using Google Domains, resolved using Google Public DNS, going over Google Fiber, in Google Chrome on a Google Chromebook. Google has officially vertically integrated the Internet.

Amazon Linux AMI : Let’s Encrypt : ImportError: No module named interface

Let’s Encrypt has only experimental support for the Amazon Linux AMI, so it’s kind of expected to have issues once in a while.   Here’s one I came across today:

# /opt/letsencrypt/certbot-auto renew
Creating virtual environment...
Installing Python packages...
Installation succeeded.
Traceback (most recent call last):
File "/root/.local/share/letsencrypt/bin/letsencrypt", line 7, in <module>
from certbot.main import main
File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/certbot/main.py", line 12, in <module>
import zope.component
File "/root/.local/share/letsencrypt/local/lib/python2.7/dist-packages/zope/component/__init__.py", line 16, in <module>
from zope.interface import Interface
ImportError: No module named interface

My first though was to install the system updates. It looks like something is off in the Python-land. But even after the “yum update” was done, the issue was still there. A quick Google search later, thanks to the this GitHub issue and this comment, the solution is the following:

pip install pip --upgrade
pip install virtualenv --upgrade
virtualenv -p /usr/bin/python27 venv27

Running the renewal of the certificates works as expected after this.

P.S.: I wish we had fewer package and dependency managers in the world…

Let’s Encrypt on CentOS 7 and Amazon AMI

The last few weeks were super busy at work, so I accidentally let a few SSL certificates expire.  Renewing them is always annoying and time consuming, so I was pushing it until the last minute, and then some.

Instead of going the usual way for the renewal, I decided to try to the Let’s Encrypt deal.  (I’ve covered Let’s Encrypt before here and here.)  Basically, Let’s Encrypt is a new Certification Authority, created by Electronic Frontier Foundation (EFF), with the backing of Google, Cisco, Mozilla Foundation, and the like.  This new CA is issuing well recognized SSL certificates, for free.  Which is good.  But the best part is that they’ve setup the process to be as automated as possible.  All you need is to run a shell command to get the certificate and then another shell command in the crontab to renew the certificate automatically.  Certificates are only issued for 3 months, so you’d really want to have them automatically updated.

It took me longer than I expected to figure out how this whole thing works, but that’s because I’m not well versed in SSL, and because they have so many different options, suited for different web servers, and different sysadmin experience levels.

Eventually I made it work, and here is the complete process, so that I don’t have to figure it out again later.

We are running a mix of CentOS 7 and Amazon AMI servers, using both Nginx and Apache.   Here’s what I had to do.

First things first.  Install the Let’s Encrypt client software.  Supposedly there are several options, but I went for the official one.  Manual way:

# Install requirements
yum install git bc
cd /opt
git clone https://github.com/certbot/certbot letsencrypt

Alternatively, you can use geerlingguy’s lets-encrypt-role for Ansible.

Secondly, we need to get a new certificate.  As I said before, there are multiple options here.  I decided to use the certonly way, so that I have better control over where things go, and so that I would minimize the web server downtime.

There are a few things that you need to specify for the new SSL certificate.  These are:

  • The list of domains, which the certificate should cover.  I’ll use example.com and www.example.com here.
  • The path to the web folder of the site.  I’ll use /var/www/vhosts/example.com/
  • The email address, which Let’s Encrypt will use to contact you in case there is something urgent.  I’ll use ssl@example.com here.

Now, the command to get the SSL certificate is:

/opt/letsencrypt/certbot-auto certonly --webroot --email ssl@example.com --agree-tos -w /var/www/vhosts/example.com/ -d example.com -d www.example.com

When you run this for the first time, you’ll see that a bunch of additional RPM packages will be installed, for the virtual environment to be created and used.  On CentOS 7 this is sufficient.  On Amazon AMI, the command will run, install things, and will fail with something like this:

WARNING: Amazon Linux support is very experimental at present...
if you would like to work on improving it, please ensure you have backups
and then run this script again with the --debug flag!

This is useful, but insufficient.  Before you can run successfully, you’ll also need to do the following:

yum install python26-virtualenv

Once that is done, run the certbot command with the –debug parameter, like so:

/opt/letsencrypt/certbot-auto certonly --webroot --email ssl@example.com --agree-tos -w /var/www/vhosts/example.com/ -d example.com -d www.example.com --debug

This should produce a success message, with “Congratulations!” and all that.  The path to your certificate (somewhere in /etc/letsencrypt/live/example.com/) and its expiration date will be mentioned too.

If you didn’t get the success message, make sure that:

  • the domain, for which you are requesting a certificate, resolves back to the server, where you are running the certbot command.  Let’s Encrypt will try to access the site for verification purposes.
  • that public access is allowed to the /.well-known/ folder.  This is where Let’s Encrypt will store temporary verification files.  Note that the folder starts with dot, which in UNIX means hidden folder, which are often denied access to by many web server configurations.

Just drop a simple hello.txt to the /.well-known/ folder and see if you can access it with the browser.  If you can, then Let’s Encrypt shouldn’t have any issues getting you a certification.  If all else fails, RTFM.

Now that you have the certificate generated, you’ll need to add it to the web server’s virtual host configuration.  How exactly to do this varies from web server to web server, and even between the different versions of the same web server.

For Apache version >= 2.4.8 you’ll need to do the following:

SSLEngine on
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem

For Apache version < 2.4.8 you’ll need to do the following:

SSLEngine on
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
SSLCertificateFile /etc/letsencrypt/live/example.com/cert.pem
SSLCertificateChainFile /etc/letsencrypt/live/example.com/chain.pem

For Nginx >= 1.3.7 you’ll need to do the following:

ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

You’ll obviously need the additional SSL configuration options for protocols, ciphers and the like, which I won’t go into here, but here are a few useful links:

Once your SSL certificate is issued and web server is configured to use it, all you need is to add an entry to the crontab to renew the certificates which are expiring in 30 days or less.  You’ll only need a single entry for all your certificates on this machine.  Edit your /etc/crontab file and add the following (adjust for your web server software, obviously):

# Renew Let's Encrypt certificates at 6pm every Sunday
0 18 * * 0 root (/opt/letsencrypt/certbot-auto renew && service httpd restart)

That’s about it.  Once all is up and running, verify and adjust your SSL configuration, using Qualys SSL Labs excellent tool.

Let’s Encrypt is not in Beta anymore

Let’s Encrypt – anew Certificate Authority, which is free, open, and automated – announced that it’s leaving beta.  Just look at how many SSL certificates they’ve issued, and at what rate!

Issuance-April-10-2016

I’ve first written about Let’s Encrypt back in November 2014.  It hasn’t been that long ago, but boy, what a journey!

Cipherli.st – strong ciphers for Apache, Nginx and Lighttpd

Cipherli.st – provides ready to use cipher configurations for a variety of applications, such as Apache, Nginx, Lighttpd, HAProxy, Exim, Postfix, Dovecot, OpenSSH, and others.  This is a huge time-saver for those of us not well versed in cryptography and security.

Don’t forget to use Qyalis SSL Labs SSL Server Test tool for the complete analysis of where you went wrong.