HTTPS on Stack Overflow: The End of a Long Road

Way too often I hear rants from random people (unfortunately, many of them are also from the IT industry, with the deep understanding of the underlying issues) complaining about why company X or product Y doesn’t implement this or that feature.  As someone who has been involved a dozens, if not hundreds, of projects, I pretty much always can think of a number of reasons why even seemingly the simplest of features aren’t implemented for years.  These can vary from business side of things – insufficient budgets, strategic goals, and the like – to technical, such as architectural limitations, insufficient expertise, insufficient resources, etc.

One of the recent frequent rant that keeps coming up is “Why don’t they just enable HTTPS?”.  Again, as someone being involved in HTTPS setup for several different environments I can think of a number of reasons why.  SSL certificates used to cost money and were quite cumbersome to install until very recently.  Thanks to Let’s Encrypt effort, SSL certificates are now free and quite easy to issue and renew.  But that’s only part of the problem.  Enabling HTTPS requires infrastructural changes, and the more complex your infrastructure, the more changes are needed.  Just think of a few points here – web server configuration (especially when you have multiple web servers, with varied software (Apache, Nginx, IIS) and varied versions of that software), load balancers, web application firewalls, reverse proxies, caching servers, and so on.

Apart from the infrastructural changes, HTTPS often needs changes on the application level.  Caching, cookies, headers, making sure that all your resources are HTTPS-only, redirects, and the like.

All of the above issues are multiplied by a gadzillion, when your project is publicly available, used by tonnes of people, and provides embeddable content or APIs to third-party (hello, backward compatibility).

This is not to mention that HTTPS itself is a complex subject, not well understood by even the most experienced system administrators and developers.  There are different protocols and versions (SSL vs. TLS), cipher suites, handshakes, and protocol details.  Just have a look at the variety of checks and the report length done by Qualys’ SSL Labs Server Test.  Even giants like Google, who employ thousands of smart people, can’t get it all right.

But for some reason, people either don’t know or prefer to ignore all this complexities, and whine and cry anyway.

Recently, Stack Overflow – a well known collection of sites on a variety of technical subjects, has completed the migration to HTTPS everywhere.  These are also people with a lot of knowledge and expertise and with access to all the information.  Just have a look at their long way, which took not months, but years: HTTPS on Stack Overflow: The End of a Long Road.

Today, we deployed HTTPS by default on Stack Overflow. All traffic is now redirected to https:// and Google links will change over the next few weeks. The activation of this is quite literally flipping a switch (feature flag), but getting to that point has taken years of work. As of now, HTTPS is the default on all Q&A websites.

We’ve been rolling it out across the Stack Exchange network for the past 2 months. Stack Overflow is the last site, and by far the largest. This is a huge milestone for us, but by no means the end. There’s still more work to do, which we’ll get to. But the end is finally in sight, hooray!

So next time you are about to start crying about somebody not having feature X or Y, just give it a minute first.  Try to imagine what goes on on the other side.  You aren’t the only one with low budgets, pressing deadlines, insufficient knowledge, bad colleagues and horrible bosses…

Stack Overflow: Helping One Million Developers Exit Vim

OK, this one is socially funny and statistically cool – Stack Overflow question on how to exit Vim editor was viewed over a million times in the last few years.  Now, there’s a breakdown of all sorts of statistics about who gets stuck in Vim the most.  It’s pretty amazing the kind of questions and answers one can ponder at when having access to a lot of statistical data.

:wq

The Most Mentioned Books On StackOverflow

Slashdot links to “The Most Mentioned Books On StackOverflow“.

How we did it:

  • We got database dump of all user-contributed content on the Stack Exchange network (can be downloaded here)
  • Extracted questions and answers made on stackoverflow
  • Found all amazon.com links and counted it
  • Created tag-based search for your convenience
  • Brought it to you

I’ve previously linked to a similar selection of “Top 29 books on Amazon from Hacker News comments“.

The RegEx that killed StackOverflow

Here’s an outage postmortem from the recent StackOverflow downtime.  It just shows you how easy it is to break things, even they were built by some of the smartest people around.  Programming is touch and there is no way around it.

Technical Details

The regular expression was: ^[\s\u200c]+|[\s\u200c]+$ Which is intended to trim unicode space from start and end of a line. A simplified version of the Regex that exposes the same issue would be \s+$ which to a human looks easy (“all the spaces at the end of the string”), but which means quite some work for a simple backtracking Regex engine. The malformed post contained roughly 20,000 consecutive characters of whitespace on a comment line that started with — play happy sound for player to enjoy. For us, the sound was not happy.

If the string to be matched against contains 20,000 space characters in a row, but not at the end, then the Regex engine will start at the first space, check that it belongs to the \s character class, move to the second space, make the same check, etc. After the 20,000th space, there is a different character, but the Regex engine expected a space or the end of the string. Realizing it cannot match like this it backtracks, and tries matching \s+$ starting from the second space, checking 19,999 characters. The match fails again, and it backtracks to start at the third space, etc.

So the Regex engine has to perform a “character belongs to a certain character class” check (plus some additional things) 20,000+19,999+19,998+…+3+2+1 = 199,990,000 times, and that takes a while. This is not classic catastrophic backtracking (talk on backtracking) (performance is O(n²), not exponential, in length), but it was enough. This regular expression has been replaced with a substring function.