Software debugging is like finding the hay in the needle stack.
Here goes the story of me learning a few new swear words and pulling out nearly all my hair. Grab a cup of coffee, this will take make a while to tell…
First of all, here is a diagram to make things a little bit more visual.
As you can see, we have an office network with NAT on the gateway. We have an Amazon VPC with NAT on the bastion host. And then there’s the rest of the Internet.
The setup is pretty straight forward. There are no outgoing firewalls anywhere, no VLANs, no network equipment – all of the involved machines are a variety of Linux boxes. The whole thing has been working fine for a while now.
A couple of weeks ago we had an issue with our ISP in the office. The Internet connection was alive, but we were getting extremely high packet loss – around 80%. The technician passed by, changed the cables, rebooted the ADSL modem, and we’ve also rebooted the gateway. The problem was fixed, except for one annoying bit. We could access all of the Internet just fine, except our Amazon VPC bastion host. Here’s where it gets interesting.
httpdiff – perform the same request against two HTTP servers and diff the results
I’ve read this story a while ago, but this is a beautiful piece of the system administration reality, so here it goes again.
“We’re having a problem sending email out of the department.”
“What’s the problem?” I asked.
“We can’t send mail more than 500 miles,” the chairman explained.
I choked on my latte. “Come again?”
“We can’t send mail farther than 500 miles from here,” he repeated. “A
little bit more, actually. Call it 520 miles. But no farther.”
More stories here.
morgue – post mortem tracker
Tools of the Trade – a huge collection of tools (mostly software as a service) for all kinds of web work: development, troubleshooting, project management, testing, emails, etc.
Easylogging++ – single header only, extremely light-weight high performance logging library for C++ applications
sysdig – system troubleshooting for Linux
- Log structured data in a readable format
- Add a dash of color
- Logs let your app communicate with you and your team
- Seriously though, don’t put exception stack traces in your logs!
- Log URLs for easy access to more context
- Add emotional context to your logs
Most of these are somewhat expected, but I emotional context in logs was definitely new to me. I wonder why I’ve never even thought of this.