Month: April 2016

WhatsApp introduces end-to-end encryption for everything

WhatsApp introduces end-to-end encryption for all communications – chats, pictures, videos, etc. I’m sure it’ll help them get more individuals and businesses on the network, as well as probably ban the app in a handful of countries.

WhatsApp has always prioritized making your data and communication as secure as possible. And today, we’re proud to announce that we’ve completed a technological development that makes WhatsApp a leader in protecting your private communication: full end-to-end encryption. From now on when you and your contacts use the latest version of the app, every call you make, and every message, photo, video, file, and voice message you send, is end-to-end encrypted by default, including group chats.

The idea is simple: when you send a message, the only person who can read it is the person or group chat that you send that message to. No one can see inside that message. Not cybercriminals. Not hackers. Not oppressive regimes. Not even us. End-to-end encryption helps make communication via WhatsApp private – sort of like a face-to-face conversation.

Absolute stupidity of include directive in /etc/sudoers, and Microsoft Azure

I’ve just spent three hours (!!!) trying to troubleshoot why sudo was misbehaving on a brand new CentOS 7 server. I was doing the setup of two identical servers in parallel (for two different clients). One server worked as expected, the other one didn’t.

The thing I was trying to do was trivial – allow users in the wheel group execution of sudo commands without password. I’ve done it a gadzillion times in the past, and probably at least a dozen times just this week alone. Here’s what’s needed:

Add user to the wheel group.
Edit /etc/sudoers file to uncommen tthe line (as in: remove the hash comment character from the beginning of the file): # %wheel ALL=(ALL) NOPASSWD: ALL
Enjoy!

Imagine my surprise when it only worked on one server and not on the other. I’ve dug deep and wide. Took a break. And dug again. Then, I’ve summoned the great troubleshooting powers of my brother. But even those didn’t help.

Lots of logging, diff-ing, strace-ing, swearing and hair pulling later, the problem was found and fixed. The issue was due to two separate reasons.

Reason 1: /etc/sudoers syntax uses the hash character (#) for two different purposes.

For comments, which there are plenty of in the file.
For the “#include” and “#includedir” directives, which include other files into the configuration.

The default /etc/sudoers file is full of lengthy comments. Just to give you and idea:

(root@host ~)# wc -l /etc/sudoers
118 /etc/sudoers
(root@host ~)# grep -v '^#' /etc/sudoers | grep -v '^$' | wc -l
12

Yup. 118 lines in total vs. 12 lines of configuration (comments and empty lines removed). Like with banner blindness, this causes comment blindness. Especially towards the end of the file. Especially if you’ve seen this file a billion times before.

And that’s where the problem starts. Right at the bottom of the file, there are these two lines:

##Read drop-in files from /etc/sudoers.d (the # here does not mean a comment)
#includedir /etc/sudoers.d

Interesting, right? Usually there is nothing in the /etc/sudoers.d/ folder on the brand new CentOS box. But even if there was something, by now you’d assume that the include of the folder is commented out. Much like that wheel group configuration I mentioned earlier. I found it by accident, while reading sudoers(5) manual page, trying to find out if there are any other locations or defaults for included configurations. About 600 lines into the manual, there is this:

To include /etc/sudoers.local from within /etc/sudoers we
would use the following line in /etc/sudoers:

#include /etc/sudoers.local

When sudo reaches this line it will suspend processing of
the current file (/etc/sudoers) and switch to
/etc/sudoers.local.

So that comment is not a comment at all, but an include of the folder. That’s the first part of the problem.

Reason #2: Windows Azure Linux Agent

As I mentioned above, the servers aren’t part of my infrastructure – they were provided by the clients. I was basically given an IP address, a username and a password for each server – which is usually all I need. In most cases I don’t really care where the server is hosted and what’s the hosting company in use. Turns out, I should.

The server with the problem was hosted on the Microsoft Azure cloud infrastructure. I assumed I was working off a brand new vanilla CentOS 7 box, but in fact I wasn’t. Microsoft adds packages to the default install. On of the packages that it adds is the Windows Azure Linux Agent, which “rpm -qi WALinuxAgent” describes as following:

The Windows Azure Linux Agent supports the provisioning and running of Linux VMs in the Microsoft Azure cloud. This package should be installed on Linux disk images that are built to run in the Microsoft Azure environment.

Harmless, right? Well, not so much. What I found in the /etc/sudoers.d/ folder was a little file, called waagent, which included the different sudo configuration for the user which I had a problem with.

During the troubleshooting process, I’ve created a new test user, added the account to the wheel group and found out that it was working fine. From there, I needed to find the differences between the two users.

I guess, the user that I was using initially was created by the client’s system administrator using Microsoft Azure web interface. A quick Google search brings this page from the Azure documentation:

By default, the root user is disabled on Linux virtual machines in Azure. Users can run commands with elevated privileges by using the sudo command. However, the experience may vary depending on how the system was provisioned.

SSH key and password OR password only – the virtual machine was provisioned with either a certificate (.CER file) or SSH key as well as a password, or just a user name and password. In this case sudo will prompt for the user’s password before executing the command.

SSH key only – the virtual machine was provisioned with a certificate (.cer, .pem, or .pubfile) or SSH key, but no password. In this case sudo will not prompt for the user’s password before executing the command.

I checked the user’s home folder and found no keys in there, so I think it was provisioned using the first option, with password only.

I think Microsoft should make it much more obvious that the system behavior might be different. Amazon AWS provides a good example to follow. When you login into Amazon AMI instance, you see a message of the day (motd) banner, which looks like this:

$ ssh server.example.com
Last login: Tue Apr  5 17:25:38 2016 from 127.0.0.1

__|  __|_  )
_|  (     /   Amazon Linux AMI
___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2016.03-release-notes/

([email protected])$

It’s dead obvious that you are now on the Amazon EC2 machine and you should adjust your ~~expectations~~ assumptions accordingly.

Deleting the file immediately solved the problem. To avoid similar issues in the future, #includedir directive can be moved further up in the file, and surrounded by more visible comments. Like, maybe, an ASCII art skull, or something.

With that, I am off to heavy drinking and recovery… Stay sane!

Share your public keys easily with GitHub

Here’s a handy thing that I didn’t know about – you can easily share your public keys by adding them to your GitHub account and then accessing the URL of the form https://github.com/YOUR_USERNAME.keys . What you get is a plain text response with all your public keys, ready to be inserted into .ssh/authorized_keys file or anywhere else you want them.

Here’s an example of mine – https://github.com/mamchenkov.keys . Don’t forget to configure two factor authentication for your GitHub account for an extra layer of security. You probably don’t want any bugger who got your password inserting his own public keys into your account.

Top level domain nonsense and how it can break your stuff

Call me old school, but I really (I mean REALLY) don’t like the recent explosion of the top level domains. I understand that most good names are taken in .com, .org, and .net zones, but do we really need all those .blue, .parts, and .yoga TLDs?

Why am I whining about all this all of a sudden? I’ll tell you why. Because a new top level domain – .aws – is about to be introduced, and it already broke something for me in a non-obvious manner.

I manage a few Virtual Private Clouds on the Amazon AWS. Many of these use and rely on some hostname naming convention (yeah, I’m familiar with the pets vs. cattle idea). Imagine you have a few servers, which are separated into generic infrastructure and client segments, like so:

bastion.aws.example.com
firewall.aws.example.com
lb.aws.example.com
web.client1.example.com
db.client1.example.com
web.client2.example.com
db.client2.example.com
… and so on.

Working with such long FQDNs (fully qualified domain names) isn’t very convenient. So add “search example.com” to your /etc/resolve.conf file and now you can use short hostnames like firewall.aws and web.client1. And life is beautiful …

… until one day, when you see the following:

[email protected]$> ssh firewall.aws
Permission denied (publickey).

And that’s when your heart misses a beat, the world freezes, and you go: “WTF?”. All kinds of thoughts are rushing through your head. Is it a typo? Am I in the right place? Did the server get compromised? How’s that for a little panic …

Trying a few things here and there, you manage to get into the server from somewhere else. You are very careful. You are looking around for any traces of the break-in, but you see nothing. You dig through the logs both on the server and off it. Still nothing. You can dive into all those logwatch and cron messages in your Trash, that you were automatically deleting, cause things were working fine for so long. There! You find that cron was complaining that backup script couldn’t get into this machine. Uh-oh. This was happening for a few days now. A black cloud of combined worry for the compromised machine and outdated backup kills the sunlight in your life. Dammit!

Take a break to calm down. Try to think clearly. Don’t panic. Stop assuming things, and start troubleshooting.

A few minutes later, you establish that the problem is not limited to that particular machine. All your .aws hosts share this headache. A few more minutes later, you learn that ‘ssh firewall.aws.example.com’ works fine, while ‘ssh firewall.aws’ still doesn’t.

That points toward the hostname resolution issue. With that, it takes only a few more moments to see the following:

[email protected]$> host firewall.aws
firewall.aws has address 127.0.53.53
firewall.aws mail is handled by 10 your-dns-needs-immediate-attention.aws.

Say what? That’s not at all what I expected. And what is that that I need to fix with my DNS? Google search brings this beauty:

This is problably because the .dev and .local are now valid top level extensions.

Really? Who’s the genius behind that? I thought people chose those specifically to make them internal. So is there an .aws top level extension now too? You bet there is!

Solution? Well, as far as I am concerned, from this day onward, I don’t trust the brief hostnames anymore. It’s FQDN or nothing.

Ansible setup for Fedora project

Real life working examples are some of the most useful things when learning a new system. The more – the better. That’s why this git repository of the Ansible setup for the Fedora project is a pure gold mine. It is large. It is complex. It covers a whole lot of things. But most importantly, it is alive and well tested.