After using Ansible for only a week, I am deeply in love. I am doing more and more with less and less, and that’s exactly how I want my automation.
Today I had to solve an interesting problem. Ansible operates, based on the host and group inventory. As I mentioned before, I am now always relying on FQDNs (fully qualified domain names) for my host names. But what happens when DNS wildcards come into play with things like load balancers and reverse proxies Consider an example:
- Nginx configured as reverse proxy on the machine proxy1.example.com with 10.0.0.10 IP address.
- DNS wildcard is in place: *.example.com 3600 IN CNAME proxy1.example.com.
- Ansible contains proxy1.example.com in host inventory and a playbook to setup the reverse proxy with Nginx.
- Ansible contains a few other hosts in inventory and a playbook to setup Nginx as a web server.
- Somebody adds a new host to inventory: another-web-server.example.com, without specifying any other host details, like ansible_ssh_host variable. And he also forgets to update the DNS zone with a new A or CNAME record.
Now, Ansible play is executed for the web servers configuration. All previously existing machines are fine. But the new machine’s another-web-server.example.com host name resolves to proxy1.example.com, which is where Ansible connects and runs the Nginx setup, overwriting the existing configuration, triggering a service restart, and screwing up your life. Just kidding, of course. :) It’ll be trivial to find out what happened. Fixing the Nginx isn’t too difficult either. Especially if you have backups in place. But it’s still better to avoid the whole mess altogether.
To help prevent these cases, I decided to create a new safety net role. Given a variable like:
--- # Aliased IPs is a list of hosts, which can be reached in # multiple ways due to DNS wildcards. Both IPv4 and IPv6 # can be used. The hostname value is the primary hostname # for the IP - any other inventory hostname having any of # these IPs will cause a failure in the play. aliased_ips: "10.0.0.10": "proxy1.example.com" "192.168.0.10": "proxy1.example.com"
And the following code in the role’s tasks/main.yml:
--- - debug: msg="Safety net - before IPv4" - name: Check all IPv4 addresses against aliased IPs fail: msg="DNS is not configured for host '{{ inventory_hostname}}'. It resolves to '{{ aliased_ips[ item.0 ] }}'." when: "('{{ item[0] }}' == '{{ item[1] }}') and ('{{ inventory_hostname }}' != '{{ aliased_ips[ item.0 ] }}')" with_nested: - "{{ aliased_ips | default({}) }}" - "{{ ansible_all_ipv4_addresses }}" - debug: msg="Safety net - after IPv4 and before IPv6" - name: Check all IPv6 addresses against aliased IPs fail: msg="DNS is not configured for host '{{ inventory_hostname}}'. It resolves to '{{ aliased_ips[ item.0 ] }}'." when: "('{{ item[0] }}' == '{{ item[1] }}') and ('{{ inventory_hostname }}' != '{{ aliased_ips[ item.0 ] }}')" with_nested: - "{{ aliased_ips | default({}) }}" - "{{ ansible_all_ipv6_addresses }}" - debug: msg="Safety net - after IPv6"
the safety net is in place. The first check will connect to the remote server, get the list of all configured IPv4 addresses, and then compare each one with each IP address in the aliased_ips variable. For every matching pair, it will check if the remote server’s host name from the inventory file matches the host name from the aliased_ips value for the matched IP address. If the host names match, it’ll continue. If not – a failure in the play occurs (Ansible speak for thrown exception). Other tasks will continue execution for other hosts, but nothing else will be done during this play run for this particular host.
The second check will do the same but with IPv6 addresses. You can mix and match both IPv4 and IPv6 in the same aliased_ips variable. And Ansible is smart enough to exclude the localhost IPs too, so things shouldn’t break too much.
I’ve tested the above and it seems to work well for me.
There is a tiny issue with elegance here though: host name to IP mappings are already configured in the DNS zone – duplicating this configuration in the aliased_ips variable seems annoying. Personally, I don’t have that many reverse proxies and load balancers to handle, and they don’t change too often either, so I don’t mind. Also, there is something about relying on DNS while trying to protect against DNS mis-configuration that rubs me the wrong way. But if you are the adventurous type, have a look at the Ansible’s dig lookup, which you can use to fetch the IP addresses from the DNS server of your choice.
As always, if you see any potential issues with the above or know of a better way to solve it, please let me know.
RT @mamchenkov: Ansible safety net for DNS wildcard hosts #Ansible #Linux #SysAdmin #DevOps #DNS https://t.co/K4lt4Kovbj