Current state of the Ansible inventory and how it might evolve

This is an introductory post about the Inventory in Ansible where I’m looking at the current design and implementation, some of it internals, and where hope to yield some discussion and ideas on how it could be improved, be extended. A recent discussion at Devopsdays Ghent last week, re-spawned my interest in this topic, with some people actively showing interest to participate. Part of that interest is about building some standard inventory tool with some API and frontend, similar to what Vincent Van der Kussen started (and also lots of other often now abandoned projects) but going way further. Of course, that exercise would be pointless when not looking at what parts need to happen upstream, and what parts are more fitting in a separate project, or not. That’s also why my initial call to people interested in this didn’t focus on immediately bringing that discussion on one of the Ansible mailing lists.

20161103115009

In it’s current state, the Ansible Inventory – at the time of writing 2.2 was just released – hasn’t changed since it’s initial inception on how it was modelled during it’s early 0.x releases. This post tries to explain a bit how it was designed, how it works, and what might be its limits. I might oversimplify some details whilst focusing on the internal model and how data is kept in data structures.

ansible-host-inventory1

Whilst most people tend to see the inventory as just the host list, and a way to group them, it is much more than that, as parameters to each host, inventory variables, are also part of the inventory. Initially, those group and host variables where implemented as vars plugins, and whilst the documentation still seems to imply this, this hasn’t been true since around a major bugfix and update to the inventory in the 1.7 release, now over two years ago, where this part now is a fixed part of the ansible.inventory code. As far as I know, nobody seems to use custom vars plugins. I’d argue that the part where one manages variables, parameters to playbooks and roles, is the most important part in inventory. Structurally, the inventory model comes down to this:

The inventory basically is a list of hosts, which can be member of one or more groups, and where each group can be a child of one or more other (parent) groups. One can define variables on each of those hosts and groups, and define different values for all of those hosts. Groups and hosts inherit variables from their parents.

The inventory is mostly pre-parsed at the beginning of an Ansible run. After that, you can consider the inventory as being a set of groups, where hosts live, and each host has a set of variables with one specific value attached to it. An often made misunderstanding that comes up on the mailing list every now and then, is thinking a host can have a different value for a specific variable, depending on which group was used to target that host in a playbook. Ansible doesn’t work like that. Ansible takes a hosts: definition and calculates a list of hosts. In the end which exact group was used to get to that host, doesn’t matter anymore. Before hosts are touched, and variables are used, those variables always are calculated down to the host. Assigning different values to different groups, is how you can manage those, but in the end, you could choose to never use group_vars, and put everything yourself, manually, in host_vars, and get the same end result.

Now, if a host is member of multiple groups, and the same variable is defined in (some of) those groups, the question is, which value will prevail? Variable precedence in Ansible is quite a beast, and can be quite complex or at least daunting to both new and experienced users. The Ansible docs overview doesn’t explain alle the nifty details, and I couldn’t even find an explanation how precedence works within group_vars. Now, the short story here is, the more specific and down the tree wins. A child group wins over a parent group, host vars always win over group vars.

When host kid is member of a father group, and that father group is member of a grandfather group, then the kid will inherit variables from grandfather and father. Father could overrule a value from grandfather, and kid can overrule his father and grandfather if he wants. Modern family values.

There are also two special default groups: all and ungrouped. The former contains all groups that are not defined as a child group of another group, the latter contains all hosts that are not made member of a group.

1

But what if I have two application groups, app1 and app2, which are not parent-child related, and both define the same variable? In this case, both groups app1 and app2 live on the same level, and have the same ‘depth‘. Which one will prevail depends on the alphanumerical sorting of both names – IIRC – but I’m not even sure of the details.

2

That depth parameter is actually an internal parameter of the Group object. Each time a group is made member of a parent group, that group gets the depth from its parent + 1 unless if that group’s depth was already bigger than that newly calculated depth. The special ALL group has a depth of 0, app1 and app2 both have a depth of 1, and app21 got a depth of two. For a variable defined in all those groups, the value in app21 will be inherited by node2, whilst node1 will get the value from either app1 or app2, which is more or less undefined. That’s one of the reasons why I recommend to not define the same variables in multiple group “trees”, where a group tree is one big category of groups. It’s already hard to manage groups within a specific sub tree, whilst keeping precedence (and hence depth) in mind, it’s totally impractical to track that amongst groups from different sub trees.

20161103084137
Oh, if you want to generate a nice graphic of your inventory, to see how it lays out, Will Thames manages a nice project (https://github.com/willthames/ansible-inventory-grapher) that does just that. As long as graphviz manages to generate a small enough graph that fits on a sheet of paper, whilst remaining readable, you probably have a small inventory.

Unless you write your own application to manage all this, and write a dynamic inventory script to export data to Ansible, one is stuck with mostly the INI style hosts files, and the yaml group_vars and host_vars files to manage those groups and variables. If you need to manage lots of hosts, lots of applications, you probably end up with lots of categories to organise these groups, and then it’s very easy to lose any overview in how those groups are structured, and how variables inherit over those groups. It becomes hard to predict which value will prevail for a particular host, which doesn’t help to ensure consistency. If you were to change default values in the all group, whilst some hosts have an overruled value defined in child groups, but not all, you suddenly change values for a bunch of hosts. That might be your intention, but when managing hundreds of different groups, you know such a mistake might easily happen when all you have are the basic Ansible inventory files.

Ansible tends (or at least often used to) to recommend not doing such complicated things, “better remodel how you structure your groups” – but I think managing even moderately large infrastructures can quickly have complex demands on how to structure your data model. Different levels of organisation (sub trees) are often needed in this concept. Many of us need to manually create intersection of groups such as app1 ~ development => app1-dev to be able to manage different parameters for different applications in different environments. At scale this quickly becomes cumbersome with the ini and yaml file based inventory, as that doesn’t scale. Maybe we need a good pattern to handle dynamic intersections implemented upstream? Yes, hosts: app1&dev, but that is parsed at run time, and you can’t assign vars to such an intersection. An interesting approach is how Saltstack – which doesn’t have the notion of groups in the way Ansible does – lets you target hosts using grains, a kind of dynamic groups filtering hosts based on facts. Perhaps this project does something similar?

Putting more logic on how we manage the inventory, could be beneficial in several ways. Some ideas.

Besides scaling the current model by having a better tooling, it could for example allow to write simpler roles/playbooks, e.g. where the only lookup plugin we need to use for with_ iterations, would be the standard with_items, as the other ones are actually used to handle more complex data and generate lists from it. That could be part of the inventory, where the real infrastructure-as-data is modelled, doing a better decoupling of code (roles) and config (inventory). Possibly a way to really describe infrastructure as a higher level data model, describing a multi-tier application, that gets translated into specific configurations and action on the node level.

How about tracking the source of (the value for) a variable, allowing to make a difference between having a variable inherit a default and updating from changing that default from a more generic group (e.g. the dns servers for the DC), as opposed to only instantiate a variable from a default, and not letting it change afterwards by inheriting that default (e.g. the Java version development starts out with during the very first development deploy).

Where are gathered facts kept? For some, just caching those as it currently happens, is more than enough. For others, especially more specific custom facts – I’d call those run time values – that can have a direct relationship with inventory parameters, it might make more sense to keep them in the inventory. Think about the version of your application: you configure that as a parameter, but how do you know if that version is also the one that is currently deployed? One might need to keep track of the  deployed version (runtime fact) and compare that to what was configured to be deployed (inventory parameter).

An often needed pattern when deploying clusters, is the concept of a specific cluster group. I’ve seen roles acting on the master node in a cluster by conditionally running when: inventory_hostname = myapplicationcluster[0].

How many of you are aware that the order of the node list of a group was reversed somewhere between two 1.x releases? It was initially reverse alphanumerically ordered.

This should nicely illustrate the problem of relying on undocumented and untested behaviour. This pattern also makes you hard coding an inventory group name, in a role, which I think is ugly, and makes your role less portable. Doing a run_once can solve a similar problem, but what if your play targets a group of multiple clusters, instead of a specific cluster where you need to run_once per cluster? Perhaps we need to introduce a special kind of group that can handle the concept of a cluster? Metadata to groups, hosts and their variables?

Another hard pattern to implement in Ansible, is what Puppet solves with their exported resources. Take a list of applications that are deployed on multiple (application) hosts, and put that in a list so we can deploy a load balancer on 1 specific node. Or monitoring, or ACL to access those different applications. As Ansible in the end manages just vars per hosts, doing things amongst multiple hosts is hard. Can’t solve everything with a delegate_to.
node-collaboration-exported-resources-and-puppetdb-7-638

At some point, we might want to parse the code (the roles) that will be run, as to import a list of variables that are needed, and build some front-end, so the user can instantiate his node definitions and parameterize the right vars in inventory.

How about integrating a well managed inventory, with external sources? Currently, that happens by defining the Ansible inventory as a directory, and combining ini files with dynamic inventory scripts. I won’t get started on the merits of dir.py, but let’s say we really need to redesign that into something more clever, that integrates multiple sources, keeps track of metadata, etc. William Leemans started something on this after some discussion at Loadays in 2014, implementing specific external connectors.

Perhaps on a side note, I have also been thinking a lot on versioning. Versioning of the deploy code, and especially roles here. I want to be able to track the life cycle of an application, its different versions, and the life cycle and versions of it’s deployment. Imagine I start a business where I offer hosted Gitlab for multiple customers. Each of those customers potentially has multiple organisations, and hence Gitlab instances. Each of them potentially want to always run the latest version, whilst some want to remain ages on the same enterprizy first install, and others are somewhere in between. Some might even have different DTAP environments – Gitlab might be a bad example here as not all customers will do custom Gitlab development, but you get the idea. In the end you really have LOTS of Gitlab instances to manage. Will you manage them all with one single role? At some point the role needs development. Needs testing. A new version of the role needs to be used in production, for a specific customer’s instance. How can I manage these different role version in the Ansible eco-system? Doing that in the inventory sounds like the central source of information?

Lots of ideas. Lots of things to discuss. I fully expect to hear about many other use cases I never thought of, or never even would need. And as to not re-invent the wheel, insights from other tools are very welcome here!

New GPG Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1,SHA512

Date: 22 JUNE 2014

For a number of reasons[0], I've recently set up a new OpenPGP key,
and will be transitioning away from my old one.

The old key will continue to be valid for some time, but i prefer all
future correspondence to come to the new one. I would also like this
new key to be re-integrated into the web of trust. This message is
signed by both keys to certify the transition.

the old key was:

sec 1024D/0x8CC387DA097F5468 2004-07-14
Key fingerprint = 0FAC 6A6C D9D5 134C C87E 4FF3 8CC3 87DA 097F 5468

And the new key is:

sec 4096R/0xD08FC082B8E46E8E 2014-06-22 [expires: 2019-06-21]
Key fingerprint = F744 94B0 7042 6B14 BB90 D283 D08F C082 B8E4 6E8E

To fetch the full key from a public key server, you can simply do:

gpg --keyserver keys.riseup.net --recv-key

If you already know my old key, you can now verify that the new key is
signed by the old one:

gpg --check-sigs 0xD08FC082B8E46E8E

If you don't already know my old key, or you just want to be double
extra paranoid, you can check the fingerprint against the one above:

gpg --fingerprint 0xD08FC082B8E46E8E

If you are satisfied that you've got the right key, and the UIDs match
what you expect, I'd appreciate it if you would sign my key. You can
do that by issuing the following command:

**
NOTE: if you have previously signed my key but did a local-only
signature (lsign), you will not want to issue the following, instead
you will want to use --lsign-key, and not send the signatures to the
keyserver
**

gpg --sign-key 0xD08FC082B8E46E8E

I'd like to receive your signatures on my key. You can either send me
an e-mail with the new signatures (if you have a functional MTA on
your system):

gpg --export 0xD08FC082B8E46E8E | gpg --encrypt -r '$your_fingerprint' --armor | mail -s 'OpenPGP Signatures' serge@vanginderachter.be

Additionally, I highly recommend that you implement a mechanism to keep your key
material up-to-date so that you obtain the latest revocations, and other updates
in a timely manner. You can do regular key updates by using parcimonie to
refresh your keyring. Parcimonie is a daemon that slowly refreshes your keyring
from a keyserver over Tor. It uses a randomized sleep, and fresh tor circuits
for each key. The purpose is to make it hard for an attacker to correlate the
key updates with your keyring.

I also highly recommend checking out the excellent Riseup GPG best
practices doc, from which I stole most of the text for this transition
message ;-)

https://we.riseup.net/debian/openpgp-best-practices

Please let me know if you have any questions, or problems, and sorry
for the inconvenience.

If you have a keybase account and if you are into it, you can also check my
keybase page[1].

Serge van Ginderachter

0. https://www.debian-administration.org/users/dkg/weblog/48
1. https://keybase.io/svg

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iKYEARECAGYFAlOm06hfFIAAAAAALgAoaXNzdWVyLWZwckBub3RhdGlvbnMub3Bl
bnBncC5maWZ0aGhvcnNlbWFuLm5ldDBGQUM2QTZDRDlENTEzNENDODdFNEZGMzhD
QzM4N0RBMDk3RjU0NjgACgkQjMOH2gl/VGh5QgCdE2dKZly+MECXFfH0WCje9Rpo
/HoAoL+6jQ15wWq0FMrisRx24dX5OtOeiQJ8BAEBCgBmBQJTptOoXxSAAAAAAC4A
KGlzc3Vlci1mcHJAbm90YXRpb25zLm9wZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRG
NzQ0OTRCMDcwNDI2QjE0QkI5MEQyODNEMDhGQzA4MkI4RTQ2RThFAAoJENCPwIK4
5G6OnB0P/jw77jLrcLlP6GUXTrfui1Pbrk/W6hysplKHPzh53OoVJB0Bq6NARlOz
yptDwRW2LivFNz2M9FxObij+2CDGQ/FoOWdWlKatg9bvqhkQwglpMFNyGDQ/EOxV
a2ObFOySsGU2hXnYvVSUUCc1SMt5M3RKw3264ZsxqIda8o2lqF7ZO9qDijY7peHy
Ll0aPPxlFiqUjN0Q5P4PzoQcWbHDFLDO1Mm+P52gyod/Rh0PrWKOk2kwMEHFBwUd
tgi2jT+W4wv7yAOdvrIwiRpdqAM4be9MPDmXDjYrEHsJrKwqkXfDRRV53ZRFo8f3
bKXSnAV0i2svIEOscNWHhNrpmk5iqzyvr5CeJse7nEjXAP7HntTxPFIvWs2c3dvt
HItslcDcU2ZIrCh3rIi+fv7pcjX6JE/A0CzZkTo294wnGexoIRiRcC7wojS5e3PV
v83NZPRBz7tpPVQaMP74UiXvQpTm2GEiIXYtFkyZFEtyxwfEOY8L50QpMAJ0HXPm
7xH+XIaCcBljgeVoP0VlUecGW6aJubTryNTUimIBUnL7ItWjNLl7uJtDlGdjsOZV
QVgpQ6G3Tx8lDp+qo4SD4YI8zoWK59Ef9MUCSJn3ngWI0dG5jElONqOOY/W1zcyA
ce2wJs8ua79HJV/GXiadtlSCJpG8XfanyvhrvePSCp9O/5mZLnWs
=evlZ
-----END PGP SIGNATURE-----

Packt Publishing Ansible Configuration Management review

Around late November 2013 I – too – got contacted by Packt Publishing, asking to do a review on Ansible Configuration Management. I was a bit surprised, as I had declined their offer to write that book, which they asked me exactly two months earlier. Two months seemed like a short period of time to manage to write a book and get it published.

Either way, I kind of agreed, and got the book in pdf, printed it out, started some reading, lended it to a colleague (we us Ansible extensively at work), and just recently got it back so I could finish to have a look at it.

“Ansible Configuration Management” is an introductory book for beginners. I won’t introduce Ansible here, there are a lot of good resources on that, just duck it. Ansible being relatively new, has evolved quite a bit in the previous year, releasing 1.4 by the end of November. The current development cycle focuses more on bug fixes, and under the hood stuff, and less on new syntax, which was quite the opposite when going from 0.9 through 1.2, and up until the then and now current 1.3.

Knowing what major changes would get into 1.3 was easy when you followed the project. One of the major changes is the syntax for variables and templates. Basically, don’t use $myvar or ${othervar} any more, but only use {{ anicevar }}. If you know ansible, you know this is an important thing. I was very disappointed to notice the author didn’t stress this. Whilst most examples use the new syntax, at one point all syntax’s are presented as equally possible – which is correct for the then latest 1.3, but it was well known at the time it would be deprecated.

Of course, writing a tech book on a rapid evolving Open Source tool, will always be deprecated by the time it gets published. But I think this should be expected, and a good book on such a subject should of course focus on the most recent possible release, but also try to mention the newer features that are to be expected. Especially for a publisher that also focuses on Open Source.

A quirk, is when code snippets are discussed. Some of those longer snippets are printed across more than one page, and the book mentions certain line numbers. Which is confusing, and even unreadable, when the snippets don’t have line numbers. Later in the book, sometimes line numbers are used, but not in a very standard way:

 

code snippet with weird line number

code snippet with weird line number

Whilst most of this book has a clear layout at first sight, things like this don’t feel very professional.

This books gives a broad overview and discusses several basic things in ansible. It goes from basic syntax, over inventory, small playbooks, and extended playbooks ans also mentions custom code things. It gives lots of examples, discusses special variables, modules, plugins… and many more. Not all of them, but that is not needed, given the very good documentation the project publishes. This book is an introduction to ansible, so focusing on the big principles is more important at this point, than having a full inventory of all features. As it’s a relatively short book (around 75 pages), it’s small enough to be appealing as a quick introduction.

It’s a pity the publisher and the author didn’t pay more attention to details. The less critical user, with little to no previous ansible experience, will however get a good enough introduction with this book, with some more hand-holding and overview than what can easily be found freely on-line.

Git and Github: keeping a feature branch updated with upstream?

Git and github, you gotta love them for managing and contributing to (FLOSS) projects.

Contributing to a Github hosted project becomes very easy. Fork the project to your personal Github account, clone your fork locally, create a feature branch, make some patch, commit, push back to your personal Github account, and issue a pull request from your feature branch to the upstream (master) branch.


git clone -o svg git@github.com:sergevanginderachter/ansible.git
cd ansible
git remote add upstream git://github.com/ansible/ansible.git
git checkout -b user-non-unique
vi library/user
git add library user
git commit -m "Add nonunique option to user module, translating to the -o/--non-unique option to useradd and usermod."
git push --set-upstream svg user-non-unique
[go to github and issue the pull request]

Now, imagine upstream (1) doesn’t approve your commit and asks for a further tweak and (2) you need to pull in newer changes (upstream changes that were committed after you created your feature branch.)

How do we keep this feature branch up to date? Merging the newest upstream commits is easy, but you want to avoid creating a merge commit, as that won’t be appreciated when pushed to upstream: you are then effectively re-committing upstream changes, and those upstream commits will get a new hash (as they get a new parent). This is especially important, as those merged commits would be reflected in your Github pull request when you push those updates to your personal github feature branch (even if you do that after you issued the pull request.)

That’s why we need to rebase instead of merging:


git co devel #devel is ansible's HEAD aka "master" branch
git pull --rebase upstream devel
git co user-non-unique
git rebase devel

Both the rebase option and rebase command to git will keep your tree clean, and avoid having merge commits.
But keep in mind that those areyour first commits (with whom you issued your first pull request) that are being rebased, and which now have a new commit hash, which is different from the original hashes that are still in your remote github repo branch.

Now, pushing those updates out to your personal Github feature branch will fail here, as both branches differ: the local branch tree and the remote branch tree are “out of sync”, because of those different commit hashes. Git will tell you to first git pull --rebase, then push again, but this won’t be a simple fast-forward push, as your history got rewritten. Don’t do that!

The problem here is that you would again fetch your first changed commits as they were originally, and those will get merged on top of your local branch. Because of the out of sync state, this pull does not apply cleanly. You’ll get a b0rken history where your commits appear two times. When you would push all of this to your github feature branch, those changes will get reflected on the original pull request, which will get very, very ugly.

AFAIK, there is actually no totally clean solution to this. The best solution I found is to force push your local branch to your github branch (actually forcing a non-fast-orward update):

As per git-push(1):

Update the origin repository’s remote branch with local branch, allowing non-fast-forward updates. This can leave unreferenced commits dangling in the origin repository.

So don’t pull, just force push like this:

git push svg +user-non-unique

This will actually plainly overwrite your remote branch, with everything in your local branch. The commits which are in the remote stream (and caused the failure) will remain there, but will be dangling commit, which would eventually get deleted by git-gc(1). No big deal.

As I said, this is AFAICS the cleanest solution. The downside of this, is that your PR will be updated with those newest commits, which will get a later date, and could appear out of sync in the comment history of the PR. No big problem, but could potentially be confusing.

bash redirection target gets funky

Can anybody explain me how this funky behaviour in bash works?

find /root  >output 2>error 3

Yes, that’s just “error” followed by a space followed by “3”.

serge@goldorak:~/tmp$ ls -l
total 8
-rw-rw-r-- 1 serge serge 71 Aug 30 13:57 error 3
-rw-rw-r-- 1 serge serge 6 Aug 30 13:57 output
serge@goldorak:~/tmp$

Lets create a file with a space in it:

serge@goldorak:~/tmp$ touch "test 1"
serge@goldorak:~/tmp$ ls -l
total 8
-rw-rw-r-- 1 serge serge 71 Aug 30 13:57 error 3
-rw-rw-r-- 1 serge serge 6 Aug 30 13:57 output
-rw-rw-r-- 1 serge serge 0 Aug 30 13:58 test 1
serge@goldorak:~/tmp$

using bash completion I get:

serge@goldorak:~/tmp$ ls -l test 1
-rw-rw-r-- 1 serge serge 0 Aug 30 13:58 test 1
serge@goldorak:~/tmp$ ls -l error 3
-rw-rw-r-- 1 serge serge 71 Aug 30 13:57 error 3
serge@goldorak:~/tmp$

It seems the space in “error 3” is not a space but some other char?

The Linux-Training Project: Linux Training v2 released

As announced in February new versions of the Linux training courses were being (re-)written by Paul.

I’m pleased to announce that v2 was merged in the master branch on github.

I you want to test it or just check it out:


git clone git://github.com/linuxtraining/lt.git
cd lt
git submodule init
git submodule update
./make.sh
./make.sh build fundamentals

Feedback is welcome, by mail at contributors@linux-training.be or via a github issue.

If you prefer to just download the latest books in PDF format, check out the download page. These are nightly builds from the master branch.

The Linux-Training Project: New Books

You may have noticed that since February 2011 there have been no updates to the free pdf’s. That is because we have been redesigning the docbook source code to accommodate smaller books (we call them minibooks). This mandates a rewrite of the build system and a refitting of the .xml files and directories. This is all being done in the git experimental branch.

We take the minibook redesign also as an opportunity to rewrite/improve the content of each subject. Making this a time consuming process. The good news is that some minibooks should be coming online soon (say July 2011).  We plan to finish at least the Linux Fundamentals (big)book by the end of July (it will consist of six or seven minibooks).