VEXXHOST has started to open source the tooling that we use in the provisioning of reliable infrastructure. Read on to find out more about YUM, our first one!
One of the more complicated issues surrounding running a large number of servers at scale, (regardless of if you're running bare metal or virtual machines), is ensuring a consistent set of packages across your entire server fleet. This is usually not a problem at the start of the deployment however with time, as new servers are added or old ones are updated, you end up with a very inconsistent set of packages.
This creates a maintenance nightmare. For example, it's possible to end up with groups of servers behaving differently due to different kernel versions (some older, some newer) or even worse, unexpected behaviour such as a bad driver that can crash your system which can lead to confusion if it's happening across different servers, with the same hardware, but different kernels.
What's Out There
There are already a few tools out there to manage package versions, with package promotion and other fancy features. In our use case, we have a very simple and basic requirement, so all of the added features didn't make sense for our deployments - we go over those briefly here. It's also important to note that we build on top of the CentOS ecosystem, so there may be other solutions available for Debian based deployments. We use open source tooling and software only, so this omits any paid/proprietary solutions.
- Katello: This is the base for the newer releases of RedHat Satellite as it seems that Spacewalk is no longer the base for new versions. In our case, Katello brings in a lot of extra tooling (builds on top of Foreman) which is not necessary for our deployments.
- Pulp: This is actually the underlying tool which Katello uses to manage all of its package management. The biggest issue for us was the fact that it seemed to require an agent and a whole subscription model setup to operate. We keep our hosts as lean as possible, so an extra agent/tool that doesn't have to be there isn't necessary.
This seemed like a simple problem to solve for our use case (which can cover other use cases too). We needed an uncomplicated tool that allowed us to get a consistent group of package metadata at any moment so that the hosts that consume that metadata always get the exact version of packages when needing to update. This means that once all of your systems are updated, even if new packages are released, your package updates will mention that there are no packages to update (unless you make a new “release” of package sets).
What We've Built
We've built a simple set of tools located in our GitHub account called yum-mirror-tools which allows you to do exactly that. All you need to do is configure the set of yum repositories that you need to mirror and it will take care of updating the mirrors (using the sync-packages.sh command) and releasing a new set of packages (using the create-release.sh command) and then promoting it to the latest version (using the promote-repo.sh command).
Only once you actually promote the repositories (assuming your servers point to the latest) will your servers finally update to the new set of packages. In addition, once you have a tagged release, you can actually point towards it directly in your staging environment and test updates there.
There should be absolutely no surprises because you're going from one functional set of tested packages to another, therefore, the upgrade behaviour will be exactly the same in your staging and production environment (and perhaps not in the case of not using this solution, where the upgrade can be different because of the different packages).
This simple set of tools allows you to solve a problem for a (fairly) complicated solution in a simple fashion. It doesn't solve it for everyone, but it surely might help you. We're more than happy to get small contributions back to the project and we'll continue to maintain it in the open. We're going to slowly start adding more and more projects that solve problems for running infrastructure at scale, alongside with posts like this to help bring more information on how we're able to deliver reliable and secure infrastructure at scale.