The Glorious Journey to a New Version of Openstack

As you might expect, upgrading Openstack is a pain in the ass. In theory, it is very simple, but any software that has a more complicated upgrade process than “[package manager of your choice] upgrade” is less than ideal to begin with.

An ACM alumnus who works with Openstack professionally has said that he would reserve approximately a week for an Openstack major version upgrade. You have been warned! That being said, when we did this it wasn’t too terrible and only took a day.

The ACM currently runs: Openstack Juno.

Openstack Upgrade Documentation and other references

Before starting, you should read http://docs.openstack.org/ops-guide/ops-upgrades.html and possibly other documents available from the Openstack documentation, like the Release Notes: https://wiki.openstack.org/wiki/ReleaseNotes/Juno. The Upgrade Guide walks you through the basic procedure; this document will document some specific things about our setup you should know and also offer advice from sysadmins who have done the Openstack Upgrade before.

Once the Openstack docs had specific guides with the actual commands you needed to run to do things to upgrade from version to version, but I cannot seem to find them anymore. There are various other articles on the internet documenting this; https://access.redhat.com/articles/1169003 might be helpful too, and so might https://www.chriscowley.me.uk/blog/2015/08/11/upgrade-openstack-from-juno-to-kilo/.

Back Up EVERYTHING

Seriously, we mean it. Ensure that the Openstack databases, generated Openstack configuration, and anything else relevant is backed up. Even if it gets backed up nightly by AFS I would edge on the side of paranoia and do a manual backup before proceeding. You might even want to just mirror the root disks of gomes and friends, although that might not be necessary.

Test First?

If you want, you can set up a test Openstack instance (even inside Openstack!) and test the upgrade before actually doing it on the production cluster. We did not do so when upgrading Icehoues to Juno and everything went okay, but it would be the safer way to do things.

The Debian Problem

Generally, you should upgrade from Openstack N to Openstack N+1 in a (futile) attempt to minimize breakage. While Openstack supports upgrading multiple releases in one go, I wouldn’t recommend it.

Most ACM servers currently run Debian Jessie, where the last available version of Openstack is Icehouse. The current version, Mitaka, is what’s available in Stretch and Jessie Backports. This is a lot of versions to upgrade in one go, and Mitaka may no longer have the database migrations for upgrading from Icehouse. How, then, is someone meant to upgrade a server from Debian Jessie to Debian Stretch when running Openstack?

Well, according to this old tutorial: https://wiki.debian.org/OpenStackHowto/Upgrades, the answer is to use the repositories on archive.gplhost.com (http://archive.gplhost.com/debian/pool/), where the Debian Openstack team maintain repositories of each Openstack release for older Debian versions. Unfortunately, these repositories don’t necessarily contain the latest minor release of each major OS version; the Juno repos contain only .2 and the last Juno patch was .4; the Kilo repositories don’t contain any patches.

Also, when we did the Icehouse to Juno upgrade, we discovered that the Juno repository metadata was not in sync with the actual deb files present in the directory. You may need to yell at the Debian Openstack team if you encounter similar issues with these packages.

Upgrading With Configuration Management

Our Openstack cluster was originally set up using Puppet to generate and manage the Openstack configuration. If Puppet or Ansible or similar software is currently being used to manage Openstack, it should just be a matter of upgrading to the newer version’s Puppet or Ansible modules and telling Puppet or Ansible to do its thing.

However, as of this writing, we stopped using Puppet because the Puppet Openstack modules for Openstack Icehouse were extremely buggy, referred to other modules by the wrong name, and had a number of local patches that past sysadmins had applied manually. Thus you cannot do this, and should refer to the next section:

Upgrading Without Configuration Management

This is slightly more involved, because we don’t have software that will automatically do database migrations and upgrade the configuration files.

You should read the release notes for the version of Openstack you are attempting to upgrade to, and the “Upgrade Guide” if you can find one. These notes will hopefully cover any major issues and also tell you what has changed in the format of the configuration files (the packages will _not_ automatically upgrade them, so it is your reponsibility to fix deprecated syntax).

You should then find the right Debian repository for the upgrade on archive.gplhost.com. I recommend using these repositories unless upgrading to the version in current Debian stable. (see notes above). You will need to ensure that the packages are also built for the version of Debian you’re on; for Juno, for example, there are “juno”, “juno-backports”, “jessie-juno-backports”, and the former is only for Wheezy and the latter two for Jessie.

Once you’ve found the right repositories, edit /etc/apt/sources.list.d/openstack.list and update it to point at the right repositories. This file should need adjusting on any and all Openstack nodes: currently, that is gomes (the controller and network node) and antonio, enrique, and serrao (the compute nodes). Its current (as of Openstack Nova) contents look something like this.

deb http://archive.gplhost.com/debian/ juno-backports main
deb http://archive.gplhost.com/debian/ jessie-juno-backports main

At this point, I would shut down the compute nodes and upgrade the controller (apt-get update and apt-get dist-upgrade), and wait.

Assuming all packages install successfully, now you can proceed and apply database migrations to each service, upgrade their config files, and (re)start their daemons. (Though, see “Restarting Services” for a note on that). The Openstack Upgrade Guide lists the order you should apply database migrations in.

Generally the command to do the database migrations is something like “nova-manage db sync” or “neutron-db-manage … upgrade kilo”; they vary from service to service. In newer Openstack releases, there is a unified command (“openstack”) which might just be able to do all of them.

Once you’re finished upgrading the controller, upgrade nova-compute on the compute nodes too, updating their configuration files if appropriate.

When Things Go Wrong

First, don’t panic– you made a backup, you can always roll back.

Don’t be afraid to ask for help, either #jhuacm / admins@, where people who have done this before will probably be happy to give you whatever advice they can, or even Openstack upstream. If it’s a packaging problem you can yell at the Debian maintainers of Openstack too.

Restarting Services

The Openstack daemons need to be restarted in a specific order. There are a lot of them and nobody in the ACM knows what that order actually is.

However, the init system does. So when you’re all finished I recommend just rebooting gomes. It will restart all the Openstack daemons in the right order.

Testing the Upgrade

Make sure it’s still possible to start, stop, and schedule VMs once you’re all done. Also make sure that a user can

You might need to cherry-pick patches from minor releases of Openstack that aren’t available in the archive.gplhost repositories.

Assuming it all worked, declare success!