The Tao of Ops: Managing installed server packages

Every Operations Team needs to maintain the system packages installed on their servers. There are various paths toward that goal, with one extreme being to track the packages manually - a tedious, soul-crushing endeavor even if you automate it using Puppet, Fabric, Chef, or (our favorite at &yet) Ansible.

Why? Because even when you automate, you have to be aware of what packages need to be updated. Automating "apt-get upgrade" will work, yes - but you won't discover any regression issues (and related surprises) until the next time you cycle an app or service.

A more balanced approach is to automate the tedious aspects and let the Operations Team handle the parts that require a purposeful decision. How the upgrade step is performed, via automation or manually, is beyond the scope of this brief post. Instead, I'll focus on the first step: how to gather data that can be used to make the required decisions.

Gathering Data

The first step is to find out what packages need to be updated. To do that we will use the operating system's package manager. For the purposes of this post I'll use the apt utility for Debian/Ubuntu and yum for RedHat/Centos.

apt-get -s dist-upgrade
yum list updates

Apt will return output that looks like this:

Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
  libxfixes-dev
The following packages will be upgraded:
  base-files openssl tzdata
3 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Inst base-files [6.5ubuntu6.7] (6.5ubuntu6.8 Ubuntu:12.04/precise-updates [amd64])
Conf base-files (6.5ubuntu6.8 Ubuntu:12.04/precise-updates [amd64])
Inst tzdata [2014c-0ubuntu0.12.04] (2014e-0ubuntu0.12.04 Ubunt
Inst openssl [1.0.1-4ubuntu5.14] (1.0.1-4ubuntu5.17 Ubuntu:12.04/precise-security [amd64])

Yum will return output that contains:

Updated Packages
audit.x86_64       2.2-4.el6_5       rhel-x86_64-server-6
audit-libs.x86_64  2.2-4.el6_5       rhel-x86_64-server-6
avahi-libs.x86_64  0.6.25-12.el6_5.1 rhel-x86_64-server-6

Both of these tools provide the core data we need: package name and version. Apt even gives us a clue that it's a security update - the presence of "-security" in the repo name. I imagine that yum can also provide that, I just haven't found the proper command line argument to use.

The Next Step

Having this data is still not enough – we need to gather, store, and then process it. - To that end I'll share a small Python program to parse the output from apt so the data can be stored. At &yet we use etcd for storage, but any backend data store will suffice. Processing the data for each server reflects the second step of our path - reducing the firehose of data into actionable parts that can then be carried along the path for the next step.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/usr/bin/env python
import json
import datetime
import subprocess
import etcd

hostname = subprocess.check_output(['uname', '-h'])
ec       = etcd.Client(host='127.0.0.1', port=4001)
normal   = {}
security = {}
output   = subprocess.check_output(['apt-get', '-s', 'dist-upgrade'])
for line in output.split('\n'):
    if line.startswith('Inst'):
        items      = line.split()
        pkgName    = items[1]
        oldVersion = items[2][1:-1]
        newVersion = items[3][1:]
        if '-security' in line:
            security[pkgName] = { 'old': oldVersion, 'new': newVersion }
        else:
            normal[pkgName] = { 'old': oldVersion, 'new': newVersion }
data = { 'timestamp': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
         'normal': normal,
         'security': security,
       }
key = '/packages/%s' % hostname
ec.write(key, json.dumps(data))

When you run this, you will get an entry in etcd for each server, with a list of packages that need updating.

The remaining steps along the path are now attainable because the groundwork is done - for example, you can write other cron jobs to scan that list, check the timestamp, and produce a report for all servers that need updates. Heck, you can even use your trusty Ops Bot to generate an alert in your team chat channel if a server has gone more than a day without being checked or having a security update applied.

The point is this - if you're not monitoring, you are guessing. The tool above enables you to monitor your installed package environment and that's the first step along the many varied paths toward mastering your server environments.

You might also enjoy reading: