LXC and Docker DIY

For things that almost, but don’t quite, need their own real machine, we have docker and LXC for containerization.

LXC

Templates

Hey look, Linux evolved into a microkernel. If you’re feeling like doing a DIY LXC application container setup, here’s a rough sketch of a lot of the pieces.

Create a UID and GID map entry

This is straightforward, though coming up with numbers is up to you:

echo root:1000000:10 >> /etc/subuid
echo root:1000000:10 >> /etc/subgid

Those are “base” and “number” so these lines allocate through 1000009. You might consider putting these in /etc/passwd and /etc/group, too.

I tend to keep a UIDMAPS file next to the containers just to index containers to sub-id regions. Note that allocating 10 UIDs for each jail allows the use of UIDs 0 througn 9 in sensible ways internally, which may require creating /etc/passwd and /etc/group inside the container just so programs don’t get jumpy.

Make a Minimal Root

mkdir -p /path/to/root/filesystem
cd /path/to/root/filesystem
mkdir bin dev etc home lib lib64 opt proc root sbin sys tmp usr var
cd dev
mkdir mqueue pts shm
mknod console c 5 1
mknod full c 1 7
mknod null c 1 3
mknod pmtx c 5 2
mknod random c 1 8
mknod urandom c 1 9
mknod zero c 1 5
cd ..
cp /etc/{localtime,resolv.conf} etc
mkdir var/tmp
chmod 1777 var/tmp tmp
chown -R 1000000:1000000 .

Template LXC Configuration File

lxc.utsname = template
lxc.rootfs = /path/to/root/filesystem
lxc.network.type=none
lxc.pts=1024
lxc.id_map = u 0 1000000 10
lxc.id_map = g 0 1000000 10
lxc.mount.entry=/lib lib none ro,bind 0 0
lxc.mount.entry=/bin bin none ro,bind 0 0
lxc.mount.entry=/usr usr none ro,bind 0 0
lxc.mount.entry=/sbin sbin none ro,bind 0 0
lxc.mount.entry=/lib64 lib64 none ro,bind 0 0
lxc.mount.entry=tmpfs /dev/shm tmpfs  defaults 0 0

Wrap your service

Write a shell script along these lines:

#!/bin/sh
exec >/dev/null 2>&1
exec lxc-execute -n servicename -f /path/to/config.conf -- \
  /path/to/service args

You can, of course, just run “bash” to get an interactive prompt in the container environment.

We tend to use different -n arguments for different processes running inside the same container filesystem (e.g. for the KDC container on Chicago, we have “-n kdc_krb5kdc” and “-n kdc_kpropd” with appropriate invocations).

It’s also worth noting that you can either put a keytab in the container’s filesystem and work with that, or, if you’re feeling paranoid, can in fact grab a PAG before running lxc-execute and get the AFS tokens within all the same. That is:

k5start -f /etc/krb5.keytab -U -t -k KEYRING:session -- \
  lxc-execute -n testing -f /r/lxc/template.conf -- \
  keyctl show

will show both an afs_pag key and a full Kerberos keyring. There’s not much point to that, perhaps, as the TGT (though not the keytab) is available to the container via the keyring, but

k5start -f /etc/krb5.keytab -U -t -k /tmp/krb5cc_for_service -- \
  lxc-execute -n testing -f /r/lxc/template.conf -- \
  keyctl show

is a lot more interesting: the inner process has no idea that kerberos is involved in getting its AFS rights and has no access to key material.

Specific Examples

Running OpenAFS within a LXC container is documented (initially by one of us) at http://wiki.openafs.org/InstallingOpenAFSinLXC/ .