Venti and AFS

Overview

On Chicago, we run a program called Venti for all our long-term archives. It’s a content-addressable block store, allowing for efficient storage of slowly-changing contents like homedirs and service backups. Or at least, that’s the theory.

In any case, the basic automation pipeline is something like this:

  • On a relatively frequent schedule, AFS volumes are released to Chicago.

  • On a less frequent schedule, these volumes are dumped (with ‘vos dump’), processed, and archived. The full procedure is in fact:

    • For each volume on Chicago’s archival partition, vos dump feeds data to a rabinsplit program which generate a series of files that are then vac-ed into venti.
    • The rabinsplit program helps us with deduplication by recovering from non-block-sized insertions into the dump files. See its source for details, but the net result is that we use it to produce a directory of files, x/00000000, x/00000001, etc, which when concatenated in name-ascending-sorted order yields the stream produced by vos dump.
    • These directories are then fed to the Venti archival tool, vac, in a way that causes it to produce ‘archive files’. These are in Chicago’s /mnt/vicepa.dump directory and are simply pointers into the Venti store.

The automation is overseen by AFS BOS; venti is overseen by runit so that the entire AFS subsystem may be restarted without affecting Venti.

Looking at or restoring an archive file

The unvac tool can show us what’s inside an archive file; find a .vac file and run /usr/local/plan9/bin/unvac -h "tcp\!localhost\!venti" -t $VACFILE. The contents will be files named YYYY/MMDD/NNNNNNNN. Pick the particular YYYY/MMDD you want to extract and do so with /usr/local/plan9/bin/unvac -h "tcp\!localhost\!venti" $VACFILE YYYY/MMDD. Then it’s simply a matter of find YYYY/MMDD/* -type f | sort -n | xargs cat | vos restore ... to bring things back. (The use of find | sort | xargs is because we may create archives whose list of files would overflow the maximum command line length limits; xargs manages that for us this way.)

Restoring from archive without nuking an exisiting volume

If a user has asked for an emergency restore but does not want their home directory clobbered, consider creating, mounting, and restoring a new volume for them. Something like

vos create $SERVER $PARTITION recover.$USER
fs mkm ~$USER/acmsys/recover recover.$USER.readonly
find YYYY/MMDD/* -type f | sort -n | xargs cat | vos restore $SERVER $PARTITION recover.$USER -readonly

And then ‘vos remove’ the partition when the user has gotten their files back.

Manually inserting a dump into the archive

The easiest thing to do is to vos release ${VOLUMENAME}

Todo

And then what?