Venti Archival Server

Extracting metadata from Venti

We might want to gather information about our venti setup from time to time. Here are some example commands for easy tweaking. These assume you have VHTTP set to something sensible; on Chicago, for example, we would use:

VHTTP=http://localhost:17035

Just extract a storage summary: wget -q -O- $VHTTP/storage.

Filter the index dump to have human readable dates instead of date stamps. (perl golfed a bit by as_df of #cslounge; credit where it’s due):

wget -q -O - $VHTTP/index \
  | perl -pe 's/(created|modified)=\K\d+/localtime$&/ge'

Sorted list of differences between mutation and creation time of sealed arenas, in human-readable format:

wget -q -O - $VHTTP/index \
   | perl -n -e 'if (/created=(\d+) modified=(\d+).*sealed.*/) { print $2-$1, "\n"; }' \
   | sort -n \
   | perl -p -e 'BEGIN{ use Time::Duration; } $_ = duration($_)."\n";'

Sanity checking

A simple way to validate a VAC file (if there’s any output, be alarmed):

/usr/local/plan9/bin/venti/copy -m 'tcp!localhost!venti' 'tcp!localhost!venti' `cat $FILE`

The contents of a vac file can be investigated with:

/usr/local/plan9/bin/unvac -h 'tcp!localhost!venti' -t -v $FILE

(Though note that because we tend to dump rabinsplit AFS dumps into Venti, you’re only going to get two levels of meaningful structure (YYYY/MMDD[.NTH]) before it’s just a pile of numbers.

Adding storage to Venti

Because you’re mucking about with the Venti archival store itself, please make sure that you have made off-machine backups AND an on-machine ZFS snapshot: “zfs snapshot z/venti/arenas@`date +%Y%m%D`” is so easy and you can always destroy it later (though its overhead is likely minimal!)

Create a new arena file (the size here is somewhat arbitrary):

(cd arenas; dd if=/dev/zero of=${NEXT_ARENA_NUMBER} bs=1 count=0 seek=68719476736)
(cd arenas; /usr/local/plan9/bin/venti/fmtarenas arenas${NEXT_ARENA_NUMBER} ${NEXT_ARENA_NUMBER}

Stop the venti service, update the configuration, and rebuild the index in append mode:

sv stop /etc/service/venti

echo "arenas /venti/arenas/${NEXT_ARENA_NUMBER}" >> venti.conf
/usr/local/plan9/bin/venti/fmtindex -a venti.conf

sv start /etc/service/venti
sleep 5
sv stat /etc/service/venti

Adding index space to Venti (or rebuilding the index)

This step will take notably longer than just adding storage. Add a new index section (try to keep the index ~10% of the storage):

(cd index; dd if=/dev/zero of=${NEXT_INDEX_NUMBER} bs=1 count=0 seek=5368709120)
(cd index; /usr/local/plan9/bin/venti/fmtisect ${NEXT_INDEX_NUMBER} ${NEXT_INDEX_NUMBER})

Stop the venti service, update the configuration, and rebuild the index from scratch:

sv stop /etc/service/venti

echo "isect ${NEXT_INDEX_NUMBER}" >> venti.conf

for i in index/*; do /usr/local/plan9/bin/venti/fmtisect $i $i; done
/usr/local/plan9/bin/venti/fmtindex venti.conf

/usr/local/plan9/bin/venti/buildindex -b venti.conf

sv start /etc/service/venti
sleep 5
sv stat /etc/service/venti

Growing the Bloom Filter

Venti uses a bloom filter to help it quickly decide that a block is new, saving it a trip to the index. Occasionally, this bloom filter needs to grow and be re-computed. Something like this should be useful:

sv stop /etc/service/venti

dd if=/dev/zero of=bloom bs=1024 count=${NEW_BLOOM_BLOCKS}

/usr/local/plan9/bin/venti/fmtbloom bloom
/usr/local/plan9/bin/venti/buildindex -b venti.conf

sv start /etc/service/venti
sleep 5
sv stat /etc/service/venti