ZFS on Linux

ZFS is a fantastic filesystem developed by Sun. Compared to other filesystems, it’s quite interesting as it combines both a filesystem and a logical volume manager. This allows you to get great flexibility, features and performance. It supports things like integrated snapshots, native NFSv4 ACL support and clever data integrity checking.

I’m now running a HP ProLiant MicroServer N36L which is a small NAS unit containing a 4-bay SATA enclosure. It has a low-performance AMD CPU, and comes with 1GB RAM and a 250GB harddisk. I’ve upgraded mine to 4GB of RAM and 4 x 2TB Seagate Barracuda drives.

The benefit of these units are that they’re a standard x86 machine allowing you to easily install any OS you like. They’re also really cheap and often have cash-back promotions.

I bought mine when I was in the UK and I brought it back with me to Australia. I waited until I got back to upgrade it so save me the trouble of shipping the extra harddisks on the ship.

In this post, I’ll document how to easily install ZFS on Debian Wheezy and some basic ZFS commands you’ll need to get started.


UPDATE: ZFS on Linux now has their own Debian Wheezy repository! http://zfsonlinux.org/debian.html

Install the ZFS packages

# apt-get install debian-zfs

This should use DKMS to build some new modules specific to your running kernel and install all the required packages.

Pull the new module into the kernel
# modprobe zfs

If all went well, you should see that spl and zfs have been loaded into the kernel.


Prepare disks

ZFS works best if you give it full access to your disks. I’m not going to run ZFS on my root filesystem, so this makes things much simpler.

Find our ZFS disks. We use the disk ID’s instead of the standard /dev/sdX naming because it’s more stable.
# ls /dev/disk/by-id/ata-*
lrwxrwxrwx 1 root root 9 Jan 21 19:18 /dev/disk/by-id/ata-ST2000DM001-1CH164_Z1E1GYH5 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jan 21 08:55 /dev/disk/by-id/ata-ST2000DM001-9YN164_Z1E2ACRM -> ../../sda
lrwxrwxrwx 1 root root 9 Jan 21 08:55 /dev/disk/by-id/ata-ST2000DM001-9YN164_Z1F1SHN4 -> ../../sdb

Create partition tables on the disks so we can use them in a zpool:
# parted /dev/disk/by-id/ata-ST2000DM001-9YN164_Z1E2ACRM mklabel gpt
# parted /dev/disk/by-id/ata-ST2000DM001-9YN164_Z1F1SHN4 mklabel gpt
# parted /dev/disk/by-id/ata-ST2000DM001-1CH164_Z1E1GYH5 mklabel gpt


Create a new pool

ZFS uses the concept of pools in a similar way to how LVM would handle volume groups.

Create a pool called mypool, with the initial member being a RAIDZ composed of the remaining three drives.
# zpool create -m none -o ashift=12 mypool raidz /dev/disk/by-id/ata-ST2000DM001-1CH164_Z1E1GYH5/dev/disk/by-id/ata-ST2000DM001-9YN164_Z1E2ACRM/dev/disk/by-id/ata-ST2000DM001-9YN164_Z1F1SHN4

RAIDZ is a little like RAID-5. I’m using RAID-Z1, meaning that from a 3-disk pool, I can lose one disk while maintaining the data access.

NOTE: Unlike RAID, once you build your RAIDZ, you cannot add new individual disks. It’s a long story.

The -m none means that we don’t want to specify a mount point for this pool yet.

The -o ashift=12 forces ZFS to use 4K sectors instead of 512 byte sectors. Many new drives use 4K sectors, but lie to the OS about it for ‘compatability’ reasons. My first ZFS filesystem used the 512-byte sectors in the beginning, and I had shocking performance (~10Mb/s write).

See http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives for more information about it.

# zpool list
mypool 5.44T 1.26T 4.18T 23% 1.00x ONLINE -

Disable atime for a small I/O boost
# zfs set atime=off mypool

Deduplication is probably not worth the CPU overheard on my NAS.
# zfs set dedup=off mypool

Our pool is now ready for use.


Create some filesystems

Create our documents filesystem, mount and share it by NFS
# zfs create mypool/documents
# zfs set mountpoint=/mnt/documents mypool/documents
# zfs set sharenfs=on mypool/documents

Create our photos filesystem, mount and share it by NFS
# zfs create mypool/photos
# zfs set mountpoint=/mnt/photos mypool/photos
# zfs set sharenfs=on mypool/photos

Photos are important, so keep two copies of them around
# zfs set copies=2 mypool/photos

Documents are really important, so we’ll keep three copies of them on disk
# zfs set copies=3 mypool/documents

Documents are mostly text, so we’ll compress them.
# zfs set compression=on mypool/documents


ZFS pools should be scrubbed at least once a week. It helps balance the data across the disks in your pool and to fix up any data integrity errors it might find.
# zpool scrub <pool>

To do automatic scrubbing once a week, set the following line in your root crontab
# crontab -e
30 19 * * 5 zpool scrub <pool>

Coming soon is a follow-up to this post with some disk fail/recovery steps.

Posted in Linux at October 7th, 2013. 9 Comments.

Automating Debian installs with Preseeding

Following on from my post about building Debian virtual machines with libvirt, I’ve now got automated installations of Debian Lenny using the preseeding method. Coupling this with using virt-install, I can have a Debian virtual machine installation in only a few minutes. No questions asked.

The virt-install command contains an extra-args argument, where you can fill-in the specific parts of the preseeding. I don’t want to set an IP address in the file as it’s going to be used to build lots of machines, so I just specify that at install time. The URL part is where out preseed config file is stored. This obviously means that the machine needs to able to contact with webserver at install time to download the config.

$ NAME=debian-test
virt-install --name=${NAME} \
--ram=512 --file=/var/lib/xen/images/${NAME}.img \
--file-size 8 \
--nographics \
--paravirt \
--network=bridge:br0 \
--location=http://mirrors.uwa.edu.au/debian/dists/lenny/main/installer-i386 \
--extra-args="auto=true interface=eth0 hostname=${NAME} domain=vpac.org netcfg/get_ipaddress= netcfg/get_netmask= netcfg/get_gateway= netcfg/get_nameservers= netcfg/disable_dhcp=true url=http://webserver/preseed.cfg"

To get an idea of the contents of the preseed config file, the best place to start is the Debian stable example preseed file. It lists lots of different options, with plenty of comments so you can understand what’s going on.

For me to get a fully-automated install, I used these options. It’s fairly standard, but definitely worth reading the comments about each line.

$ egrep -v "(^#|^$)" preseed.cfg
d-i debian-installer/locale string en_AU
d-i console-keymaps-at/keymap select us
d-i netcfg/choose_interface select eth0
d-i netcfg/disable_dhcp boolean true
d-i netcfg/dhcp_options select Configure network manually
d-i netcfg/confirm_static boolean true
d-i mirror/protocol string http
d-i mirror/country string manual
d-i mirror/http/hostname string mirrors.uwa.edu.au
d-i mirror/http/directory string /debian
d-i mirror/http/proxy string
d-i clock-setup/utc boolean true
d-i time/zone string Australia/Melbourne
d-i clock-setup/ntp boolean true
d-i clock-setup/ntp-server string ntp.vpac.org
d-i partman-auto/method string regular
d-i partman-lvm/device_remove_lvm boolean true
d-i partman-md/device_remove_md boolean true
d-i partman-lvm/confirm boolean true
d-i partman-auto/choose_recipe select atomic
d-i partman/confirm_write_new_label boolean true
d-i partman/choose_partition select finish
d-i partman/confirm boolean true
d-i passwd/make-user boolean false
d-i passwd/root-password-crypted password [MD5 Sum of the password]
tasksel tasksel/first multiselect standard
d-i pkgsel/include string openssh-server vim puppet
popularity-contest popularity-contest/participate boolean false
d-i grub-installer/only_debian boolean true
d-i grub-installer/with_other_os boolean false
d-i finish-install/reboot_in_progress note

Some good resources I found, which might help you are:

Posted in Geek, Linux, Work at September 18th, 2009. 3 Comments.

How do you clone an LVM partition?

It’s actually more difficult than you might think. From the bit of googling that I did, it seems that you can’t just ‘clone’ and LVM logical volume, while it’s running.

One method I found was to use the ‘snapshot’ feature of LVM to create a ‘frozen image’ copy of the logical volume, which is then suitable for copying to a new logical volume, while leaving the original intact.

Here’s our original logical volume that we want to clone.

# lvdisplay
--- Logical volume ---
LV Name                /dev/vg/host-disk
VG Name                vg
LV UUID                UK1rjH-LS3l-f7aO-240S-EwGw-0Uws-5ldhlW
LV Write Access        read/write
LV Status              available
# open                 1
LV Size                9.30 GB
Current LE             2382
Segments               1
Allocation             inherit
Read ahead sectors     0
Block device           254:0

Let’s now create our snapshot logical volume. For the size, it should only need 10 – 20% of the original size, as we’re only mirroring the real volume.

# lvcreate --size 2G --snapshot --name host-disk-snap /dev/vg/host-disk

Let’s take a look at our new volume

# lvdisplay
--- Logical volume ---
LV Name                /dev/vg/host-disk
VG Name                vg
LV UUID                UK1rjH-LS3l-f7aO-240S-EwGw-0Uws-5ldhlW
LV Write Access        read/write
LV snapshot status     source of /dev/vg/host-disk-snap [active]
LV Status              available
# open                 1
LV Size                9.30 GB
Current LE             2382
Segments               1
Allocation             inherit
Read ahead sectors     0
Block device           254:0
--- Logical volume ---
LV Name                /dev/vg/host-disk-snap
VG Name                server1
LV UUID                9zR5X5-OhM5-xUI0-OolP-vLjG-pexO-nk36oz
LV Write Access        read/write
LV snapshot status     active destination for /dev/vg/host-disk
LV Status              available
# open                 1
LV Size                9.30 GB
Current LE             2382
COW-table size         10.00 GB
COW-table LE           2560
Allocated to snapshot  0.01%
Snapshot chunk size    8.00 KB
Segments               1
Allocation             inherit
Read ahead sectors     0
Block device           254:5

From the output, you should be able to see that we’ve now got some snapshot fields shown in our output. We’ll create another logical volume, which will be our final target for our new virtual machine.

# lvcreate --size 10G --name newhost-disk vg

With our source and target partitions ready to go, we need to begin copying the data. You have some choices here, depending on your setup.

If you’re using the same size partitions you could use dd, or even xfs_copy if you’re using XFS.

If you’re like me, I wanted the new target partition to be a smaller size than the original. Also, if you wanted to use a different filesystem, the only real way to do it is to copy the files.

We’ll need to make the new filesystem on our target partiton

# mkfs.xfs /dev/vg/newhost-disk

and mount our filesystems

# mkdir /mnt/host-disk-snap
# mount -o ro /dev/vg/host-disk-snap /mnt/host-disk-snap
# mkdir /mnt/newhost-disk
# mount /dev/vg/newhost-disk /mnt/newhost-disk

I wasn’t sure about how changes to the filesystem would affect the original, so I thought I’d stick to the safe side, and mount it as read-only.

Change into the source filesystem

# cd /mnt/host-disk-snap

Using a mix of find and cpio, copy the files

# find . -mount -print | cpio -pdm /mnt/newhost-disk

Wait a few minutes, depending on your filesystem size, and you’re done.

When you’re satisfied, you can just use lvremove to remove your snapshot volume.

# umount /mnt/host-disk-snap
# lvremove /dev/vg/host-disk-snap

After all that, you should finally have a cloned filesystem to use. I’m sure there’s an easier way, but this worked for me.

Posted in Linux at October 17th, 2008. 11 Comments.

Mastering MythTV MPEG2 files to DVD

MythTV is a wonderful project, which I’ve been running for nearly 12 months. In the AFL season 2007, I was downloading the Geelong games from the internet. Unfortunately, you’re at the mercy of whomever was encoding the game, in terms of quality. By Geelong winning the premiership last season, most of the games were broadcasted on free-to-air TV and I was recoding them all using MythTV, with the intention of putting them on DVD later.

I tried using MythArchive, but it was clumsy and didn’t give me enough control of the DVD creation process. I wanted the video to fit exactly on a single sided DVD, with minimal loss of quality, so I had to the process manually.

The problem I found trying to encode MythTV’s mpeg2 files was that you would always get the the video and audio out of sync. The length of the broadcast was around 3 hours long, so by the end, the sound could be more than a second behind the video. I think this is due to missing little pieces of the data from the broadcast.

To fix the broadcasting issues, use ProjectX to split the video and audio
java -jar /usr/share/java/ProjectX.jar -demux -out <your tmp directory> -name <your output filename prefix> <input mythtv recording file>

You’ll see some output like this:

demuxing DVB MPEG-TS file 1010_20080802192500.mpg
!> PID 0x0 (PAT) (0 #1) -> ignored
!> PID 0x100 (PMT) (188 #2) -> ignored
ok> PID 0x200 has PES-ID 0xEA (MPEG Video) (376 #3)
ok> PID 0x240 has PES-ID 0xBD (private stream 1) (TTX)  (5264 #29)
ok> PID 0x28B has PES-ID 0xBD (private stream 1) (20680 #111)
ok> PID 0x28A has PES-ID 0xC0 (MPEG Audio) (84412 #450)
-> video basics: 720*576 @ 25fps @ 0.7031 (16:9) @ 9000000bps, vbvBuffer 95
-> starting export of video data @ GOP# 0
!> dropping useless B-Frames @ GOP# 0 / new Timecode 00:00:00.000
6 %!> PID 0x200 -> packet 2687051 @ pos. 505165400 out of sequence (15/11) (shifting..) (~00:12:34.480)
!> PID 0x200 -> packet 2687616 @ pos. 505271620 out of sequence (1/14) (shifting..) (~00:12:34.960)
12 %!> PID 0x200 -> packet 5943135 @ pos. 1117309192 out of sequence (12/6) (shifting..) (~00:27:44.080)
16 %!> PID 0x200 -> packet 7860852 @ pos. 1477839988 out of sequence (8/14) (shifting..) (~00:36:38.320)
17 %!> PID 0x200 -> packet 8427938 @ pos. 1584452156 out of sequence (6/10) (shifting..) (~00:39:16.720)

Once this is done, multiplex the Video and Audio files back together again
mplex -f 9 -o <your output filename> <video file>.m2v <audio file>.mp2

This should give us a nice clean MPEG2 file, which we can then cut all the ads out of. Rather than using something like nuvexport and making the ads in MythTV itself, I found it much easier to use Avidemux.

Load the video into Avidemux, then

  • Use the arrow keys and mousewheel to find cutting spots.
  • Use the A_> and <_B buttons to mark the start and finish of ads
  • Hit the delete key to cut the ads

Once you’re done, I always save a project file, in case Avidemux crashes.

You can then load the ‘DVD’ profile from the Auto -> DVD menu. Leave the ratios at 1:1.

Hit the calculator button, select Format: MPEG and Medium: DVD5. Hit apply, and then close.

Hit the configure button underneath the video part on the left. Configure the video output to Interlaced (I used TFF, but prob doesn’t matter) and 16:9 Aspect ratio.

Then, just save the new encoded video. On my AMD 3200+ machine, it takes somewhere around 5-6 hours to encode about 2.5 hours of video.

For me, I found that the Avidemux encoding sometimes failed in the 2nd pass at one of my defined cutpoints. To combat this, I opened the output video to the end of the file to find out which cutpoint the video failed encoding at. With this, I would open up my project file in Avidemux again, and remove a few extra frames either side of my original cut. This was usually enough to get it over the line.

Once this is done, use DeVeDe to create the DVD ISO file.

When adding the newly encoded MPEG PS file, make sure you open the properties of the file click ‘Advanced Options’, and in the Misc tab, select the checkbox ‘This file is already a DVD/xCD-suitable MPEG-PS file’. Don’t worry if DeVeDe tells you the file is 106%, it will fit on a 4.7 DVD.

Once this is done, you should have a nice ISO image which you can burn straight to disc.

Posted in Geek, Linux at August 31st, 2008. 2 Comments.

Finally getting everything to work on Gutsy

I had everything working quite well on my Macbook Pro (Core 2 Duo) with Ubuntu Feisty, but due to some badness from the shitty, shitty ATI fglrx driver, I couldn’t use Gutsy with everything working.

The new kernel in Gutsy moved from the SLAB allocator to the SLUB allocator. (Btw, I have no idea what that actually is..). So, this meant that I couldn’t put the machine into Suspend mode while using the ATI drivers. Although the RadeonHD driver works quite well, it means that I can’t actually play BZFlag.

To solve this, I ended up compiling my own custom kernel for Ubuntu, and switching it to use the SLAB allocator. I thought it would be appropriate to blog it here in case anyone else is interested.

I followed much of the instructions from the Macbook Pro page on the Ubuntu Wiki, but I built a deb package for my kernel. So have a look at the Unofficial ATI linux driver wiki page for installing on Ubuntu Gutsy.

Here we go:

Symlink the firmware directory. This is usually needed for the initrd.
sudo ln -sf /lib/firmware/2.6.22-14-generic /lib/firmware/

Install all the required packages
sudo apt-get install linux-source libncurses5-dev build-essential kernel-package fakeroot module-assistant build-essential dh-make debhelper debconf libstdc++5 linux-headers-generic

Extract the kernel source, import the old config and start the make config
cd /usr/src/
sudo tar -xvjpf linux-source-2.6.22.tar.bz2
sudo ln -sf linux-source-2.6.22 linux
cd linux
sudo cp /boot/config-2.6.22-14-generic .config
sudo make menuconfig

In the menu, browse to ‘General setup‘, then select ‘Choose SLAB allocator‘ at the last entry. Change this from ‘SLUB’ to ‘SLAB‘, then exit ‘General setup’.

Select ‘Processor type …‘ and ‘Processor family‘ and change the CPU from ‘Generic-x86-64′ to ‘Intel Core2 / newer Xeon‘, exit ‘Proccessor type …’.

Go to ‘Device Drivers‘ > ‘Sound‘ > ‘Advanced Linux Sound Architecture‘ > ‘PCI devices‘ and Hit the M key to enable the ‘Intel HD Audio‘ module.

I think that Ubuntu packages this driver as part of an extra modules package, but the in-kernel one works fine

Save the new config and exit.

UPDATE: Edit the file /etc/kernel-pkg.conf, and add the line:

This should make sure of both cores when compiling your new kernel. This will take it down from about 1 hour to still timing it, please update

Build your new kernel package
time make-kpkg --rootcmd fakeroot --uc --us --initrd binary

I think it’s a good idea to boot into your new kernel here. Make sure everything’s working before you move on.

Download the ATI driver installer ati-driver-installer-7-11-x86.x86_64.run

Build Ubuntu fglrx packages from the ATI driver script
sudo bash ati-driver-installer-7-11-x86.x86_64.run --buildpkg Ubuntu/gutsy

Install the fglrx packages
sudo dpkg -i xorg-driver-fglrx_8.433-1*.deb fglrx-kernel-source_8.433-1*.deb fglrx-amdcccle_8.433-1*.deb

Build the fglrx kernel module package.
sudo module-assistant prepare,update
sudo module-assistant build,install fglrx -f
sudo depmod -a

Install the fglrx kernel module
sudo dpkg -i fglrx-kernel-

Make sure you’ve got fglrx set in the device section of your xorg.conf.

That should do it.

Posted in Geek, Linux, Ubuntu at December 21st, 2007. 2 Comments.

Suspend to Ram on a MacBook Pro

I’m running Gentoo and Ubuntu Feisty on my MacBook Pro (Core 2 Duo), and it just refused to resume after a suspend to ram. Let this be a note to anybody else going through the frustration that I was.

In /etc/default/acpi-support, I changed POST_VIDEO from true to false.

# Should we attempt to warm-boot the video hardware on resume?

Works a treat now ;)

Posted in Geek, Linux at April 18th, 2007. 2 Comments.

Banshee 0.11.0 is rad


The new Banshee rocks. Not only does it do two-way iPod sync, but it also does cover art, iTMS (from a plugin) and Last.FM recommendations. I nearly forgot, it also does DAAP for sharing with iTunes over mDNS.
I made some ebuilds for Gentoo, and submitted a bug report for it.

Posted in Geek, Linux at September 20th, 2006. 2 Comments.

I scored a UPS!

I scored myself an APC Backup-UPS RS 500 the other day. It’s a small unit designed for running pretty much one machine on it.

APC Backup-UPS RS 500

The nice thing about this is that it has a USB interface which I have connected to my server hosting andybotting.com, and allows me to monitor it using apcupsd software. It even comes with some basic CGI scripts so I can monitor it online. If there is a power-out, then the UPS alerts the apcupsd, and at a certain point, can instruct the server to gracefully shut down.

Although, my ADSL modem isn’t plugged into it yet because it uses different plugs. I need to find some way of converting the IEC power socket to something I can plug my Australian power plug into. One possible solution might be a standard 4-port powerboard, but with an IEC plug on the end. I can’t say I’ve actually seen one though.

Posted in Geek, Gentoo, Linux at November 19th, 2005. 3 Comments.

An update for the last couple of weeks

Haven’t blogged for a while, so he’s a quick update of what’s been going on.
At work late tonight, helping out because of a function. It was supposed to be finished at 9pm, but it’s now 10:15pm and they’re still here. Let’s bring on the tear gas and tasers :)

We started getting pissed off with the builders across the lane from us, so we started hassling the council about it, and maybe something’s going to get done. I moved my webcam server home, and set up two cams outside our window, so at any time, I can see what they’re doing. You can check it out at http://andybotting.com/webcam. Whatever you do, don’t leave it running too long, it kills my bandwidth :)

My old man is going to salary sacrifice a new laptop for me. This is nice because it means that i’ll only have to pay about 52c in every dollar. I was tossing up between an IBM ThinkPad and an Apple 15″ PowerBook, but I think the PowerBook is going to be the choice. I set up a Wiki page on everything I could find about running Linux on the new hi-res (powerbook5,8 Oct 2005) model PowerBooks here. It seems that there are a couple of little issues (current kernels don’t have support for the ATA controller, AGP and the Gigabit Ethernet but can be patched), but is basically ok. I should be ordering it in 2 weeks time.

I broke the Segway at work (not sure how…) but it’s back today. They couldn’t tell me exactly what broke, but it was under warranty anyway. Looks like i’m going back to my old rotation at IP Voice next week. They’ve got plenty of work and need some more bodies, so I’m going back to help out.

Also, the fuckwits managing this company decided to Not go ahead with the graduate intake for 2006. This does mean that the grads who were promised jobs, and signed their contracts will now be sacked. The Age ran an article about it, and there is also a Whirlpool forum on the subject. If you’re wondering, Nelsie’s still got her spot for next year. Phew.

Jezza has done some terrific work on his new Melways frontend. You can now search for an address to bring up the melways map to fill your full brower window, and even switch to the corresponding Google Maps satellite image. It’s the best we’ve got until Google bring maps to Australia

… also, I found this which made me laugh.

Note to self: blog more often :)

Posted in Geek, Linux, Personal, Work at November 9th, 2005. 3 Comments.

andybotting.com system monitoring

I have been using LogWatch for a while now and I have been very impressed. It sends me a daily email (at about 3am) summarising the important parts of the logs that were generated throughout the day. It was actually LogWatch that tipped me off that something was not quite right when my server was compromised not long ago. Since then, I have been quite interesting in some system monitoring applications for linux so I can keep a close eye on what’s happening, so that if something bad happens again, I should know very quickly.

I had a poke around with LogWatch and found that it stores some configuration scripts in /etc/log.d/conf/services, and there are plenty of scripts there for a variety of services. I found that many of them were incorrectly set to monitor the wrong log files, and therefore were not sending me any information about them. I modified the httpd, amavis, openvpn and postfix to use the right logs, and I suddenly started getting information about these in my email. It can now tell me about how many spam emails it has dropped, how many emails have been sent and recieved and how many hits apache has had.

Another thing I have been playing with is Cacti, which is a PHP based SNMP monitoring tool. I was easily able to start monitoring simple things like the number of users currently logged in, available disk space, CPU load average and memory usage without any SNMP support, but once I recompiled both php and mod_php and installed net-snmp, then I was able to get all sorts of network interface statistics, which I find to be very informative. You can have a look at my stats here.

I’m also playing with Webalizer and Mailgraph to show me Apache and Postfix statistics. You can see them here and here.

Posted in Geek, Gentoo, Linux at July 8th, 2005. No Comments.