After I learned what is lvm, I started to use it every time. Creating volumes, resizing them, snapshots,… They are wonderful tools for managing disks. Especially msdos disk label sucks. After gpt I didn’t change my mind. Especially using lvm at at libvirt feels like I am the god of the disks.

However one month ago I started using ZFS. In centos it is not official repo. However I add zfsonlinux repo and start using zfs. It is very important that both install kernel-devel and zfs at the same time or install kernel-devel before zfs. Beacuse there is no dependency between zfs and kernel-devel however, zfs needs to be compiled with kernel. Hence if your system is not up-to-date, please update and reboot your system, if there is a kernel update. If you are an advanced user, you can also install kernel-devel package with the version you are using at the kernel.

I have four 1TB hard disks. I configured them with RAID10 and lvm and I don’t have any backup system. And I also have a lot of data on my existing system. However I migrate to ZFS on-the-fly. For my system it is easy, but it is not always possible.

In 4 disks raid10 you have 2 identical strips with 2 disks. Assume disks and raid are

/dev/sdb
/dev/sdc
/dev/sdd
/dev/sde

/dev/md100

For which disks I need to fail and remove I used to check configuration of raid

cat /proc/mdstat
...
md100 : active raid10 sdb[0] sdc[1] sdd[2] sde[3]
xxxxxxx blocks xxK chunks 2 near-copies [4/4] [UUUU]
...

From these output I know that raid uses near-copies that sdb and sdc are identical and sdd and sde are identical. I failed and removed one of strips.

mdadm /dev/md100 -f /dev/sdc
mdadm /dev/md100 -f /dev/sde
mdadm /dev/md100 -r /dev/sdc
mdadm /dev/md100 -r /dev/sde

After that I have 2 empty disks. Then I create two sparse files with size 1TB and setup loop device.

dd if=/dev/zero of=/tmp/disk1.bin bs=1K count=1 seek=1G
losetup /dev/loop1 /tmp/disk1.bin
dd if=/dev/zero of=/tmp/disk2.bin bs=1K count=1 seek=1G
losetup /dev/loop2 /tmp/disk2.bin

Sparse files advantage is they do not use any disk area if any data is written on them. As an example /tmp/disk1.bin and /tmp/disk2.bin uses only 2KB space on /tmp. After that I create a raidz2 zfs pool with 2 real disks and 2 loop disks and then mark two loop disks offline.

zpool create myzfs raidz2 /dev/loop1 /dev/loop2 /dev/sdc /dev/sde
zpool offline myzfs /dev/loop1
zpool offline myzfs /dev/loop2

Creating a zfs pool with name myzfs creates default zfs file system and mounts it to the /myzfs folder. After creating pool you can create zfs file systems and volumes (same as lvm logical volumes). You can configure mount point of zfs file system and quota and volume size of zfs volume. I create a temporary zfs volume for data migration without quota and mount point as /mnt/temp-data and I rsync my data.

zfs create -o mountpoint=/mnt/temp-data myfs/temp-data
rsync -ar /mnt/data /mnt/temp-data

Then my cpu start using 100%. Then I stopped rsync. After some research I need to change xattr property to sa if I am using centos. Then continue rsync (good cpu).

zfs set xattr=sa myzfs
rsync -ar /mnt/data/ /mnt/temp-data

However I started using so much memory. I learned that zfs uses memory so much (1GB for each 1TB). ZFS has two cache: primary and secondary and for caching zfs has tree values: all, metadata and none. I set primary cache to metadata.

zfs set primarycache=metadata myzfs

Memory consumptions is now lower. However speed is low. I need a secondary cache. Some research of other zfs configurations in the world. I seed that in server environment admins uses ssd disks for cache (read cache) and log (like write cache). I bought and ssd disk and create two partition one for cache and one for log. And attached them to my pool.

zpool add myzfs log /dev/sdf1
zpool add myzfs cache /dev/sdf2

I reached good speeds. After rsync of my data, I fully removed raid setup and replace loop disks.

mdadm --stop /dev/md100
zpool replace myzfs /dev/loop1 /dev/sdb
zpool replace myzfs /dev/loop2 /dev/sdd
losetup -d /dev/loop1
losetup -d /dev/loop2
rm -f /tmp/disk1.bin /tmp/disk2.bin

Now zpool start recovering the pool.

Migration is completed.

After that, I started to planning and creating file systems and volumes on my zpool.

For next days I will write about virtual machine cloning with zfs.