Migrating to Gluster with existing data

I’ve always run NFS mounted storage for my media from NAS boxes (OK, they’re just Ubuntu server installs with NFS exports…) and this has worked fine. In the past year I’ve watched the available disk space on one NAS box shrink more than the others as the media stored on it was either released more often, or was released in increasingly higher quality. Clearly I needed more space but I’m not quite ready to go and buy bigger disks just yet.

I needed a way to spread out the media over the different NAS boxes. Initially I created the same directories on the other NAS boxes and copied around some media manually, NFS mounted the new directories on the Kodi machine (/mnt/media1, /mnt/media2, etc) and told Kodi that all of those directories contained the same kind of thing. This worked OK, but it can become a bit of a pain to manage and balance between the NAS boxes.

Enter Gluster. Reading up on Gluster, it seemed like distributed mode might help with my situation without losing any disk space. Each of the NAS boxes runs a RAID5 array, with LVM to manage the space should I want to grow the array in the future:

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sdd1[3] sda1[4] sdb1[0]
8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
unused devices:

--- Physical volume ---
PV Name /dev/md0
VG Name storage
PV Size 8.19 TiB / not usable 1.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 2146094
Free PE 0
Allocated PE 2146094

--- Logical volume ---
LV Path /dev/storage/storage
LV Name storage
VG Name storage
LV Write Access read/write
LV Creation host, time ,
LV Status available
# open 1
LV Size 8.19 TiB
Current LE 2146094
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 6144
Block device 252:0

Yep, my lv and vg names are not creative at all.

I started by installing Gluster from the PPA suggested on the Gluster install page on each server:

sudo apt install software-properties-common
sudo add-apt-repository ppa:gluster/glusterfs-3.8
sudo apt update
sudo apt install glusterfs-server

And then from one of the servers, I used the peer probe command to discover the other NAS boxes with Gluster installed:

sudo gluster peer probe
sudo gluster peer probe

Check that the peers are in the cluster and connected:

sudo gluster peer status
Number of Peers: 2
State: Peer in Cluster (Connected)
State: Peer in Cluster (Connected)

From memory, NFS was causing a few problems when creating the volume (it was complaining about exports) so I commented the lines in /etc/exports and restarted nfs-kernel-server. Since we’re just making a basic distributed volume, the following command should work fine:

sudo gluster volume create media force
sudo gluster volume info
sudo gluster volume start media

I had to use force as the folders I am using already contain data. I moved everything into the sub directory “brick” on each host since I don’t want “lost+found” getting modified. I mounted this volume on my Kodi machine to see what things looked like:

user@kodi:~$ sudo mount -t glusterfs /mnt/media

Unfortunately it looked as though many files were missing. I first tried changing into a few directories – if I chose a directory that wasn’t listed but should have been it still worked! It turns out that triggers Gluster to recognise the directory, creating the metadata file and making it visible. It was time to figure out how to get all of the files to be recognised.

My first thought was maybe a rebalance would cause Gluster to look at every file while it figures out where to move them to even up the disk space.

sudo gluster volume rebalance media start
sudo gluster volume rebalance media status

This balanced the disk space nicely, but it didn’t trigger any kind of metadata creation that I wanted. Some websites suggested using “fix-layout” but this didn’t help either.

Thinking back, accessing a file/folder was enough to trigger the metadata creation. Perhaps there is a way to trigger this for all existing files. On each NAS I ran:

sudo mount -t glusterfs /mnt
cd /media/storage/brick
find . -exec stat '/mnt/{}' \;

Basically, mount the GlusterFS volume in /mnt (or wherever you want), cd into the directory with the original files you want listed in the Gluster volume, perform a find against everything in there executing a stat against each of the files – but not within /media/storage/brick, within /mnt/ where you’ve mounted the Gluster volume. By doing this on each server, you’re accessing each and every one of those files and folders causing Gluster to create the metadata and that makes the data appear in the Gluster volume. Hooray!

I hope this helps someone else who has tried to migrate to GlusterFS in place. I’ve only tried the above steps on a distributed volume and none of the other types.