Moving some home directories around

Over the summer we had to do some back-end work to make user’s lives slightly better, by replacing the servers the home directories were served from (let’s call them the UFAs) with something a bit newer and shinier (let’s call this the SAN). There were a few good reasons for this; the hardware was 13 years old, for a start. We had to do some consolidation work to tidy up home directories from users who were never going to return to the institution, and we needed to have a consistent policy on home directory creation.

Historically we’ve had really good service from the UFAs, with great bandwidth and throughput, and we’ve always said that a replacement service needs to at least match what we’ve had in the past. That’s the basis of all the hardware replacement we do in HPC ; whatever we put in has to provide at least as good a service as what it replaces. So, we did some testing, to make sure we knew what the replacement matrix should look like.

When this project started, back in 2014, initial testing wasn’t good; in fact the performance for just simple dd if=/dev/zero of= bs=1M count=1024 -type on a single node, single threaded was considerably worse than on the UFAs. However, with a newer SAN and the right mix of NFS mount options and underlying filesystems – work carried out by the servers and storage team, who did an excellent job – we were able to get an improvement on some standard tasks, like extracting a standard application.

bar chart
fig 1 – how long does it take to extract OpenFOAM?

You’ll see that the time taken to extract a file was considerably larger on the replacement service. We found some interesting things; file async is a process where a server makes sure that a file has been written before it sends the acknowledgement that it’s got the file. In these cases turning async on made everything go much quicker, at the risk of data being lost – however we felt that risk was worth taking as the situation where that would occur would be most unlikely and we do have resilient backups. Single threaded performance was equivalent, and although multithreaded was not an improvement it was equivalent or better than writing to local NFS storage.

There’s also an interesting quirk relating to XFS being a 64-bit filesystem; a 32-bit application might get be told to use an inode that’s bigger than it knows how to handle, which returns an IO error. We needed to do a quick bit of work to make sure there weren’t that many 32-bit applications still being used (there are, but it’s not many, and we have a solution for users who might be affected by this – if you are, get in touch).

In the end a lot of hours were spent on the discovery phase of this project, then as we entered the test phase (Mark & I started using the new home directory server about a month before everybody else) we found a few issues that needed sorting, especially with file locking and firewalls. Once that was sorted there was a bunch of scripting that needed to happen so human error was minimised (one of the nice things about being a HPC systems admin is that you very quickly learn how to programatically do the same task multiple times), and we needed to tidy up the user creation processes – some of which have been around since the early 00s. The error catching and “unusual circumstances” routines – as you’d expect – made up the bulk of that scripting!

We’ve gone from 29 different home directory filesystems to three; performance is about the same, and quotas are larger. We’ve done all the tidying up work that means future migrations will go smoother, and although there was a bit of disruption for everybody it was all over quickly and relatively painlessly for the users (which is the most important thing). We are still keeping an eye on things, too.

Huge thanks are due to everybody in the wider IT Service who helped out.