Overview of the MARC1 facility
The latest phase of the ARC service here at Leeds offers an additional Linux-based HPC cluster, based on the CentOS 6 distribution. Due to the way this cluster is funded, there is a slightly different application process to the main HPC service. The machine is prioritised for researchers associated with LIDA (Leeds Centre for Data Analytics) whose research is broadly involved with:
- Medical and Bioinformatics research
- Consumer data analytics
The batch scheduler operates in a similar way to ARC2, with significant improvements to requesting resources for parallel and large-memory jobs.
MARC1 consists of a cluster of HP based servers and storage. A schematic of the rack layout is below, which is separated into a high density component geared towards computation and a low-density portion providing mainly infrastructure:
|Compute: Standard||HP BL460 blade||Each blade houses one Intel Haswell node. Each node is dual socket with a 10-core Intel E5-2660v3(2.6GHz) processor per socket (20 cores per node); 256GB of DDR4 2133MHz memory per node (configured as 16 x 16Gb); a 500Gb hard drive and QDR Connect-X Infiniband. CPUs are AVX/AVX2 capable||57 blades; 114 CPUs; 1140 cores|
|Compute: Large Memory||HP BL460 blade||Each blade houses one Intel Haswell server node. Each node is quad socket with a 12-core Intel E7-4860v2(2.6GHz) processor per socket (48 cores per node); **3TB** of DDR3 1866MHz memory per node (configured as 96 x 32Gb); a 500Gb hard drive and QDR Connect-X Infiniband. CPUs are AVX capable||2 blades; 8 CPUs; 96 cores|
|Storage||Lustre||Two fail-over pairs delivering 6GB/s via the InfiniBand network to ~520TB usable storage on /nobackup||~520Tb|
|Home||Independent home directories not shared with other HPC clusters||20GB per user|
|Gigabit||Management and general networks facilitating system boot. All user traffic is carried over the InfiniBand network|
In common with other HPC clusters at Leeds, MARC1 is configured with:
- A single general purpose queue
- A maximum 48-hour job run time
- Automatic file expiry on /nobackup if files have not been used or accessed for 90 days
By default, jobs will be executed on the standard 20 core/ 256GB nodes. To execute jobs on the large memory nodes, add the following line to the submission script:
$ -l node_type=48core-3T
All user-facing nodes (login and compute) are connected to the InfiniBand network and use it to transfer all user data. This is a layered network, with the latency of communication dependent upon the number of switch hops required to route between the source and destination devices. The diagram below shows the cluster’s topology, sometimes described as a half-clos network (click for larger version):
Each server has a 4X quad-data-rate (QDR) connection which can send and receive data at 3.6GB/s. Each switch has two 4X QDR links up to the core, able to transfer data at ~8GB/s.
The latency between servers connected to the same switch is around 1.1 microseconds. Between servers connected to different switches, the latency is around 1.5 microseconds.
By default, jobs will be allocated to any compute node in the cluster. The following parameter can be used for a job to be given a better distribution of nodes:
|-l placement=scatter||Ignore network topology and run anywhere potentially introducing more latency than necessary to all communications (default)|
|-l placement=optimal||Minimises number of switch hops|
Lustre file system
A large part of the HPC cluster is a special parallel filesystem (using Lustre), which is mounted as /nobackup . This is connected to the compute nodes by the Infiniband network, and can transfer data at approximately 6GB/s. Although this can be used to store data and code, it is not backed up and any file that is not used for 90 days will be automatically deleted. Users are given reminders two weeks and then one week before a scheduled deletion.
User home directories
Accounts on MARC1 have a separate ‘HOME’ directory to all the other HPC clusters. The initial quota is 20GB.