ARC3

Operating system

ARC3 is the third phase of the ARC service here at Leeds and provides a Linux-based HPC service, based on the CentOS 7 distribution.

Hardware

ARC3 Rack Layout
ARC3 Rack Layout
Initial configuration
Purpose Item Description
Compute All compute are Broadwell E5-2650v4 CPUs: Clock rate for non-AVX instructions is 2.2GHz and for AVX instructions is 1.8GHz. memory bandwidth per core is 800MHz/core System will turbo where it can. Using fewer cores will mean active cores can turbo more. The are 3 types of node.
Standard nodes 165 nodes with 24 cores and 128GB of memory; and a solid state disk within the node with 100GB of storage.
High memory nodes 2 nodes with 24 cores and 768GB of memory; and a hard disk drive within the node with 800GB of storage.
GPGPU nodes 2 nodes with 24 cores and 128GB of memory; and a hard disk drive within the node with 800GB of storage and each node has 2 NVIDIA K80’s.
Storage Lustre Two fail-over pairs delivering 4GB/s via the InfiniBand network to 350TB usable storage on /nobackup.
Network InfiniBand The compute nodes are connected with a FDR of 56Gbit/s vs. a QDR of 40Gbit/s in a 2:1 blocking topology, built up from non-blocking islands of 24 nodes.
Gigabit Management and general networks facilitating system boot. All other traffic carried over the InfiniBand Network.

The best Intel compiler flag to use is:

Broadwell should be better at maintaining performance when mixing AVX/non-AVX instructions in the same program.

Network Topology

There are 2 login nodes and 2 head nodes. All nodes are connected to the InfiniBand network and use it to transfer all user data.

ARC3 Network Topology
ARC3 Network Topology

Lustre File System

A large amount of infrastructure is dedicated to the Lustre parallel filesystem, which is mounted on /nobackup. This is accessed over infiniband, and is configured to deliver ~4GB/s from a 350TB filesystem. It is possible to tune the filesystem in a more-extreme (or conservative) manner, however this configuration achieves a sensible compromise between data integrity and performance.