Skip to main content

Our supercomputer

Our supercomputer 'Hawk' can be accessed from anywhere with an internet connection.

About Hawk

Hawk is our latest High-Performance Computing (HPC) cluster, a system providing a major increase in capability compared to that of its predecessor, Raven, which ceased operation in October 2019. A refreshment to the Hawk core service is due to be delivered in 2024.

Since Hawk launched in August 2018, the overall system has experienced continued growth expanding by a factor of around 2.5 in less than two years.

This expansion is made possible by the architectural design of the system which, through the philosophy of a 'pluggable infrastructure' first demonstrated on Raven, has enabled the core partition available to all researchers, plus specific researcher funded sub-systems, to be integrated in highly efficient and robust fashion.

Our Core Hawk Linux cluster comprises of both Intel Skylake Gold and AMD nodes comprising dual AMD EPYC Rome 7502 processors (32 Zen-2 cores, 2.5 GHz).

This includes Intel Skylake Gold 6148 processors (2.4GHz / 4.8GB per core / 20 cores per processor) as the main parallel MPI partition (including a High Memory, SMP section), as well as dual processor AMD Rome 7502 nodes providing X86-64 capability, plus additional Intel Skylake Gold as a serial/high throughput subsystem. More recently acquired dedicated researcher partitions contain newer generation Intel and AMD prcoessors.

Accelerated performance is available through a combination of Nvidia V100 and P100 GPU nodes - resources in high demand from the growing Artificial Intelligence (AI) and Deep Learning (DL) community.

Technical specification

Brief overview:

  • current total number of cores on Hawk is +20,000
  • total memory across the entire cluster is +100 TB
  • total useable capacity of Lustre global parallel file storage is 1.2 PB
  • total useable capacity of NFS partition for longer-term data store is 480 TB
  • nodes are connected with InfiniBand EDR technology (100 Gbps / 1.0 μsec latency) from Mellanox.

The current cluster consists of cores comprising:

  • Intel Skylake Phase 1 core partition nodes
  • Intel Skylake dedicated researcher expansion
  • Intel Cascade Lake dedicated researcher expansion
  • Intel Broadwell and Haswell Raven migrated sub-system nodes
  • AMD Rome Phase 2 core partition nodes
  • AMD Milan and Rome dedicated researcher expansion

Core MPI Partition

225 x Intel Skylake nodes based on Intel Skylake Gold 6148 processors (2.4GHz / 4.8GB per core / 20 cores per processor). This includes:

  • Intel Xeon Gold 6148 (Skylake) 2.40GHz processors
  • 20 cores/socket (2.40GHz, 10.4GT/s, Turbo+, 150W) giving 40 cores per node
  • 4.8GB RAM per core (ECC DDR4 2666MT/s dual rank RDIMMs)
  • 120GB SSD disk
  • Single port Connect-X4 EDR PCIe Gen3-x8 Infiniband Interface.
  • 136 x standard compute nodes based on the Dell PowerEdge C6420 ½ U dual-socket servers
  • 26 × dual-socket serial compute blades based on the Dell PowerEdge C6420 server with 192GB RAM per node
  • 26 × SMP, High memory compute nodes (1040 core subset of the MPI Nodes) based on the Dell PowerEdge C6420 server, with 384GB RAM per node.
  • 65 x AMD Rome nodes

Dual AMD EPYC Rome 7502 processors (32 Zen-2 cores, 2.5 GHz)

Includes:

  • AMD 7502 (EPYC) 2.5GHz processors
  • 32 cores/socket (2.5GHz,128M,180W) giving 64 cores per node
  • 4GB RAM per core (ECC DDR4 3200MT/s single rank RDIMMs)
  • 240GB SSD disk
  • Single port ConnectX-6 HDR100 QSFP56 Infiniband Adapter.

GPU accelerator nodes

28 GPU Accelerator nodes comprising 15 x NVIDIA V100 and 13 x NVIDIA P100 nodes.

V100 accelerator nodes

The 15 V100 accelerator nodes are PowerEdge R740 servers, with 2 x Intel Xeon Gold 6248 2.5G, 20C/40T, 10.4GT/s, 27.5M Cache, Turbo, HT (150W) processors, 384GB memory (12 x 32GB DDR4-2933 2933MT/s dual rank RDIMMs), dual redundant power supplies, and an Infiniband host fabric interface card. The server is fitted with a GPU enablement kit and 2 (two) NVidia tesla V100 16GB PCIe GPU cards.

Each node contains:

  • Intel Xeon Gold 6248 2.5GHz processors
  • 20 cores/socket (2.0GHz, 10.4GT/s, Turbo+, 150W) giving 40 cores per node
  • 4.8GB RAM per core (192GB ECC DDR4 memory 2933MHz)
  • 240GB SSD disk
  • Single Infiniband EDR/PCIe Gen3-x8 interface embedded in the motherboard.
  • 2 (two) NVidia tesla V100 16GB PCIe GPU cards
P100 accelerator nodes

The 13 P100 accelerator nodes are Dell PowerEdge R740 servers, with 2 x Intel Xeon Gold 6148 20-core 2 2.4GHz processors, 384GB memory (12 x 32GB ECC DDR4 2666MT/s dual rank RDIMMs), dual redundant power supplies, and an Infiniband host fabric interface card. The server is fitted with a GPU enablement kit and 2 (two) NVidia tesla P100 16GB PCIe GPU cards.

Each node contains:

  • Intel Xeon Gold 6148 (Skylake) 2.40GHz processors
  • 20 cores/socket (2.40GHz, 10.4GT/s, Turbo+, 150W) giving 40 cores per node
  • 4.8GB RAM per core (192GB ECC DDR4 memory 2666MHz)
  • 120GB SSD disk
  • Single Infiniband EDR/PCIe Gen3-x8 interface embedded in the motherboard.
  • 2 (two) NVidia tesla P100 16GB PCIe GPU cards

The High Speed, Low Latency, high performance interconnect is provided by an InfiniBand 100Gb/s network using Mellanox MSB7700 Switch-IB based 36 port non-blocking EDR switches. ConnectX-4 single port EDR PCIe gen3-x16 HCAs (host card adaptor) are provided in each compute node.

There are two main storage sub-systems, a fast 1.2PB cluster file system based on Dell EMC reference architectures including Lustre software (RAID-6), and a redundant Network File System (NFS) of 420TB useable RAID-6 disk.

Cluster File System

The Lustre implementation consists of 6 x OSS nodes and 6 x OST drive enclosures. Each drive enclosure provides 60 x 4TB drives for a raw capacity of 240TB per enclosure, 1.44PB in total. After RAID6 this provides 192TB per enclosure, 1.2PB in total.

Network File System

The subsystem consists of 2 NFS servers operating in active/passive mode, attached to a shared RAID disk enclosure. The NFS servers are Dell PowerEdge R640 1U servers with 2 x Intel Xeon Gold 6130 16-core 2.1GHz processors, 192GB RAM and dual redundant boot SSD drives.

Each server is fitted with dual redundant power supplies, redundant fans and dual SAS controllers, along with a single Omni-Path adapter and Intel 2 x 10GbE plus 2 x GbE network daughter card. An additional heartbeat failover link is provided.

Each server is connected to the Infiniband fabric for serving of files, to the 10GbE and GbE cluster switches to provide Ethernet based access, and to the out-of-band network. The shared disk array is a Dell PowerVault MD3460 device with redundant cooling, power supplies and dual RAID controllers. Disks are configured with RAID6 to provide additional resilience. This provides 600TB raw (60 x 10TB drives), 480TB after RAID6

Intel Parallel Studio XE Cluster Edition software product

This provides the cluster with a tuned applications development, optimisation and debugging environment for applications written in C/C++, or Fortran, and either serial or parallelised with OpenMP and/or MPI.

Libraries

  • Intel Math Kernel Library - Cluster Edition Medium Cluster License for Linux
  • FFTW
  • netCDF
  • gsl
  • CUDA for GPU applications
  • Software licenses for SuperComputing Suite 5 (SCS) for cluster management and monitoring.

The Slurm Workload Manager

The open-source job scheduler for Linux and Unix-like kernels. It provides three key functions. First, it allocates exclusive and/or nonexclusive access to resources (computer nodes) to users for some duration of time so 4 they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending jobs.

Analysers, Profilers and Debuggers

  • Intel Parallel Studio XE 2017 Cluster Edition contains VTune Amplifier (advanced profiling capabilities with a single, friendly analysis interface; there are powerful tools to tune OpenCL and the GPU’s
  • Intel Inspector, an easy-to-use memory and threading error debugger for parallel and distributed memory C, C++ and Fortran applications that run on Windows and Linux
  • Arm/Allinea DDT enterprise debugging software and Allinea’s Performance Report.