CLIMB consortium service
The Redwood Datacentre is hosting equipment for the Medical Research Council (MRC) CLIMB consortium system.
The Cloud Infrastructure for Microbial Bioinformatics (CLIMB) is an MRC funded project across four sites – the University of Warwick, Swansea University, Cardiff University and the University of Birmingham. The Cardiff Principal Investigator is Dr Thomas Connor from the School of Biosciences.
Research aims and requirements
The project aims to:
- create a public/private cloud for use by UK academics
- develop a set of standardised cloud images that implement key pipelines
- establish a storage repository for data that is made available online and within our system, from anywhere (‘eduroam for microbial genomics’)
- provide access to other databases from within the system.
Developing the requirements for the project, the Principal Investigators scoped out a solution which would be distributed over the four sites and connected over the Janet academic network. Different sizes of virtual machines would be required: personal, standard, large memory, and huge memory. The solution must be able to support thousands of virtual machines simultaneously, with four petabytes of object storage across four sites and 300 terabytes of local high performance storage per site.
ARCCA and Portfolio Management and IT Services have been involved in the planning and installation stages of the project – with the tender being released (via a framework procurement rather than an open procurement) in Q3/2014. The service has different constraints to those traditionally encountered on an HPC supercomputer – not least the requirements to support a wide variety of operating systems and environments through a virtual infrastructure, with no job schedulers to allocate the computing resources.
Given the distributed nature of the deployment and the cutting-edge design of this solution, there are a number of practical considerations where ARCCA support may be required by this project. The locations of the two Principal Investigators were awarded technical support – with the aim of this providing distributed support to the other sites. This may well work once the system is commissioned and in a stable production state, but during the implementation and design there was a strong recommendation that local technical staff would be required to assist in deploying software and testing networks/performance etc. Furthermore, given the novel approach of this proposal, there is limited expertise across the entire community – so there will be a steep learning curve for the four sites as well as the technology providers.
Pooling all expertise is essential in ensuring this project succeeds, particularly as this requires an understanding of clusters and storage systems not configured for traditional HPC services. The cutting edge approach means the design and performance aspects are not as defined as those for a traditional supercomputer – particularly as some of the configuration considerations need to be flexible to adapt to situations that may arise as the service is tested and developed.
With the final design in place in late 2014, a preliminary site survey was undertaken in January 2015. To accommodate the increased power, the CLIMB consortium funded the fourth power supply which was installed during the datacentre refurbishment works in December. With the final rack layout agreed, including power and networking designs, the systems were delivered to Cardiff and installed in early March 2015. Cardiff was the first site to take delivery of systems from this procurement (Birmingham went out separately to provide a pilot evaluation set-up to inform the CLIMB tender criteria). With the Cardiff system successfully passing the associated acceptance tests, the service is now operational.
The CLIMB infrastructure will also enable ARCCA to gain greater awareness of new and emerging solutions, helping to define future service directions and aspects of support for diverse community requirements.