EIDF now houses the first Cerebras CS 1 system in Europe
The Edinburgh International Data Facility (EIDF) brings together regional, national and international datasets to create new products, services, and research and is an integral part of the Data Driven Innovation Programme of the Edinburgh and South-East Scotland City Region Deal. However, whilst we are blessed with massive amounts of data and ambition to exploit it, without compute power to process this then the value extracted will fall short of its potential. Whilst the EIDF contains some of the most powerful hardware for data processing, it is also important to consider cutting-edge novel hardware that has grown up around Data Science and AI.
One such highly innovative system is the Cerebras CS 1 and the EIDF now houses the first such system in Europe. Designed to provide a step change in data processing capabilities, the CS-1 is built around the world’s largest processor, the WSE. Whilst commodity CPU dyes tend to be around the size of a postage stamp (although the packaging is often larger to fit in all the connections), the WSE is around the size of a dinner plate. The best way to emphasis what a difference this makes is in terms of numbers, for comparison ARCHER2 the UK’s national supercomputer has 64 cores per CPU, whereas the WSE in the CS-1 has around 400,000 cores! This is combined with around 18GB of very fast on-chip cache memory and over 20,000 times more bandwidth than latest generation GPUs. Put simply, the WSE at the heart of the CS-1 is a behemoth, delivering thousands of times more performance than legacy alternatives for AI workloads, and at a fraction of the power draw and physical space.
Of course, the CS-1 is a bit like a racing car, without a talented driver it is nothing, and that is where the benefits of the wider EIDF comes in. By connecting the compute performance of the CS-1 with the rest of the EIDF means that one gains the benefits of massive high-performance data storage, high speed connections to and from the CS-1, and familiar well supported environment of the EIDF. Whilst programming 400,000 cores might sound like a daunting proposition, Cerebras have done an excellent job with their software tooling stack by providing common existing machine learning frameworks including TensorFlow and PyTorch2. It really is very convenient, where a data scientist takes their existing ML workload and if this is already using a Cerebras supported framework then they are good to go, otherwise they must convert it to one of these common ML tools. Subsequently the underlying Cerebras technology takes care of all the details around exactly what parts of the model should run where on the chip in order to make most effective use of all those compute cores, memory, and on-chip bandwidth. This means that, on behalf of the data scientist, no specific knowledge or experience of the hardware is required to obtain extremely high performance for these ML tasks. The experience is actually rather similar to CPUs and GPUs, where the data scientist focusses entirely on interrogating their data, only on the CS-1 the answers tend to arrive far more quickly!
Based on the benefits of the CS-1 that we have described it is little wonder then that, since it became operational in May 2021, a variety of academic and industrial projects have been busy running on the hardware and reaping the benefits. One of the fascinating aspects has been to see the wide breadth of areas that have been using the CS-1, including genetics, physics, climate change modelling, and robotics. This is set to grow further and for all of these, irrespective of the details of the problem, users have been excited about the ability to unlock new AI capabilities for their data sets.
The CS-1 is available for access to pilot and long-term projects and we are keen for data scientists to experiment with this exciting hardware and for them to understand its suitability for their workloads. As EIDF and the DDI programme develops over the coming years access the CS-1 will be crucial in delivering the goals and ambitions of the programme to both the Edinburgh and South-East Scotland City Region in addition to the nation as a whole.
For further information about the system and to request access please contact: EIDF@epcc.ed.ac.uk
Read the latest Case studies
Data for schools-based intervention
As one of the TRAIN@Ed programme fellows, Sarah Galey had the opportunity to work with…
A global light
Childlight, the data institute based in the University of Edinburgh, is exploring how data can…
Data as a tool for peace
Too often, technological advance is used for destructive, military purposes. However, backed by DDI, Devanjan…