CSC supercomputer ranked among the world’s fastest

Posted by Julia Werner  • 

- NeIC web

Building a bridge to the LUMI supercomputer

Written by Arne Vollertsen

Invisible helper, gatekeeper, and accountant rolled into one: The Puhuri project aims at making it easier for researchers to access and use the new pre-exascale LUMI supercomputer.

Supercomputers are the race cars of research. But while Formula 1 cars are limited to be piloted by a select few, supercomputers should be used as much and by as many researchers as possible, to get the most out of our investment in this expensive machinery.

To achieve that, you need a technical platform making access and usage easy and manageable. This is where Puhuri comes in. Named after the spirit of cold and winter in Finnish mythology Puhuri is building a digital bridge between LUMI and its users.

A valuable resource

Supercomputers are expensive machines, and the new LUMI machine, which will be taken into production in spring 2021, is no exception. Located in Kajaani, Finland, and with a total budget of 207 million euros it is a valuable resource indeed. Furthermore, like other high performance computers, it has a fairly short lifespan - between 4 and 6 years.

So, no wonder the 9 countries participating in the LUMI consortium are eager to achieve the best possible return on their investment. The Puhuri project is designed to do just that, playing multiple roles, serving as invisible helper, gatekeeper, and accountant as well.

European project

ALUMI is part of the EuroHPC Joint Undertaking, a 1 billion euro endeavour to build a European infrastructure of next-generation high-performance computers, consisting of 8 hosting sites, with 5 petascale systems and 3 pre-exascale systems.

The pre-exascale systems are located in Bologna, Italy, Barcelona, Spain, and Kajaani, Finland, and each of the three will be about ten times more powerful than the most powerful supercomputer currently in Europe.

Researchers in many different fields rely on supercomputers. Climate scientists run complex Earth System Models on them to predict long-term effects of climate change. Engineers use supercomputers to optimize the design and structure of airplane wings. Linguists use heavy computation as well, e.g. to develop tools for processing human speech and writing to be recognized by computers, for machine translation and human speech recognition.

In high demand

LUMI’s super powers are in high demand, and not only the 9 LUMI-countries will be using the machine, so will researchers from other countries in Europe and across the world. Furthermore, European industry and SMEs are invited to use LUMI’s superpowers as well.

This means the system needs to be able to handle many different groups of users, enabling them to use LUMI in an easy and secure fashion, and moreover helping them to manage the processing power put at their disposal. Puhuri will develop middleware to bridge the gap between LUMI and various national portals for access and resource allocation.

Gatekeeping

Obviously, gatekeeping is important. You have to be sure that only people authorized to do so can access LUMI. For that you need an identity management solution, enabling users to identify themselves and for the system to authorize their access. Such Authentication and Authorization Infrastructures (AAIs) already exist on national level and for specific research communities. Now Puhuri is building a generic AAI, functioning as a middle layer between LUMI and existing national and community specific portals. This means that, instead of needing a separate password to log in to LUMI, a researcher can access it using his or her home organisation login and password.

Applications and accounting

In addition to gatekeeping, resource allocation management and accounting is equally important. Researchers who want to use the LUMI machine are not just queuing up and waiting in line, first come, first served.

They have to write an application describing their research project and the computing resources they need (GPUs, CPUs and storage), and submit it to the LUMI resource management board. The board then distributes LUMI’s computing resources, based on a peer-review process and on the share of resources assigned to each LUMI country.

Divided into shares

LUMI is divided into “shares”. Half of its resources belong to the EuroHPC Joint Undertaking, the other half belongs to the LUMI Consortium countries Belgium, Czech Republic, Denmark, Estonia, Finland, Norway, Poland, Sweden, and Switzerland. As mentioned, industry and SMEs can apply for resources, and LUMI will also have a channel for urgent computing, related to for instance national security, pandemics and other time-critical tasks.

Currently, the LUMI partners have their own national portals for applying for computing resources. Puhuri will develop middleware that connects them to LUMI, and furthermore provide an interface with an accounting and billing service, showing users for instance what services they have used, how many billing units that remain etc. Again, the aim is to ease access and interaction between users and the computing resources put to their disposal.

Generic solution

Generic – that is one of the key words for what Puhuri wants to achieve in its two-year project run. It is building a generic AAI solution, meaning that it is meant to work not only on LUMI, but on other HPC resources as well. The Puhuri ambition is to develop a digital authorisation and resource allocation platform for accessing HPC resources that can be used by other similar initiatives, instead of them having to invent an ad-hoc solution.

In short, Puhuri - invisible helper, gatekeeper, and accountant rolled into one – is positioning itself as the middleman of the future, when it comes to seamless access to supercomputing for researchers.

Lumi-supercomputer/LUMI-EasyBuild-contrib: A repository for contributed EasyConfig files that LUMI users can install at their own discretion or use as a starting base for their own build recipes.

This is a repository for contributed EasyConfig files that LUMI users can install at their own discretion or use as a starting base for their own build recipes.

Recipes in this repository are not installed centrally on LUMI, but can be installed by users using the EasyBuild-user module, or adapted to their own needs.

These build recipes are not as carefully tested as the ones that are centrally installed and part of the main LUMI-SoftwareStack production repository, and they are not always ported to newer versions of the software or the LUMI software stack by us, but they can be a great source of inspiration for users to build their own build recipes.

The structure of the repository follows the standard structure used by EasyBuild so that it should be compatible with the github management features of EasyBuild.

CSC supercomputer ranked among the world’s fastest

EuroHPC Joint Undertaking’s LUMI supercomputer reached the third spot on the latest Top500 list of world’s fastest supercomputers released at the ISC22 conference in Hamburg, Germany on 30 May 2022. LUMI reached a measured High-Performance Linpack (HPL) performance of 151.9 petaflops. This makes LUMI the fastest supercomputer in Europe.

LUMI is a unique endeavour thanks to its scale, sustainability and pan-European nature, hosted by a consortium of ten countries.

– The EuroHPC JU is extremely proud to see its first pre-exascale supercomputer reach the third position on the Top500 list. This is the latest tangible success for the EuroHPC initiative and validates the work that the JU is doing to achieve its mission of developing a world-class supercomputing infrastructure in Europe. To see LUMI so well ranked is an incredible result which shows the importance of European collaboration and the impact we can achieve when we work together, says Anders Dam Jensen, Executive Director of the EuroHPC Joint Undertaking.

– The LUMI consortium is excited about LUMI’s Top500 ranking and is looking forward to seeing LUMI’s remarkable computing power being used for RDI efforts with major societal impact. In addition to being an enabler of scientific breakthroughs and industrial innovation, LUMI will also be a platform for research collaboration allowing for mutual learning and competence development as well as for developing emerging technologies, such as AI and quantum. All these factors together make LUMI a key instrument for Europe’s future success and strategic autonomy, says Kimmo Koski, Managing Director of CSC – IT Center for Science, Finland, on behalf of the LUMI consortium.

– LUMI is not only a very powerful supercomputer, it is also an exceptionally green one. It runs on 100% renewable hydroelectricity, uses free cooling and has an advanced waste heat utilisation system with its waste heat being used for local district heating. This makes LUMI’s contribution to Europe’s future all the more remarkable, Koski continues.

The LUMI system is supplied by Hewlett Packard Enterprise (HPE), based on an HPE Cray EX supercomputer.

– The LUMI supercomputer represents a landmark achievement for Europe that will unlock innovation in critical areas such as drug discovery, healthcare, weather forecasting, and AI-driven initiatives, and power economic growth for European nations and corporations, says Justin Hotard, executive vice president and general manager for HPC & AI, HPE.

– It is an honor to closely collaborate with EuroHPC JU, CSC, and AMD to support this mission, and to build the system with HPE Cray EX supercomputers, which deliver next-generation supercomputing and AI capabilities. We celebrate this significant performance milestone for LUMI and look forward to many more, he continues.

LUMI’s second pilot phase for selected users will start in August 2022, and the system will become generally available for users in late September 2022.

– We are thrilled to see LUMI among the top systems of the Top500 list – it’s been a while since a supercomputer in Europe has reached such positions. Critical technologies such as HPC, play an ever more important role, and interest towards Europe is increasing. We are eager to see LUMI in its full glory later this year and look forward to a golden era of data-driven scientific discovery in Europe, enabled by LUMI and other EuroHPC systems, says Pekka Manninen, Director of LUMI leadership computing facility, CSC – IT Center for Science, Finland.

LUMI’s GPU partition called LUMI-G wasn’t yet fully installed for this edition of the Top500 list – the expected HPL performance will grow up to 375 petaflops during June 2022.

On the new HPCG (High-Performance Conjugate Gradient) list LUMI was also ranked number 3. The HPCG list benchmark provides an alternative metric for assessing supercomputer performance and is meant to complement the HPL measurement.

LUMI’s architecture

The full system architecture of LUMI is the following:

The LUMI system is supplied by Hewlett Packard Enterprise (HPE), based on an HPE Cray EX supercomputer.

The GPU partition consists of 2560 nodes, each node with one 64 core AMD Trento CPU and four AMD MI250X GPUs.

Each GPU node features four 200 Gbit/s network interconnect cards, has 800 Gbit/s injection bandwidth.

The committed Linpack performance of LUMI-G in its final configuration is 375 Pflop/s.

The MI250X GPU comes with a total of 128 GB of HBM2e memory offering over 3.2 TB/s of memory bandwidth.

A single MI250X card is capable of delivering 42.2 TFLOP/s of performance in the HPL benchmarks. More in-depth performance results for the card can be found on AMD’s website.

In addition to the GPUs in LUMI there is another partition (LUMI-C) using CPU only nodes, featuring 64-core 3rd-generation AMD EPYC™ CPUs, and between 256 GB and 1024 GB of memory. There are 1,536 dual-socket CPU nodes in total.

LUMI also has a partition with large memory nodes, with a total of 32 TB of memory in the partition.

For visualization workloads LUMI has 64 Nvidia A40 GPUs.

LUMI’s storage system, based on the Cray Clusterstor E1000 storage system from HPE, consists of three components. First, there is a 8 petabyte all flash Lustre system for short term fast access. Next there is a longer term more traditional 80 petabyte Lustre system based on mechanical hard drives.

For easy data sharing and project lifetime storage LUMI has 30 petabytes of Ceph based storage.

LUMI also has an OpenShift/Kubernetes container cloud platform for running microservices.

All the different compute and storage partitions are connected to the very fast HPE Slingshot interconnect of 200 Gbit/s.

LUMI takes nearly 300m2 of space, which is about the size of two tennis courts. The weight of the system is nearly 150 000 kilograms (150 metric tons).

Inauguration in June

LUMI’s inauguration event will take place on Monday 13th June 2022, in Kajaani Finland. The event can be followed via live stream. Please follow LUMI’s website and social media channels for further information. There will also be a press conference during the inauguration. Media representatives are asked to do the accreditation for the press conference via the link below.

This article was first published on 30 May by CSC.

Tagged:

  • Lumi-Supercomputer
  • Leave a Reply