Periodic Reporting for period 1 - EUPEX (EUROPEAN PILOT FOR EXASCALE)
Periodo di rendicontazione: 2022-01-01 al 2022-09-30
The prototype will be designed to be open, scalable and flexible, including the modular OpenSequana-compliant platform and the corresponding HPC software ecosystem for the Modular Supercomputing Architecture.
Scientifically, it is a vehicle to prepare HPC, AI, and Big Data processing communities for upcoming European Exascale systems and technologies.
The hardware platform is sized to be large enough for relevant application preparation and scalability forecast, and a proof of concept for a modular architecture relying on European technologies in general and on European Processor Technology (EPI) in particular. In this context, a strong emphasis is put on the system software stack and the applications.
Being the first of its kind, EUPEX sets the ambitious challenge of gathering, distilling and integrating European technologies that the scientific and industrial partners use to build a production-grade prototype.
EUPEX will lay the foundations for Europe's future digital sovereignty. It has the potential for the creation of a sustainable European scientific and industrial HPC ecosystem and should stimulate science and technology more than any national strategy.
The consortium – constituted of key actors on the European HPC scene – has the capacity and the will to provide a fundamental contribution to the consolidation of European supercomputing ecosystem. EUPEX aims to directly support an emerging and vibrant European entrepreneurial ecosystem in AI and Big Data processing that will leverage HPC as a main enabling technology.
The project has the following objectives:
1. Co-design a modular Exascale-pilot system. A set of key applications has been selected to identify the requirements that the pilot’s hardware and software must address.
2. Build and deploy a pilot hardware and software platform integrating European technology. The SiPearl Rhea chip, the Modular Supercomputing Architecture (MSA), and Bull SequanaX will be used as the basis of a system in which all hardware and software components will be selected whenever possible from technologies developed in Europe.
3. Demonstrate the readiness and the scalability of the pilot technology in general and the MSA in particular, towards Exascale. The pilot system will not be Exaflop capable itself, but its capabilities to scale to this level will be determined. Hardware and software readiness shall enable a Go-to-Market path and optimize our chances to have a major part of these European IPs and outcomes be globally competitive and be used in main European systems by Horizon 2024.
4. Prepare applications and European users to efficiently exploit the future Exascale machines. The co-design applications will be adapted and optimized to efficiently exploit the capabilities and modularity of Exascale-ready systems. Access to the pilot platform will also be offered to external developers, e.g. CoEs, enlarge the portfolio of Exascale-ready applications and enable the acceptance by EU users of the underlying technologies, needed for Exascale systems.
-WP3: has analyzed the applications & associated workflow participating in the project, selected the best architectural metrics to analyze the workload characteristics of the applications, launched the GPU technology codesign, incl. GPU technology. For the applications support, several in-kind systems used as SDVs are accessible, notably the IRENE system at CEA/GENCI facility. The structure of D3.1 has been defined.
- WP4: has provided preliminary architectures to help crystallization of SW/App requirements. Definition of the OpenSequana platform is ongoing. A first version of the modular platform and a preliminary OpenSequana specification have been delivered.
- WP5: has worked on getting an overview of the software ecosystem to be deployed on the pilot system via presentations & discussions concerning all components at management stack + Execution environment + performance & energy efficiency levels. As results, the deployment roadmap of the collaboration platform has been defined, the WP contributed to the co-design process by providing requirements of the execution environment to the system architecture.
-WP6: has furnished access to the early in-kind systems to the partners of WP3 and WP5, along with the associated ticketing. Joint webinars presenting the in-kind resources have been rolled out, and additional description in the EUPEX wiki is available. The partners ensure the first level support for those systems and conduct maintenances. Around 35 accounts have been created on the in-kind systems. D6.1 has been submitted.
In addition:
-WP2: set the project's website and social media channels up. They are completed by virtual conferences, events and workshops, for which numerous presentations and publications have been produced. Joint communication with other EuroHPC projects has been enhanced. D2.1 D2.2 and D3.3 have been submitted.
-WP1: The project management team continuously monitors the status of the work to guarantee that it progresses according to the Description of Action, and to identify any risk that could endanger its execution. The management bodies are established, as well as the legal frame that regulates the interactions between the partners: the Consortium Agreement. Communication tools for intra-project communication are in place. For quality control, each deliverable follows an internal review process before it is submitted. For financial control, each partner sends a quarterly financial report to the project management team to detail all expenses + actual & estimated PMs per WP. Collaboration with EUPILOT & EPI is ongoing, the Collaboration Agreement under preparation to be signed by all Parties. D1.1 D1.2 and D1.6 have been submitted.
The project will address two fundamental concerns in current large-scale HPC computing systems design:
- consistent co-design approach supported by a selected number of applications relevant for their respective scientific or engineering domains
- consistent mechanisms to keep the system at the top level of performance for the whole workload while managing the energy and power consumption
To address these concerns, EUPEX will develop:
- The first modular HPC system extensively relying on European hardware and software technologies
- The first HPC system ground-up, co-designed integrating the first-generation EPI processor with tools and utilities specifically designed for enabling the system to achieve top performance while keeping energy and power consumption under control
- A co-design approach, taking into account the above mentioned, together with the constraints of the data centre where the system will be installed and the requirements of selected applications used in diverse scientific, engineering domains