Skip to main content
European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

MAchinE Learning for Scalable meTeoROlogy and cliMate

Periodic Reporting for period 1 - MAELSTROM (MAchinE Learning for Scalable meTeoROlogy and cliMate)

Période du rapport: 2021-04-01 au 2022-09-30

To develop Europe’s computer architecture of the future, MAELSTROM will co-design bespoke compute system designs for optimal application performance and energy efficiency, a software framework to optimise usability and training efficiency for machine learning at scale, and large-scale machine learning applications for the domain of weather and climate science.

The MAELSTROM compute system designs test machine learning applications across a range of hardware configurations regarding energy consumption, time-to-solution, numerical precision and solution accuracy. Customised compute systems are designed that are optimised for application needs to strengthen Europe’s high-performance computing portfolio and to pull recent hardware developments, driven by general machine learning applications, toward needs of weather and climate applications.

The MAELSTROM software framework enables scientists to apply and compare machine learning tools and libraries efficiently across a wide range of computer systems. A user interface will link application developers with compute system designers, and automated benchmarking and error detection of machine learning solutions will be performed during the development phase. Tools will be published as open source.

The MAELSTROM machine learning applications cover all important components of the workflow of weather and climate predictions including the processing of observations, the assimilation of observations to generate initial and reference conditions, model simulations, as well as post-processing of model data and the development of forecast products. For each application, benchmark datasets with up to 10 terabytes of data are published online for training and machine learning tool-developments. MAELSTROM machine learning solutions serve as a blueprint for a wide range of machine learning applications on supercomputers in the future.
In the first half of the project which is covered by this report, MAELSTROM has already closed the co-design cycle once – from data-set developments in D1.1 to software developments D2.2 to hardware benchmarks D3.4 to machine learning application development D1.4 – and reached the following achievements:

- The first wave of deliverables were the survey deliverables which outlined the state-of-the-art for machine learning applications (D1.2) software (D2.1) and hardware (D3.1 and D3.2).
- MAELSTROM datasets already comprise 16 TB of data which are documented and published, and are available for download via the internet.
- The development of MAELSTROM software tools progressed as planed and will soon be useable for all MAELSTROM applications.
- The first hardware benchmarks for the MAELSTROM applications have been performed and results have been reported and fed back to the application designers.
- We had the 1st MAELSTROM Dissemination Workshop on 28th March which was organised back-to-back with the Machine Learning Workshop at ECMWF from 29th March to 1st April. The dissemination workshop attracted 208 registered participants.
- The first 1st MAELSTROM hackathon (called MAELSTROM Bootcamp) was very successful with more than 30 participants and 16 scientific advisors meeting from 27th-30th September at JSC.
- MAELSTROM scientists have already provided more than 30 presentations and published 13 papers.
- We have designed a project webpage that provides access to all important information on the project https://www.maelstrom-eurohpc.eu/.
- MAELSTROM applications are already used by industry as reference benchmarks for HPC performance and machine learning application quality, including companies such as GRAPHCORE, Microsoft and ATOS.
MAELSTROM has contributed significantly to an improved understanding of how to design machine learning applications that are customised for the W&C domain with plenty of synergies created between the applications – e.g. between the downscaling application and the application investigating the use of crowed sourced data. This will, in the long term, allow for the design of improved weather and climate prediction systems. MAELSTROM has also established benchmark datasets and problems to describe machine learning workloads for the HPC community as well as new software tools that facilitate the training and inference of machine learning tools that will be used for the years to come, therefore making the W&C community more visible when designing the next generation of supercomputers. Finally, MAELSTROM has already generated insight into the main performance bottlenecks when using W&C machine learning applications on state-of-the-art compute system designs.
MAELSTROM co-design cycle