Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Pilot using Independent Local & Open Technologies

Deliverables

Collaboration roadmap and collaboration agreement with EUPEX

D35 Collaboration roadmap and collaboration agreement with EUPEX EUPEX aims at delivering a largescale modular demonstrator based on the ARMbased general purpose processor design under development in EPI In contrast the European PILOT will deliver a demonstrator based on the RISCV accelerators in EPI The European PILOT output could be integrated as an additional module into the EUPEX modular supercomputer For this reason we will define a collaboration roadmap between the two pilots to ensure the integration of the two projects into a global framework A joint Collaboration Agreement will be signed to that effect

Parallel Programming Runtimes specifications

D71 Parallel Programming Runtimes specifications BSC R PU M6 This deliverable will define the functionalities and interfaces that will have to be integrated in the Pilot Beyond the basic MPI and OpenMP support based in MPICH and the LLVM OpenMP runtime it will include the TAMPI interface for improved interoperability between MPI and OpenMP resulting in more productive mechanisms to achieve communicationcomputation overlap Also the DLB interfaces to dynamically reassign cores between OpenMP threads in different processes The document will specify the fine grain resource management policies to be implemented by these runtimes within the processes and at the node level as well as the vertical interface to the coarser grain schedulers in WP5 It will also specify the optimizations to be implemented in the internals of the runtime like vectorization offloading to communication devices as well as mechanisms to be used to minimize the impact of noise OS communications in performance

Design of AI frameworks for the Pilot platform

D61 Design of AI frameworks for the Pilot platform ETH R PU M6 This deliverable will present the design of the AI frameworks ONNXDaCe TensorFlow Tarantela for accelerated ONNXDaCe TF and distributed Tarantella learning taking into account the requirements of the respective WP1 verticals

Dissemination and Communication Plan

D21 Dissemination and Communication Plan BSC R PU M3 This deliverable will set out the dissemination and communication strategy and the activities to be undertaken to achieve it Results of the dissemination work will be reported in the periodic and final reports

Project Management and Quality Guidelines
Compilation and Emulation infrastructure

D9.1. Compilation and Emulation infrastructure (BSC, O, PU) [M9]. This deliverable will provide an updated version of the EPI compilation and Emulation infrastructure (Vehave) extended to support v1.0 of the RISC-V ISA. It will support C/C++ and will include automatic vectorization capabilities.

Publications

A Heterogeneous In-Memory Computing Cluster for Flexible End-to-End Inference of Real-World Deep Neural Networks

Author(s): Angelo Garofalo; Geethan Karunaratne; Francesco Conti; DAVIDE ROSSI; Irem Boybat; GIANMARCO OTTAVI; LUCA BENINI
Published in: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Issue 1, 2022, ISSN 2156-3357
Publisher: IEEE Circuits and Systems Society
DOI: 10.1109/jetcas.2022.3170152

Scalable Hierarchical Instruction Cache for Ultralow-Power Processors Clusters

Author(s): Jie Chen; Igor Loi; Eric Flamand; Giuseppe Tagliavini; Luca Benini; Davide Rossi
Published in: IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 31 (4), Issue 8, 2023, ISSN 1557-9999
Publisher: IEEE
DOI: 10.1109/tvlsi.2022.3228336

Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode

Author(s): Gianmarco Ottavi; Angelo Garofalo; Giuseppe Tagliavini; Francesco Conti; Alfio Di Mauro; Luca Benini; Davide Rossi
Published in: IEEE Transactions on Circuits and Systems I: Regular Papers, 70 (6), Issue 8, 2023, ISSN 1558-0806
Publisher: IEEE
DOI: 10.1109/tcsi.2023.3254810

Darkside: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training

Author(s): Angelo Garofalo; Yvan Tortorella; Matteo Perotti; Luca Valente; Alessandro Nadalini; Luca Benini; Davide Rossi; Francesco Conti
Published in: IEEE Open Journal of the Solid-State Circuits Society, Issue 1, 2022, ISSN 2644-1349
Publisher: IEEE
DOI: 10.1109/ojsscs.2022.3210082

STen: An Interface for Efficient Sparsity in PyTorch

Author(s): A. Ivanov, N. Dryden, T. Hoefler
Published in: Sparsity in Neural Networks workshop 2022, 2022
Publisher: ETH Zurich, Scalable Parallel Computing Laboratory
DOI: 10.48550/arxiv.2304.07613

I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication

Author(s): Niels Gleinig, Maciej Besta, Torsten Hoefler
Published in: 36th IEEE Interational Parallel and Distributed Processing Symposium, 2022, ISBN 978-1-6654-8106-9
Publisher: 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

The Red-Blue Pebble Game on Trees and DAGs with Large Input

Author(s): Niels Gleinig, Torsten Hoefler
Published in: Structural Information and Communication Complexity. SIROCCO 2022, Lecture Notes in Computer Science, 2022, Page(s) 135-153, ISBN 978-3-031-09992-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-09993-9_8

Lifting C Semantics for Dataflow Optimization

Author(s): Alexandru Calotoiu, Tal Ben-Nun,Grzegorz Kwasniewski, Johannes de Fine Licht, Timo Schneider, Philipp Schaad, Torsten Hoefler
Published in: ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing, 2022
Publisher: ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing
DOI: 10.1145/3524059.3532389

MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores

Author(s): Bertaccini, Luca; Paulin, Gianna; Fischer, Tim; Mach, Stefan; Benini, Luca
Published in: 2022 IEEE 29th Symposium on Computer Arithmetic (ARITH), Issue 5, 2022
Publisher: IEEE
DOI: 10.1109/arith54963.2022.00010

Model-Agnostic Federated Learning

Author(s): Gianluca Mittone; Walter Riviera; Iacopo Colonnelli; Robert Birke; Marco Aldinucci
Published in: Euro-Par 2023: Parallel Processing, Lecture Notes in Computer Science, Issue 3, 2023, Page(s) 383-396, ISBN 978-3-031-39698-4
Publisher: Springer Nature
DOI: 10.1007/978-3-031-39698-4_26

Benchmarking FedAvg and FedCurv for Image Classification Tasks

Author(s): Bruno Casella, Roberto Esposito, Carlo Cavazzoni, Marco Aldinucci
Published in: The 1st Italian Conference on Big Data and Data Science, 2022, Page(s) 99-100
Publisher: CEUR-WS
DOI: 10.48550/arxiv.2303.17942

Fast Arbitrary Precision Floating Point on FPGA

Author(s): Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler
Published in: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2022
Publisher: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
DOI: 10.1109/fccm53951.2022.9786219

Arax: A Runtime Framework for Decoupling Applications from Heterogeneous Accelerators

Author(s): Manos Pavlidakis, Stelios Mavridis, Antony Chazapis, Giorgos Vasiliadis, and Angelos Bilas.
Published in: SoCC '22: Proceedings of the 13th Symposium on Cloud Computing, 2022, ISBN 978-1-4503-9414-7
Publisher: Association for Computing Machinery
DOI: 10.1145/3542929.3563467

A Data-Centric Optimization Framework for Machine Learning

Author(s): Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler
Published in: ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing, 2022, ISBN 978-1-4503-9281-5
Publisher: Association for Computing Machinery
DOI: 10.1145/3524059.3532364

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available