Skip to main content
CORDIS - Forschungsergebnisse der EU
CORDIS

Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology

Leistungen

Final report on ARM-optimized Fortran compiler and mathematics libraries.

Produced by tasks T7.3.1 and T7.3.2, this deliverable will include a final report on the state of the ARM-optimized Fortran compiler and mathematics libraries.

Report on application tuning and optimization on ARM platform

Produced by task T6.4, this report includes the description of the tuning and optimisation for ARM-based platforms applied to a subset of the applications, and the results of their evaluation.

Final report on runtime support, optimization and programming productivity for compute accelerator and symbiotic cores.

Produced by tasks T7.2.4 and T7.2.5, this deliverable will contain a report on runtime support and extensions within OmpSs for the symbiotic heterogeneous cores whilst leveraging simulation models used for heterogeneous cores. It will also report on the programmability and performance of the compute accelerator explored in T4.2 whilst using the task-based approach of OmpSs.

Final report on enhancements to message passing.

Produced by tasks T7.5.1 and T7.5.4, this deliverable will provide a final report on the OmpSs communication task implementation and the TinyMPI interface with a TinyTask reference implementation, and final measurements regarding the memory copy saving techniques.

Final Report

This report will include the following: a final publishable summary of the work completed to date covering results, the conclusions and socio-economic impact of the project, a chapter on awareness and wider societal implications as well as a report on the distribution of the Community financial contribution. It will be presented in conjunction with the final report of Dissemination (D2.5).

Report on correlated and fine tuned multi-scale simulation infrastructure

Produced by task T5.7 this report will describe the final fine tuned automated multi-scale simulation workflow(s) and it will summarize the results of the correlation work to be carried thorough the final year of the project.

Final report on memory hierarchy investigations.

Produced by task T3.1, this deliverable shall gather final T3.1 simulation results using high level model such as gem5, exploring specifically the energy and latency aspects. The key outcomes and recommendations shall be included in D3.8.

Final report of applications on the project test platform: performance evaluation and optimizations

This document merges the two WP6 final deliverables into a single document. We'll have the a more comprehensive document including both i) an evaluation of the Dibona platform (from applications point of vue) and ii) a demonstration of more advanced techniques (e.g. throughput-oriented (BSC), parallel in time (UGraz), machine learning (AVL)) applied to our applications running on our prototype

Report on regions of interest as mini-application candidates

Produced by task T6.2, this report will include a list, description and justification of the regions of interest of the initial set of applications. Foreseen metrics include the percentage of time spent on those regions of interest with respect to total application execution time and lines of code compared to the full application.

Report on profiling and benchmarking of the initial set of applications on ARM-based HPC systems

Produced by tasks T6.1 and T6.2, this report will include the results of profiling and benchmarking of the initial set of applications on ARM-based HPC systems, such as the existing Mont-blanc prototype and mini-clusters, using the sets of metrics and methodology defined in T6.2.

Initial report on automatic region of interest extraction and porting to OpenMP4.0-OmpSs

Produced by task T6.5, this report will include the description of the extraction process of the regions of interest defined in D6.2. It will also include the description of the strategy and issues faced during the porting of the regions of interest to OpenMP4.0-OmpSs. These codes conform the initial list of mini-applications.

Final report on OpenMP and operating system adjustments and scheduling policies.

Produced by tasks T7.1.4 and T7.2.3, this deliverable will include a final evaluation of which new Linux kernel and OpenMP heterogeneous scheduling policies we found to be most suitable, including a final description of the heterogeneous devices specification.

Public project website

The public website will be considered in this deliverable. It shall be maintained and updated after the initial delivery, in order to continue to provide a view of the project live.

Veröffentlichungen

Is Arm software ecosystem ready for HPC?

Autoren: Banchelli Gracia, Fabio F.; Ruiz, Daniel; Hao Xu Lin, Ying; Mantovani, Filippo
Veröffentlicht in: Poster, Ausgabe 1, 2017
Herausgeber: SC17

A Domain Decomposition Multilevel Preconditioner for Interpolation with Radial Basis Functions

Autoren: Gundolf Haase, Dirk Martin, Patrick Schiffmann, Günter Offner
Veröffentlicht in: Large-Scale Scientific Computing. LSSC 2017. Lecture Notes in Computer Science, vol 10665, 2017, Seite(n) 499-506
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-319-73441-5_55

To distribute or not to distribute: The question of load balancing for performance or energy

Autoren: Esteban Stafford, Borja Pérez, Jose Luis Bosque, Ramón Beivide, Mateo Valero
Veröffentlicht in: Proceedings of Euro-Par 2017, 2017, Seite(n) 710-722, ISBN 978-3-319-64203-1
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-319-64203-1_51

Towards performance portability through locality-awareness for applications using one-sided communication primitives

Autoren: Zhou, Huan; Gracia, Jose
Veröffentlicht in: Ausgabe 14, 2016
Herausgeber: International Workshop on Legacy HPC Application Migration (LHAM16)

Multi-Node Advanced Performance and Power Analysis with Paraver

Autoren: Mantovani, Filippo; Calore, Enrico
Veröffentlicht in: Parallel Computing is Everywhere (serie: Advances in Parallel Computing), Ausgabe 7, 2018, Seite(n) 723-732, ISSN 0927-5452
Herausgeber: IOS Press
DOI: 10.3233/978-1-61499-843-3-723

Implementation of the K-Means Algorithm on Heterogeneous Devices: A Use Case Based on an Industrial Dataset

Autoren: Xu, Ying hao; Vidal-Piñol, Miquel; Arejita, Beñat; Diaz, Javier; Alvarez, Carlos; Jiménez-González, Daniel; Martorell, Xavier; Mantovani, Filippo
Veröffentlicht in: Parallel Computing is Everywhere (serie: Advances in Parallel Computing), Ausgabe 5, 2018, Seite(n) 642-651, ISSN 0927-5452
Herausgeber: IOS Press
DOI: 10.3233/978-1-61499-843-3-642

TaskGenX: A Hardware-Software Proposal for Accelerating Task Parallelism

Autoren: Kallia Chronaki, Marc Casas, Miquel Moreto, Jaume Bosch, Rosa M. Badia
Veröffentlicht in: Proceedings of ISC 2018, 2018, Seite(n) 389-409, ISBN 978-3-319-92039-9
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-319-92040-5_20

Computational Fluid and Particle Dynamics Simulations for Respiratory System - Runtime Optimization on an Arm Cluster

Autoren: Marta Garcia-Gasulla, Marc Josep-Fabrego, Beatriz Eguzkitza, Filippo Mantovani
Veröffentlicht in: Proceedings of the 47th International Conference on Parallel Processing Companion - ICPP '18, 2018, Seite(n) 1-8, ISBN 9781-450365239
Herausgeber: ACM Press
DOI: 10.1145/3229710.3229736

Efficient Programming for Multicore Processor Heterogeneity: OpenMP versus OmpSs

Autoren: Anastasiia Butko; Florent Bruguier; Abdoulaye Gamatié; Gilles Sassatelli
Veröffentlicht in: OpenSuCo 1 (ISC17), June 2017, 2017
Herausgeber: HAL

Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies

Autoren: Isaac Sánchez Barrera, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero, Marc Casas
Veröffentlicht in: Proceedings of the 2018 International Conference on Supercomputing - ICS '18, 2018, Seite(n) 207-217, ISBN 9781-450357838
Herausgeber: ACM Press
DOI: 10.1145/3205289.3205310

Extending OmpSs for OpenCL Kernel Co-Execution in Heterogeneous Systems

Autoren: Borja Perez, Esteban Stafford, Jose Luis Bosque, Ramon Beivide, Sergi Mateo, Xavier Teruel, Xavier Martorell, Eduard Ayguade
Veröffentlicht in: 2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2017, Seite(n) 1-8, ISBN 978-1-5090-1233-6
Herausgeber: IEEE
DOI: 10.1109/SBAC-PAD.2017.8

Characterizing and Improving the Performance of Many-Core Task-Based Parallel Programming Runtimes

Autoren: Jaume Bosch, Xubin Tan, Carlos Alvarez, Daniel Jimenez-Gonzalez, Xavier Martorell, Eduard Ayguade
Veröffentlicht in: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017, Seite(n) 1285-1292, ISBN 978-1-5386-3408-0
Herausgeber: IEEE
DOI: 10.1109/IPDPSW.2017.32

How The Flang Frontend Works - Introduction to the interior of the Open-Source Fortran frontend for LLVM

Autoren: Paul Osmialowski
Veröffentlicht in: Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC - LLVM-HPC'17, 2017, Seite(n) 1-14, ISBN 9781-450355650
Herausgeber: ACM Press
DOI: 10.1145/3148173.3148183

Evaluation of Heterogeneous Multicore Cluster Architectures Designed for Mobile Computing

Autoren: David Novo, Alejandro Nocua, Florent Bruguier, Abdoulaye Gamatie, Gillies Sassatelli
Veröffentlicht in: 2018 13th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2018, Seite(n) 1-8, ISBN 978-1-5386-7957-9
Herausgeber: IEEE
DOI: 10.1109/ReCoSoC.2018.8449376

Main memory organization trade-offs with DRAM and STT-MRAM options based on gem5-NVMain simulation frameworks

Autoren: Manu Komalan, Oh Hyung Rock, Matthias Hartmann, Sushil Sakhare, Christian Tenllado, Jose Ignacio Gomez, Gouri Sankar Kar, Arnaud Furnemont, Francky Catthoor, Sophiane Senni, David Novo, Abdoulaye Gamatie, Lionel Torres
Veröffentlicht in: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2018, Seite(n) 103-108, ISBN 978-3-9819263-0-9
Herausgeber: IEEE
DOI: 10.23919/DATE.2018.8341987

Efficient Router Bypass via Hybrid Flow Control

Autoren: Ivan Perez, Enrique Vallejo, Ramon Beivide
Veröffentlicht in: 2018 11th International Workshop on Network on Chip Architectures (NoCArc), 2018, Seite(n) 1-6, ISBN 978-1-5386-8552-5
Herausgeber: IEEE
DOI: 10.1109/NOCARC.2018.8541147

Stencil codes on a vector length agnostic architecture

Autoren: Adrià Armejach, Helena Caminal, Juan M. Cebrian, Rekai González-Alberquilla, Chris Adeniyi-Jones, Mateo Valero, Marc Casas, Miquel Moretó
Veröffentlicht in: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques - PACT '18, 2018, Seite(n) 1-12, ISBN 9781-450359863
Herausgeber: ACM Press
DOI: 10.1145/3243176.3243192

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Autoren: Paul Caheny, Lluc Álvarez, Mateo Valero, Miquel Moreto, Marc Casas
Veröffentlicht in: Proceedings of SC18, 2018
Herausgeber: ACM

Filling the gap between education and industry: evidence-based methods for introducing undergraduate students to HPC

Autoren: Filippo Mantovani, Fabio Banchelli
Veröffentlicht in: Proceedings of the Workshop on Education for High Performance Computing (EduHPC) at SC18, 2018
Herausgeber: SC18

Teaching HPC Systems and Parallel Programming with Small Scale Clusters of Embedded SoCs

Autoren: Lluc Alvarez, Eduard Ayguade, Filippo Mantovani
Veröffentlicht in: Proceedings of the Workshop on Education for High Performance Computing (EduHPC) at SC18, 2018
Herausgeber: SC18

CATA: Criticality Aware Task Acceleration for Multicore Processors

Autoren: Emilio Castillo, Miquel Moreto, Marc Casas, Lluc Alvarez, Enrique Vallejo, Kallia Chronaki, Rosa Badia, Jose Luis Bosque, Ramon Beivide, Eduard Ayguade, Jesus Labarta, Mateo Valero
Veröffentlicht in: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016, Seite(n) 413-422, ISBN 978-1-5090-2140-6
Herausgeber: IEEE
DOI: 10.1109/IPDPS.2016.49

Piecewise Holistic Autotuning of Compiler and Runtime Parameters

Autoren: Mihail Popov, Chadi Akel, William Jalby, Pablo De Oliveira Castro
Veröffentlicht in: Euro-Par 2016 Parallel Processing - 22nd International Conference, 2016, Seite(n) 238-250, ISBN 978-3-319-43659-3
Herausgeber: Springer
DOI: 10.1007/978-3-319-43659-3_18

MUSA: a multi-level simulation approach for next-generation HPC machines

Autoren: Allande, César; Moreto Planas, Miquel; Grass, Thomas; Ayguadé Parra, Eduard; Rico, Alejandro; Armejach, Adrià; Casas, Marc; Labarta, Jesús; Valero Cortés, Mateo
Veröffentlicht in: SC '16 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Ausgabe 1, 2016, Seite(n) Article No. 45, ISBN 978-1-4673-8815-3
Herausgeber: IEEE Press

TaskPoint: Sampled simulation of task-based programs

Autoren: Thomas Grass, Alejandro Rico, Marc Casas, Miquel Moreto, Eduard Ayguade
Veröffentlicht in: 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2016, Seite(n) 296-306, ISBN 978-1-5090-1953-3
Herausgeber: IEEE
DOI: 10.1109/ISPASS.2016.7482104

Reducing cache coherence traffic with hierarchical directory cache and NUMA-aware runtime scheduling

Autoren: Caheny, Paul; Saintes, Maxime; Valero Cortés, Mateo; Casas, Marc; Moreto Planas, Miquel; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Gloaguen, Hervé
Veröffentlicht in: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation - PACT '16, Ausgabe 1, 2016, Seite(n) 275 - 286, ISBN 978-1-4503-4121-9
Herausgeber: ACM
DOI: 10.1145/2967938.2967962

POSTER

Autoren: Valero, Mateo; Chronaki, Kallia; Ayguadé, Eduard; Casas, Marc; Rico, Alejandro; Badia, Rosa M.; Moretó, Miquel; Labarta, Jesus
Veröffentlicht in: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation - PACT '16, Ausgabe 1, 2016, Seite(n) 415-417, ISBN 978-1-4503-4121-9
Herausgeber: ACM
DOI: 10.1145/2967938.2976038

HPC Benchmarking: Problem Size Matters

Autoren: Vladimir Marjanovic, Jose Gracia, Colin W. Glass
Veröffentlicht in: 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2016, Seite(n) 1-10, ISBN 978-1-5090-5218-9
Herausgeber: IEEE
DOI: 10.1109/PMBS.2016.006

Asynchronous Progress Design for a MPI-Based PGAS One-Sided Communication System

Autoren: Huan Zhou, Jose Gracia
Veröffentlicht in: 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), 2016, Seite(n) 999-1006, ISBN 978-1-5090-4457-3
Herausgeber: IEEE
DOI: 10.1109/icpads.2016.0133

Synthetic Traffic Model of the Graph500 Communications

Autoren: Pablo Fuentes, Enrique Vallejo, José Luis Bosque, Ramón Beivide, Andreea Anghel, Germán Rodríguez, Mitch Gusat, Cyriel Minkenberg
Veröffentlicht in: Proceedings of ICA3PP 2016: Algorithms and Architectures for Parallel Processing, 2016, Seite(n) 675-683, ISBN 978-3-319-49583-5
Herausgeber: Springer
DOI: 10.1007/978-3-319-49583-5_52

Extending Commodity OpenFlow Switches for Large-Scale HPC Deployments

Autoren: Mariano Benito, Enrique Vallejo, Ramon Beivide, Cruz Izu
Veröffentlicht in: 2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), 2017, Seite(n) 41-48, ISBN 978-1-5090-6354-3
Herausgeber: IEEE
DOI: 10.1109/HiPINEB.2017.12

Random Folded Clos Topologies for Datacenter Networks

Autoren: Cristobal Camarero, Carmen Martinez, Ramon Beivide
Veröffentlicht in: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017, Seite(n) 193-204, ISBN 978-1-5090-4985-1
Herausgeber: IEEE
DOI: 10.1109/HPCA.2017.26

dist-gem5: Distributed simulation of computer clusters

Autoren: Alian Mohammad, Umur Darbaz, Gabor Dozsa, Stephan Diestelhorst, Daehoon Kim, Nam Sung Kim
Veröffentlicht in: 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2017, Seite(n) 153-162, ISBN 978-1-5386-3890-3
Herausgeber: IEEE
DOI: 10.1109/ISPASS.2017.7975287

Crossing the architectural barrier: Evaluating representative regions of parallel HPC applications

Autoren: Alexandra Ferreoon, Radhika Jagtap, Sascha Bischoff, Roxana Rusitoru
Veröffentlicht in: 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2017, Seite(n) 109-120, ISBN 978-1-5386-3890-3
Herausgeber: IEEE
DOI: 10.1109/ISPASS.2017.7975275

FlexVC: Flexible Virtual Channel Management in Low-Diameter Networks

Autoren: Pablo Fuentes, Enrique Vallejo, Ramon Beivide, Cyriel Minkenberg, Mateo Valero
Veröffentlicht in: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017, Seite(n) 842-854, ISBN 978-1-5386-3914-6
Herausgeber: IEEE
DOI: 10.1109/IPDPS.2017.110

Power monitoring on ARM-based HPC clusters: experiences from young and old

Autoren: Fillipo Mantovani
Veröffentlicht in: EECS Seminar 2017 Energy Efficient Computing Systems, 2017
Herausgeber: NTNU Norway

ElasticSimMATE: A fast and accurate gem5 trace-driven simulator for multicore systems

Autoren: Alejandro Nocua, Florent Bruguier, Gilles Sassatelli, Abdoulaye Gamatie
Veröffentlicht in: 2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2017, Seite(n) 1-8, ISBN 978-1-5386-3344-1
Herausgeber: IEEE
DOI: 10.1109/ReCoSoC.2017.8016146

Towards co-execution of massive data-parallel OpenCL kernels on CPU and Intel Xeon Phi

Autoren: Raul Nozal, Borja Perez and Jose Luis Bosque
Veröffentlicht in: Proceedings of the 17th International Conference on Computational and Mathematical Methods in Science and Engineering, Ausgabe Volume 5, 2017, Seite(n) 1561-1572, ISBN 978-84-617-8694-7
Herausgeber: CMMSE

Efficient CFD code implementation for the ARM-based Mont-Blanc architecture

Autoren: G. Oyarzun, R. Borrell, A. Gorobets, F. Mantovani, A. Oliva
Veröffentlicht in: Future Generation Computer Systems, Ausgabe 79, 2018, Seite(n) 786-796, ISSN 0167-739X
Herausgeber: Elsevier BV
DOI: 10.1016/j.future.2017.09.029

A scalable synthetic traffic model of Graph500 for computer networks analysis

Autoren: Pablo Fuentes, Mariano Benito, Enrique Vallejo, José Luis Bosque, Ramón Beivide, Andreea Anghel, Germán Rodríguez, Mitch Gusat, Cyriel Minkenberg, Mateo Valero
Veröffentlicht in: Concurrency and Computation: Practice and Experience, Ausgabe 29/24, 2017, Seite(n) e4231, ISSN 1532-0626
Herausgeber: John Wiley & Sons Inc.
DOI: 10.1002/cpe.4231

Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters

Autoren: Filippo Mantovani, Enrico Calore
Veröffentlicht in: Journal of Low Power Electronics and Applications, Ausgabe 8/2, 2018, Seite(n) 13, ISSN 2079-9268
Herausgeber: Multidisciplinary Digital Publishing Institute (MDPI)
DOI: 10.3390/jlpea8020013

Optimizing a RBF Interpolation Solver for Energy on Heterogeneous Systems

Autoren: Patrick Schiffmann, Dirk Martin, Gundolf Haase, Günter Offner
Veröffentlicht in: Advances in Parallel Computing Volume 32: Parallel Computing is Everywhere, 2018, Seite(n) 287 - 296
Herausgeber: IOS Press
DOI: 10.3233/978-1-61499-843-3-287

Reducing Cache Coherence Traffic with a NUMA-Aware Runtime Approach

Autoren: Paul Caheny, Lluc Alvarez, Said Derradji, Mateo Valero, Miquel Moreto, Marc Casas
Veröffentlicht in: IEEE Transactions on Parallel and Distributed Systems, Ausgabe 29/5, 2018, Seite(n) 1174-1187, ISSN 1045-9219
Herausgeber: Institute of Electrical and Electronics Engineers
DOI: 10.1109/TPDS.2017.2787123

Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems

Autoren: B. Pérez, E. Stafford, J.L. Bosque, R. Beivide, S. Mateo, X. Teruel, X. Martorell, E. Ayguadé
Veröffentlicht in: Journal of Parallel and Distributed Computing, Ausgabe 125, 2019, Seite(n) 45-57, ISSN 0743-7315
Herausgeber: Academic Press
DOI: 10.1016/j.jpdc.2018.11.001

Piecewise holistic autotuning of parallel programs with CERE

Autoren: Mihail Popov, Chadi Akel, Yohan Chatelain, William Jalby, Pablo de Oliveira Castro
Veröffentlicht in: Concurrency and Computation: Practice and Experience, 2017, Seite(n) e4190, ISSN 1532-0626
Herausgeber: John Wiley & Sons Inc.
DOI: 10.1002/cpe.4190

Task Scheduling Techniques for Asymmetric Multi-Core Systems

Autoren: Kallia Chronaki, Alejandro Rico, Marc Casas, Miquel Moreto, Rosa M. Badia, Eduard Ayguade, Jesus Labarta, Mateo Valero
Veröffentlicht in: IEEE Transactions on Parallel and Distributed Systems, Ausgabe 28/7, 2017, Seite(n) 2074-2087, ISSN 1045-9219
Herausgeber: Institute of Electrical and Electronics Engineers
DOI: 10.1109/TPDS.2016.2633347

Efficiency modeling and exploration of 64-bit ARM compute nodes for exascale

Autoren: J. Wanza Weloli, S. Bilavarn, M. De Vries, S. Derradji, C. Belleudy
Veröffentlicht in: Microprocessors and Microsystems, Ausgabe 53, 2017, Seite(n) 68-80, ISSN 0141-9331
Herausgeber: Elsevier BV
DOI: 10.1016/j.micpro.2017.06.019

Application Productivity and Performance Evaluation of Transparent Locality-aware One-sided Communication Primitives

Autoren: Huan Zhou, José Gracia
Veröffentlicht in: International Journal of Networking and Computing, Ausgabe 7/2, 2017, Seite(n) 136-153, ISSN 2185-2839
Herausgeber: International Journal of Networking and Computing
DOI: 10.15803/ijnc.7.2_136

Interconexiones balanceadas y eficientes para supercomputadores Exascale

Autoren: Fuentes Sáez, Pablo
Veröffentlicht in: TDR (Tesis Doctorales en Red), Ausgabe 1, 2017
Herausgeber: University of Cantabria

EngineCL: Usability and Performance in Heterogeneous Computing

Autoren: Nozal, Raúl; Bosque, Jose Luis; Beivide, Ramón
Veröffentlicht in: Ausgabe 3, 2018
Herausgeber: Cornell University Library

Parallel Low Memory Footprint Eikonal Solver in Cardiovascular Applications

Autoren: Daniel Ganellari, Gundolf Haase
Veröffentlicht in: Proceedings of the PhD Forum (posters) at ISC 2018, 2018
Herausgeber: ISC 2018

HPC with Unstructured Meshes on Novel Architectures

Autoren: Alban Lumi, Gundof Haase
Veröffentlicht in: Proceedings of the Research Paper session at ISC 2018, 2018
Herausgeber: ISC 2018

Enabling PAPI support for advanced performance analysis on ThunderX SoC

Autoren: Ruiz, Daniel; Calore, Enrico; Mantovani, Filippo
Veröffentlicht in: UPCommons, Ausgabe 1, 2017
Herausgeber: UPC

Patterns for OpenMP Task Data Dependency Overhead Measurements

Autoren: Joseph Schuchart, Mathias Nachtmann, José Gracia
Veröffentlicht in: Scaling OpenMP for Exascale Performance and Portability, Ausgabe 10468, 2017, Seite(n) 156-168, ISBN 978-3-319-65577-2
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-319-65578-9_11

The Impact of Taskyield on the Design of Tasks Communicating Through MPI

Autoren: Joseph Schuchart, Keisuke Tsugane, José Gracia, Mitsuhisa Sato
Veröffentlicht in: Evolving OpenMP for Evolving Architectures - 14th International Workshop on OpenMP, IWOMP 2018, Barcelona, Spain, September 26–28, 2018, Proceedings, Ausgabe 11128, 2018, Seite(n) 3-17, ISBN 978-3-319-98520-6
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-319-98521-3_1

Suche nach OpenAIRE-Daten ...

Bei der Suche nach OpenAIRE-Daten ist ein Fehler aufgetreten

Es liegen keine Ergebnisse vor