Starting from a Data-Flow execution model called “DF-Threads”, we defined a minimalistic API to enable an efficient implementation in the hardware of the distribution of the threads across the cores of a single multi-core system and across the remote cores of a cluster. We aim at proposing this API as a simple programming model in C language that can potentially permit an easy interface between DF-Threads and generic programming models. Clusters are typically programmed with MPI, therefore we evaluated our approach against OpenMPI. If we consider the delivered GFLOPS per core, DF-Threads are also competitive in respect to CUDA. In the basic examples, that we used in this initial investigation, DF-Threads achieve better performance-per-core compared to OpenMPI and CUDA. In particular, OpenMPI has a large portion of OS-kernel activity, which is slowing down its performance.
Giorgi, R., Procaccini, M. (2019). Bridging a Data-Flow Execution Model to a Lightweight Programming Model. In 2019 International Conference on High Performance Computing & Simulation (HPCS) (pp.165-168). New York : Institute of Electrical and Electronics Engineers Inc. [10.1109/HPCS48598.2019.9188183].
Bridging a Data-Flow Execution Model to a Lightweight Programming Model
Roberto Giorgi
Membro del Collaboration Group
;Marco ProcacciniMembro del Collaboration Group
2019-01-01
Abstract
Starting from a Data-Flow execution model called “DF-Threads”, we defined a minimalistic API to enable an efficient implementation in the hardware of the distribution of the threads across the cores of a single multi-core system and across the remote cores of a cluster. We aim at proposing this API as a simple programming model in C language that can potentially permit an easy interface between DF-Threads and generic programming models. Clusters are typically programmed with MPI, therefore we evaluated our approach against OpenMPI. If we consider the delivered GFLOPS per core, DF-Threads are also competitive in respect to CUDA. In the basic examples, that we used in this initial investigation, DF-Threads achieve better performance-per-core compared to OpenMPI and CUDA. In particular, OpenMPI has a large portion of OS-kernel activity, which is slowing down its performance.File | Dimensione | Formato | |
---|---|---|---|
Bridging_a_Data-Flow_Execution_Model_to_a_Lightweight_Programming_Model.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
120.87 kB
Formato
Adobe PDF
|
120.87 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1081831