Heterogeneous architectures proved successful in achieving unprecedented performance and energy-efficiency. However, taking advantage of these diverse processing elements is still hard. Programmers need to code through the different approaches suitable for each target architecture and need to decide the distribution of activities on the different resources. The majority of current frameworks focuses on either performance or productivity. The former mainly provides low-level target-specific programming interfaces, and the latter offers high-level tools that often fail in achieving high-performance. In both cases, the design is usually data-parallel, as task-parallelism is not supported. In this work, we propose a task-based solution within the data-parallel heterogeneous single-source PHAST library. Tasks can be coded in a target-agnostic fashion, can be compiled and parallelized on multi-core CPUs and NVIDIA GPUs automatically and support the choice of the execution platform at runtime. We evaluate the capabilities of the proposed task-directed acyclic graph support in case of an extensive set of randomly generated task-based applications with different sizes and characteristics. We compare it against a SYCL implementation in terms of performance and complexity metrics, highlighting that PHAST achieves about 1.56× and 2.60× speedup over SYCL for multi-core CPU and GPU, respectively, while improving also code complexity metrics.
Peccerillo, B., Bartolini, S. (2022). Flexible task-DAG management in PHAST library: Data-parallel tasks and orchestration support for heterogeneous systems. CONCURRENCY AND COMPUTATION, 34(2) [10.1002/cpe.5842].
Flexible task-DAG management in PHAST library: Data-parallel tasks and orchestration support for heterogeneous systems
Peccerillo B.
;Bartolini S.
2022-01-01
Abstract
Heterogeneous architectures proved successful in achieving unprecedented performance and energy-efficiency. However, taking advantage of these diverse processing elements is still hard. Programmers need to code through the different approaches suitable for each target architecture and need to decide the distribution of activities on the different resources. The majority of current frameworks focuses on either performance or productivity. The former mainly provides low-level target-specific programming interfaces, and the latter offers high-level tools that often fail in achieving high-performance. In both cases, the design is usually data-parallel, as task-parallelism is not supported. In this work, we propose a task-based solution within the data-parallel heterogeneous single-source PHAST library. Tasks can be coded in a target-agnostic fashion, can be compiled and parallelized on multi-core CPUs and NVIDIA GPUs automatically and support the choice of the execution platform at runtime. We evaluate the capabilities of the proposed task-directed acyclic graph support in case of an extensive set of randomly generated task-based applications with different sizes and characteristics. We compare it against a SYCL implementation in terms of performance and complexity metrics, highlighting that PHAST achieves about 1.56× and 2.60× speedup over SYCL for multi-core CPU and GPU, respectively, while improving also code complexity metrics.File | Dimensione | Formato | |
---|---|---|---|
Concurrency and Computation - 2020 - Peccerillo - Flexible task‐DAG management in PHAST library Data‐parallel tasks and.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
2.69 MB
Formato
Adobe PDF
|
2.69 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1111986