With the potential of overcoming the memory and power wall, the many-core/multi-thread has become a trend in processor design area. However, this architecture is far from ripeness because it also companies with many challenges such as scalability and larger architecture design space compared with mono-core architectures. In many-core design space, Data-Flow based architectures are alternatives that deal with concurrency, long memory latencies, and synchronization stalls efficiently. Nevertheless, even in this sub-area, there are still a lot of factors affecting the scalability and performance of the architecture. In this paper, we explore the design trade-offs for Decoupled Threaded Architecture (DTA) which is a data-flow many-core architecture. By using a well known bio-informatics benchmark, ClustalW, we evaluate various DTA configurations with different number of synchronization and execution pipelines. We find that the configuration which consists of two synchronization pipelines (SP) and one execution pipeline (EP) for each processing element(PE) achieves almost the same performance as the configuration consisting of two SPs and two EPs for each processing element. By employing the former configuration, we can save 32.5%% of the area required for each DTA processing element.

Zhibin, Y.u., Andrea, R., Giorgi, R. (2011). A Case Study on the Design Trade-off of a Thread Level Data Flow based Many-core Architecture. In Proceedings of Future Computing (pp.100-106).

A Case Study on the Design Trade-off of a Thread Level Data Flow based Many-core Architecture

GIORGI, ROBERTO
2011-01-01

Abstract

With the potential of overcoming the memory and power wall, the many-core/multi-thread has become a trend in processor design area. However, this architecture is far from ripeness because it also companies with many challenges such as scalability and larger architecture design space compared with mono-core architectures. In many-core design space, Data-Flow based architectures are alternatives that deal with concurrency, long memory latencies, and synchronization stalls efficiently. Nevertheless, even in this sub-area, there are still a lot of factors affecting the scalability and performance of the architecture. In this paper, we explore the design trade-offs for Decoupled Threaded Architecture (DTA) which is a data-flow many-core architecture. By using a well known bio-informatics benchmark, ClustalW, we evaluate various DTA configurations with different number of synchronization and execution pipelines. We find that the configuration which consists of two synchronization pipelines (SP) and one execution pipeline (EP) for each processing element(PE) achieves almost the same performance as the configuration consisting of two SPs and two EPs for each processing element. By employing the former configuration, we can save 32.5%% of the area required for each DTA processing element.
2011
9781612081540
Zhibin, Y.u., Andrea, R., Giorgi, R. (2011). A Case Study on the Design Trade-off of a Thread Level Data Flow based Many-core Architecture. In Proceedings of Future Computing (pp.100-106).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/46818
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo