In this paper we describe a new approach to designing multithreaded architecture that can be used as the basic building blocks in high-end computing architectures. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are post-stored after the completion of the thread execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to the enabled threads. The non-blocking nature of threads reduces the number of context switches, thus reducing the overhead in scheduling threads. Our functional execution paradigm eliminates complex hardware required for dynamic scheduling of instructions used in modern superscalar architectures. We will present our preliminary results obtained from an instruction set simulator using several benchmark programs. We compare the execution of our architecture with that of MIPS architecture as facilitated by DLX simulator.

K., K., Giorgi, R., J., A. (2000). Comparing execution performance of Scheduled Dataflow Architecture with RISC processors. In Proc. of the 13th ISCA Parallel and Distributed Computing Systems Conference (pp.41-47).

Comparing execution performance of Scheduled Dataflow Architecture with RISC processors

GIORGI, ROBERTO;
2000-01-01

Abstract

In this paper we describe a new approach to designing multithreaded architecture that can be used as the basic building blocks in high-end computing architectures. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are post-stored after the completion of the thread execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to the enabled threads. The non-blocking nature of threads reduces the number of context switches, thus reducing the overhead in scheduling threads. Our functional execution paradigm eliminates complex hardware required for dynamic scheduling of instructions used in modern superscalar architectures. We will present our preliminary results obtained from an instruction set simulator using several benchmark programs. We compare the execution of our architecture with that of MIPS architecture as facilitated by DLX simulator.
2000
188084334X
K., K., Giorgi, R., J., A. (2000). Comparing execution performance of Scheduled Dataflow Architecture with RISC processors. In Proc. of the 13th ISCA Parallel and Distributed Computing Systems Conference (pp.41-47).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/46861
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo