Nanophotonics is a promising solution for on-chip interconnection due to its intrinsic low-latency and lowpower features, which can be useful for performance and energy in future Chip Multi-Processors (CMPs). This article proposes a novel arbitrated all-optical path-setup scheme for tiled CMPs adopting circuitswitched optical networks. It aims at significantly reducing path-setup latency and overall energy consumption. The proposed arbitrated scheme is able to configure multiple photonic switches simultaneously, instead of sequentially as it is done in state-of-the-art proposals. The proposed fast optical path-setup solution reduces the overhead in each transmission and, most importantly, allows optical circuit-switched networks to effectively serve cache coherence traffic, which is mainly composed of relatively small messages. Specifically, we propose a Single-Arbiter scheme where the whole topology is managed by a central module (Single-Arbiter) that takes care of the path-setup procedures. Then, to tackle scalability, we propose a logically clustered architecture (Multi-Arbiter) in which an arbiter is allocated in each logical core-cluster and an ad-hoc distributed reservation protocol coordinates arbiters to manage inter-cluster path reservations. We show that our proposed Single-Arbiter architecture outperforms a state-of-the-art optical network with sequential path-setup (Optical Baseline) in case of 8- and 16-core tiled CMP setups. However, due to serialization issues, the Single-Arbiter solution is not capable to compete with a reference Electronic Baseline for bigger 32- and 64-core setups even if still performing much better than the Optical Baseline. Conversely, our Multi-Arbiter hierarchical solution, allows to improve performance up to almost 20% and 40% also for 32- and 64-core setups, respectively, demonstrating a wide applicability of the proposed technique. Energy-wise, the analyzed solutions enable significant savings compared to both the Optical Baseline with sequential path setup, and to the electronic counterpart. Specifically, results show more than 25% average improvement for the Single-Arbiter in case of the 8- and 16-core cases and more than 40% and 15% savings for the Multi-Arbiter in case of 32- and 64-core, respectively.
Grani, P., Bartolini, S. (2018). Scalable path-setup scheme for all-optical dynamic circuit switched NoCs in cache coherent CMPs. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 14(1), 1-27 [10.1145/3154840].
Scalable path-setup scheme for all-optical dynamic circuit switched NoCs in cache coherent CMPs
Grani, Paolo;Bartolini, Sandro
2018-01-01
Abstract
Nanophotonics is a promising solution for on-chip interconnection due to its intrinsic low-latency and lowpower features, which can be useful for performance and energy in future Chip Multi-Processors (CMPs). This article proposes a novel arbitrated all-optical path-setup scheme for tiled CMPs adopting circuitswitched optical networks. It aims at significantly reducing path-setup latency and overall energy consumption. The proposed arbitrated scheme is able to configure multiple photonic switches simultaneously, instead of sequentially as it is done in state-of-the-art proposals. The proposed fast optical path-setup solution reduces the overhead in each transmission and, most importantly, allows optical circuit-switched networks to effectively serve cache coherence traffic, which is mainly composed of relatively small messages. Specifically, we propose a Single-Arbiter scheme where the whole topology is managed by a central module (Single-Arbiter) that takes care of the path-setup procedures. Then, to tackle scalability, we propose a logically clustered architecture (Multi-Arbiter) in which an arbiter is allocated in each logical core-cluster and an ad-hoc distributed reservation protocol coordinates arbiters to manage inter-cluster path reservations. We show that our proposed Single-Arbiter architecture outperforms a state-of-the-art optical network with sequential path-setup (Optical Baseline) in case of 8- and 16-core tiled CMP setups. However, due to serialization issues, the Single-Arbiter solution is not capable to compete with a reference Electronic Baseline for bigger 32- and 64-core setups even if still performing much better than the Optical Baseline. Conversely, our Multi-Arbiter hierarchical solution, allows to improve performance up to almost 20% and 40% also for 32- and 64-core setups, respectively, demonstrating a wide applicability of the proposed technique. Energy-wise, the analyzed solutions enable significant savings compared to both the Optical Baseline with sequential path setup, and to the electronic counterpart. Specifically, results show more than 25% average improvement for the Single-Arbiter in case of the 8- and 16-core cases and more than 40% and 15% savings for the Multi-Arbiter in case of 32- and 64-core, respectively.File | Dimensione | Formato | |
---|---|---|---|
3154840.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
5.33 MB
Formato
Adobe PDF
|
5.33 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1027085