Nanophotonics is a promising solution for on-chip interconnection due to its intrinsic low-latency and lowpower features, which can be useful for performance and energy in future Chip Multi-Processors (CMPs). This article proposes a novel arbitrated all-optical path-setup scheme for tiled CMPs adopting circuitswitched optical networks. It aims at significantly reducing path-setup latency and overall energy consumption. The proposed arbitrated scheme is able to configure multiple photonic switches simultaneously, instead of sequentially as it is done in state-of-the-art proposals. The proposed fast optical path-setup solution reduces the overhead in each transmission and, most importantly, allows optical circuit-switched networks to effectively serve cache coherence traffic, which is mainly composed of relatively small messages. Specifically, we propose a Single-Arbiter scheme where the whole topology is managed by a central module (Single-Arbiter) that takes care of the path-setup procedures. Then, to tackle scalability, we propose a logically clustered architecture (Multi-Arbiter) in which an arbiter is allocated in each logical core-cluster and an ad-hoc distributed reservation protocol coordinates arbiters to manage inter-cluster path reservations. We show that our proposed Single-Arbiter architecture outperforms a state-of-the-art optical network with sequential path-setup (Optical Baseline) in case of 8- and 16-core tiled CMP setups. However, due to serialization issues, the Single-Arbiter solution is not capable to compete with a reference Electronic Baseline for bigger 32- and 64-core setups even if still performing much better than the Optical Baseline. Conversely, our Multi-Arbiter hierarchical solution, allows to improve performance up to almost 20% and 40% also for 32- and 64-core setups, respectively, demonstrating a wide applicability of the proposed technique. Energy-wise, the analyzed solutions enable significant savings compared to both the Optical Baseline with sequential path setup, and to the electronic counterpart. Specifically, results show more than 25% average improvement for the Single-Arbiter in case of the 8- and 16-core cases and more than 40% and 15% savings for the Multi-Arbiter in case of 32- and 64-core, respectively.
|Titolo:||Scalable path-setup scheme for all-optical dynamic circuit switched NoCs in cache coherent CMPs|
|Citazione:||Scalable path-setup scheme for all-optical dynamic circuit switched NoCs in cache coherent CMPs / Grani, Paolo; Bartolini, Sandro. - STAMPA. - 14:1(2018), pp. 12.1-12.27.|
|Appare nelle tipologie:||1.1 Articolo in rivista|