Designing an efficient memory system is a big challenge for future multicore systems. In particular, multicore systems increase the number of requests towards the memory systems, so the design of efficient on-chip caches is crucial to achieve adequate level of performance. Solutions based on conventional, big sized cache may be improved due to wire delay effects, so NUCA and D-NUCA cache may represents an alternative solution, thanks to their ability to limit such effects. Another important design issue of such systems is related to coherence management: the theory of caches kept coherent via directory based coherence protocols was successful in designing high performance DSM machine, and now must consider the requirements of the new scenario: many cores on a chip, and NUCA organizations. In this paper, we face some of these aspects by presenting a NUCA based last-level cache (LLC) architecture. Such an architecture is based on a D-NUCA scheme, i.e. a banked LLC architecture with a migration mechanism to put frequently accessed data near to the requesting processor. To improve access time to shared copies limited by ping-pong effects, we adopted the copy replication, that allows the replication of shared copies that are requested by processors located on the opposite side of the cache. Finally, we have adapted a directory based, distributed, coherence protocol to a D-NUCA cache with migration and replication. Our resulting cache memory sub-system is more performing than a statically sub-banked LLC. The adoption of all such mechanisms forced us to deal with race conditions that may compromise data coherence inside the chip and the memory and, then, to modify the baseline coherence protocol. This experience demonstrated that, in the multicore era, coherence protocols still must be considered of the utmost importance by researchers and designers when facing the design of such systems.
Bartolini, S., Foglia, P., Prete, C.A., Solinas, M. (2014). Coherence in the CMP ERA: Lesson learned in designing a LLC architecture. WSEAS TRANSACTIONS ON COMPUTERS, 13, 195-206.
Coherence in the CMP ERA: Lesson learned in designing a LLC architecture
BARTOLINI, SANDRO;
2014-01-01
Abstract
Designing an efficient memory system is a big challenge for future multicore systems. In particular, multicore systems increase the number of requests towards the memory systems, so the design of efficient on-chip caches is crucial to achieve adequate level of performance. Solutions based on conventional, big sized cache may be improved due to wire delay effects, so NUCA and D-NUCA cache may represents an alternative solution, thanks to their ability to limit such effects. Another important design issue of such systems is related to coherence management: the theory of caches kept coherent via directory based coherence protocols was successful in designing high performance DSM machine, and now must consider the requirements of the new scenario: many cores on a chip, and NUCA organizations. In this paper, we face some of these aspects by presenting a NUCA based last-level cache (LLC) architecture. Such an architecture is based on a D-NUCA scheme, i.e. a banked LLC architecture with a migration mechanism to put frequently accessed data near to the requesting processor. To improve access time to shared copies limited by ping-pong effects, we adopted the copy replication, that allows the replication of shared copies that are requested by processors located on the opposite side of the cache. Finally, we have adapted a directory based, distributed, coherence protocol to a D-NUCA cache with migration and replication. Our resulting cache memory sub-system is more performing than a statically sub-banked LLC. The adoption of all such mechanisms forced us to deal with race conditions that may compromise data coherence inside the chip and the memory and, then, to modify the baseline coherence protocol. This experience demonstrated that, in the multicore era, coherence protocols still must be considered of the utmost importance by researchers and designers when facing the design of such systems.File | Dimensione | Formato | |
---|---|---|---|
bartolini-wseas-2014-a245705-346.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
PDF editoriale
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
1.07 MB
Formato
Adobe PDF
|
1.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1011112