In this work, by using dynamic analysis techniques, we analyze how a workload can be accelerated in the case of a shared-bus shared-memory multiprocessor. It is well known that, in this kind of systems, the bus is the critical element that can limit the scalability of the machine. Nevertheless, many factors that influence bus utilization have not been yet investigated for this kind of workload, in particular the effects of thread migration. The operating system effects are also considered in our evaluation. We analyzed a basic four-processor and a high-end sixteen-processor machine, implementing three different coherence protocols (including MESI and another solution from the literature). We show that even in the four-processor case, the overhead induced by the sharing of private data, as a consequence of process migration, namely passive sharing, cannot be neglected. Indeed, the analysis shows that a protocol based on a selective strategy for dealing with private and shared data has a better performance than protocols either relying on the detection of migratory access-pattern or purely using a Write-Invalidate strategy, like MESI. We varied the architectural parameters to show how passive sharing and other coherence overhead are influenced by different cache choices. Then, we considered the sixteen-processor case, where the effects on performance are more evident. We also end up that performance can take advantage of large caches and cache affinity scheduling. However, even with affinity scheduling, a selective protocol delivers better performance.

Foglia, P., Giorgi, R., Prete, C.A. (2001). Performance Analysis of Parallel Applications Running on SMP. In Int.l Conf. on Parallel and Distributed Processing Techniques and Applications (pp.1634-1640).

Performance Analysis of Parallel Applications Running on SMP

GIORGI, ROBERTO;
2001-01-01

Abstract

In this work, by using dynamic analysis techniques, we analyze how a workload can be accelerated in the case of a shared-bus shared-memory multiprocessor. It is well known that, in this kind of systems, the bus is the critical element that can limit the scalability of the machine. Nevertheless, many factors that influence bus utilization have not been yet investigated for this kind of workload, in particular the effects of thread migration. The operating system effects are also considered in our evaluation. We analyzed a basic four-processor and a high-end sixteen-processor machine, implementing three different coherence protocols (including MESI and another solution from the literature). We show that even in the four-processor case, the overhead induced by the sharing of private data, as a consequence of process migration, namely passive sharing, cannot be neglected. Indeed, the analysis shows that a protocol based on a selective strategy for dealing with private and shared data has a better performance than protocols either relying on the detection of migratory access-pattern or purely using a Write-Invalidate strategy, like MESI. We varied the architectural parameters to show how passive sharing and other coherence overhead are influenced by different cache choices. Then, we considered the sixteen-processor case, where the effects on performance are more evident. We also end up that performance can take advantage of large caches and cache affinity scheduling. However, even with affinity scheduling, a selective protocol delivers better performance.
2001
189251270X
Foglia, P., Giorgi, R., Prete, C.A. (2001). Performance Analysis of Parallel Applications Running on SMP. In Int.l Conf. on Parallel and Distributed Processing Techniques and Applications (pp.1634-1640).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/46859
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo