Elliptic-Curve cryptography (ECC) is promising for enabling information security in constrained embedded devices. In order to be efficient on a target architecture, ECCs require accurate choice/tuning of the algorithms that perform the underlying mathematical operations. This paper contributes with a cycle-level analysis of the dependencies of ECC performance from the interaction between the features of the mathematical algorithms and the actual architectural and microarchitectural features of an ARM-based Intel XScale processor. Another contribution is the cycle-level analysis of a modified ARM processor that includes a word-level finite field polynomial multiplier (poly_mul) in its data path. This extension constitutes a good trade-off between applicability in a number of contexts, the simplicity of integration within the processor, and performance. This paper points out the most advantageous mix of elliptic curve (EC) parameters both for the standard ARM-based Intel XScale platform and for the one equipped with the poly_mul unit. In particular, the latter case allows for more than 41 percent execution time reduction on the considered benchmarks. Last, this paper investigates the correlation between the possible architectural organizations of a processor equipped with poly_mul unit(s) and EC benchmark performance. For instance, only superscalar pipelines can exploit the features of out-of-order execution and only very complex organizations (for example, four way superscalar) can exploit a high number of available ALUs. Conversely, we show that there are no benefits in endowing the processor with more than one poly_mul, and we point out a possible trade-off between performance and complexity increase: A two-way in-order/out-of-order pipeline allows +50 percent and +90 percent of Instructions per Cycle (IPC), respectively. Finally, we show that there are no critical constraints on the latency and pipelining capability of the poly_mul unit for the basic EC point multiplication.

Bartolini, S., Branovic, I., Giorgi, R., Martinelli, E. (2008). Effects of instruction-set extensions on an embedded processor: A case study on elliptic-curve cryptography over GF(2(m)). IEEE TRANSACTIONS ON COMPUTERS, 57(5), 672-685 [10.1109/TC.2007.70832].

Effects of instruction-set extensions on an embedded processor: A case study on elliptic-curve cryptography over GF(2(m))

BARTOLINI, SANDRO;GIORGI, ROBERTO;
2008-01-01

Abstract

Elliptic-Curve cryptography (ECC) is promising for enabling information security in constrained embedded devices. In order to be efficient on a target architecture, ECCs require accurate choice/tuning of the algorithms that perform the underlying mathematical operations. This paper contributes with a cycle-level analysis of the dependencies of ECC performance from the interaction between the features of the mathematical algorithms and the actual architectural and microarchitectural features of an ARM-based Intel XScale processor. Another contribution is the cycle-level analysis of a modified ARM processor that includes a word-level finite field polynomial multiplier (poly_mul) in its data path. This extension constitutes a good trade-off between applicability in a number of contexts, the simplicity of integration within the processor, and performance. This paper points out the most advantageous mix of elliptic curve (EC) parameters both for the standard ARM-based Intel XScale platform and for the one equipped with the poly_mul unit. In particular, the latter case allows for more than 41 percent execution time reduction on the considered benchmarks. Last, this paper investigates the correlation between the possible architectural organizations of a processor equipped with poly_mul unit(s) and EC benchmark performance. For instance, only superscalar pipelines can exploit the features of out-of-order execution and only very complex organizations (for example, four way superscalar) can exploit a high number of available ALUs. Conversely, we show that there are no benefits in endowing the processor with more than one poly_mul, and we point out a possible trade-off between performance and complexity increase: A two-way in-order/out-of-order pipeline allows +50 percent and +90 percent of Instructions per Cycle (IPC), respectively. Finally, we show that there are no critical constraints on the latency and pipelining capability of the poly_mul unit for the basic EC point multiplication.
2008
Bartolini, S., Branovic, I., Giorgi, R., Martinelli, E. (2008). Effects of instruction-set extensions on an embedded processor: A case study on elliptic-curve cryptography over GF(2(m)). IEEE TRANSACTIONS ON COMPUTERS, 57(5), 672-685 [10.1109/TC.2007.70832].
File in questo prodotto:
File Dimensione Formato  
bartolini-2008-IEEE-TransOnComputers-EffectsInstructionSet.pdf

non disponibili

Tipologia: Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 3.29 MB
Formato Adobe PDF
3.29 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/4300
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo