A Highly-efficient Lattice-based Post-Quantum Cryptography Processor for IoT Applications

Authors

  • Zewen Ye Zhejiang University, Hangzhou, China; City University of Hong Kong, Hong Kong, China
  • Ruibing Song Zhejiang University, Hangzhou, China
  • Hao Zhang Zhejiang University, Hangzhou, China
  • Donglong Chen BNU-HKBU United International College, Zhuhai, China
  • Ray Chak-Chung Cheung City University of Hong Kong, Hong Kong, China
  • Kejie Huang Zhejiang University, Hangzhou, China

DOI:

https://doi.org/10.46586/tches.v2024.i2.130-153

Keywords:

Post-quantum Cryptography, RISC-V, Single-Instruction-Multiple- Data, Lattice-Based Cryptography, Internet-of-Things

Abstract

Lattice-Based Cryptography (LBC) schemes, like CRYSTALS-Kyber and CRYSTALS-Dilithium, have been selected to be standardized in the NIST Post-Quantum Cryptography standard. However, implementing these schemes in resourceconstrained Internet-of-Things (IoT) devices is challenging, considering efficiency, power consumption, area overhead, and flexibility to support various operations and parameter settings. Some existing ASIC designs that prioritize lower power and area can not achieve optimal performance efficiency, which are not practical for battery-powered devices. Custom hardware accelerators in prior co-processor and processor designs have limited applications and flexibility, incurring significant area and power overheads for IoT devices. To address these challenges, this paper presents an efficient lattice-based cryptography processor with customized Single-Instruction-Multiple-Data (SIMD) instruction. First, our proposed SIMD architecture supports efficient parallel execution of various polynomial operations in 256-bit mode and acceleration of Keccak in 320-bit mode, both utilizing efficiently reused resources. Additionally, we introduce data shuffling hardware units to resolve data dependencies within SIMD data. To further enhance performance, we design a dual-issue path for memory accesses and corresponding software design methodologies to reduce the impact of data load/store blocking. Through a hardware/software co-design approach, our proposed processor achieves high efficiency in supporting all operations in lattice-based cryptography schemes. Evaluations of Kyber and Dilithium show our proposed processor achieves over 10x speedup compared with the baseline RISC-V processor and over 5x speedup versus ARM Cortex M4 implementations, making it a promising solution for securing IoT communications and storage. Moreover, Silicon synthesis results show our design can run at 200 MHz with 2.01 mW for Kyber KEM 512 and 2.13 mW for Dilithium 2, which outperforms state-of-the-art works in terms of PPAP (Performance x Power x Area).

Downloads

Published

2024-03-12

Issue

Section

Articles

How to Cite

A Highly-efficient Lattice-based Post-Quantum Cryptography Processor for IoT Applications. (2024). IACR Transactions on Cryptographic Hardware and Embedded Systems, 2024(2), 130-153. https://doi.org/10.46586/tches.v2024.i2.130-153