Low-Latency Design and Implementation of the Squaring in Class Groups for Verifiable Delay Function Using Redundant Representation

Authors

  • Danyang Zhu School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
  • Rongrong Zhang School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
  • Lun Ou School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
  • Jing Tian School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
  • Zhongfeng Wang School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China

DOI:

https://doi.org/10.46586/tches.v2023.i1.438-462

Keywords:

Verifiable delay functions, squaring, extended GCD, low-latency, ASIC, architecture, class groups, redundant representation

Abstract

A verifiable delay function (VDF) is a function whose evaluation requires running a prescribed number of sequential steps over a group while the result can be efficiently verified. As a kind of cryptographic primitives, VDFs have been adopted in rapidly growing applications for decentralized systems. For the security of VDFs in practical applications, it is widely agreed that the fastest implementation for the VDF evaluation, sequential squarings in a group of unknown order, should be publicly provided. To this end, we propose a possible minimum latency hardware implementation for the squaring in class groups by algorithmic and architectural level co-optimization. Firstly, low-latency architectures for large-number division, multiplication, and addition are devised using redundant representation, respectively. Secondly, we present two hardware-friendly algorithms which avoid time-consuming divisions involved in calculations related to the extended greatest common divisor (XGCD) and design the corresponding low-latency architectures. Besides, we schedule and reuse these computation modules to achieve good resource utilization by using compact instruction control. Finally, we code and synthesize the proposed design under the TSMC 28nm CMOS technology. The experimental results show that our design can achieve a speedup of 3.6x compared to the state-of-the-art implementation of the squaring in the class group. Moreover, compared to the optimal C++ implementation over an advanced CPU, our implementation is 9.1x faster.

Downloads

Published

2022-11-29

Issue

Section

Articles

How to Cite

Low-Latency Design and Implementation of the Squaring in Class Groups for Verifiable Delay Function Using Redundant Representation. (2022). IACR Transactions on Cryptographic Hardware and Embedded Systems, 2023(1), 438-462. https://doi.org/10.46586/tches.v2023.i1.438-462