Low-Latency Design and Implementation of the Squaring in Class Groups for Verifiable Delay Function Using Redundant Representation

Danyang Zhu; Rongrong Zhang; Lun Ou; Jing Tian; Zhongfeng Wang

doi:10.46586/tches.v2023.i1.438-462

Authors

Danyang Zhu School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
Rongrong Zhang School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
Lun Ou School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
Jing Tian School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China
Zhongfeng Wang School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China

DOI:

https://doi.org/10.46586/tches.v2023.i1.438-462

Keywords:

Verifiable delay functions, squaring, extended GCD, low-latency, ASIC, architecture, class groups, redundant representation

Abstract

A verifiable delay function (VDF) is a function whose evaluation requires running a prescribed number of sequential steps over a group while the result can be efficiently verified. As a kind of cryptographic primitives, VDFs have been adopted in rapidly growing applications for decentralized systems. For the security of VDFs in practical applications, it is widely agreed that the fastest implementation for the VDF evaluation, sequential squarings in a group of unknown order, should be publicly provided. To this end, we propose a possible minimum latency hardware implementation for the squaring in class groups by algorithmic and architectural level co-optimization. Firstly, low-latency architectures for large-number division, multiplication, and addition are devised using redundant representation, respectively. Secondly, we present two hardware-friendly algorithms which avoid time-consuming divisions involved in calculations related to the extended greatest common divisor (XGCD) and design the corresponding low-latency architectures. Besides, we schedule and reuse these computation modules to achieve good resource utilization by using compact instruction control. Finally, we code and synthesize the proposed design under the TSMC 28nm CMOS technology. The experimental results show that our design can achieve a speedup of 3.6x compared to the state-of-the-art implementation of the squaring in the class group. Moreover, compared to the optimal C++ implementation over an advanced CPU, our implementation is 9.1x faster.

Low-Latency Design and Implementation of the Squaring in Class Groups for Verifiable Delay Function Using Redundant Representation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

iacr-logo