Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs

Authors

  • Wonkyung Jung Seoul National University, Seoul, Republic of Korea
  • Sangpyo Kim Seoul National University, Seoul, Republic of Korea
  • Jung Ho Ahn Seoul National University, Seoul, Republic of Korea
  • Jung Hee Cheon Seoul National University, Seoul, Republic of Korea; Crypto Lab. Inc, Seoul, South Korea
  • Younho Lee SeoulTech, Seoul, Republic of Korea

DOI:

https://doi.org/10.46586/tches.v2021.i4.114-148

Keywords:

Fully Homomorphic Encryption, Bootstrapping, Logistic regression, GPU, Kernel fusion

Abstract

Fully Homomorphic encryption (FHE) has been gaining in popularity as an emerging means of enabling an unlimited number of operations in an encrypted message without decryption. A major drawback of FHE is its high computational cost. Specifically, a bootstrapping step that refreshes the noise accumulated through consequent FHE operations on the ciphertext can even take minutes of time. This significantly limits the practical use of FHE in numerous real applications.
By exploiting the massive parallelism available in FHE, we demonstrate the first instance of the implementation of a GPU for bootstrapping CKKS, one of the most promising FHE schemes supporting the arithmetic of approximate numbers. Through analyzing CKKS operations, we discover that the major performance bottleneck is their high main-memory bandwidth requirement, which is exacerbated by leveraging existing optimizations targeted to reduce the required computation. These observations motivate us to utilize memory-centric optimizations such as kernel fusion and reordering primary functions extensively.
Our GPU implementation shows a 7.02× speedup for a single CKKS multiplication compared to the state-of-the-art GPU implementation and an amortized bootstrapping time of 0.423us per bit, which corresponds to a speedup of 257× over a single-threaded CPU implementation. By applying this to logistic regression model training, we achieved a 40.0× speedup compared to the previous 8-thread CPU implementation with the same data.

Downloads

Published

2021-08-11

Issue

Section

Articles

How to Cite

Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs. (2021). IACR Transactions on Cryptographic Hardware and Embedded Systems, 2021(4), 114-148. https://doi.org/10.46586/tches.v2021.i4.114-148