SPA-GPT: General Pulse Tailor for Simple Power Analysis Based on Reinforcement Learning

Ziyu Wang; Yaoling Ding; An Wang; Yuwei Zhang; Congming Wei; Shaofei Sun; Liehuang Zhu

doi:10.46586/tches.v2024.i4.40-83

Authors

Ziyu Wang School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Yaoling Ding School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
An Wang School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Yuwei Zhang School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Congming Wei School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Shaofei Sun School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Liehuang Zhu School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China

DOI:

https://doi.org/10.46586/tches.v2024.i4.40-83

Keywords:

Side-channel Analysis, Power Trace Segmentation, Public-key Algorithms, Kyber, Reinforcement Learning, Deep Q-Network

Abstract

In side-channel analysis of public-key algorithms, we usually classify operations based on the differences in power traces produced by different basic operations (such as modular square or modular multiplication) to recover secret information like private keys. The more accurate the segmentation of power traces, the higher the efficiency of their classification. There exist two commonly used methods: one is equidistant segmentation, which requires a fixed number of basic operations and similar trace lengths for each type of operation, leading to limited application scenarios; the other is peak-based segmentation, which relies on personal experience to configure parameters, resulting in insufficient flexibility and poor universality.

In this paper, we propose an automated trace segmentation method based on reinforcement learning applicable to a wide range of common implementation of public-key algorithms. The introduction of reinforcement learning, which doesn’t need labels, into trace processing for side-channel analysis marks its debut in this field. Our method has good universality on the traces with varying segment lengths and differing peak heights. By using prioritized experience replay optimized Deep Q-Network algorithm, we reduce the required number of parameters to one, which is the key length. We also employ various techniques to improve the segmentation effectiveness, such as clustering algorithm and enveloped-based feature enhancement. We validate the effectiveness of the new method in nine scenarios involving hardware and software implementations of different public-key algorithms executed on diverse platforms such as microcontrollers, SAKURA-G, and smart cards. Specifically, one of these implementations is protected by time randomization countermeasures. Experimental results show that a basic version of our method can correctly segment most traces. The enhanced version is capable of reconstructing the sequence of operations during trace segmentation, achieving an accuracy rate of 100% for the majority of the traces. For traces that cannot be entirely restored, we utilize reward values of reinforcement learning to correct errors and achieve fully recovery. We also conducted comparative experiments with supervised seq2seq methods, revealing our approach’s 42% higher accuracy in operation recovery and 96% faster time efficiency. In addition, we applied our method to the post-quantum cryptography Kyber, and successfully recovered an intermediate value crucial for deriving the secret key. Besides, power traces collected from these devices have been uploaded as open databases, which are available for researchers engaged in public-key algorithms to conduct related experiments or verify our method.

SPA-GPT: General Pulse Tailor for Simple Power Analysis Based on Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

iacr-logo