Subjects -> ELECTRONICS (Total: 207 journals)
| A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | The end of the list has been reached or no journals were found for your choice. |
|
|
- PMFault: Faulting and Bricking Server CPUs through Management Interfaces
Authors: Zitai Chen, David Oswald Pages: 1 - 23 Abstract: Apart from the actual CPU, modern server motherboards contain other auxiliary components, for example voltage regulators for power management. Those are connected to the CPU and the separate Baseboard Management Controller (BMC) via the I2C-based PMBus. In this paper, using the case study of the widely used Supermicro X11SSL motherboard, we show how remotely exploitable software weaknesses in the BMC (or other processors with PMBus access) can be used to access the PMBus and then perform hardware-based fault injection attacks on the main CPU. The underlying weaknesses include insecure firmware encryption and signing mechanisms, a lack of authentication for the firmware upgrade process and the IPMI KCS control interface, as well as the motherboard design (with the PMBus connected to the BMC and SMBus by default). First, we show that undervolting through the PMBus allows breaking the integrity guarantees of SGX enclaves, bypassing Intel’s countermeasures against previous undervolting attacks like Plundervolt/V0ltPwn. Second, we experimentally show that overvolting outside the specified range has the potential of permanently damaging Intel Xeon CPUs, rendering the server inoperable. We assess the impact of our findings on other server motherboards made by Supermicro and ASRock. Our attacks, dubbed PMFault, can be carried out by a privileged software adversary and do not require physical access to the server motherboard or knowledge of the BMC login credentials. We responsibly disclosed the issues reported in this paper to Supermicro and discuss possible countermeasures at different levels. To the best of our knowledge, the 12th generation of Supermicro motherboards, which was designed before we reported PMFault to Supermicro, is not vulnerable. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.1-23
- Peek into the Black-Box: Interpretable Neural Network using SAT Equations
in Side-Channel Analysis Authors: Trevor Yap, Adrien Benamira, Shivam Bhasin, Thomas Peyrin Pages: 24 - 53 Abstract: Deep neural networks (DNN) have become a significant threat to the security of cryptographic implementations with regards to side-channel analysis (SCA), as they automatically combine the leakages without any preprocessing needed, leading to a more efficient attack. However, these DNNs for SCA remain mostly black-box algorithms that are very difficult to interpret. Benamira et al. recently proposed an interpretable neural network called Truth Table Deep Convolutional Neural Network (TT-DCNN), which is both expressive and easier to interpret. In particular, a TT-DCNN has a transparent inner structure that can entirely be transformed into SAT equations after training. In this work, we analyze the SAT equations extracted from a TT-DCNN when applied in SCA context, eventually obtaining the rules and decisions that the neural networks learned when retrieving the secret key from the cryptographic primitive (i.e., exact formula). As a result, we can pinpoint the critical rules that the neural network uses to locate the exact Points of Interest (PoIs). We validate our approach first on simulated traces for higher-order masking. However, applying TT-DCNN on real traces is not straightforward. We propose a method to adapt TT-DCNN for application on real SCA traces containing thousands of sample points. Experimental validation is performed on software-based ASCADv1 and hardware-based AES_HD_ext datasets. In addition, TT-DCNN is shown to be able to learn the exact countermeasure in a best-case setting. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.24-53
- Garbled Circuits from an SCA Perspective
Authors: Itamar Levi, Carmit Hazay Pages: 54 - 79 Abstract: Garbling schemes, invented in the 80’s by Yao (FOCS’86), have been a versatile and fundamental tool in modern cryptography. A prominent application of garbled circuits is constant round secure two-party computation, which led to a long line of study of this object, where one of the most influential optimizations is Free-XOR (Kolesnikov and Schneider ICALP’08), introducing a global offset Δ for all garbled wire values where XOR gates are computed locally without garbling them. To date, garbling schemes were not studied per their side-channel attacks (SCA) security characteristics, even though SCA pose a significant security threat to cryptographic devices. In this research we, demonstrate that adversaries utilizing advanced SCA tools such as horizontal attacks, mixed with advanced hypothesis building and standard (vertical) SCA tools, can jeopardize garbling implementations. Our main observation is that garbling schemes utilizing a global secret Δ open a door to quite trivial side-channel attacks. We model our side-channel attacks on the garbler’s device and discuss the asymmetric setting where various computations are not performed on the evaluator side. This enables dangerous leakage extraction on the garbler and renders our attack impossible on the evaluator’s side. Theoretically, we first demonstrate on a simulated environment, that such attacks are quite devastating. Concretely, our attack is capable of extracting Δ when the circuit embeds only 8 input non-linear gates with fifth/first-order attack Success-Rates of 0.65/0.7. With as little as 3 such gates, our attack reduces the first-order Guessing Entropy of Δ from 128 to ∼ 48-bits. We further demonstrate our attack via an implementation and power measurements data over an STM 32-bit processor software implementing circuit garbling, and discuss their limitations and mitigation tactics on logical, protocol and implementation layers. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.54-79
- On Protecting SPHINCS+ Against Fault Attacks
Authors: Aymeric Genêt Pages: 80 - 114 Abstract: SPHINCS+ is a hash-based digital signature scheme that was selected by NIST in their post-quantum cryptography standardization process. The establishment of a universal forgery on the seminal scheme SPHINCS was shown to be feasible in practice by injecting a fault when the signing device constructs any non-top subtree. Ever since the attack has been made public, little effort was spent to protect the SPHINCS family against attacks by faults. This paper works in this direction in the context of SPHINCS+ and analyzes the current algorithms that aim to prevent fault-based forgeries. First, the paper adapts the original attack to SPHINCS+ reinforced with randomized signing and extends the applicability of the attack to any combination of faulty and valid signatures. Considering the adaptation, the paper then presents a thorough analysis of the attack. In particular, the analysis shows that, with high probability, the security guarantees of SPHINCS+ significantly drop when a single random bit flip occurs anywhere in the signing procedure and that the resulting faulty signature cannot be detected with the verification procedure. The paper shows both in theory and experimentally that the countermeasures based on caching the intermediate W-OTS+s offer a marginally greater protection against unintentional faults, and that such countermeasures are circumvented with a tolerable number of queries in an active attack. Based on these results, the paper recommends real-world deployments of SPHINCS+ to implement redundancy checks. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.80-114
- Areion: Highly-Efficient Permutations and Its Applications to Hash
Functions for Short Input Authors: Takanori Isobe, Ryoma Ito, Fukang Liu, Kazuhiko Minematsu, Motoki Nakahashi, Kosei Sakamoto, Rentaro Shiba Pages: 115 - 154 Abstract: In the real-world applications, the overwhelming majority of cases require hashing with relatively short input, say up to 2K bytes. The length of almost all TCP/IP packets is between 40 to 1.5K bytes, and the maximum packet lengths of major protocols, e.g., Zigbee, Bluetooth low energy, and Controller Area Network (CAN) are less than 128 bytes. However, existing schemes are not well optimized for short input. To bridge the gap between real-world needs (in future) and limited performances of state-of-the-art hash functions for short input, we design a family of wide-block permutations Areion that fully leverages the power of AES instructions, which are widely deployed in many devices. As its applications, we propose several hash functions. Areion significantly outperforms existing schemes for short input and even competitive to relatively long message. Indeed, our hash function is surprisingly fast, and its performance is less than 3 cycles/byte in the latest Intel architecture for any message size. Especially, it is about 10 times faster than existing state-of-the-art schemes for short message up to around 100 bytes, which are most widely-used input size in real-world applications, on both the latest CPU architectures (IceLake, Tiger Lake, and Alder Lake) and mobile platforms (Pixel 6 and iPhone 13). PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.115-154
- Threshold Implementations in Software: Micro-architectural Leakages in
Algorithms Authors: John Gaspoz, Siemen Dhooghe Pages: 155 - 179 Abstract: This paper provides necessary properties to algorithmically secure firstorder maskings in scalar micro-architectures. The security notions of threshold implementations are adapted following micro-processor leakage effects which are known to the literature. The resulting notions, which are based on the placement of shares, are applied to a two-share randomness-free PRESENT cipher and Keccak-f. The assembly implementations are put on a RISC-V and an ARM Cortex-M4 core. All designs are validated in the glitch and transition extended probing model and their implementations via practical lab analysis. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.155-179
- High-order masking of NTRU
Authors: Jean-Sébastien Coron, François Gérard, Matthias Trannoy, Rina Zeitoun Pages: 180 - 211 Abstract: The main protection against side-channel attacks consists in computing every function with multiple shares via the masking countermeasure. While the masking countermeasure was originally developed for securing block-ciphers such as AES, the protection of lattice-based cryptosystems is often more challenging, because of the diversity of the underlying algorithms. In this paper, we introduce new gadgets for the high-order masking of the NTRU cryptosystem, with security proofs in the classical ISW probing model. We then describe the first fully masked implementation of the NTRU Key Encapsulation Mechanism submitted to NIST, including the key generation. To assess the practicality of our countermeasures, we provide a concrete implementation on ARM Cortex-M3 architecture, and eventually a t-test leakage evaluation. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.180-211
- FaultMeter: Quantitative Fault Attack Assessment of Block Cipher Software
Authors: Keerthi K, Chester Rebeiro Pages: 212 - 240 Abstract: Fault attacks are a potent class of physical attacks that exploit a fault njected during device operation to steal secret keys from a cryptographic device. The success of a fault attack depends intricately on (a) the cryptographic properties of the cipher, (b) the program structure, and (c) the underlying hardware architecture. While there are several tools that automate the process of fault attack evaluation, none of them consider all three influencing aspects. This paper proposes a framework called FaultMeter that builds on the state-of-art by not just identifying fault vulnerable locations in a block cipher software, but also providing a quantification for each vulnerable location. The quantification provides a probability that an injected fault can be successfully exploited. It takes into consideration the cryptographic properties of the cipher, structure of the implementation, and the underlying Instruction Set Architecture’s (ISA) susceptibility to faults. We demonstrate an application of FaultMeter to automatically insert optimal amounts of countermeasures in a program to meet the user’s security requirements while minimizing overheads. We demonstrate the versatility of the FaultMeter framework by evaluating five cipher implementations on multiple hardware platforms, namely, ARM (32 and 64 bit), RISC-V (32 and 64 bit), TI MSP-430 (16-bit) and Intel x86 (64-bit). PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.212-240
- How Secure is Exponent-blinded RSA–CRT with Sliding Window
Exponentiation' Authors: Rei Ueno, Naofumi Homma Pages: 241 - 269 Abstract: This paper presents the first security evaluation of exponent-blinded RSA–CRT implementation with sliding window exponentiation against cache attacks. Our main contributions are threefold. (1) We demonstrate an improved cache attack using Flush+Reload on RSA–CRT to estimate the squaring–multiplication operational sequence. The proposed method can estimate a correct squaring–multiplication sequence from one Flush+Reload trace, while the existing Flush+Reload attacks always contain errors in the sequence estimation. This is mandatory for the subsequent steps in the proposed attack. (2) We present a new and first partial key exposure attack on exponent-blinded RSA–CRT with a random-bit leak. The proposed attack first estimates a random mask for blinding exponent using a modification of the Schindler–Wiemers continued fraction attack, and then recovers the secret key using an extension of the Heninger–Shacham branch-and-prune attack. We experimentally show that the proposed attack on RSA–CRT using a practical window size of 5 with 16-, 32-, and 64-bit masks is carried out with complexity of 225.6, 267.7, and 2161, respectively. (3) We then investigate the tradeoffs between mask bit length and implementation performance. The computational cost of exponent-blinded RSA–CRT using a sliding window with a 32- and 64-bit mask are 15% and 10% faster than that with a 128-bit mask, respectively, as we confirmed that 32- and 64-bit masks are sufficient to defeat the proposed attack. Our source code used in the experiment is publicly available. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.241-269
- Some New Methods to Generate Short Addition Chains
Authors: Yuanchao Ding, Hua Guo, Yewei Guan, Hutao Song, Xiyong Zhang, Jianwei Liu Pages: 270 - 285 Abstract: Modular exponentiation and scalar multiplication are important operations in most public-key cryptosystems, and their efficient computation is essential to cryptosystems. The shortest addition chain is one of the most important mathematical concepts to realize the optimization of computation. However, finding a shortest addition chain of length r is generally regarded as an NP-hard problem, whose time complexity is comparable to O(r!). This paper proposes some novel methods to generate short addition chains. We firstly present a Simplified Power-tree method by deeply deleting the power-tree whose time complexity is reduced to O(r2). In this paper, a Cross Window method and its variant are introduced by improving the Window method. The Cross Window method uses the cross correlation to deal with the windows and its pre-computation is optimized by a new Addition Sequence Algorithm. The theoretical analysis is conducted to show the correctness and effectiveness. Meanwhile, our experiments show that the new methods can obtain shorter addition chains compared to the existing methods. The Cross Window method with the Addition Sequence algorithm can attain 44.74% and 9.51% reduction of the addition chain length, in the best case, compared to the Binary method and the Window method respectively. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.270-285
- Efficient Private Circuits with Precomputation
Authors: Weijia Wang, Fanjie Ji, Juelin Zhang, Yu Yu Pages: 286 - 309 Abstract: At CHES 2022, Wang et al. described a new paradigm for masked implementations using private circuits, where most intermediates can be precomputed before the input shares are accessed, significantly accelerating the online execution of masked functions. However, the masking scheme they proposed mainly featured (and was designed for) the cost amortization, leaving its (limited) suitability in the above precomputation-based paradigm just as a bonus. This paper aims to provide an efficient, reliable, easy-to-use, and precomputation-compatible masking scheme. We propose a new masked multiplication over the finite field Fq suitable for the precomputation, and prove its security in the composable notion called Probing-Isolating Non-Inference (PINI). Particularly, the operations (e.g., AND and XOR) in the binary field can be achieved by assigning q = 2, allowing the bitsliced implementation that has been shown to be quite efficient for the software implementations. The new masking scheme is applied to leverage the masking of AES and SKINNY block ciphers on ARM Cortex M architecture. The performance results show that the new scheme contributes to a significant speed-up compared with the state-of-the-art implementations. For SKINNY with block size 64, the speed and RAM requirement can be significantly improved (saving around 45% cycles in the online-computation and 60% RAM space for precomputed values) from AES-128, thanks to its smaller number of AND gates. Besides the security proof by hand, we provide formal verifications for the multiplication and T-test evaluations for the masked implementations of AES and SKINNY. Because of the structure of the new masked multiplication, our formal verification can be performed for security orders up to 16. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.286-309
- Conditional Variational AutoEncoder based on Stochastic Attacks
Authors: Gabriel Zaid, Lilian Bossuet, Mathieu Carbone, Amaury Habrard, Alexandre Venelli Pages: 310 - 357 Abstract: Over the recent years, the cryptanalysis community leveraged the potential of research on Deep Learning to enhance attacks. In particular, several studies have recently highlighted the benefits of Deep Learning based Side-Channel Attacks (DLSCA) to target real-world cryptographic implementations. While this new research area on applied cryptography provides impressive result to recover a secret key even when countermeasures are implemented (e.g. desynchronization, masking schemes), the lack of theoretical results make the construction of appropriate and powerful models a notoriously hard problem. This can be problematic during an evaluation process where a security bound is required. In this work, we propose the first solution that bridges DL and SCA in order to get this security bound. Based on theoretical results, we develop the first Machine Learning generative model, called Conditional Variational AutoEncoder based on Stochastic Attacks (cVAE-SA), designed from the well-known Stochastic Attacks, that have been introduced by Schindler et al. in 2005. This model reduces the black-box property of DL and eases the architecture design for every real-world crypto-system as we define theoretical complexity bounds which only depend on the dimension of the (reduced) trace and the targeting variable over F2n . We validate our theoretical proposition through simulations and public datasets on a wide range of use cases, including multi-task learning, curse of dimensionality and masking scheme. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.310-357
- Speeding Up Multi-Scalar Multiplication over Fixed Points Towards
Efficient zkSNARKs Authors: Guiwen Luo, Shihui Fu, Guang Gong Pages: 358 - 380 Abstract: The arithmetic of computing multiple scalar multiplications in an elliptic curve group then adding them together is called multi-scalar multiplication (MSM). MSM over fixed points dominates the time consumption in the pairing-based trusted setup zero-knowledge succinct non-interactive argument of knowledge (zkSNARK), thus for practical applications we would appreciate fast algorithms to compute it. This paper proposes a bucket set construction that can be utilized in the context of Pippenger’s bucket method to speed up MSM over fixed points with the help of precomputation. If instantiating the proposed construction over BLS12-381 curve, when computing n-scalar multiplications for n = 2e (10 ≤ e ≤ 21), theoretical analysis ndicates that the proposed construction saves more than 21% computational cost compared to Pippenger’s bucket method, and that it saves 2.6% to 9.6% computational cost compared to the most popular variant of Pippenger’s bucket method. Finally, our experimental result demonstrates the feasibility of accelerating the computation of MSM over fixed points using large precomputation tables as well as the effectiveness of our new construction. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.358-380
- A Closer Look at the Chaotic Ring Oscillators based TRNG Design
Authors: Shuqin Su, Bohan Yang, Vladimir Rožić, Mingyuan Yang, Min Zhu, Shaojun Wei, Leibo Liu Pages: 381 - 417 Abstract: TRNG is an essential component for security applications. A vulnerable TRNG could be exploited to facilitate potential attacks or be related to a reduced key space, and eventually results in a compromised cryptographic system. A digital FIRO-/GARO-based TRNG with high throughput and high entropy rate was introduced by Jovan Dj. Golic (TC’06). However, the fact that periodic oscillation is a main failure of FIRO-/GARO-based TRNGs is noticed in the paper (Markus Dichtl, ePrint’15). We verify this problem and estimate the consequential entropy loss using Lyapunov exponents and the test suite of the NIST SP 800-90B standard. To address the problem of periodic oscillations, we propose several implementation guidelines based on a gate-level model, a design methodology to build a reliable GARO-based TRNG, and an online test to improve the robustness of FIRO-/GARO-based TRNGs. The gate-level implementation guidelines illustrate the causes of periodic oscillations, which are verified by actual implementation and bifurcation diagram. Based on the design methodology, a suitable feedback polynomial can be selected by evaluating the feedback polynomials. The analysis and understanding of periodic oscillation and FIRO-/GARO-based TRNGs are deepened by delay adjustment. A TRNG with the selected feedback polynomial may occasionally enter periodic oscillations, due to active attacks and the delay inconstancy of implementations. This inconstancy might be caused by self-heating, temperature and voltage fluctuation, and the process variation among different silicon chips. Thus, an online test module, as one indispensable component of TRNGs, is proposed to detect periodic oscillations. The detected periodic oscillation can be eliminated by adjusting feedback polynomial or delays to improve the robustness. The online test module is composed of a lightweight and responsive detector with a high detection rate, outperforming the existing detector design and statistical tests. The areas, power consumptions and frequencies are evaluated based on the ASIC implementations of a GARO, the sampling circuit and the online test module. The gate-level implementation guidelines promote the future establishment of the stochastic model of FIRO-/GARO-based TRNGs with a deeper understanding. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.381-417
- Pushing the Limits of Generic Side-Channel Attacks on LWE-based KEMs -
Parallel PC Oracle Attacks on Kyber KEM and Beyond Authors: Gokulnath Rajendran, Prasanna Ravi, Jan-Pieter D’Anvers, Shivam Bhasin, Anupam Chattopadhyay Pages: 418 - 446 Abstract: In this work, we propose generic and novel adaptations to the binary Plaintext-Checking (PC) oracle based side-channel attacks for Kyber KEM. These attacks operate in a chosen-ciphertext setting, and are fairly generic and easy to mount on a given target, as the attacker requires very minimal information about the target device. However, these attacks have an inherent disadvantage of requiring a few thousand traces to perform full key recovery. This is due to the fact that these attacks typically work by recovering a single bit of information about the secret key per query/trace. In this respect, we propose novel parallel PC oracle based side-channel attacks, which are capable of recovering a generic P number of bits of information about the secret key in a single query/trace. We propose novel techniques to build chosen-ciphertexts so as to efficiently realize a parallel PC oracle for Kyber KEM. We also build a multi-class classifier, which is capable of realizing a practical side-channel based parallel PC oracle with very high success rate. We experimentally validated the proposed attacks (upto P = 10) on the fastest implementation of unprotected Kyber KEM in the pqm4 library. Our experiments yielded improvements in the range of 2.89× and 7.65× in the number of queries, compared to state-of-the-art binary PC oracle attacks, while arbitrarily higher improvements are possible for a motivated attacker, given the generic nature of the proposed attacks. We further conduct a thorough study on applicability to different scenarios, based on the presence/absence of a clone device, and also partial key recovery. Finally, we also show that the proposed attacks are able to achieve the lowest number of queries for key recovery, even for implementations protected with low-cost countermeasures such as shuffling. Our work therefore, concretely demonstrates the power of PC oracle attacks on Kyber KEM, thereby stressing the need for concrete countermeasures such as masking for Kyber and other lattice-based KEMs. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.418-446
- Fiddling the Twiddle Constants - Fault Injection Analysis of the Number
Theoretic Transform Authors: Prasanna Ravi, Bolin Yang, Shivam Bhasin, Fan Zhang, Anupam Chattopadhyay Pages: 447 - 481 Abstract: In this work, we present the first fault injection analysis of the Number Theoretic Transform (NTT). The NTT is an integral computation unit, widely used for polynomial multiplication in several structured lattice-based key encapsulation mechanisms (KEMs) and digital signature schemes. We identify a critical single fault vulnerability in the NTT, which severely reduces the entropy of its output. This in turn enables us to perform a wide-range of attacks applicable to lattice-based KEMs as well as signature schemes. In particular, we demonstrate novel key recovery and message recovery attacks targeting the key generation and encryption procedure of Kyber KEM. We also propose novel existential forgery attacks targeting deterministic and probabilistic signing procedure of Dilithium, followed by a novel verification bypass attack targeting its verification procedure. All proposed exploits are demonstrated with high success rate using electromagnetic fault injection on optimized implementations of Kyber and Dilithium, from the open-source pqm4 library on the ARM Cortex-M4 microcontroller. We also demonstrate that our proposed attacks are capable of bypassing concrete countermeasures against existing fault attacks on lattice-based KEMs and signature schemes. We believe our work motivates the need for more research towards development of countermeasures for the NTT against fault injection attacks. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.447-481
- Prime-Field Masking in Hardware and its Soundness against Low-Noise SCA
Attacks Authors: Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, François-Xavier Standaert Pages: 482 - 518 Abstract: A recent study suggests that arithmetic masking in prime fields leads to stronger security guarantees against passive physical adversaries than Boolean masking. Indeed, it is a common observation that the desired security amplification of Boolean masking collapses when the noise level in the measurements is too low. Arithmetic encodings in prime fields can help to maintain an exponential increase of the attack complexity in the number of shares even in such a challenging context. In this work, we contribute to this emerging topic in two main directions. First, we propose novel masked hardware gadgets for secure squaring in prime fields (since squaring is non-linear in non-binary fields) which prove to be significantly more resource-friendly than corresponding masked multiplications. We then formally show their local and compositional security for arbitrary orders. Second, we attempt to >experimentally evaluate the performance vs. security tradeoff of prime-field masking. In order to enable a first comparative case study in this regard, we exemplarily consider masked implementations of the AES as well as the recently proposed AESprime. AES-prime is a block cipher partially resembling the standard AES, but based on arithmetic operations modulo a small Mersenne prime. We present cost and performance figures for masked AES and AES-prime implementations, and experimentally evaluate their susceptibility to low-noise side-channel attacks. We consider both the dynamic and the static power consumption for our low-noise analyses and emulate strong adversaries. Static power attacks are indeed known as a threat for side-channel countermeasures that require a certain noise level to be effective because of the adversary’s ability to reduce the noise through intra-trace averaging. Our results show consistently that for the noise levels in our practical experiments, the masked prime-field implementations provide much higher security for the same number of shares. This compensates for the overheads prime computations lead to and remains true even if / despite leaking each share with a similar Signal-to-Noise Ratio (SNR) as their binary equivalents. We hope our results open the way towards new cipher designs tailored to best exploit the advantages of prime-field masking. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.482-518
- Efficient Persistent Fault Analysis with Small Number of Chosen Plaintexts
Authors: Fan Zhang, Run Huang, Tianxiang Feng, Xue Gong, Yulong Tao, Kui Ren, Xinjie Zhao, Shize Guo Pages: 519 - 542 Abstract: In 2018, Zhang et al. introduced the Persistent Fault Analysis (PFA) for the first time, which uses statistical features of ciphertexts caused by faulty Sbox to recover the key of block ciphers. However, for most of the variants of PFA, the prior knowledge of the fault (location and value) is required, where the corresponding analysis will get more difficult under the scenario of multiple faults. To bypass such perquisite and improve the analysis efficiency for multiple faults, we propose Chosen-Plaintext based Persistent Fault Analysis (CPPFA). CPPFA introduces chosen-plaintext to facilitate PFA and can reduce the key search space of AES-128 to extremely small. Our proposal requires 256 ciphertexts, while previous state-of-the-art work still requires 1509 and 1448 ciphertexts under 8 and 16 faults, respectively, at the only cost of requiring 256 chosen plaintexts. In particular, CPPFA can be applied to the multiple faults scenarios where all fault locations, values and quantity are unknown, and the worst time complexity of CPPFA is O(28+nf ) for AES-128, where nf represents the number of faults. The experimental results show that when nf > 4, 256 pairs of plaintext-ciphertext can recover the master key of AES-128. As for LED-64, only 16 pairs of plaintext-ciphertext reduce the remaining key search space to 210. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.519-542
- RDS: FPGA Routing Delay Sensors for Effective Remote Power Analysis
Attacks Authors: David Spielmann, Ognjen Glamočanin, Mirjana Stojilović Pages: 543 - 567 Abstract: State-of-the-art sensors for measuring FPGA voltage fluctuations are time-to-digital converters (TDCs). They allow detecting voltage fluctuations in the order of a few nanoseconds. The key building component of a TDC is a delay line, typically implemented as a chain of fast carry propagation multiplexers. In FPGAs, the fast carry chains are constrained to dedicated logic and routing, and need to be routed strictly vertically. In this work, we present an alternative approach to designing on-chip voltage sensors, in which the FPGA routing resources replace the carry logic. We present three variants of what we name a routing delay sensor (RDS): one vertically constrained, one horizontally constrained, and one free of any constraints. We perform a thorough experimental evaluation on both the Sakura-X side-channel evaluation board and the Alveo U200 datacenter card, to evaluate the performance of RDS sensors in the context of a remote power side-channel analysis attack. The results show that our best RDS implementation in most cases outperforms the TDC. On average, for breaking the full 128-bit key of an AES-128 cryptographic core, an adversary requires 35% fewer side-channel traces when using the RDS than when using the TDC. Besides making the attack more effective, given the absence of the placement and routing constraint, the RDS sensor is also easier to deploy. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.543-567
- Improved Attacks on (EC)DSA with Nonce Leakage by Lattice Sieving with
Predicate Authors: Luyao Xu, Zhengyi Dai, Baofeng Wu, Dongdai Lin Pages: 568 - 586 Abstract: Lattice reduction algorithms have been proved to be one of the most powerful and versatile tools in public key cryptanalysis. In this work, we primarily concentrate on lattice attacks against (EC)DSA with nonce leakage via some sidechannel analysis. Previous works relying on lattice reduction algorithms such as LLL and BKZ will finally lead to the “lattice barrier”: lattice algorithms become infeasible when only fewer nonce is known. Recently, Albrecht and Heninger introduced lattice algorithms augmented with a predicate and broke the lattice barrier (Eurocrypt 2021). We improve their work in several aspects. We first propose a more efficient predicate algorithm which aims to search for the target lattice vector in a large database. Then, we combine sieving with predicate algorithm with the “dimensions for free” and “progressive sieving” techniques to further improve the performance of our attacks. Furthermore, we give a theoretic analysis on how to choose the optimal Kannan embedding factor. As a result, our algorithm outperforms the state-of-the-art lattice attacks for existing records such as 3-bit nonce leakage for a 256-bit curve and 2-bit nonce leakage for a 160-bit curve in terms of running time, sample numbers and success probability. We also break the lattice records on the 384-bit curve with 3-bit nonce leakage and the 256-bit curve with 2-bit nonce leakage which are thought infeasible previously. Finally, we give the first lattice attack against ECDSA with a single-bit nonce leakage, which enables us to break a 112-bit curve with 1-bit nonce leakage in practical time. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.568-586
- “Whispering MLaaS”
Authors: Shubhi Shukla, Manaar Alam, Sarani Bhattacharya, Pabitra Mitra, Debdeep Mukhopadhyay Pages: 587 - 613 Abstract: While recent advancements of Deep Learning (DL) in solving complex real-world tasks have spurred their popularity, the usage of privacy-rich data for their training in varied applications has made them an overly-exposed threat surface for privacy violations. Moreover, the rapid adoption of cloud-based Machine-Learning-asa-Service (MLaaS) has broadened the threat surface to various remote side-channel attacks. In this paper, for the first time, we show one such privacy violation by observing a data-dependent timing side-channel (naming this to be Class-Leakage) originating from non-constant time branching operation in a widely popular DL framework, namely PyTorch. We further escalate this timing variability to a practical inference-time attack where an adversary with user level privileges and having hard-label black-box access to an MLaaS can exploit Class-Leakage to compromise the privacy of MLaaS users. DL models have also been shown to be vulnerable to Membership Inference Attack (MIA), where the primary objective of an adversary is to deduce whether any particular data has been used while training the model. Differential Privacy (DP) has been proposed in recent literature as a popular countermeasure against MIA, where inclusivity and exclusivity of a data-point in a dataset cannot be ascertained by definition. In this paper, we also demonstrate that the existence of a data-point within the training dataset of a DL model secured with DP can still be distinguished using the identified timing side-channel. In addition, we propose an efficient countermeasure to the problem by introducing constant-time branching operation that alleviates the Class-Leakage. We validate the approach using five pre-trained DL models trained on two standard benchmarking image classification datasets, CIFAR-10 and CIFAR-100, over two different computing environments having Intel Xeon and Intel i7 processors. PubDate: 2023-03-06 DOI: 10.46586/tches.v2023.i2.587-613
|