Part I: Introduction to Post Quantum Cryptography
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
Part I: Introduction to Post Quantum Cryptography Tutorial@CHES - - PowerPoint PPT Presentation
Part I: Introduction to Post Quantum Cryptography Tutorial@CHES 2017 - Taipei Tim Gneysu Ruhr-Universitt Bochum & DFKI 04.10.2017 Overview Goals Provide a high-level introduction to Post-Quantum Cryptography (PQC)
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
– Code-Based Cryptography – Lattice-Based Cryptography – Hash-Based Cryptography
and systems long-term security is an essential requirement
have tight constraints with their computational ressources
10-30 years > 15 years 10 years 5-25 years
– Symmetric encryption: Advanced Encryption Standard – Asymmetric encryption: RSA (Factorization Problem), ElGamal or Elliptic Curve Cryptography (DLOG Problem)
available for these real-world cryptosystems
resist best known (cryptanalytic) attack
and computing capabilities of a powerful attacker (e.g. NSA)
Source: ECRYPT II Yearly Key Size Report
Short-term security (days to months) Mid-term security (years to decades) Long-term security (many years)
(symmetric)
cryptosystems are closely related
cryptanalysis is likely to affect both PKC classes
powerful quantum computers
cryptosystems is required NIST Call for PQC (Nov 30)
NP-hard problems?
(such as Grover‘s/Shor‘s alg.) with quantum computers
comparable to currently employed cryptosystems
– Class of cryptographic schemes based on the classical computing paradigm – Designed to provide security in the era
computers
– PQC ≠ quantum cryptography!
– Code-based – Lattice-based – Hash-based – Multivariate-quadratic – Supersingular isogenies
and/or digital signatures
– CHES 2001: Bailey et al.: NTRU in Small Devices – CHES 2004: Yang et al. : TTS on SmartCards – CHES 2008: Bogdanov et al.: MQ-Cryptosystems in HW – CHES 2009: Eisenbarth et al.: MicroEliece – CHES 2011: Session on Lattice-based attacks (3 papers) – CHES 2012: High-Performance McEliece+MQ+Lattices; GLS-Cryptosystem – CHES 2013: McBits + QC-MDPC McEliece Implementations – CHES 2014: RingLWE + Lattice-based Signature Implementations – CHES 2015: Session on Lattice crypto (2 papers), Homomorphic Encryption – CHES 2016: QcBits, Fault-Attack on BLISS signature scheme – CHES 2017: Tomorrow‘s session on PQC (3 papers)
cryptographic constructions
techniques and algorithmic tweaks
analysis and fault-injection attacks
confidence considering potential attacks
attacks from quantum-computers
devices, Internet infrastructures and Cloud services
Implementierungsaspekte alternativer asymmetrischer Kryptosysteme
ICT-644729
(2015-2019) (2015-2019) (2012-2017)
+ more
– Code-Based Cryptography – Lattice-Based Cryptography – Hash-Based Cryptography
applications
redundancy
Some problems in code-based theory are NP-complete Possible foundation for Code-Based Cryptosystems (CBC)
Matrix size of G: k x n
– H : parity check matrix of size (n - k) · n – s : vector of GF(2n-k) – t : positive integer (defined by error correction capability)
GF(2n) of weight w(e)≤ t s.t. H · eT = s
– E.R. BERLEKAMP, R.J. MCELIECE and H.C. VAN TILBORG On the inherent intractability of certain coding problems. IEEE Transactions on Information Theory, 24(3), May 1978.
Decryption Let Ψ𝐼 be a 𝑢-error-correcting decoding algorithm. P𝑛𝑈 ← Ψ𝐼 𝑇−1 · 𝑦 Extract 𝑛 by transposing the computation P−1 · P𝑛𝑈. Encryption Encode the message 𝑛 into an error vector 𝑓 ∈𝑆 𝐺
2 𝑜, 𝑥𝑢 𝑓 ≤ 𝑢
x ← 𝐼 · 𝑓𝑈
Key Generation Given a code C[n, k, d] with parity check matrix H and error correcting capability t Private Key: (𝑇, 𝐼, 𝑄), where S is a scrambling and P a permutation matrix Public Key: 𝐼 = 𝑇 · 𝐼 · 𝑄
Decryption Let Ψ𝐼 be a 𝑢-error-correcting decoding algorithm. S𝑛 ← Ψ𝐼 𝑦 · P−1 removing the error e Extract 𝑛 by computing S−1 · S𝑛 Encryption Message 𝑛 ∈ 𝐺
2 𝑜−𝑠 , error vector 𝑓 ∈𝑆 𝐺 2 𝑜, 𝑥𝑢 𝑓 ≤ 𝑢
x ← 𝑛 𝐻 + 𝑓
Key Generation Given a code C[n, k, d] with generator matrix G and error correcting capability t Private Key: (𝑇, 𝐻, 𝑄), where S is a scrambling and P a permutation matrix Public Key: 𝐻 = 𝑇 · 𝐻 · 𝑄
McEliece [M78] Niederreiter [N86]
Generalized Reed-Solomon Goppa Reed Muller Concatenated LRPC/LDCP/MDPC Srivastava Elliptic Rank-Metric
* This is a selection based on presenter‘s choice.
McEliece [M78] Niederreiter [N86]
Generalized Reed-Solomon Goppa Reed Muller Concatenated Srivastava Elliptic LRPC/LDCP/MDPC
* This is a selection based on presenter‘s choice.
Rank-Metric
– Properties of code determine key size, matrices are often large – Structures in codes reduce key size, but might enable attacks – Encoding is fast on most platforms (matrix multiplication) – Decoding requires efficient techniques in terms of time and memory
Encrypt Decrypt
Kpub=M (Matrix) y=Mx+e Kpriv y=Ψ(y, Kpriv) x y x y
– Code-Based Cryptography – Lattice-Based Cryptography – Hash-Based Cryptography
– Unpractical but provably secure – Practical but without proof (GGH/NTRU) – Lately: Ideal lattices can potentially combine both
4 1 11 10 5 5 9 53 3 9 10 1 3 3 2 12 7 3 4 6 5 11 4 3 3 5 4 8 1 10 4 12 9
6 9 11 11
7×4
4×1
7×1
secret
4 1 11 10 5 5 9 53 3 9 10 1 3 3 2 12 7 3 4 6 5 11 4 3 3 5 4 8 1 10 4 12 9
Blue is given; Find red Learning with Errors (LWE) Problem
6 9 11 11
7×4
4×1
7×1
secret
1 1 1
7×1
random
small noise looks random
– Significant ciphertext expansion for (R-)LWE encryption – Decryption error probability with (R-)LWE encryption
but also from Discrete Gaussian distributions (not a trivial task!)
– (Ideal lattices) Make use of FFT for polynomial multiplication – (Standard lattices) Matrix-vector arithmetic
– Given for encryption/signatures constructions – Unclear for advanced services such as functional encryption (e.g., FHE)
– Code-Based Cryptography – Lattice-Based Cryptography – Hash-Based Cryptography
Hash-based Cryptography: Lamport-Diffie One-Time Signatures (LD-OTS, 1979)
𝑉𝑜 = {0,1}𝑜 and a one-way function ℎ: 𝑉𝑜 → 𝑉𝑜
𝑌 = (𝑦 0,0 , 𝑦 0,1 , 𝑦 1,0 , 𝑦 1,1 , . . , 𝑦 𝑜−1,1 )
∀𝑧𝑗,𝑘 = 𝑔(𝑦𝑗,𝑘)
… = X x0 x1 x0 x1 x0 x1 x0 x1 x0 x1 h h h h h h h h h h … = Y y0 y1 y0 y1 y0 y1 y0 y1 y0 y1
Hash-based Cryptography: Lamport-Diffie One-Time Signatures (LD-OTS, 1979)
𝑜-bit message 𝑁 = (𝑛0, … , 𝑛𝑜−1) to sign
revealing corresponding 𝑦 𝑗,𝑛𝑗 secret bits.
m0 m1 m2 mn-2 mn-1 … = 𝜏 x0 x1 x0 x1 x0 x1 x0 x1 x0 x1 r r r r r h h h h h … = Y y0 y1 y0 y1 y0 y1 y0 y1 y0 y1
!
the validity of many OTS verification keys to a single verification key using a binary tree
– Max. signature count determined by height H of tree (fixed at setup) – Needs to keep track of already used signatures in the tree stateful signature scheme – Can be used with any one-time signature scheme and (collision- resistant) cryptographic hash function
P K = V 3 [ ] V 2 [ ] V 2 [ 1 ] V 1 [ ] V 1 [ 1 ] V 1 [ 2 ] V 1 [ 3 ] V [ ] = (𝑍0) V [ 1 ] = (𝑍1) V [ 2 ] = (𝑍2) V [ 3 ] = (𝑍0) V [ 4 ] = (𝑍4) V [ 5 ] = (𝑍5) V [ 6 ] = (𝑍6) V [ 7 ] = (𝑍7)Public MSS key Public OTS keys
𝑗) with 0 ≤ 𝑗 < 2𝐼
𝑗 𝑘 with 0 ≤ 𝑗 ≤ 𝐼 and 0 ≤ 𝑘 < 2𝐼−𝑗
𝑗 𝑘 = g(𝑊 𝑗−1[2j] || 𝑊 𝑗−1[2j+1])
with 0 < 𝑗 ≤ H and 0 ≤ 𝑘 < 2𝑗
PK = V3[0] V2[0] V2[1] V1[0] V1[1] V1[2] V1[3]
V0[0] = (𝑍
0)
V0[1] = (𝑍
1)
V0[2] = (𝑍
2)
V0[3] = (𝑍
0)
V0[4] = (𝑍
4)
V0[5] = (𝑍
5)
V0[6] = (𝑍
6)
V0[7] = (𝑍
7)
(𝑌0, 𝑍
0)
(𝑌1, 𝑍
1)
(𝑌2, 𝑍
2)
(𝑌3, 𝑍
3)
(𝑌4, 𝑍
4)
(𝑌5, 𝑍
5)
(𝑌6, 𝑍
6)
(𝑌7, 𝑍
7)
Example: 𝐼 = 3
– Second preimage (older schemes: collision) resistant hash function – Pseudorandom functions for OTS (XMSS)
– Height of the tree determines max. # of signatures (issue with DoS attacks for real-world systems) – Requires track record of signatures already used (critical in untrusted environments!) – Increasing tree height increases memory requirements and computational complexity
– Code-Based Cryptography – Lattice-Based Cryptography – Hash-Based Cryptography
– Code-based encryption schemes are the most mature candidates – Digital signatures from hash-based cryptography with high confidence respect to security and under standardization – Lattice-based cryptography has high potential and extremely high versatility
– Efficient implementation strategies for Code-Based Cryptosystems – Efficient implementation of Lattice-Based Cryptosystems
ICT-644729
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
including slides by Ingo von Maurich and Thomas Pöppelmann
Tutorial@CHES 2017 - Tim Güneysu
Key Generation Given a [𝑜, 𝑙]-code 𝐷 with generator matrix 𝐻 and error correcting capability 𝑢 Private Key: (𝑇, 𝐻, 𝑄), where 𝑇 is a scrambling and 𝑄 is a permutation matrix Public Key: 𝐻′ = 𝑇 · 𝐻 · 𝑄 Encryption Message 𝑛 ∈ 𝔾2
𝑙, error vector e ∈𝑆 𝔾2 𝑜, wt e ≤ 𝑢
x ← 𝑛𝐻′ + e Decryption Let Ψ𝐼 be a 𝑢-error-correcting decoding algorithm. 𝑛 · 𝑇 ← Ψ𝐼 𝑦 · 𝑄−1 , removes the error e · 𝑄−1 Extract 𝑛 by computing 𝑛 · 𝑇 · 𝑇−1
– Properties of code determine key size, short keys essential – Structures in codes reduce key size, but can enable attacks – Encoding is a fast operation on all platforms (matrix multiplication) – Decoding requires efficient techniques in terms of time and memory
Encrypt Decrypt
Kpub=M (Matrix) y=Mx+e Kpriv y=Ψ(y, Kpriv) x y x y
Code/Key Generation 1. Generate 𝑜0 first rows of parity-check matrix blocks 𝐼𝑗 ℎ𝑗 ∈𝑆 𝐺
2 𝑠 of weight 𝑥𝑗, w = 𝑗=0 𝑜0−1𝑥𝑗
2. Obtain remaining rows by 𝑠 − 1 quasi-cyclic shifts of ℎ𝑗 3. 𝐼 = [𝐼0|𝐼1|… |𝐼𝑜0−1] 4. Generator matrix of systematic form 𝐻 = 𝐽𝑙 𝑅 Q = (𝐼𝑜0−1
−1
∗ 𝐼0)𝑈 (𝐼𝑜0−1
−1
∗ 𝐼1)𝑈 … (𝐼𝑜0−1
−1 ∗ 𝐼𝑜0−2)𝑈
𝑜0 = 2
Encryption Message 𝑛 ∈ 𝐺2
𝑙, error vector 𝑓 ∈𝑆 𝐺2 𝑜, 𝑥𝑢(𝑓) ≤ 𝑢
x ← 𝑛𝐻 + 𝑓 Decryption Let Ψ𝐼 be a 𝑢-error-correcting (QC-)MDPC decoding algorithm. 𝑛𝐻 ← Ψ𝐼 𝑛𝐻 + 𝑓 Extract 𝑛 from the first k positions. Parameters for 80-bit equivalent symmetric security [MTSB13] 𝑜0 = 2, 𝑜 = 9602, 𝑠 = 4801, 𝑥 = 90, 𝑢 = 84
– Encryption/Encoding:
(with large matricies, either to be stored or to be generated on-the-fly);
– Decryption/Decoding:
hard-decision decoding with simple (bitwise) operations preferred
G
codeword ciphertext message
Decoders for LDPC/MDPC codes: bit flipping and belief propagation
“Bit-Flipping” Decoder 1. Compute syndrome 𝑡 of the ciphertext 2. Count unsatisfied parity-check-equations #𝑣𝑞𝑑 for each ciphertext bit 3. Flip ciphertext bits that violate ≥ 𝑐 equations 4. Recompute syndrome 5. Repeat until 𝑡 = 0 or reaching max. iterations (decoding failure)
Target: Xilinx Spartan-6 FPGA Scheme: QC-MDPC Encryption
compute 𝑦 = 𝑛𝐻 + 𝑓
row and the redundant part (3x4801-bit vectors)
in a separate register
the second half of the error vector
Control + XOR
m G redundan t part
m BRAM
32 flip flops
QC-MDPC Decryption
QC-MDPC Decryption
syndrome while rotating both
– We can get signatures and public key encryption from lattices and also more advanced services (IBE, FHE) – A lot of development on theory side; schemes are improving – Implementation of lattice-based cryptography is a young field;
(e.g., 532x840)
Ideal Lattices
512 coefficients
𝑟 < 232
with 256 or 512 coefficients
Random Lattices
Two important lines of research: random lattices and ideal lattices
(ideal lattices are more structured)
4 1 11 10 5 5 9 53 3 9 10 1 3 3 2 12 7 3 4 6 5 11 4 3 3 5 4 8 1 10 4 12 9
6 9 11 11
7×4
4×1
7×1
secret
4 1 11 10 5 5 9 53 3 9 10 1 3 3 2 12 7 3 4 6 5 11 4 3 3 5 4 8 1 10 4 12 9
6 9 11 11
7×4
4×1
7×1
secret
1 1 1
7×1
random
small noise looks random
4 1 11 10 3 4 1 11 2 3 4 1 12 2 3 4 9 12 2 3 10 9 12 2 11 10 9 12
7×4
case of wrap around (e.g., 10 ⇒ −10 ≡ 3 mod 13)
4 1 11 10
Only one line has to be stored
1
… 1 …
32 43 … 12
random
small secret (Gaussian) small error (Gaussian)
random
the ring R =
𝑎𝑟 𝑦 𝑦𝑜+1
sample is: 𝐮 = 𝒃𝒕 + 𝒇 ∈ 𝑆 for uniform 𝒃 ∈ R and small discrete Gaussian distributed 𝒕, 𝒇 ← 𝐸𝜏
– Search-RLWE: Find s when given 𝐮 and 𝐛 – Decision-RLWE: Distinguish 𝐮 from uniform when given 𝐮 and 𝐛 34 23 … 23
𝒃
𝑎𝒓 𝑦 𝑦𝒐+1
𝑎𝒓 𝑦 𝑦𝒐+1
= (4,2,0,1)
= 2,1,4,0
𝑎𝒓 𝑦 𝑦𝒐+1
−𝑦2 2𝜏2)
1020 502 …
572 R = 𝑎𝟓𝟏𝟘𝟒 𝑦 𝑦𝟑𝟔𝟕 + 1
4
… 1
Remark on Arithmetic of x-distributed values: Uniform * Gaussian = Uniform Gaussian * Gaussian = larger Gaussian
𝒃 e
Rejection Sampling Bernoulli Sampling Knuth-Yao Sampling Cumulative Distribution Table (CDT) Sampling
[DG14] Efficient sampling from discrete Gaussians for lattice-based cryptography on a constrained device, Dwarakanath and Galbraith, Applicable Algebra in Engineering, Communication and Computing, 2014 [DDLL14] Lattice Signatures and Bimodal Gaussians, Léo Ducas and Alain Durmus and Tancrède Lepoint and Vadim Lyubashevsky, CRYPTO '13
Enc(𝒃, 𝒒, 𝑛 ∈ 0,1 𝑜): 𝒇1, 𝒇2, 𝒇3 ← 𝐸𝜏. 𝒏 = 𝑓𝑜𝑑𝑝𝑒𝑓 𝑛 . Ciphertext: [𝒅1 = 𝒃 ⋅ 𝒇1 +𝒇2, 𝒅2 = 𝒒 ⋅ 𝒇1 +𝒇3 + 𝒏]
Gen: Choose 𝒃 ← 𝑆 and 𝒔1, 𝒔2 ← 𝐸𝜏; pk: 𝒒 = 𝒔1 − 𝒃 ⋅ 𝒔2∈ R; sk: 𝒔2
𝑏 𝑞 𝐸𝜏 x x 𝐸𝜏 𝐸𝜏 + + + 𝑛 𝑓𝑜𝑑𝑝𝑒𝑓 𝑑1 𝑑2
Dec(𝑑 = [𝒅1, 𝒅2], 𝒔𝟑): Output 𝑒𝑓𝑑𝑝𝑒𝑓(𝒅1 ⋅ 𝒔2 +𝒅2)
𝑑1 𝑑2 𝑠
1
x + 𝑒𝑓𝑑𝑝𝑒𝑓 𝑛 Correctness: 𝒅1𝒔2 + 𝒅2 = (𝒃𝒇1 + 𝒇2)𝒔2 +𝒒𝒇1 + 𝒇3 + 𝒏 = 𝒔2𝒃𝒇1 + 𝒔2𝒇2 + 𝒔1𝒇1 − 𝒔2𝒃𝒇1 + 𝒇3 + 𝒏 = 𝒏 + 𝒔2𝒇2+𝒔1𝒇1 + 𝐟3 large small
– Return 𝑛 ⋅ 𝑟/2
– If (1/4𝑟 < 𝑦 < 3/4𝑟) Return 1 – Else return 0
1 … 1 2046 … 2046 𝒏 m 𝑓𝑜𝑑𝑝𝑒𝑓 𝑛 𝑜 −bit message/coefficients 402 1907 … 2631 4024 1 … 1 𝒏 𝒏 + 𝒔2𝒇2+ 𝒔1𝒇1 + 𝐟3 de𝑑𝑝𝑒𝑓 𝑛
R = 𝑎𝟓𝟏𝟘𝟒 𝑦 𝑦𝟑𝟔𝟕 + 1
– Message space: 𝑜 bits – Expansion 2 ⋅ log2 𝑟 – Two large polynomials (𝒅1, 𝒅2)
Parameter sets 𝑜 𝑞 𝜏 |𝒅1, 𝒅2| |sk| |pk| security (256, 4093, 8.35 [LP11] 256 4093 ~4.5 6,144 1,792 6,144 ~106 bits (256, 7681,11.32) [GFSBH12] 256 7681 ~4.8 6,656 1,792 6,656 ~106 bits (512, 12289, 12.18) [GFSBH12] 512 12289 ~4.9 14,336 3,584 14,336 ~256 bits
– Simple address generation – Sample coefficient of 𝒇1, add row of 𝒅1 then add row of 𝒅2, add coefficient of 𝒇2 and 𝒇3
Multiplication (DSP) Modular reduction (power
Post-place-and-route performance on a Spartan-6 LX9 FPGA.
Area savings by power of two modulus
parameters
multiplications modulo q for one polynomial multiplication
– 1282 = 16384 – 2562 = 65536 – 5122 = 262144 – 10242 = 1048576
(DFT) defined over a finite field or ring. For a given primitive 𝑜-th root
– Forward transformation: NTT
𝑜−1𝒃 𝑘 𝜕𝑗𝑘, 𝑗 = 0,1,… , 𝑜
– Inverse transformation: INTT
𝑜−1 𝑩 𝑘 𝜕−𝑗𝑘, 𝑗 = 0,1,… , 𝑜
vectors/polynomials with the help of the NTT
– 𝐝 = INTT NTT 𝒃 ∘ NTT 𝒄 – Efficient algorithms are known for bi-direction conversion
– Polynomial multiplication in 𝑎𝑟 𝑦 / 𝑦𝑜 + 1 – Runtime 𝑃(𝑜 log𝑜) – No appending of zeros required (as for regular convolution) – Implicit polynomial reduction by 𝑦𝑜 + 1 NTT NTT INTT
𝒃 𝒄 𝒅
modulo 𝑟 (
𝑜 2 log2(𝑜) times)
Multiplication by 𝜕0 = 1
twiddle factors
NTT is very fast but still quite small
Lots of improvement since [GFS+12]
http://www.seceng.rub.de/research/projects/pqc/
ICT-644729
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
Tutorial@CHES 2017 - Tim Güneysu
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017
including slides by Ingo von Maurich and Thomas Pöppelmann
Key Generation Given a [𝑜, 𝑙]-code 𝐷 with generator matrix 𝐻 and error correcting capability 𝑢 Private Key: (𝑇, 𝐻, 𝑄), where 𝑇 is a scrambling and 𝑄 is a permutation matrix Public Key: 𝐻′ = 𝑇 · 𝐻 · 𝑄 Encryption Message 𝑛 ∈ 𝔾2
𝑙, error vector e ∈𝑆 𝔾2 𝑜, wt e ≤ 𝑢
x ← 𝑛𝐻′ + e Decryption Let Ψ𝐼 be a 𝑢-error-correcting decoding algorithm. 𝑛 · 𝑇 ← Ψ𝐼 𝑦 · 𝑄−1 , removes the error e · 𝑄−1 Extract 𝑛 by computing 𝑛 · 𝑇 · 𝑇−1
Encryption Message 𝑛 ∈ 𝐺2
𝑙, error vector 𝑓 ∈𝑆 𝐺2 𝑜, 𝑥𝑢(𝑓) ≤ 𝑢
x ← 𝑛𝐻 + 𝑓 Decryption Let Ψ𝐼 be a 𝑢-error-correcting (QC-)MDPC decoding algorithm. 𝑛𝐻 ← Ψ𝐼 𝑛𝐻 + 𝑓 Extract 𝑛 from the first k positions. Parameters for 80-bit equivalent symmetric security [MTSB13] 𝑜0 = 2, 𝑜 = 9602, 𝑠 = 4801, 𝑥 = 90, 𝑢 = 84
ARM-based 32-bit Microcontroller
AVR-based 8-bit Microcontroller
−1and 𝐼0
Scheme Platform Cycles/Op Time McE MDPC (keygen) STM32F407 148,576,008 884 ms McE MDPC (enc) STM32F407 16,771,239 100 ms McE MDPC (dec) STM32F407 37,171,833 221 ms McE MDPC (enc) ATxmega256 26,767,463 836 ms McE MDPC (dec) ATxmega256 86,874,388 2,71 s
– Additional conversion (e.g., via Fujisaki-Okamoto, includes the necessity for hash-function and re-encryption)
– Masking schemes (SCA) for McEliece by Eisenbarth et al. [SAC15], does not include CCA2 security
– Guo et al [ASIACRYPT16] identifies correlation between decoding failures in iterative decoders (bit flipping decoding)
Enc(𝒃, 𝒒, 𝑛 ∈ 0,1 𝑜): 𝒇1, 𝒇2, 𝒇3 ← 𝐸𝜏. 𝒏 = 𝑓𝑜𝑑𝑝𝑒𝑓 𝑛 . Ciphertext: [𝒅1 = 𝒃 ⋅ 𝒇1 +𝒇2, 𝒅2 = 𝒒 ⋅ 𝒇1 +𝒇3 + 𝒏]
Gen: Choose 𝒃 ← 𝑆 and 𝒔1, 𝒔2 ← 𝐸𝜏; pk: 𝒒 = 𝒔1 − 𝒃 ⋅ 𝒔2∈ R; sk: 𝒔2
𝑏 𝑞 𝐸𝜏 x x 𝐸𝜏 𝐸𝜏 + + + 𝑛 𝑓𝑜𝑑𝑝𝑒𝑓 𝑑1 𝑑2
Dec(𝑑 = [𝒅1, 𝒅2], 𝒔𝟑): Output 𝑒𝑓𝑑𝑝𝑒𝑓(𝒅1 ⋅ 𝒔2 +𝒅2)
𝑑1 𝑑2 𝑠
1
x + 𝑒𝑓𝑑𝑝𝑒𝑓 𝑛 Correctness: 𝒅1𝒔2 + 𝒅2 = (𝒃𝒇1 + 𝒇2)𝒔2 +𝒒𝒇1 + 𝒇3 + 𝒏 = 𝒔2𝒃𝒇1 + 𝒔2𝒇2 + 𝒔1𝒇1 − 𝒔2𝒃𝒇1 + 𝒇3 + 𝒏 = 𝒏 + 𝒔2𝒇2+𝒔1𝒇1 + 𝐟3 large small
– Message space: 𝑜 bits – Expansion 2 ⋅ log2 𝑟 – Two large polynomials (𝒅1, 𝒅2)
Parameter sets 𝑜 𝑞 𝜏 |𝒅1, 𝒅2| |sk| |pk| security (256, 4093, 8.35 [LP11] 256 4093 ~4.5 6,144 1,792 6,144 ~106 bits (256, 7681,11.32) [GFSBH12] 256 7681 ~4.8 6,656 1,792 6,656 ~106 bits (512, 12289, 12.18) [GFSBH12] 512 12289 ~4.9 14,336 3,584 14,336 ~256 bits
void encrypt(poly a, poly p, unsigned char * plaintext, poly c1, poly c2) { int i,j; poly e1,e2,e3; gauss_poly(e1); gauss_poly(e2); gauss_poly(e3); poly_init(c1, 0, n); // init with 0 poly_init(c2, 0, n); // init with 0 for(i = 0;i < n; i++){ // multiplication loops for(j = 0; j<n; j++){ c1[(i + j) % n] = modq(c1[(i + j) % n] + (a[i] * e1[j] * (i+j>=n ? -1 : 1))); c2[(i + j) % n] = modq(c2[(i + j) % n] + (p[i] * e1[j] * (i+j>=n ? -1 : 1))); } c1[i] = modq(c1[i] + e2[i]); c2[i] = (plaintext[i>>3] & (1<<(i%8))) ? modq(c2[i] + e3[i] + q/2) : modq(c2[i] + e3[i]); } }
This has to be fast
𝑜−1𝒃 𝑘 𝜕𝑗𝑘, 𝑗 = 0,1, … , 𝑜
𝑜−1𝑩 𝑘 𝜕−𝑗𝑘,𝑗 = 0,1, … , 𝑜
q ≡ 1 mod 2𝑜
09.10.2012
reduction modulo 𝑟 (
𝑜 2 log2(𝑜) times)
Multiplication by 𝜕0 = 1
twiddle factors
– “Standard” NTT𝑐𝑝→𝑜𝑝 requires bitreversed input and produces naturally ordered output – Bitreversal before each forward or inverse NTT
– Natural to bitreversed for forward: NTT𝑜𝑝→𝑐𝑝 – Bitreversed to natural for inverse: INTT𝑐𝑝→𝑜𝑝 – No bitreversal necessary anymore:
Removal of expensive “helper” functions
transformation is expensive
pretransformed constants (e.g., 𝒃, 𝒒, or 𝒔2)
– Put 𝑜−1 into these constants – Multiplication by scalar does not change much as
– Store 𝒃′ = 𝑜−1 𝒃
– Only possible with forward transformation and current butterfly (see next picture)
transformation is finished
(CT) allows merging of inverse multiplication by powers of 𝜔−1
– CT: 𝑏 + 𝜕𝑐 and 𝑏 − 𝜕𝑐 – GS: 𝑏 + 𝑐 and (𝑏 − 𝑐)𝜕
– No multiplication by one in first stage anymore – Can be mitigated by using lookup tables if coefficients for e are small
Textbook
(*) FFT people probably know most of these tricks
Optimized (*)
– a log2 𝑟 × log2 𝑟 multiplication – a mod 𝑟 modulo reduction – two additions or subtractions modulo 𝑟
– General methods like Montgomery or Barret reduction – Reductions that depend on special primes like Solinas primes
performance impact of larger parameter set
decryption
practice (only CPA and decryption errors)
Schoolbook was 12 million
[POG15] High-Performance Ideal Lattice-Based Cryptography on 8-bit ATxmega Microcontrollers, Thomas Pöppelmann, Tobias Oder, and Tim Güneysu, Latincrypt’15
Code size is not significantly increased Sampler is the bottleneck
Table from [CRV+15]: Ruan de Clercq, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede: Efficient software implementation of ring-LWE encryption. DATE 2015: 339-344
– Additional conversion (e.g., via Fujisaki-Okamoto, includes the necessity for hash-function and re-encryption)
– Masking schemes (SCA) by Reparaz et al [CHES15, PQCRYPTO16], does not include CCA2 security
– Loop-Abort attacks by Espitau et al. [ePrint 16] – Fault Sensitivity by Bindel et al. [FDTC16]
http://www.seceng.rub.de/research/projects/pqc/
ICT-644729
Tutorial@CHES 2017 - Taipei Tim Güneysu Ruhr-Universität Bochum & DFKI 04.10.2017