[PPT] - Improved Blind Side-Channel Analysis by Exploitation of Joint PowerPoint Presentation

SLIDE 1

Improved Blind Side-Channel Analysis by Exploitation of Joint Distributions of Leakages

Christophe Clavier, L´ eo Reynaud

Universit´ e de Limoges - XLIM

SLIDE 2

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Side Channel attacks

Common non profiled side channel attacks

DPA
CPA
MIA

S ⊕ k

guess

m

known

x y

predicted

Figure 1: Internal states variables

Needs

Leakage on some internal state
Knowledge and variability of plain/ciphertext

2

SLIDE 4

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Side Channel attacks

Common non profiled side channel attacks

DPA
CPA
MIA

S ⊕ k

guess

m

known

x y

predicted

Figure 1: Internal states variables

Needs

Leakage on some internal state
Knowledge and variability of plain/ciphertext

How to attack with no or not variable plain/ciphertext ?

2

SLIDE 5

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

EMV : no exploitable plain/ciphertext

ATC high ATC low

0x00 0x00 ... 0x00

AES master key session key IV M1 M2 Mn AES AES AES ⊕ ⊕ ⊕ cryptogram

Figure 2: EMV session key derivation

3

SLIDE 6

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

What is available ?

We only have access to consumptions

S ⊕ k m

unknown

x y

unpredictable

Figure 3: Leakages

4

SLIDE 7

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

What is available ?

We only have access to consumptions

S ⊕ k m

unknown

x y

unpredictable

Figure 3: Leakages

Joint distributions

4

SLIDE 8

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

5

SLIDE 9

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

6

SLIDE 10

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

⇒ All theoretical distributions are different

6

SLIDE 11

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

⇒ All theoretical distributions are different ⇒ Discrimination of the key

6

SLIDE 12

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions in a nutshell

Cons

Need to locate the points of interest (PoI)
Use HW instead of consumption

Pros

Work without plain/ciphertext
Any round can be attacked

7

SLIDE 13

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions : steps

Step 1 Locate the PoI where the variables considered leak Step 2 Infer HW from the leakages observed at the PoI Step 3 Build the joint distribution for each key Step 4 Select the key whose distribution best fits the observations

8

SLIDE 14

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Contribution

Prior work Linge

Slice method → HW (Step 2)
Distances between histograms

→ key (Step 4) Le Bouder

Maximum of likelihood

→ key (Step 4) Our contribution

Variance method → HW (Step 2)
Improvement on the maximum of likelihood (Step 4)
Extension to masked implementations (Step 3)

9

SLIDE 15

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Slice method to infer HW (Linge)

Consumption model ℓ = α HW(v) + β + ω Slice method

Sort the N leakages by ascending order
Assign HW = 0 to the N·C 0

8

28

lowest

Assign HW = 1 to the N·C 1

8

28

next ... ⇒ Integer valued HW

Leakages

...

N∗C 0

8

28 N∗C 1

8

28 N∗C 2

8

28

. . . HW = 0 HW = 1 HW = 2

Figure 6: Slice method to infer HW

10

SLIDE 16

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω)

11

SLIDE 17

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) = α2 Var(HW(v)) + Var(ω)

11

SLIDE 18

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) = α2 Var(HW(v)) + Var(ω) = 2α2 + Var(ω)

11

SLIDE 19

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) α = ±

Var(ℓ)−Var(ω)

2

= α2 Var(HW(v)) + Var(ω) = 2α2 + Var(ω)

Var(ℓ)
Var(ω)

PoI

Figure 7: Standard deviation trace

11

SLIDE 20

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) α = ±

Var(ℓ)−Var(ω)

2

= α2 Var(HW(v)) + Var(ω) β = E(ℓ) − α E(HW(v)) = 2α2 + Var(ω)

Var(ℓ)
Var(ω)

PoI

Figure 7: Standard deviation trace

11

SLIDE 21

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) α = ±

Var(ℓ)−Var(ω)

2

= α2 Var(HW(v)) + Var(ω) β = E(ℓ) − α E(HW(v)) = 2α2 + Var(ω) Once α and β are known : HW(v) = ℓ−β

α

⇒ Real valued HW

Var(ℓ)
Var(ω)

PoI

Figure 7: Standard deviation trace

11

SLIDE 22

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Distinguisher : maximum of likelihood

Observations hm = h∗

m + ωm

hy = h∗

y + ωy

h∗

m, h∗ y : correct HW (integer)

ωm, ωy : noise Bayes Pr(k|(hm, hy)) = Pr((hm,hy)|k)·Pr(k)

Pr((hm,hy))

∼ Pr((hm, hy)|k) · Pr(k) Law of total probability Pr((hm, hy)|k) =

h∗

m,h∗ y

Pr((hm, hy)|(h∗

m, h∗ y)) · Pr((h∗ m, h∗ y)|k)

Noise probability Pr((hm, hy)|(h∗

m, h∗ y)) = Pr(ωm = hm − h∗ m) · Pr(ωy = hy − h∗ y) 12

SLIDE 23

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Improvements on basic attack

Key likelihood can be computed for other/more observed HW Better results with new/more variables :

m-x-y
m-x-y-3·y
m-3·

y

...

Every combination of m and other variables (not involving other keys) may bring information Attack on the HW of the key Attack on m and x :

Only gives information about

the HW of k

Very efficient
HW of all extended key bytes

sufficient to retrieve the key

HW(k)=0 HW(k)=2

13

SLIDE 24

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Simulation results

20 40 60 80 100 120 500 1000 1500 2000 Correct key rank Number of observations m-y σ=0.7 m-y σ=1.0 m-y σ=1.5 m-x-y σ=0.7 m-x-y σ=1.0 m-x-y σ=1.5

Figure 8: Improvements using m-x-y

0.5 1 1.5 2 100 200 300 400 500 Rank of HW of the correct key Number of observations m-x σ=0.7 m-x σ=1.0 m-x σ=1.5

Figure 9: m-x attack

14

SLIDE 25

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Experimental results

Table 1: Rank of the correct key byte for a m-y attack with unknown plaintexts on an unprotected implementation (1000 traces) Byte 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

In. Prod.

0 167 29 45 187 192 45 77 108 36 124 5 104 64 147

Eucl. Dist. 1 80 210 106

3 62 186 17 38 68 194 48 27 120 21 116 ML (slice) 1 1 29 1 46 1 1 32 36 19 26 67 66 28 ML (var.) 2 6 1 1 17 1 1 19 5 15 4 40 19 13

15

SLIDE 26

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

16

SLIDE 27

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Vulnerable schemes

A masking scheme is vulnerable if m and all other variables included in the attack (x, y...) are masked with the same mask Attack on the key (case m y)

Theoretical distributions must

now exhaust all masks for each couple (m, y)

Harder to distinguish
But still feasible !!

k = 39 k = 167

Attack on the HW (case m x)

The set of all (m,x) is the

same as (m ⊕ u,x ⊕ u) ⇒ The joint distributions of (HW(m),HW(x)) and (HW(m ⊕ u),HW(x ⊕ u)) are the same ⇒ The attack is not impacted by masking !!

17

SLIDE 28

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Masked schemes

m ⊕ u k ⊕ S’ y ⊕ u x ⊕ u

(a)

m ⊕ u k ⊕ w ⊕ S’ y ⊕ u x ⊕ u ⊕ w

(b)

m ⊕ u k ⊕ w ⊕ S’ y ⊕ u x ⊕ u ⊕ w ⊕ w x ⊕ u

(c)

m ⊕ u k ⊕ w ⊕ S’ y ⊕ v x ⊕ u ⊕ w

(d) Figure 10: Examples of boolean masking

m-y m-x m-x-y

X X X X X

18

SLIDE 29

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Simulation results

20 40 60 80 100 120 10 100 1000 10000 100000 1x106 Correct key rank Number of observations m-y masked m-x-y masked

Figure 11: Key rank on masked implementation m-y and m-x-y (σ = 1.0)

19

SLIDE 30

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

20

SLIDE 31

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Conclusion

Summary

New method to infer HW : Variance
New ways of using joint distributions :
m-x-y : better results
m-x : quite appealing on protected implementations
Extension to masked implementations : independent masks still

secure Future work

Attack with independent masks
Extend to other consumption models (template like)
Identify protocols that require blind attacks

21

Improved Blind Side-Channel Analysis by Exploitation of Joint Distributions of Leakages

Christophe Clavier, L´ eo Reynaud

Table of contents

Side Channel attacks

Common non profiled side channel attacks

S ⊕ k

m

x y

Figure 1: Internal states variables

Needs

Side Channel attacks

Common non profiled side channel attacks

S ⊕ k

m

x y

Figure 1: Internal states variables

Needs

How to attack with no or not variable plain/ciphertext ?

EMV : no exploitable plain/ciphertext

AES master key session key IV M1 M2 Mn AES AES AES ⊕ ⊕ ⊕ cryptogram

Figure 2: EMV session key derivation

What is available ?

We only have access to consumptions

S ⊕ k m

unknown

x y

unpredictable

Figure 3: Leakages

What is available ?

We only have access to consumptions

S ⊕ k m

unknown

x y

unpredictable

Figure 3: Leakages

Joint distributions

Introduction Joint distributions and maximum of likelihood Extension to masked implementations Conclusion

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

⇒ All theoretical distributions are different

Joint distributions : theoretical distributions

For each key, we can count couples (HW(m),HW(y)) when exhausting all inputs m

Figure 4: Theoretical distribution

k = 39

Figure 5: Theoretical distribution

k = 126

⇒ All theoretical distributions are different ⇒ Discrimination of the key

Joint distributions in a nutshell

Cons

Pros

Joint distributions : steps

Step 1 Locate the PoI where the variables considered leak Step 2 Infer HW from the leakages observed at the PoI Step 3 Build the joint distribution for each key Step 4 Select the key whose distribution best fits the observations

Contribution

Prior work Linge

→ key (Step 4) Le Bouder

→ key (Step 4) Our contribution

Slice method to infer HW (Linge)

Consumption model ℓ = α HW(v) + β + ω Slice method

lowest

next ... ⇒ Integer valued HW

Figure 6: Slice method to infer HW

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω)

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) = α2 Var(HW(v)) + Var(ω)

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) = α2 Var(HW(v)) + Var(ω) = 2α2 + Var(ω)

Variance method to infer HW

Variance Goal : infer α and β in order to inverse the leakage function Var(ℓ) = Var(α HW(v) + β) + Var(ω) α = ±

= α2 Var(HW(v)) + Var(ω) = 2α2 + Var(ω)