Deriving cryptographic keys via power consumption Roman Korkikian - - PowerPoint PPT Presentation

deriving cryptographic keys via power consumption roman
SMART_READER_LITE
LIVE PREVIEW

Deriving cryptographic keys via power consumption Roman Korkikian - - PowerPoint PPT Presentation

Deriving cryptographic keys via power consumption Roman Korkikian Power trace plotting ,


slide-1
SLIDE 1

Deriving cryptographic keys via power consumption Roman Korkikian

slide-2
SLIDE 2

Power trace plotting

Запустите виртуальную машину, вы должны увидеть терминал с тремя вкладками и редактор (тоже с несколькими вкладками). Перейдите в редактор и откройте вкладку Task1 она нам потребуется в дальнейшем (если у вас нет распечатки). Все слайды есть на флешках. Откройте, если вам не видно на проекторе.

slide-3
SLIDE 3

What this workshop about?

Рома Ты

Questions during/after and instead of workshop.

slide-4
SLIDE 4

Who am I?

Roman Korkikian

slide-5
SLIDE 5

What this workshop about?

The workshop agenda:

  • 1. Hands-on theory (you will execute couple of commands). During

this part I will not come to help you. I will help you after theory.

  • 2. Theory (1-2 hours).
  • 3. Practice (1-2 hours) so you may leave if you don’t want it to do.
slide-6
SLIDE 6

What this workshop about?

The workshop goal: Atract you interest with side-channel attacks.

slide-7
SLIDE 7

What this workshop about?

The workshop is about side-channel attacks: how to extract the cryptographic key using physical data

slide-8
SLIDE 8

What this workshop about?

The workshop is about side-channel attacks: how to extract the cryptographic key using physical data Sounds crazy?

slide-9
SLIDE 9

PART 0. Power trace acquisition

slide-10
SLIDE 10

Power traces acquisition

Each electronic device require power to perform computations. Typical power supplies are 0.95 – 1.8V depending on the technology. During the work electronic device drains current to the ground, transforms energy and etc, i.e. device consume power.

Pin connected to power supply

slide-11
SLIDE 11

Power traces acquisition

https://github.com/kokke/tiny-AES128-C/blob/master/aes.c

slide-12
SLIDE 12

Power trace plotting

AES algorithm

slide-13
SLIDE 13

Power trace explanation

slide-14
SLIDE 14

Power trace plotting

Start Virtual Machine and open folder: Desktop/DPA_Workshop/AES_STM8_Example There should be file Task1 – open the file and follow all the steps (7- 13) to plot power curves.

slide-15
SLIDE 15

Power traces acquisition

slide-16
SLIDE 16

Power trace explanation

  • AES is a particular algorithm:
  • 11 Key mixing operations;
  • 10 Sbox and Shift rows operations;
  • 9 MixColumn operations;
  • Hence from power trace we can understand where each operation

took place.

slide-17
SLIDE 17

Power trace explanation

slide-18
SLIDE 18

Power trace explanation

  • AES engine acquisition:
  • 10.000 plaintext/ciphertext pairs and corresponing power trace during

encryption are available in folder data.

  • We clearly see that AES algorithm reveals its operation via power

consumption.

  • What we will do with all of that preasious information?! EXTRACT THE

KEY!

slide-19
SLIDE 19

Power trace explanation

  • Key extraction:
  • We build simple power models that depend on the key byte and know

information and then we try all the bytes and among them we will choose the correct one.

  • Lets come back to Virtual Machine and continue Task 1 (items 14 to

17).

slide-20
SLIDE 20

Power trace explanation

slide-21
SLIDE 21

Power trace explanation

slide-22
SLIDE 22

Power trace explanation

slide-23
SLIDE 23

Power trace explanation

  • Steps 14 to 17 computes

relationship between power traces and the first byte of the last round state: = [ ]

  • is used twice in a program –
  • first time is obtained after

AddRndKey

  • second time is read from the

memory to compute

[ ]

slide-24
SLIDE 24

Power trace explanation

  • We were correlating first byte of the key. Now I want you to rewrite

two numbers in the code so you will correlate another key byte.

  • Do the Task1 from lines 18 till 25.
  • Check the position of the pike – it shall move.
slide-25
SLIDE 25

Power trace explanation

slide-26
SLIDE 26

Power trace explanation

slide-27
SLIDE 27

Power trace explanation

  • So in this workshop I will try to explain:
  • Why and how power consumption leak binary information.
  • How to build simple power consumption models based on binary data.
  • How to search for a key.
slide-28
SLIDE 28

PART 1. The truth is out there…*

* This part is boring, don’t not fall asleep

slide-29
SLIDE 29
slide-30
SLIDE 30

Intuition lies

  • R. Pacalet
slide-31
SLIDE 31

Examples of side-channels

Examples of side-channel techniques that YOU KNOW FROM SCHOOL

slide-32
SLIDE 32

Examples of side channels: model of the atom

slide-33
SLIDE 33

Examples of side channels: volume definition

slide-34
SLIDE 34

Examples of side-channels

  • Side channel techniques – are the methods that allow characterize
  • bject’s properties by observing reaction and feedback of other
  • bjects.
  • For cryptography – side channel attacks are the attacks that use

physical information to extract binary data.

slide-35
SLIDE 35

How binary and physical data are linked?

slide-36
SLIDE 36

Algorithms are just physical signals

Algorithm

Computing machinery

Computing is physical process To perform computations energy has to be spent irreversably

slide-37
SLIDE 37

Side channel leakages

  • CMOS (electronic) devices consume

electrical power, but also transform energy to other observable types of side- channel.

slide-38
SLIDE 38

Side channel leakages

slide-39
SLIDE 39

Power consumption

slide-40
SLIDE 40

Power consumption

  • Hardware works from frequencies from several MHz till several GHz.
  • Power consumption shall be measured accordingly (sampling rate

may be smaller than the actual frequency).

  • Use fast digital oscilloscopes.
  • In our case the bandwidth was 250 MHz with microcontroller running

at 16 MHz.

slide-41
SLIDE 41

Power consumption

  • Home setup.
  • Bandwidth up to 250 MHz, sampling

10 GHz – can attack up to 500 MHz devices.

  • 1.8K $
  • Industrial setup.
  • All type of devices
  • >30K $
slide-42
SLIDE 42

Power consumption

  • Roughly speaking power consumption is a measure of current/voltage

needed for the correct device work – at different moments of time current that flows through the device will be different.

  • We consider CMOS systems – they consume power only on

transaction.

slide-43
SLIDE 43

Power consumption

Lets keep it simple: each hardware is based on registers (and logic), registers consume a lot of power on transaction when 0 overwrites 1

  • r 1 overwrites 0.
slide-44
SLIDE 44

Power consumption: D flip flop

  • D flip flop consumes power on

clock’s edge.

  • Lets see small gif
slide-45
SLIDE 45

Power consumption: D flip flop

  • Flip flop power consumption on

D input transaction

0.2 0.4 0.6 0.8 1 1.2 0 to 1 1 to 0 1 to 1 0 to 0

slide-46
SLIDE 46

Power consumption in microcontroller

slide-47
SLIDE 47

Power consumption

Different instructions – different impedance/path/power consumption

slide-48
SLIDE 48

Power consumption

slide-49
SLIDE 49

Power consumption

  • Well… registers consume power but the rest of the circuit also

consumes a lot (even more than one register).

  • What to do?
slide-50
SLIDE 50

Mathematics behind side channel attacks

slide-51
SLIDE 51

Mathematics

  • Consider a small example:
  • Two asset values 2 and 1.
  • Each time asset value is returned a significant noise is added:

Asset values Noise values

+

slide-52
SLIDE 52

Mathematics

slide-53
SLIDE 53

Mathematics

Guess 1: sequence of two asset values Guess 2: sequence of two asset values Red is 2 Blue is 1 We know asset values, but we don’t know its appearance

slide-54
SLIDE 54

Mathematics

  • Take all, average and extract:
  • Mean(red bars) – mean(blue bars) approximately 1
  • For guess 1 this difference is equal to 1.042
  • For guess 2 this difference is equal to -0.297
  • Why?
slide-55
SLIDE 55

Mathematics

  • Law of large numbers (check wiki): the average of the results obtained

from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

slide-56
SLIDE 56

Mathematics

  • Asset values 0 and 1
  • Noise is Gaussian with mean

128 and variance of 78

  • Average for asset value 1 will

converge to 128 + 1, and for asset value 0 the average converges to 128 + 0

slide-57
SLIDE 57

Mathematics

  • Asset values 0 and 1
  • Noise is Gaussian with mean

128 and variance of 78

  • Average for asset value 1 will

converge to 128 + 1, and for asset value 0 the average converges to 128 + 0

slide-58
SLIDE 58

Mathematics

  • Law of large numbers for our case:
  • Correct guess:

mean(n21+2, n22+2, n23+2 … n2i+2) = mean(noise) + 2 mean(n11+1, n12+1, n13+1 … n1i+1) = mean(noise) + 1

  • Wrong guess:

mean(n21+2, n22+1, n23+2 … n2i+1) = mean(noise) + 1.5 mean(n11+1, n12+2, n13+1 … n1i+1) = mean(noise) + 1.5

slide-59
SLIDE 59

Mathematics

Guess 1: sequence of two asset values Guess 2: sequence of two asset values Difference = 1.042 Difference = -0.297

slide-60
SLIDE 60

Mathematics

  • That was for an arbitrary data, but what happens in reality for an

arbitrary instruction: R = f(x,y)

slide-61
SLIDE 61

Mathematics

R = func(x,y) HW = 0 HW = 1 HW = 2 HW = 3 HW = 4 HW = 5 HW = 6 HW = 7 HW = 8 Averaged power consumption Number of averages

slide-62
SLIDE 62

Mathematics

  • Mathematics is simple – average will show the difference in power

consumption thus revealing Hamming weight only for the correct model.

  • Due to noise we need a lot of acquisitions.
  • Lets finally go to side channel attacks.
slide-63
SLIDE 63

PART 2. Side channel attacks

slide-64
SLIDE 64

Side channel preliminaries

  • Power consumption does depend on number of bits set to 1
  • An 8bit operation (Sbox[C xor K]) happened during the measurement
  • You know C (10.000 values) but you don’t know K.
  • You know power consumption trace (lets assume for simplicity that

you know time of the operation).

slide-65
SLIDE 65

Side channel preliminaries

  • 1. Create power model for each key value: S = Sbox[C xor K]
slide-66
SLIDE 66

Power consumption: D flip flop

  • Flip flop power consumption on

D input transaction

0.2 0.4 0.6 0.8 1 1.2 0 to 1 1 to 0 1 to 1 0 to 0

slide-67
SLIDE 67

Mathematics

R = func(x,y) HW = 0 HW = 1 HW = 2 HW = 3 HW = 4 HW = 5 HW = 6 HW = 7 HW = 8 Averaged power consumption Number of averages

slide-68
SLIDE 68

Side channel preliminaries

Шифротекст Первый байт Ключ 0х00 Ключ 0х01 ... Ключ 0хFF S HW(S) S HW(S) S HW(S) 9cb4cad1a9b4031c678e5afc997dd75a 9C 1C 3 75 5 ... 505d4db951930c707c8ef4bbe52f25ea 50 6C 4 70 3 ... 1B 3 a3a1e523d7c2d1b17c0da136d0be1559 A3 71 4 1A 3 ... A7 5 0f8164f091690deff2e3f326c9877c1f 0F FB 7 D7 6 ... 17 4 9c4df2500df0f92cd6f6baaff1f765b5 9C 1C 3 75 5 ... ... ... ... ... ... ... ... ... ... ad38d78c2590807ac3a8b992056b6c3a AD 18 2 AA 4 ... 48 2 c766e6776c7d80e1acd68dd825591595 C7 31 3 C7 5 ... 76 5 fd3d63130367d08e31fe96dd2b5d49ca FD 21 2 55 4 ... 6A 4 368d5ad4525dce11c9a7c63afe4bf2b2 36 24 2 B2 4 ... 12 3 S = Sbox[C xor K]

slide-69
SLIDE 69

Side channel preliminaries

  • 1. Create power model for each key value: S = Sbox[C xor K]
  • 2. Distribute power measurements according to the model.
slide-70
SLIDE 70

Side channel preliminaries

S = Sbox[C xor K]

slide-71
SLIDE 71

Side channel preliminaries

3 4 4 7 3 ... 2 3 2 2 S = Sbox[C xor K]

slide-72
SLIDE 72

Side channel preliminaries

3 4 4 7 3 ... 2 3 2 2 S = Sbox[C xor K]

slide-73
SLIDE 73

Side channel preliminaries

3 4 4 7 3 ... 2 3 2 2 S = Sbox[C xor K]

slide-74
SLIDE 74

Side channel preliminaries

S = Sbox[C xor K] 5 3 3 6 5 ... 4 5 4 4

slide-75
SLIDE 75

Side channel preliminaries

  • 1. Create power model for each key value: S = Sbox[C xor K]
  • 2. Distribute power measurements according to the model.
  • 3. Among all distributions get the key with the best dependency.
slide-76
SLIDE 76

Side channel preliminaries

slide-77
SLIDE 77

Side channel preliminaries

  • 1. By eyes this is not comfortable to observe these results, instead use

coefficients of mutual dependence as correlation coefficient.

slide-78
SLIDE 78

Side channel preliminaries

Lets do a small demo and then you will try to develop your own first side-channel attack Do steps 26-27 of the Task1

slide-79
SLIDE 79

Side channel preliminaries

slide-80
SLIDE 80

Power trace explanation

  • Steps 14 to 17 computes

relationship between power traces and the first byte of the last round state: = [ ]

  • is used twice in a program –
  • first time is obtained after

AddRndKey

  • second time is read from the

memory to compute

[ ]

slide-81
SLIDE 81

Side channel preliminaries

slide-82
SLIDE 82

PART 2. Practice 1-2 hours

slide-83
SLIDE 83

Practice

If you did not finish Task1 and still you want to do it – I will help you after small explanation

slide-84
SLIDE 84

Practice

  • AES algorithm can be also implemented in hardware (faster, more

reliable, protected against timing attacks etc.).

  • You will develop and attack the algorithm in two ways.
  • Firstly short explanation how algorithm is done.
slide-85
SLIDE 85

Practice

128 bits register Shift Rows / Sbox MixColumn Add round key

slide-86
SLIDE 86

Practice

  • Assume that the only thing that consumes power is 128 bit register so

you need to correlate its values.

  • AES is executed in 11 clock cycles.
  • You need to identify register value that you can compute and
  • correlate. Follow description of the Task2.
slide-87
SLIDE 87

Misc

  • What to read:
  • Scholar.google.com (side channel attacks, power analysis, correlation power

analysis and etc.)

  • COSADE, CARDIS, CHES and other academic conferences.
  • Журнал Хакер – я пишу туда серию статей по аппаратным атакам (не

только по второстепенным каналам). Пока раз в два месяца.

slide-88
SLIDE 88

Practice

  • This is just the beginning of Side Channel Attacks. Nowadays there are

methods tha allow:

  • Extract the key without plaintexts and ciphertexts information.
  • Attack protected algorithms.
  • Reverse the code using side channel techniques.
  • Reverse the circuit (approximate IP location).
  • Use side-channels to communicate.
  • Possible other applications.
  • Definitely side-channels are not dead and there are enourmous place

to improve.

slide-89
SLIDE 89

Roman Korkikian Roman.korkikian@yandex.com