[PPT] - Fast Homomorphic Evaluation of Deep Discretized Neural Networks PowerPoint Presentation

SLIDE 1

Fast Homomorphic Evaluation of Deep Discretized Neural Networks

Florian Bourse Michele Minelli Matthias Minihold Pascal Paillier

ENS, CNRS, PSL Research University, INRIA (Work done while visiting CryptoExperts)

CRYPTO 2018 – UCSB, Santa Barbara

SLIDE 2

Machine Learning as a Service (MLaaS)

Enc Enc

Michele Minelli 2 / 16

SLIDE 3

Machine Learning as a Service (MLaaS)

x Enc Enc

Michele Minelli 2 / 16

SLIDE 4

Machine Learning as a Service (MLaaS)

x M (x) Enc Enc

Michele Minelli 2 / 16

SLIDE 5

Machine Learning as a Service (MLaaS)

x M (x)

Alice’s privacy!

Enc Enc

Michele Minelli 2 / 16

SLIDE 6

Machine Learning as a Service (MLaaS)

Enc Enc Possible solution: FHE.

Michele Minelli 2 / 16

SLIDE 7

Machine Learning as a Service (MLaaS)

Enc (x) Enc Possible solution: FHE.

Michele Minelli 2 / 16

SLIDE 8

Machine Learning as a Service (MLaaS)

Enc (x) Enc (M (x)) Possible solution: FHE.

Michele Minelli 2 / 16

SLIDE 9

Machine Learning as a Service (MLaaS)

Enc (x) Enc (M (x)) Possible solution: FHE. ✓ Privacy data is encrypted (both input and output) ✗ Efficiency main issue with FHE-based solutions

Michele Minelli 2 / 16

SLIDE 10

Machine Learning as a Service (MLaaS)

Enc (x) Enc (M (x)) Possible solution: FHE. ✓ Privacy data is encrypted (both input and output) ✗ Efficiency main issue with FHE-based solutions Goal of this work: homomorphic evaluation of trained networks.

Michele Minelli 2 / 16

SLIDE 11

(Very quick) refresher on neural networks

. . .

Output layer

. . .

Input layer

. . .

Hidden layers d

. . . . . .

Michele Minelli 3 / 16

SLIDE 12

(Very quick) refresher on neural networks

Computation for every neuron: x1 x2 . . . . . . w1 w2 y

Σ

xi, wi, y ∈ R

Michele Minelli 3 / 16

SLIDE 13

(Very quick) refresher on neural networks

Computation for every neuron: x1 x2 . . . . . . w1 w2 y

Σ

xi, wi, y ∈ R y = f (∑

i

wi xi ) , where f is an activation function.

Michele Minelli 3 / 16

SLIDE 14

A specific use case

We consider the problem of digit recognition.

Michele Minelli 4 / 16

SLIDE 15

A specific use case

We consider the problem of digit recognition.

7

Michele Minelli 4 / 16

SLIDE 16

A specific use case

We consider the problem of digit recognition.

7

Michele Minelli 4 / 16

SLIDE 17

A specific use case

We consider the problem of digit recognition.

7

Dataset: MNIST (60 000 training img + 10 000 test img).

Michele Minelli 4 / 16

SLIDE 18

State of the art Cryptonets [DGBL+16]

% = = =

Michele Minelli 5 / 16

SLIDE 19

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification % = = =

Michele Minelli 5 / 16

SLIDE 20

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification ✓ Near state-of-the-art accuracy (98.95%) = = =

Michele Minelli 5 / 16

SLIDE 21

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification ✓ Near state-of-the-art accuracy (98.95%) ✗ Replaces sigmoidal activ. functions with low-degree f (x) = x2 = = =

Michele Minelli 5 / 16

SLIDE 22

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification ✓ Near state-of-the-art accuracy (98.95%) ✗ Replaces sigmoidal activ. functions with low-degree f (x) = x2 ✗ Uses SHE = ⇒ parameters have to be chosen at setup time = =

Michele Minelli 5 / 16

SLIDE 23

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification ✓ Near state-of-the-art accuracy (98.95%) ✗ Replaces sigmoidal activ. functions with low-degree f (x) = x2 ✗ Uses SHE = ⇒ parameters have to be chosen at setup time

Main limitation

The computation at neuron level depends on the total multiplicative depth of the network = ⇒ bad for deep networks! =

Michele Minelli 5 / 16

SLIDE 24

State of the art Cryptonets [DGBL+16]

✓ Achieves blind, non-interactive classification ✓ Near state-of-the-art accuracy (98.95%) ✗ Replaces sigmoidal activ. functions with low-degree f (x) = x2 ✗ Uses SHE = ⇒ parameters have to be chosen at setup time

Main limitation

The computation at neuron level depends on the total multiplicative depth of the network = ⇒ bad for deep networks! Goal: make the computation scale-invariant = ⇒ bootstrapping.

Michele Minelli 5 / 16

SLIDE 25

A restriction on the model

We want to homomorphically compute the multisum ∑

i

wixi Enc Enc Enc =

Michele Minelli 6 / 16

SLIDE 26

A restriction on the model

We want to homomorphically compute the multisum ∑

i

wixi Given w1, . . . , wp and Enc (x1) , . . . , Enc (xp), do ∑

i

wi · Enc (xi) =

Michele Minelli 6 / 16

SLIDE 27

A restriction on the model

We want to homomorphically compute the multisum ∑

i

wixi Given w1, . . . , wp and Enc (x1) , . . . , Enc (xp), do ∑

i

wi · Enc (xi)

Proceed with caution

In order to maintain correctness, we need wi ∈ Z =

Michele Minelli 6 / 16

SLIDE 28

A restriction on the model

We want to homomorphically compute the multisum ∑

i

wixi Given w1, . . . , wp and Enc (x1) , . . . , Enc (xp), do ∑

i

wi · Enc (xi)

Proceed with caution

In order to maintain correctness, we need wi ∈ Z = ⇒ trade-off efficiency vs. accuracy!

Michele Minelli 6 / 16

SLIDE 29

Discretized neural networks (DiNNs)

Goal: FHE-friendly model of neural network.

Michele Minelli 7 / 16

SLIDE 30

Discretized neural networks (DiNNs)

Goal: FHE-friendly model of neural network.

Definition

A DiNN is a neural network whose inputs are integer values in {−I, . . . , I }, and whose weights are integer values in {−W , . . . , W }, for some I, W ∈ N. For every activated neuron of the network, the activation function maps the multisum to integer values in {−I, . . . , I }.

Michele Minelli 7 / 16

SLIDE 31

Discretized neural networks (DiNNs)

Goal: FHE-friendly model of neural network.

Definition

A DiNN is a neural network whose inputs are integer values in {−I, . . . , I }, and whose weights are integer values in {−W , . . . , W }, for some I, W ∈ N. For every activated neuron of the network, the activation function maps the multisum to integer values in {−I, . . . , I }. Not as restrictive as it seems: e.g., binarized NNs;

Michele Minelli 7 / 16

SLIDE 32

Discretized neural networks (DiNNs)

Goal: FHE-friendly model of neural network.

Definition

A DiNN is a neural network whose inputs are integer values in {−I, . . . , I }, and whose weights are integer values in {−W , . . . , W }, for some I, W ∈ N. For every activated neuron of the network, the activation function maps the multisum to integer values in {−I, . . . , I }. Not as restrictive as it seems: e.g., binarized NNs; Trade-off between size and performance;

Michele Minelli 7 / 16

SLIDE 33

Discretized neural networks (DiNNs)

Goal: FHE-friendly model of neural network.

Definition

A DiNN is a neural network whose inputs are integer values in {−I, . . . , I }, and whose weights are integer values in {−W , . . . , W }, for some I, W ∈ N. For every activated neuron of the network, the activation function maps the multisum to integer values in {−I, . . . , I }. Not as restrictive as it seems: e.g., binarized NNs; Trade-off between size and performance; (A basic) conversion is extremely easy.

Michele Minelli 7 / 16

SLIDE 34

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme

∑

i

wi · Enc (xi) = Enc (∑

i

wixi )

Michele Minelli 8 / 16

SLIDE 35

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function

Enc ( f (∑

i

wixi ))

Michele Minelli 8 / 16

SLIDE 36

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function 3 Bootstrap: can be costly

Enc∗ ( f (∑

i

wixi ))

Michele Minelli 8 / 16

SLIDE 37

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function 3 Bootstrap: can be costly 4 Repeat for all the layers

Enc∗ ( f (∑

i

wixi ))

Michele Minelli 8 / 16

SLIDE 38

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function 3 Bootstrap: can be costly 4 Repeat for all the layers

Issues: Choose the message space: guess, statistics, or worst-case

Michele Minelli 8 / 16

SLIDE 39

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function 3 Bootstrap: can be costly 4 Repeat for all the layers

Issues: Choose the message space: guess, statistics, or worst-case The noise grows: need to start from a very small noise

Michele Minelli 8 / 16

SLIDE 40

Homomorphic evaluation of a DiNN

1 Evaluate the multisum: easy – just need a linearly hom. scheme 2 Apply the activation function: depends on the function 3 Bootstrap: can be costly 4 Repeat for all the layers

Issues: Choose the message space: guess, statistics, or worst-case The noise grows: need to start from a very small noise How do we apply the activation function homomorphically?

Michele Minelli 8 / 16

SLIDE 41

Basic idea: activate during bootstrapping

Combine bootstrapping & activation function:

Enc (x) → Enc∗ (f (x))

Michele Minelli 9 / 16

SLIDE 42

Basic idea: activate during bootstrapping

Enc (x1) Enc (x2) . . . . . . w1 w2 Enc∗ (y)

Σ

y = f (∑

i

wixi )

Michele Minelli 9 / 16

SLIDE 43

Basic idea: activate during bootstrapping

Enc (x1) Enc (x2) . . . . . . w1 w2 Enc∗ (y)

Σ

y = f (∑

i

wixi ) Two steps:

1 Compute the multisum ∑

i wixi

Michele Minelli 9 / 16

SLIDE 44

Basic idea: activate during bootstrapping

Enc (x1) Enc (x2) . . . . . . w1 w2 Enc∗ (y)

Σ

y = f (∑

i

wixi ) Two steps:

1 Compute the multisum ∑

i wixi

2 Bootstrap to the activated value Michele Minelli 9 / 16

SLIDE 45

TFHE: a framework for faster bootstrapping

[CGGI16,CGGI17]

T := R/Z Basic assumption: learning with errors (LWE) over the torus (a, b = ⟨s, a⟩ + e mod 1)

c

≈ (a, u) , e ← χα, s ←$ {0, 1}n, a, u ←$ Tn.

s a

Michele Minelli 10 / 16

SLIDE 46

TFHE: a framework for faster bootstrapping

[CGGI16,CGGI17]

T := R/Z Basic assumption: learning with errors (LWE) over the torus (a, b = ⟨s, a⟩ + e mod 1)

c

≈ (a, u) , e ← χα, s ←$ {0, 1}n, a, u ←$ Tn. Scheme Message Ciphertext LWE scalar (n + 1) scalars TLWE polynomial (k + 1) polynomials

s a

Michele Minelli 10 / 16

SLIDE 47

TFHE: a framework for faster bootstrapping

[CGGI16,CGGI17]

T := R/Z Basic assumption: learning with errors (LWE) over the torus (a, b = ⟨s, a⟩ + e mod 1)

c

≈ (a, u) , e ← χα, s ←$ {0, 1}n, a, u ←$ Tn. Scheme Message Ciphertext LWE scalar (n + 1) scalars TLWE polynomial (k + 1) polynomials Overview of the bootstrapping procedure:

1 Hom. compute X b−⟨s,a⟩: spin the wheel 2 Pick the ciphertext pointed to by the arrow 3 Switch back to the original key Michele Minelli 10 / 16

SLIDE 48

Our activation function

We focus on f (x) = sign (x) .

Michele Minelli 11 / 16

SLIDE 49

Our activation function

We focus on f (x) = sign (x) .

2
1

1 2 ... ... I −I

+1 −1

Michele Minelli 11 / 16

SLIDE 50

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space Michele Minelli 12 / 16

SLIDE 51

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Standard packing technique: encrypt a polynomial instead of a scalar. ct = TLWE.Encrypt (∑

i

pi X i ) Enc

Michele Minelli 12 / 16

SLIDE 52

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Standard packing technique: encrypt a polynomial instead of a scalar. ct = TLWE.Encrypt (∑

i

pi X i ) Same thing for weights (in the clear) in the first hidden layer: wpol := ∑

i wiX −i.

Enc

Michele Minelli 12 / 16

SLIDE 53

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Standard packing technique: encrypt a polynomial instead of a scalar. ct = TLWE.Encrypt (∑

i

pi X i ) Same thing for weights (in the clear) in the first hidden layer: wpol := ∑

i wiX −i.

The constant term of ct · wpol is then Enc (∑

i wi xi).

Michele Minelli 12 / 16

SLIDE 54

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space Michele Minelli 12 / 16

SLIDE 55

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Fact We can keep the msg space constant (bound on all multisums).

Michele Minelli 12 / 16

SLIDE 56

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Fact We can keep the msg space constant (bound on all multisums). Better idea Change the msg space to reduce errors. Intuition: less slices when we do not need them.

Michele Minelli 12 / 16

SLIDE 57

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Fact We can keep the msg space constant (bound on all multisums). Better idea Change the msg space to reduce errors. Intuition: less slices when we do not need them. How Details in the paper. Quick intuition: change what we put in the wheel.

Michele Minelli 12 / 16

SLIDE 58

Refining TFHE

1 Reducing bandwidth usage 2 Dynamically changing the message space

Fact We can keep the msg space constant (bound on all multisums). Better idea Change the msg space to reduce errors. Intuition: less slices when we do not need them. How Details in the paper. Quick intuition: change what we put in the wheel.

Bottom line

We can start with any message space at encryption time, and change it dynamically during the bootstrapping.

Michele Minelli 12 / 16

SLIDE 59

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

Enc User Server

Michele Minelli 13 / 16

SLIDE 60

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

Enc (∑

i piX i)

User Server

Michele Minelli 13 / 16

SLIDE 61

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

User Server

Michele Minelli 13 / 16

SLIDE 62

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

· ∑

i wiX −i

User Server

Michele Minelli 13 / 16

SLIDE 63

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

User Server

Michele Minelli 13 / 16

SLIDE 64

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

extract

User Server

Michele Minelli 13 / 16

SLIDE 65

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

User Server

Michele Minelli 13 / 16

SLIDE 66

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract sign bootstrapping

User Server

Michele Minelli 13 / 16

SLIDE 67

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

User Server

Michele Minelli 13 / 16

SLIDE 68

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping weighted sums

User Server

Michele Minelli 13 / 16

SLIDE 69

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

10 LWE

weighted sums

User Server

Michele Minelli 13 / 16

SLIDE 70

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

10 LWE

weighted sums

Dec User Server

Michele Minelli 13 / 16

SLIDE 71

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

10 LWE

weighted sums

10 scores Dec User Server

Michele Minelli 13 / 16

SLIDE 72

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

10 LWE

weighted sums

10 scores Dec argmax User Server

Michele Minelli 13 / 16

SLIDE 73

Overview of the process

Evaluation of a DiNN with 30 neurons in the hidden layer:

. . . . . .

. . .

1 TLWE Enc (∑

i piX i)

30 TLWE · ∑

i wiX −i

30 LWE

extract

30 LWE

sign bootstrapping

10 LWE

weighted sums

10 scores Dec 7 argmax User Server

Michele Minelli 13 / 16

SLIDE 74

Experimental results

On inputs in the clear

Original NN (R) DiNN + hard_sigmoid DiNN + sign 30 neurons 94.76% 93.76% (-1%) 93.55% (-1.21%) 100 neurons 96.75% 96.62% (-0.13%) 96.43% (-0.32%)

On encrypted inputs

Accur. Disag. Wrong BS

Disag. (wrong BS)

Time 30 or 93.71% 273 (105–121) 3383/300000 196/273 0.515 s 30 un 93.46% 270 (119–110) 2912/300000 164/270 0.491 s 100 or 96.26% 127 (61–44) 9088/1000000 105/127 1.679 s 100 un 96.35% 150 (66–58) 7452/1000000 99/150 1.64 s

r = original

un = unfolded Michele Minelli 14 / 16

SLIDE 75

Experimental results

On inputs in the clear

Original NN (R) DiNN + hard_sigmoid DiNN + sign 30 neurons 94.76% 93.76% (-1%) 93.55% (-1.21%) 100 neurons 96.75% 96.62% (-0.13%) 96.43% (-0.32%)

On encrypted inputs

Accur. Disag. Wrong BS

Disag. (wrong BS)

Time 30 or 93.71% 273 (105–121) 3383/300000 196/273 0.515 s 30 un 93.46% 270 (119–110) 2912/300000 164/270 0.491 s 100 or 96.26% 127 (61–44) 9088/1000000 105/127 1.679 s 100 un 96.35% 150 (66–58) 7452/1000000 99/150 1.64 s

r = original

un = unfolded Michele Minelli 14 / 16

SLIDE 76

Benchmarks

Neurons Size of ct. Accuracy Time enc Time eval Time dec FHE-DiNN 30 30 8.0 kB 93.71% 0.000168 s 0.49 s 0.0000106 s FHE-DiNN 100 100 8.0 kB 96.35% 0.000168 s 1.65 s 0.0000106 s

Michele Minelli 15 / 16

SLIDE 77

Benchmarks

Neurons Size of ct. Accuracy Time enc Time eval Time dec FHE-DiNN 30 30 8.0 kB 93.71% 0.000168 s 0.49 s 0.0000106 s FHE-DiNN 100 100 8.0 kB 96.35% 0.000168 s 1.65 s 0.0000106 s

i n d e p e n d e n t

f

t h e n e t w

r

k

Michele Minelli 15 / 16

SLIDE 78

Benchmarks

Neurons Size of ct. Accuracy Time enc Time eval Time dec FHE-DiNN 30 30 8.0 kB 93.71% 0.000168 s 0.49 s 0.0000106 s FHE-DiNN 100 100 8.0 kB 96.35% 0.000168 s 1.65 s 0.0000106 s

s c a l e s l i n e a r l y

Michele Minelli 15 / 16

SLIDE 79

Open problems and future directions

Build better DiNNs: more attention to the conversion (+ retraining)

max

Michele Minelli 16 / 16

SLIDE 80

Open problems and future directions

Build better DiNNs: more attention to the conversion (+ retraining) Implement on GPU to have realistic timings

max

Michele Minelli 16 / 16

SLIDE 81

Open problems and future directions

Build better DiNNs: more attention to the conversion (+ retraining) Implement on GPU to have realistic timings More models (e.g., convolutional NNs) and machine learning problems

max

Michele Minelli 16 / 16

SLIDE 82

Open problems and future directions

Build better DiNNs: more attention to the conversion (+ retraining) Implement on GPU to have realistic timings More models (e.g., convolutional NNs) and machine learning problems

Research needed

We need a fast way to evaluate other, more complex, functions (e.g., max or ReLUa).

aReLU (x) = max (0, x) Michele Minelli 16 / 16

SLIDE 83

Open problems and future directions

Build better DiNNs: more attention to the conversion (+ retraining) Implement on GPU to have realistic timings More models (e.g., convolutional NNs) and machine learning problems

Research needed

We need a fast way to evaluate other, more complex, functions (e.g., max or ReLUa).

aReLU (x) = max (0, x)

Thank you for your attention! Questions?

Michele Minelli 16 / 16