[PPT] - Machine Learning Classification over Encrypted Data Raphael Bost, PowerPoint Presentation

SLIDE 1

Machine Learning Classification over Encrypted Data

Raphael Bost, Raluca Ada Popa, Stephen Tu, Shafi Goldwasser

SLIDE 2

Classification

(Machine Learning)

Supervised learning (training)
Classification

server

data set training phase model classification phase

client

data prediction

SLIDE 3

Secure Classification

The provider’s model is sensible

financial model, genetic sequences, …

Client’s private data

medical records, credit history, …

SLIDE 4

Secure Classification

The provider’s model is sensible

financial model, genetic sequences, …

Client’s private data

medical records, credit history, …

MPC / 2PC

SLIDE 5

Using General 2PC ?

+ Works for every circuit + Constant number of interactions

Have to build circuits
Hard to ‘compose’
Not easily reusable

➡ Ad Hoc protocols

SLIDE 6

Scope of our work

Secure classification, no learning

the model is already known

Differential privacy is out of scope

can be treated separately

Classifiers as specialized 2PC, but not a

specialized classifier

SLIDE 7

Approach

Security model: passive (honest-but-curious)

adversary

Identify and construct reusable building blocks
Practical performance as a primary goal
Choose the best fitted primitives

Homomorphic Encryption, FHE, Garbled Circuits, …

SLIDE 8

Building Blocks

Dot product
Encrypted Comparison
Encrypted (arg)max
Decision trees
Encryption scheme switching

SLIDE 9

Argmax

Alice
Bob
The comparison pattern must not depend on the

values (Ja1K, . . . , JanK, PK) SK

SLIDE 10

Argmax

Alice
Bob
The comparison pattern must not depend on the

values

Compare everything

(Ja1K, . . . , JanK, PK) SK

SLIDE 11

Argmax

Alice
Bob
The comparison pattern must not depend on the

values

Compare everything

⇒ O(n2)

(Ja1K, . . . , JanK, PK) SK

SLIDE 12

Argmax

Alice
Bob
The comparison pattern must not depend on the

values

Compare everything

⇒ O(n2)

(Ja1K, . . . , JanK, PK) SK

SLIDE 13

Argmax

Alice
Bob
The comparison pattern must not depend on the

values

Compare everything
‘Classical’ algorithm

⇒ O(n2)

(Ja1K, . . . , JanK, PK) SK

SLIDE 14

Argmax

Alice
Bob
The comparison pattern must not depend on the

values

Compare everything
‘Classical’ algorithm

⇒ O(n2) ⇒ O(n)

(Ja1K, . . . , JanK, PK) SK

SLIDE 15

Bob SK (v < w) Alice (PK, JvK, JwK) Jmax(v, w)K

Compare & Swap

SLIDE 16

Bob SK (v < w) Alice (PK, JvK, JwK) Jmax(v, w)K

Compare & Swap

Compare

∅ (v < w)

SLIDE 17

Bob SK (v < w) Alice (PK, JvK, JwK) Jmax(v, w)K

Compare & Swap

Swap

∅

Compare

∅ (v < w)

SLIDE 18

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare

Bob SK

b = (v < w)

(v < w)

SLIDE 19

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK

Bob SK

b = (v < w)

(v < w)

SLIDE 20

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK Jv0K, Jw0K

Bob SK

b = (v < w)

(v < w)

SLIDE 21

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK Jv0K, Jw0K

Bob SK

b = (v < w)

(v < w)

Jm0K ← ( Jw0K if b Jv0K o/w.

SLIDE 22

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK Jv0K, Jw0K

Bob SK

b = (v < w)

(v < w)

Jm0K ← ( Jw0K if b Jv0K o/w. (JbK, Jm0K)

SLIDE 23

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK Jv0K, Jw0K

Bob SK

b = (v < w)

(v < w)

Jm0K ← ( Jw0K if b Jv0K o/w. (JbK, Jm0K) JmK ← Jm0K·(g1 · JbK)r · JbKs

SLIDE 24

Compare & Swap

Alice (PK, JvK, JwK) Jmax(v, w)K

EncCompare (r, s) ← M 2 Jv0K = Jv + rK Jw0K = Jw + sK Jv0K, Jw0K JmK ← Jm0 − ¯ b.r − b.sK

Bob SK

b = (v < w)

(v < w)

Jm0K ← ( Jw0K if b Jv0K o/w. (JbK, Jm0K) JmK ← Jm0K·(g1 · JbK)r · JbKs

SLIDE 25

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K

SLIDE 26

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K C & S JmK ← Jmax(m, a2)K (m < a2)

SLIDE 27

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K C & S JmK ← Jmax(m, a2)K (m < a2) JmK ← Jmax(m, ai)K (m < ai) C & S

SLIDE 28

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K C & S JmK ← Jmax(m, a2)K (m < a2) C & S JmK ← Jmax(m, an)K (m < an) JmK ← Jmax(m, ai)K (m < ai) C & S

SLIDE 29

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K C & S C & S C & S (m < ai) JmK ← s max

j∈[1,i] aj

{ (m < an) JmK ← s max

j∈[1,n] aj

{ (m < a2) JmK ← Jmax(a1, a2)K

SLIDE 30

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

JmK ← Ja1K C & S C & S C & S JmK ← s max

j∈[1,i] aj

{ JmK ← s max

j∈[1,n] aj

{ JmK ← Jmax(a1, a2)K (m < ai) ⇒ argmax

j∈[1,i]

aj (m < an) ⇒ argmax

j∈[1,n]

aj (a1 < a2)

SLIDE 31

Argmax

Protocol : n-1 Compare & Swap

Alice Bob

C & S C & S C & S (aπ(1) < aπ(1)) (m < aπ(n)) ⇒ argmax

j∈[1,n]

aπ(j) (m < aπ(i)) ⇒ argmax

j∈[1,i]

aπ(j) JmK ← s max

j∈[1,n] aπ(j)

{ JmK ← s max

j∈[1,i] aπ(j)

{ JmK ← Jmax(aπ(1), aπ(2))K π(argmax aj) max aj JmK ← Jaπ(1)K

SLIDE 32

Argmax

Protocol : n-1 Compare & Swap

SLIDE 33

Argmax

Protocol : n-1 Compare & Swap

sequentially

SLIDE 34

Argmax

Protocol : n-1 Compare & Swap

sequentially

r in parallel

SLIDE 35

Argmax

Protocol : n-1 Compare & Swap

sequentially

r in parallel

1000 2000 3000 4000 5000 6000 7000 4 5 6 7 8 9 1 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 2 5 3 3 5 5 Time (ms) Elements Party A Party B Communication Tree

SLIDE 36

Decision Trees

A B C D E x y y1 y2 x1 x2

E D B A C x ≥ x2 x < x2 y > y2 x ≥ x1 x < x1 y < y1

SLIDE 37

Decision Trees

b1 b2

c1 c2

b3

c3

b4

c4 c5

1 1 1 1

P(b1, b2, b3, b4, c1, . . . , c5) = b1 · (b3 · (b4 · c5 + (1 − b4) · c4) + (1 − b3) · c3) +(1 − b1) · (b2 · c2 + (1 − b2) · c1)

SLIDE 38

Decision Trees

P(b1, b2, b3, b4, c1, . . . , c5) = b1 · (b3 · (b4 · c5 + (1 − b4) · c4) + (1 − b3) · c3) +(1 − b1) · (b2 · c2 + (1 − b2) · c1)

Polynomial evaluation

Leveled Homomorphic Encryption

Binary Variables
Binary Coefficients ! (SIMD)

Efficient LHE

)

SLIDE 39

Classifiers

Linear Classifier
Naïve Bayes Classifier
Decision Trees

In Practice

SLIDE 40

Linear Classifier

Separate two sets of

points

Very common

classifier

Dot product +

Encrypted compare

SLIDE 41

Linear Classifier

Model Size Computation Time / protocol Total Comm. Inter.

Client Server Dot Product Enc. Comp.

30 46.4 ms 43.8 ms 194 ms 9.67 ms 204 ms 35.84 kB 7 47 55.5 ms 43.8 ms 194 ms 23.6 ms 217 ms 40.19 kB 7

Evaluation on UC Irvine ML databases  40 ms network latency  2,66 GHz Intel Core i7

SLIDE 42

Naïve Bayes Classifier

SLIDE 43

Naïve Bayes Classifier

Classification

argmax

i∈[k]

p(C = ci|X = x)

SLIDE 44

Naïve Bayes Classifier

Classification
Bayes Formula

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x) p(X = x)

SLIDE 45

Naïve Bayes Classifier

Classification
Bayes Formula

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x)

SLIDE 46

Naïve Bayes Classifier

Classification
Bayes Formula
Naïve Model

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x) argmax

i∈[k]

p(C = ci, X1 = x1, . . . , Xd = xd)

SLIDE 47

Naïve Bayes Classifier

Classification
Bayes Formula
Naïve Model

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x) argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci)

SLIDE 48

Naïve Bayes Classifier

Classification
Bayes Formula
Naïve Model

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x) argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci)

SLIDE 49

Naïve Bayes Classifier

Classification
Bayes Formula
Naïve Model

argmax

i∈[k]

p(C = ci|X = x) argmax

i∈[k]

p(C = ci, X = x) argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci) argmax

i∈[k]

log p(C = ci)

d

X

j=1

log p(Xj = xj|C = ci)

SLIDE 50

Naïve Bayes Classifier

k d Computation Running Time Comm. Inter.

Client Server

2 9 150 ms 104 ms 479 ms 72.47 kB 14 5 9 537 ms 368 ms 1415 ms 150.7 kB 42 24 70 1652 ms 1664 ms 3810 ms 1911 kB 166

Evaluation on UC Irvine ML databases  40 ms network latency  2,66 GHz Intel Core i7

SLIDE 51

Decision Tree

Combination of other classifiers
In this example, linear classifiers
Linear classifier + ES Switching + Decision Trees

SLIDE 52

Decision Tree

Tree

Specs. Computation

Time / Protoc. FHE Com m. Inter.

N D Client Server Lin. Class. ES Switch Eval Decrypt

4 4 1579 ms 798 ms 446 ms 1639 ms 239 ms 33.51 ms 2639 kB 30 6 4 2297 ms 1723 ms 1410 ms 7406 ms 899 ms 35.1 ms 3555 kB 44

Evaluation on UC Irvine ML databases  40 ms network latency  2,66 GHz Intel Core i7

SLIDE 53

In conclusion

Composable building blocks for secure classifiers
Practical performances

Future work :

Less roundtrips (work on the protocols)
More parallelism (work on the implementation)

SLIDE 54

Machine Learning Classification over Encrypted Data

Classification

(Machine Learning)

Secure Classification

Secure Classification

MPC / 2PC

Using General 2PC ?

➡ Ad Hoc protocols

Scope of our work

Approach

Building Blocks

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Compare & Swap

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Argmax

Decision Trees

Decision Trees

Decision Trees

Classifiers

Linear Classifier

Linear Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Naïve Bayes Classifier

Decision Tree

Decision Tree

In conclusion

Questions?