Se Sequence Obfu fuscation ion to o Thwar art Pa Pattern - - PowerPoint PPT Presentation

β–Ά
se sequence obfu fuscation ion to o thwar art pa pattern
SMART_READER_LITE
LIVE PREVIEW

Se Sequence Obfu fuscation ion to o Thwar art Pa Pattern - - PowerPoint PPT Presentation

Se Sequence Obfu fuscation ion to o Thwar art Pa Pattern Matching Attacks Bo Guan , Nazanin Takbiri, Dennis L. Goeckel, Amir Houmansadr, Hossein Pishro-Nik University of Massachusetts Amherst IEEE International Symposium on Information


slide-1
SLIDE 1

Se Sequence Obfu fuscation ion to

  • Thwar

art Pa Pattern Matching Attacks

Bo Guan, Nazanin Takbiri, Dennis L. Goeckel, Amir Houmansadr, Hossein Pishro-Nik University of Massachusetts Amherst IEEE International Symposium on Information Theory (ISIT) Los Angeles, California June 2020

slide-2
SLIDE 2

Inter

nternet net of

  • f

Things

hings

Priva vacy cy Threats s in Internet of Things s Applica cations

2

slide-3
SLIDE 3

Mathematica cal Model of Problem

LBS

Location Based Applications

  • Data is created in the form of Long Time Series .

π‘Œ! = [π‘Œ! 1 , π‘Œ! 2 , π‘Œ!(3), …] Need Utility Lose Privacy π‘Œ! = [π‘Œ! 1 , π‘Œ! 2 , π‘Œ!(3), …]

The collection Data of User 𝑣

Data of User 𝑣 at different times

3

slide-4
SLIDE 4

Motiva vation

User 1 User 2 . . . User π‘œ Sequence 1 Sequence 2 . . . Sequence π‘œ

User Profile Adve versa sary: y: Match ch Prior Behavi viors wi with th Realiza zations π‘Œ" = [π‘Œ" 1 , π‘Œ" 2 , π‘Œ"(3), …] π‘Ÿ"

($)π‘Ÿ" & β‹― π‘Ÿ" '

User v could be identified by his habitual pattern.

slide-5
SLIDE 5

Co Contri tributi tion

  • Our goal is to provide a privacy-preserving mechanism (PPM), even if

we do not know the exact data point model.

  • We proposed a smart obfuscation method to confuse an adversary’s

pattern matching attack.

  • This method achieves the privacy by generating all the possible

patterns in a smart obfuscation noise.

slide-6
SLIDE 6

Ov Overview

I. I.

Syst ystem Model and De Defin init itio ions

II. II.

Priva vacy cy Guar Guarantee antee for for Model

  • del-Fre

Free PPM PPMs

III.

  • III. Numerica

cal Resu sults

IV

  • IV. Concl

clusi sion

6

slide-7
SLIDE 7

Ov Overview

I. I.

Syst ystem Model and De Defin init itio ions

II.

Privacy Guarantee for Model-Free PPMs

  • III. Numerical Results
  • IV. Conclusion

7

slide-8
SLIDE 8

Da Data ta Poi Point nt Mo Model

𝒀* = [π‘Œ* 1 , π‘Œ* 2 , β‹― , π‘Œ* 𝑛 ]+, 𝒀= [𝒀,, 𝒀-, β‹― , 𝒀.]

  • Here we make no assumptions about the data points’ statistical model.
  • Instead, we assume only there are 𝑠 β‰₯ 2 possible values for each user’s data points

in a finite size set 𝑆 = 0, 1, β‹― , 𝑠 βˆ’ 1 . Data point of user 𝑣

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

slide-9
SLIDE 9

… 3 1 4 2 6 3 4 5 6 … For instance, in the below fragment of the data sequence: If we set β„Ž = 3, pattern 164 shows in the sequence fragment, but not the pattern 134.

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Definition 1. A pattern is a sequence 𝑹 = π‘Ÿ(,)π‘Ÿ(-) β‹― π‘Ÿ 1 , where π‘Ÿ 2 ∈ {0,1, β‹― , 𝑠 βˆ’ 1} for any 𝑗 ∈ 1,2, β‹― , π‘š . A user 𝑣 is said to have the pattern 𝑹 if:

  • The sequence 𝑹 is a subsequence (not necessarily consecutive) of user 𝑣’s

(obfuscated) sequence, and

  • For 𝑗 ∈ 1,2, β‹― , π‘š βˆ’ 1 , π‘Ÿ 2 and π‘Ÿ 23, appear in the (obfuscated) sequence of

user 𝑣 with distance less than or equal to β„Ž.

slide-10
SLIDE 10

For instance, the following sequence is a 2, 3 βˆ’superstring since it contains all possible lengthβˆ’2 strings, here 𝑆 = {0, 1}: 000001010011100101110111

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Definition 2. A sequence is an 𝑠, π‘š βˆ’superstring is it contains all possible 𝑠1 lengthβˆ’π‘š strings on a sizeβˆ’π‘  alphabetβˆ’β„› as its contiguous substrings. Note that:

  • Cyclic tail-to-head ligation is not allowed here.
  • Repeated symbols for the substrings are allowed.
slide-11
SLIDE 11

𝒂* = [π‘Ž* 1 , π‘Ž* 2 , β‹― , π‘Ž* 𝑛 ]+, 𝒂= [𝒂,, 𝒂-, β‹― , 𝒂.].

π‘˜ = 5

()$ *

𝑋

!(𝑗)

Element of superstring sequence 𝒃!

  • I. Syst

ystem Mo Model and Defi finiti tions II. Priva vacy cy of Independent Use sers III

  • III. Priva

vacy cy of Dependent Use sers

  • IV. Remapping Tech

chnique

Obfusca scationMech chanism sm

𝒀𝒗

π‘Œ!(1) π‘Œ!(2) π‘Œ!(3) π‘Œ!(4) π‘Œ!(5) π‘Œ!(6) π‘Œ!(7) π‘Œ!(8) π‘Œ!(9) … 1 1 1 …

𝑿𝒗

π‘Œ!(1) π‘Œ!(2) 𝑏!(1) π‘Œ!(4) 𝑏!(2) π‘Œ!(6) π‘Œ!(7) π‘Œ!(8) 𝑏!(3) …

𝒂𝒗

π’Œ = 𝟐 π’Œ = 𝟐+1 π’Œ = 𝟐+1+1 …

slide-12
SLIDE 12

Anonym ymiza zation Mech chanism sm

Alice ce data se sequence ce Bob data se sequence ce Carol data se sequence ce Bob data se sequence ce Alice ce data se sequence ce Carol data se sequence ce Bo Bob Alice ce Ca Carol Bo Bob Alice ce Ca Carol

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

slide-13
SLIDE 13

Ø Adversary Does Know:

  • The sequence 𝑍

" # 1 , 𝑍 " # 2 , β‹―, 𝑍 " # 𝑛 for each user 𝑣.

  • Privacy-Preserving Mechanism β†’ Obfuscation+ Anonymization.
  • Identifying pattern for each user 𝑣.

Ø Adversary Does NOT Know:

  • The permutation employed in the anonymization.
  • The actual superstring employed for obfuscation.

Adve versa sary y Model

13

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

slide-14
SLIDE 14
  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Other Approach ches

β€œPerfect Privacy” is proposed and achieved in Takbiri’s journal paper [1]:

  • A statistical model is assumed in the data points model.
  • But, the statistical model may NOT be given or unknown.

In this paper,

  • We propose a new PPMs mechanism based on the Model-Free approach.
  • The PPM protects against a specific attack: pattern matching.

[1] Takbiri, Nazanin, et al. "Matching anonymized and obfuscated time series to users’ profiles." IEEE Transactions

  • n Information Theory 65.2 (2018): 724-741.
slide-15
SLIDE 15
  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Definition 3. User 𝑀 with data pattern π‘Ÿ<

, π‘Ÿ< - β‹― π‘Ÿ< 1 has πœ— βˆ’privacy if :

  • for any other 𝑣, the probability that user 𝑣 has the same pattern as user user 𝑀

in their obfuscated data sequence is at least πœ—.

For instance, a 3, 2 βˆ’superstring which contains all possible lengthβˆ’2 strings, here 𝑆 = {0, 1, 2}:

001122011002201221 Definition 4. β„™(ℬ*) is defined as the probability that the obfuscated sequence 𝒂* has user 1 ’s identifying pattern due to

  • bfuscation by an

𝑠, π‘š βˆ’ superstring with length π‘šπ‘ 1.

  • Superstring is obtained by arranging all the

possible 𝑠1 substrings without overlapping.

slide-16
SLIDE 16

For instance, a shortest 3, 2 βˆ’superstring constructed by using the De Bruijn sequence 𝐢(3, 2):

𝐢 3, 2 = β€œ001021122”

  • I. Syst

ystem Mo Model and Defi finiti tions

  • II. Priva

vacy cy Guarantees s for Model-Free PPMs s III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Definition 5. β„™(ℬ*

D ) is defined as the

probability that the

  • bfuscated

sequence 𝒂* has user 1’s identifying pattern due to obfuscation by the shortest 𝑠, π‘š βˆ’superstring with length 𝑔 𝑠, π‘š = 𝑠1 + π‘š βˆ’ 1.

  • Superstring is constructed by a De

Bruijn sequence, which is the optimal shortest cyclic sequence containing all the possible substrings. 0010211220

slide-17
SLIDE 17

Ov Overview

I.

System Model and Metrics

II. II.

Priva vacy cy Guar Guarantee antee for for Mo Model-Fre Free PPM PPMs

  • III. Numerical Results
  • IV. Conclusion

17

slide-18
SLIDE 18
  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Theorem 1. If 𝐚 is the obfuscated version of 𝐘, and 𝐙 is the anonymized version of 𝐚 as defined previously, there exists a lower bound πœ— for the probability β„™(ℬ*): , (1) where β„™ 𝔆!

β‰₯ 1 βˆ’ 1 βˆ’ π‘ž"#$

% &'(

𝑠& n

)*+ ,-. /,'( , 12-./ &

1 βˆ’ exp βˆ’ πœ€) 2 π»π‘ž"#$ 𝐻 = 𝑛 βˆ’ β„Ž π‘š βˆ’ 1 , πœ€) = 1 βˆ’

)& 12-./, for 𝛽 = 0, 1, β‹― , 𝑠& βˆ’ 1

slide-19
SLIDE 19

Achieves a lower bound: Define two Events: Independent

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

ℇ*: the user 1’s pattern appears in user u’s obfuscated data points 𝒂*. β„±

*: the distance between any neighboring points of pattern in 𝒂* is

smaller than or equal to β„Ž.

ℇ!: 𝑁!,(

(() ≀ 𝑛 βˆ’ β„Ž π‘š βˆ’ 1 = 𝐻

β„±

!: 𝐸! (() ≀ β„Ž; 𝐸! C

≀ β„Ž; β‹― ; 𝐸!

(&'() ≀ β„Ž

β„™ 𝔆! β‰₯ β„™ ℇ! β„™(β„±

!)

The idea behind the Proof:

slide-20
SLIDE 20

Achieves a lower bound: Independent

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

G Reserve enough space for the obfuscation Distance requirement m

ℇ!: 𝑁!,(

(() ≀ 𝑛 βˆ’ β„Ž π‘š βˆ’ 1 = 𝐻

β„±

!: 𝐸! (() ≀ β„Ž; 𝐸! C

≀ β„Ž; β‹― ; 𝐸!

(&'() ≀ β„Ž

ℇ!: 𝑁!,(

(() + β„Ž π‘š βˆ’ 1

≀ 𝑛

Independent

β„™ 𝔆! β‰₯ β„™ ℇ! β„™(β„±

!)

The idea behind the Proof:

slide-21
SLIDE 21

For instance, a 3, 2 βˆ’superstring which contains all possible lengthβˆ’2 strings, here 𝑆 = {0, 1, 2}:

001122011002201221 β„™ 𝑀*,, = π›½π‘š + 1 = 1 𝑠1 , 𝛽 = 0, 1, β‹― , 𝑠1 βˆ’ 1 probability of Event β„°*: Could be categorized on 𝑀*,, with different choices of values and equally likely:

The idea behind the Proof:

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

π’”π’Ž possible location choices for pattern head in the superstring. Also, they are equally likely. Location of the pattern head in the superstring

β„™ ℇ! = β„™(at least 𝑀!,( sucess in 𝐻 trials)

slide-22
SLIDE 22

= 1 𝑠& n

)*+ /,'(

β„™ at least π›½π‘š + 1 sucess in 𝐻 trials β„™ ℇ! = n

)*+ /,'(

β„™ at least 𝑀!,( sucess in 𝐻 trials. 𝑀!,( = π›½π‘š + 1 . β„™ 𝑀!,( = π›½π‘š + 1

By the Law of Total Probability:

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Equally likely

= 1 𝑠& n

)*+ /,'(

1 βˆ’ β„™ less than π›½π‘š + 1 sucess in 𝐻 trials

The idea behind the Proof:

slide-23
SLIDE 23

By using the Chernoff Bound: Define the probability of event 𝒝F: β„™ 𝒝F = β„™(Less than π›½π‘š + 1 success in 𝐻 trials).

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

β„™ 𝔆* β‰₯ β„™ ℇ* β„™(β„±

*)

The idea behind the Proof:

slide-24
SLIDE 24

Thus, by (5), (8) and (9), we obtain (1). Probability of Event β„±

*:

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

β„™ 𝔆* β‰₯ β„™ ℇ* β„™(β„±

*)

The idea behind the Proof:

slide-25
SLIDE 25

Constructing the shortest superstring by Lemma 1:

  • A De Bruijn sequence is a shortest cyclic sequence which contains every possible

length-l substring aggressively overlapping.

  • Here we use a De Bruijn sequence to construct the optimal short superstring.

Add the front (π‘š βˆ’ 1) symbols at the end

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Lemma 1. The length of a sequence solution for the shortest 𝑠, π‘š βˆ’superstring is equal to 𝑠1 + π‘š βˆ’ 1. That is, 𝑔 𝑠, π‘š = 𝑠1 + π‘š βˆ’ 1.

slide-26
SLIDE 26

Can be proved similarly: The location index for each pattern showing in the superstring is most aggressively overlapping:

  • I. Syst

ystem Mo Model and and De Definitions II.

  • II. Priva

vacy cy Gu Guara rante tees fo for Mo Model-Fr Free ee PPM PPMs III

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Theorem 2. If 𝐚 is the obfuscated version of 𝐘, and 𝐙 is the anonymized version of 𝐚 as defined previously, there exists a lower bound πœ—D for the probability β„™(ℬ*

D ): For instance, a shortest 3, 2 βˆ’superstring constructed by using the De Bruijn sequence 𝐢(3, 2):

𝐢 3, 2 = β€œ001021122” 0010211220

π’”π’Ž possible location choices for pattern head in the superstring. And they are equally likely.

β„™ 𝑀#,- = 𝛽 + 1 = 1

𝑠2 , 𝛽 = 0, 1, β‹― , 𝑠2 βˆ’ 1

𝐻 = 𝑛 βˆ’ β„Ž π‘š βˆ’ 1 , πœ€)

I = 1 βˆ’ ) 12-./, for 𝛽 = 0, 1, β‹― , 𝑠& βˆ’ 1

where β„™ 𝔆!

I

β‰₯ 1 βˆ’ 1 βˆ’ π‘ž"#$

% &'(

𝑠& n

)*+ ,-. /,'( , 12-./

1 βˆ’ exp βˆ’ πœ€I

) C

2 π»π‘ž"#$

slide-27
SLIDE 27

πœ—: theoretical lower bound of β„™(𝔆*) from Theorem 1. πœ—β€™: theoretical lower bound of β„™(𝔆′*) from Theorem 2. Increasing 𝑛 or π‘žHIJ β†’ Achieves LARGER lower bound.

  • I. Syst

ystem Mo Model and and De Definitions

  • II. Priva

vacy cy Guarantees s for Model-Fr Free ee PPMs

  • III. Numerica

cal Resu sults IV IV. . Concl clusi sion

Numerica cal Resu sults

slide-28
SLIDE 28
  • We proposed a PPM that can be used to provide privacy guarantees against pattern

matching attacks.

  • No assumption is made about the statistical model of the data points of the users by our
  • bfuscation method, which is Model-Free.
  • Numerical results show that we can obtain at least a certain percentage of users that

have the same pattern as user 1’s by using the proposed obfuscation method.

  • I. Syst

ystem Mo Model and and De Definitions

  • II. Priva

vacy cy Guarantees s for Model-Fr Free ee PPMs

  • III. Numerica

cal Resu sults IV

  • IV. Concl

clusi sion

Concl clusi sion