CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An - - PowerPoint PPT Presentation

csc2412 definition of di ff erential privacy
SMART_READER_LITE
LIVE PREVIEW

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An - - PowerPoint PPT Presentation

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An Ideal Goal The study reveals nothing new about any particular individual to an adversary. - not much Example: Adversary believes humans have four fingers on each hand. In


slide-1
SLIDE 1

CSC2412: Definition of Differential Privacy

Sasho Nikolov 1
slide-2
SLIDE 2

An Ideal Goal

The study reveals nothing new about any particular individual to an adversary. Example:
  • Adversary believes humans have four fingers on each hand.
  • In particular, believes Sasho has four fingers on each hand.
2
  • not much
slide-3
SLIDE 3

An Ideal Goal

The study reveals nothing new about any particular individual to an adversary. Example:
  • Adversary believes humans have four fingers on each hand.
  • In particular, believes Sasho has four fingers on each hand.
  • Study reveals distribution of number of fingers per person’s hand.
  • Adversary now has learned Sasho probably has five fingers per hand.
2
slide-4
SLIDE 4

An Ideal Goal

The study reveals nothing new about any particular individual to an adversary. Example:
  • Adversary believes humans have four fingers on each hand.
  • In particular, believes Sasho has four fingers on each hand.
  • Study reveals distribution of number of fingers per person’s hand.
  • Adversary now has learned Sasho probably has five fingers per hand.
Another example:
  • Adversary believes there is no link between smoking and cancer.
  • Also knows that Sasho smokes
  • Study reveals link between smoking and cancer.
2

Learning

about the

world also

/

::3earning

slide-5
SLIDE 5

Statistical vs Personal Information

In the examples, the adversary learns statistical information that pertains to Sasho.
  • If science works, it better reveal something about me.
What information is statistical and what information is personal? Test: Could the adversary have learned this information if my data were not analyzed? 3
  • four
vs five fingers

smoking a

cancer

}

Yes

statistical

finding if ?I÷f }

no

personal

slide-6
SLIDE 6

Towards a Definition

The algorithm doing the analysis should do almost the same in all the following cases:
  • my data is included in the data set
  • my data is not included in the data set
  • my data is changed in the data set
4
  • I. e. , what

the algorithm publishes

does not

depend

too strongly
  • n

my

data

.
slide-7
SLIDE 7

Data Model

Data set: (multi-)set X of n data points X = {x1, . . . , xn}.
  • each data point (or row) xi is the data of one person
  • each data point comes from a universe X
A data analysis algorithm (a mechanism) is a randomized algorithm M that takes a data set X and produces the results of the data analysis as output. 5

{ ditty?.at

.es
  • Eg
. d

binary attributes

¥-1

Xi c- I = 40,1yd
  • The output
  • f

MIX)

is

random

for

any

X

slide-8
SLIDE 8

Almost a Definition

We call two data sets X and X 0 neighbouring if
  • 1. (variable n) we can get X 0 from X by adding or removing an element
  • 2. (fixed n) we can get X 0 from X by replacing an element with another
Definition An mechanism M is differentially private if, for any two neighbouring datasets X, X 0 M(X) ≈ M(X 0) 6

differ

in the data of a

single individual

X -ft , . . - , Xu } X '
  • ft
. . - , ti - g. Fun .
  • i

% dataset size

xn ) X
  • f r .
. . - , Xu } X ' =hX . . . - , Xi -i ,

i'it '

"
  • Hu}

% MCH

and

MIX

' ) are " similar " as random variables
slide-9
SLIDE 9

Total Variation Distance Differential Privacy

Definition An mechanism M is δ-tv differentially private if, for any two neighbouring datasets X, X 0, and any set of outputs S |P(M(X) ∈ S) − P(M(X 0) ∈ S)| ≤ δ. What should δ be?
  • δ <
1 2n?
  • δ ≥
1 2n? 7

dtullllt) , UH

') )
  • msatl PINCHES)
  • PINK
' les ) )

X

  • h
E- . # t

quot

nee . neighbouring X '
  • ftii .
  • in
'

}

For any

X

, X ' , there are k ⇐ n data

setshesufjchan.su

  • does
almost neighbouring tot "! X ' " - X"' , . . . , Hk
  • ' '
n y '" . yl the same for

Lsat HPCUCH

c- S )
  • ipceeixyes , I earn ez}

a1 datasets

  • "Name
and shame " mechanism : For all i , output x, we prof . of

ft - X

' X-
  • Ex .
. .. , tug

ly

,prof t - S ri is not published, and NIH
  • UCH)

neighbouring

' -ft. . - ,# . . . . , tf not intuitively private : some data pt . published wreoustprof .
slide-10
SLIDE 10

Finally, Differential Privacy

Definition An mechanism M is ε-differentially private if, for any two neighbouring datasets X, X 0, and any set of outputs S P(M(X) ∈ S) ≤ eεP(M(X 0) ∈ S). 8 In Vadhana 's notes : any conclusion an adversary draws from UCH could we Dwork ,McSherry , Nissim ,Smith 2006 been drawn from Uct ')

n

HE for small E E small positive constant " Name and shame "

P ( ell X't c- S ) c.ee/PCdlHc- S )

  • fails
this
  • lefin
S
  • event that

somethingbad

happens to me

for

any

2<0

My risks if

my

data are used almost same as

if they

are not used
slide-11
SLIDE 11

A Hypothesis Testing Viewpoint

Suppose X = {X1, . . . , Xn} are drawn IID from some distribution. The adversary A wants to use M(X) to test which hypothesis holds: H0: Xi = y0
  • E.g., “Sasho does not smoke”
H1: Xi = y1
  • E.g., “Sasho smokes”
Then for any A P(A(M(X)) = ”H1” | H1) ≤ eεP(A(M(X)) = ”H1” | H0) 9

not essential

Wasserman

, Zhou

T se - DP

( that sees Nlt) and
  • utputs
" Ho " , " H , " ) Xi
  • y ,
Xi
  • yo
  • -
True Positive rate false positive rate 1- TypeII error

Type I

error
slide-12
SLIDE 12

Randomized Response

Given
  • dataset X = {x1, . . . , xn} ⊆ X,
  • query q : X → {0, 1}
  • utput M(X) = (Y1(x1), . . . , Yn(xn)), where, independently
Yi(xi) = 8 < : q(xi) w/ prob. eε 1+eε 1 − q(xi) w/ prob. 1 1+eε . 10 Warner (

if

x is a smoker E.g .

VH

  • { o
  • w
> I 2 < t
slide-13
SLIDE 13

Privacy Analysis

ETS for any y ∈ {0, 1}n, and any neighbouring X, X 0 P(M(X) = y) ≤ eεP(M(X 0) = y). 11 #
  • V-seso.it
"

PINCHES)

=

PINCH

  • y)

¥

Plait

' )
  • y )
. e' .
  • e' Mutt
' les)

fxiiriityi

take

some

xx

'

neighbouring

!

. . .

i

tty

p (Nlt)

  • y )
= Ply , Kil
  • ya
, Yachty . . . . . Yuki -
  • fu )
= Ply , hit
  • y, )
. Pl Yum
  • yr )
  • -
LPC Yu Hui .
  • yn)
(Ult ' )
  • y) = Ply. ix. Ii y, )
. - -

Pl Li Ki

's -
  • yet
.
  • PIL Kul
  • ya)
slide-14
SLIDE 14

Accuracy Analysis

Want to approximate q(X) = 1 n Pn i=1 q(xi). Claim: 1 n Pn i=1 (1+eε)Yi1 eε1 ≈ q(X) 12

q :D→{ 0,13

Efi)

  • quit
" smoker ? "

Etsi)

  • quite
+ et

ELIE

, Zi)
  • I
, # Hit
  • qlt)

Zi

Hoeffdings Inequality

: 2£ ,

uinofependentfpH-zi-EIE.tt/zt)

  • a. 2.

e.at#uetti:ZieI-eIIeeIT

)

if

n > enter)# D '

BIKE.to

  • qlhlzalhexpf.IE?iiiE'Yer

if

u

login

22 EZ