SLIDE 1
Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden - - PowerPoint PPT Presentation
Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden - - PowerPoint PPT Presentation
Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Stenography A social network occurs anywhere there is social interaction between people. Examples include Email, instant messaging, Facebook, blogging
SLIDE 2
SLIDE 3
SLIDE 4
The structure of social networks can be
interesting
How are friendships usually structured? Are there hubs, such as Heather, who connect separate networks? How many degrees of Kevin Bacon? We can investigate these questions if we have the data to mine.
SLIDE 5
For our examples, we will use a network
- f emails sent between users.
How do we protect users’ privacy while
still releasing the data for research?
John Mary Vertex Vertex Directed edge
SLIDE 6
Remove any identifiable information, such
as name and other attributes.
Randomly rename the vertices
R3579X R73313
SLIDE 7
Convert directed edges to undirected
- edges. This increases the complexity and
makes it harder to attack.
R3579X R73313 Undirected edge
SLIDE 8
Let’s say you want to know if
two vertices are connected on the graph.
All the identifying info has been
removed, so how do we do it?
SLIDE 9
An active attack involves the adversary
creating vertices in the graph before the graph is released
The adversary will create edges between
the vertices in a fashion that it can then recognize later on in when the graph is released
SLIDE 10
We create k new vertices around 2*(log n)
where n is the total number of vertices
We create new do – d1 edges between
these new vertices and the other ones in the graph
Then, we randomly create edges between
these new nodes with independent probability of 1/2
SLIDE 11
Given the graph, how do we find the
subgraph that we created?
Create a search tree, pruning the tree
based on the properties of our subgraph, such as the number of degrees of our new vertices
SLIDE 12
Tom John Mary Mike Zoe
SLIDE 13
Tom John Mary k5 k1 k2 k4 k3 Mike Zoe
SLIDE 14
Tom John Mary k5 k1 k2 k4 k3 Mike Zoe
SLIDE 15
Tom John Mary k5 k1 k2 k4 k3 Mike Zoe
SLIDE 16
JKL ZXCV QWER DFG WER UYT ASD HGF ASDF BNM
SLIDE 17
JKL ZXCV QWER k5 k1 k2 k4 k3 ASDF BNM
SLIDE 18
JKL John Mary k5 k1 k2 k4 k3 ASDF BNM
SLIDE 19
The paper proves that the search tree
does not grow too large and that the algorithm displays good performance
Also, it proves that the subgraph is unique
so that we don’t identify the wrong subgraph
SLIDE 20
They simulate an attack on LiveJournal
friendship links. They create the accounts
- n the website, make the connections,
and then crawl the site and anonymize the data
The network has 4.4 million nodes and 77
million edges
SLIDE 21
SLIDE 22
Only needs sqrt(log(n)) new nodes to
attack the graph
However, it’s much more computationally
intensive and less practical in the real world, although it takes less nodes
SLIDE 23
SLIDE 24
It’s a lot like an active attack, except you
don’t create new nodes, instead you collaborate with your friends and find yourselves in the graph
However, because you did not specifically
target certain people, you may not be able to identify other people when you find yourself
SLIDE 25