Stadium A Distributed Metadata-private Messaging System Nirvan - - PowerPoint PPT Presentation

stadium
SMART_READER_LITE
LIVE PREVIEW

Stadium A Distributed Metadata-private Messaging System Nirvan - - PowerPoint PPT Presentation

Stadium A Distributed Metadata-private Messaging System Nirvan Tyagi Yossi Gilad Derek Leung Matei Zaharia Nickolai Zeldovich SOSP 2017 Previous talk: Anonymous broadcast This talk: Private messaging Alice Bob Problem: Communication


slide-1
SLIDE 1

Stadium

Nirvan Tyagi

A Distributed Metadata-private Messaging System

SOSP 2017 Yossi Gilad Derek Leung Matei Zaharia Nickolai Zeldovich

slide-2
SLIDE 2

Previous talk: Anonymous broadcast

slide-3
SLIDE 3

Alice Bob

This talk: Private messaging

slide-4
SLIDE 4

Problem: Communication metadata

(oncologist) Alice Bob

slide-5
SLIDE 5

Goal: Hiding communication metadata

Alice Bob Stadium (oncologist)

slide-6
SLIDE 6

Metadata-private systems with cryptographic security limited in throughput. Dissent [OSDI’12] , Riposte [S&P’15] Pung [OSDI’16] , Atom [SOSP’17] ~ 1.5 - 65 K messages / min

Related work

slide-7
SLIDE 7

Metadata-private systems with cryptographic security limited in throughput. Dissent [OSDI’12] , Riposte [S&P’15] Pung [OSDI’16] , Atom [SOSP’17] ~ 1.5 - 65 K messages / min Throughput increased by relaxing guarantees to differential privacy. Vuvuzela [SOSP’15] ~ 2 M messages / min

Related work

slide-8
SLIDE 8

First metadata-private messaging system to scale horizontally Metadata-private systems with cryptographic security limited in throughput. Dissent [OSDI’12] , Riposte [S&P’15] Pung [OSDI’16] , Atom [SOSP’17] ~ 1.5 - 65 K messages / min Throughput increased by relaxing guarantees to differential privacy. Vuvuzela [SOSP’15] Stadium [SOSP’17] ~ 2 M messages / min > 10 M messages / min

Related work

slide-9
SLIDE 9

Vuvuzela: Differentially private messaging

dead-drops

  • Dead-drops: virtually hosted addresses at which user messages are exchanged
slide-10
SLIDE 10

mixnet

Vuvuzela: Differentially private messaging

  • Dead-drops: virtually hosted addresses at which user messages are exchanged
  • Mixnet: servers re-randomize and permute messages
slide-11
SLIDE 11
  • Dead-drops: virtually hosted addresses at which user messages are exchanged
  • Mixnet: servers re-randomize and permute messages
  • Noise: servers add fake messages to obscure adversary observations

Vuvuzela: Differentially private messaging

slide-12
SLIDE 12

Scaling limitations

  • Every server handles all messages
  • Running a server is expensive (e.g. 2M users / minute = 1.3 Gbps)
slide-13
SLIDE 13

Challenge: How to distribute workload across untrustworthy servers?

1. How to mix messages? 2. How to add noise?

slide-14
SLIDE 14

Stadium design

Collaborative noise generation + verifiable parallel mixnet

slide-15
SLIDE 15

Stadium design

Collaborative noise generation + verifiable parallel mixnet

slide-16
SLIDE 16

Stadium design

Collaborative noise generation + verifiable parallel mixnet

slide-17
SLIDE 17

Contributions

  • Stadium design

○ Parallel mixnet ○ Collaborative noise generation ○ Verifiable processing including fast zero-knowledge proofs of shuffle

  • Multidimensional differential privacy analysis
  • Implementation and evaluation of prototype

10 M messages/min with per-server costs of ~100 Mbps

slide-18
SLIDE 18

Parallel mixnets with cryptographic security of mixing have large depth.

Repeat One butterfly iteration

  • Iterated butterfly topology [ICALP ‘14] as used by Atom [SOSP ‘17]
  • Large depth not good for low latency applications

# of servers

slide-19
SLIDE 19

Stadium uses 2-layer mixnet with differential privacy analysis.

slide-20
SLIDE 20
  • Trace messages by modeling likely paths through mixnet ( Borisov [PET ‘05] )

Traffic analysis attacks take advantage of uneven routings.

slide-21
SLIDE 21
  • Trace messages by modeling likely paths through mixnet ( Borisov [PET ‘05] )
  • Even if links are padded with dummy messages, adversary can incorporate

adversary-known inputs and outputs to infer uneven routing Traffic analysis attacks take advantage of uneven routings.

slide-22
SLIDE 22
  • Trace messages by modeling likely paths through mixnet ( Borisov [PET ‘05] )
  • Even if links are padded with dummy messages, adversary can incorporate

adversary-known inputs and outputs to infer uneven routing Traffic analysis attacks take advantage of uneven routings.

slide-23
SLIDE 23

Traffic analysis attacks take advantage of uneven routings.

  • Trace messages by modeling likely paths through mixnet ( Borisov [PET ‘05] )
  • Even if links are padded with dummy messages, adversary can incorporate

adversary-known inputs and outputs to infer uneven routing

slide-24
SLIDE 24
  • Adversary manipulates padding through known message injection
  • Unlike padding, noise messages are independent of adversary action

Add noise messages to provide differential privacy for uneven routings.

slide-25
SLIDE 25

Noising internal links not helpful if messages aren’t mixed.

  • Adversary learns path of all messages through compromised servers
slide-26
SLIDE 26

Noising internal links not helpful if messages aren’t mixed.

  • Adversary learns path of all messages through compromised servers
slide-27
SLIDE 27
  • Probability of compromise with random assignment falls exponentially with

group size Ensure mixing by organizing providers into small groups of servers.

slide-28
SLIDE 28

Problem: Scaling noise generation

# of fake messages

Vuvuzela server

slide-29
SLIDE 29

Problem: Distributed noise generation

# of fake messages Aggregate

Stadium servers

slide-30
SLIDE 30

# of fake messages Aggregate probability distribution

Problem: Distributed noise generation

Stadium servers

slide-31
SLIDE 31

Laplace Noise mechanism Gaussian Poisson Additive

  • Poisson provides all properties nicely

Discrete Non-negative

Poisson distribution for distributed noise generation

slide-32
SLIDE 32

Multidimensional analysis for reducing noise requirements

  • When a user changes communication pattern, only a few links are affected
  • Reduce noise by a factor of where is probability link is affected
slide-33
SLIDE 33
  • Ensure noise messages stay in system
  • Utilize various cryptographic zero knowledge proofs of integrity
  • Hybrid verification scheme
  • Zero knowledge proof of shuffle is bottleneck processing cost

○ Multicore Bayer-Groth verifiable shuffle on Curve25519 ○ ~ 20X performance speedup over state of the art ○ E.g. 100K ciphertext shuffle speedup from 128 seconds to ~7 seconds Verifiable processing pipeline

slide-34
SLIDE 34

Implementation

  • Prototype

○ Control and networking logic in Go (2500 lines of code) ○ Verifiable processing protocols in C++ (9000 lines of code) ■ Highly optimized Bayer-Groth verifiable shuffle implementation ○ Available at github.com/nirvantyagi/stadium

slide-35
SLIDE 35

Evaluation

  • Recall goal: horizontal scalability with inexpensive servers
  • What is the cost of operating a Stadium server?
  • Does Stadium horizontally scale?
slide-36
SLIDE 36

Evaluation methodology

  • Deploy Stadium on up to 100 Amazon c4.8xlarge EC2 VMs

○ 36 virtual cores, 60 GB memory ○ US East region ○ Message size: 144 B

  • Extrapolate scaling patterns to larger deployment sizes
slide-37
SLIDE 37

Operating costs of a Stadium server are relatively small

*W. Norton. 2010. Internet Transit Prices - Historical and Projected. Technical Report. http://drpeering.net/white-papers/ Internet-Transit-Pricing-Historical-And-Projected.php

  • Bandwidth is dominant cost
  • Operating costs ~ $110 / month*
  • Top 300 of relays in Tor offer > 140 Mbps

88 - 173 Mbps 6-13% of Vuvuzela’s 1.3Gbps

slide-38
SLIDE 38

Messages are effectively distributed across servers to reduce latency Stadium

slide-39
SLIDE 39

Conclusion

  • Stadium: high-throughput, horizontally-scaling, metadata-private system

○ Verifiable parallel mixnet resistant to traffic analysis ○ Fast zero-knowledge proofs of shuffle ○ Collaborative noise generation with Poisson distribution

  • Multidimensional differential privacy analysis
  • Implementation and evaluation of prototype

Prototype at github.com/nirvantyagi/stadium

slide-40
SLIDE 40

Reserve Slides

slide-41
SLIDE 41

Dead-drop message exchange

d4cf2802a26e60e489a0b6949a8d881c d4cf2802a26e60e489a0b6949a8d881c e0784f9889a878fdb3c6c27d6a8318fb

slide-42
SLIDE 42

Dead-drop message exchange

slide-43
SLIDE 43

Dead-drop message exchange

Easy to observe conversations

slide-44
SLIDE 44

Dead-drop message exchange

d4cf2802a26e60e... e0784f9889a878f...

slide-45
SLIDE 45

Dead-drop message exchange

slide-46
SLIDE 46

Dead-drop message exchange

d4cf2802a26e60e... e0784f9889a878f... 2 1 Dead-drop access counts reveal conversation

slide-47
SLIDE 47

Dead-drop message exchange

d4cf2802a26e60e... e0784f9889a878f... 2 1 Dead-drop access counts reveal conversation Add “noise” to access counts with fake messages!

slide-48
SLIDE 48

Pr[Alice talking to Bob] Pr[Alice not talking to Bob]

Differential Privacy

slide-49
SLIDE 49

Pr[Alice talking to Bob] Pr[Alice not talking to Bob]

Differential Privacy

probability # of 2-message dead-drops 1 no noise

slide-50
SLIDE 50

Pr[Alice talking to Bob] Pr[Alice not talking to Bob]

Differential Privacy

probability # of 2-message dead-drops ~1000 1 no noise with noise