[PPT] - The Price of Data Simone Galperti Aleksandr Levkun Jacopo Perego PowerPoint Presentation

SLIDE 1

The Price of Data

Simone Galperti Aleksandr Levkun Jacopo Perego

UC San Diego UC San Diego Columbia University

October 2020

SLIDE 2

Motivation

introduction

Data has become essential input in modern economies Few formal markets for data; often data collected “for free” (Posner-Weyl ’18) Question: what is the individual value of a datapoint? → price ▶ value that each datapoint in database individually generates for its

wner? ⇝ WTP for additional datapoint

▶ drivers of prices? ▶ efgects of privacy concerns? ▶ compensating data sources for their data?

SLIDE 3

This Paper

introduction

Simple insight: ▶ data pricing problem intimately related to how owner uses data, given

bjective

▶ combine data as inputs to produce actionable information ▶ to make own decisions or to infmuence others’ decisions

⇒ data usage: mechanism/information design problem ▶ when carefully formulated, pricing and usage problems are in a special mathematical relationship: duals Goals for today

1. formalize data usage–pricing relationship + novel interpretation
2. (preliminary) characterization of price determinants and properties
3. showcase properties through examples

SLIDE 4

Related Literature

introduction

Mechanism Design. Myerson (’82, ’83) ... Information Design. Kamenica & Gentzkow (’11), Bergemann & Morris (’16,’19) ... Duality & Correlated Equilibrium. Nau & McCardle (’90), Nau (’92), Hart & Schmeidler (’89), Myerson (’97) Duality & Bayesian Persuasion. Kolotilin (’18), Dworczak & Martini (’19), Dizdar & Kovac (’19), Dworczak & Kolotilin (’19) Markets for Information. Bergemann & Bonatti (’15), Bergmann, Bonatti, Smolin (’18), Posner & Weyl (’18), Bergemann & Bonatti (’19) Information Privacy. Acquisti, Taylor, Wagman (’16), Ali, Lewis, Vasserman (’20), Bergemann, Bonatti, Gan (’20), Acemoglu, Makhdoumi, Malekian, Ozdaglar, (’20)

This Paper

− formulation of data usage − subclass of data usage − duality to characterize CE − feasible mechanisms for principal − dual not as a solution method, but as focus of analysis − independent economic question − games, mechanisms − individual prices of data − formal method for assessing efgects

f privacy on value of data

SLIDE 5

illustrative example

SLIDE 6

Internet Platform (Bergemann et al. ’15)

example

Internet platform owns data (cookies) about each potential buyer of product

f monopolistic seller (MC=0)

Database: big list (continuum) of datapoints = buyer ID and valuation ▶ share µ of datapoints has valuation ω0 = 1 ▶ share 1 − µ of datapoints has valuation ω0 = 2 Platform mediates interaction between each buyer and seller: ▶ bins buyers into market segments (information production) ▶ discloses segments to seller for setting price a ▶ objective: maximize buyers’ surplus

SLIDE 7

Internet Platform

example

Questions: what price p(ω0) ▶ would capture individual value that ω0-datapoint has for platform? ▶ would/should platform be willing to spend to add one datapoint with valuation ω0 to database? Broadly refer to these questions as data-pricing problem p(ω0) not interpreted as monetary transfer to buyers for their data ▶ important, yet distinct issue (later)

SLIDE 8

Internet Platform

example

Given optimal segmentation, let v∗(ω0) be realized surplus of ω0-buyer Question: does it make sense to set p(ω0) = v∗(ω0)? Extreme cases: µ = 1 ⇒ v∗(1) = 0 and µ = 0 ⇒ v∗(2) = 0 If µ ∈ (0, 0.5), optimal market segmentation s′ s′′ v∗(ω0) ω0 = 1 1 ω0 = 2

µ 1−µ

1 −

µ 1−µ µ 1−µ

→ a(s) 1 2 Idea: 1-buyers ‘help’ platform achieve positive surplus with some 2-buyers Punchline: v∗ misses this, so not good measure for p(ω0)

SLIDE 9

Internet Platform

example

If µ ∈ (0.5, 1), optimal market segmentation s′ s′′ v∗(ω0) ω0 = 1 1 ω0 = 2

µ 1−µ

1 −

µ 1−µ µ 1−µ

→ price 1 2 Idea: 1-buyers ‘help’ platform achieve positive surplus with some 2-buyers Our approach will yield p∗(1) = 1 > v∗(1) and p∗(2) = 0 < v∗(2) ▶ 1-datapoints useful ⇝ induce seller to set suboptimal price for 2-buyers ▶ 1-datapoints scarce ‘input’ in database (µ < 0.5)

SLIDE 10

model

SLIDE 11

Overview

model

Principal (she) mediates economic interaction between group of agents (he) — e.g., buyer-seller trade ⇝ general formulation : Bayes incentive problem á la Myerson (’82,’83) Each interaction characterized by data — e.g., buyer’s valuation Principal uses data to mediate interaction — e.g., segmentation Question: what is value for principal of individual data characterizing each interaction she can mediate?

SLIDE 12

Standard Primitives

model

Parties: principal i = 0, agents i ∈ I = {1, . . . , n} Action privately controlled by party i: ai ∈ Ai ⇝ A = A0 × · · · × An Piece of data privately and directly accessed by party i: ωi ∈ Ωi ⇝ Ω = Ω0 × · · · × Ωn Payofg function of party i: ui : A × Ω → R ⇒ every ω = (ω0, . . . , ωn) pins down one type of economic interaction the principal can mediate Letting µ ∈ ∆(Ω), assume Γ = (I, (Ω, µ), (Ai, ui)n

i=0) is common knowledge

SLIDE 13

Principal as Data User

model

Myerson’s principal can commit to mediating interaction by ▶ eliciting agents’ private data ▶ setting rules/incentives agents face: A0 (mechanism) ▶ sending signals to afgect agents’ private actions: Ai (information) As usual, focus on direct mechanisms x : Ω → ∆(A) that satisfy IC ▶ honesty: optimal for each agent to report ωi truthfully ▶ obedience: optimal for each agent to follow recommended ai ⇒ data-usage problem involves ▶ production technologies = IC mechanisms ▶ inputs = data ω ∈ Ω ▶ objective = ∑

ω u0(a, ω)x(a|ω)µ(ω)

SLIDE 14

Datapoints and Databases

model

Frequentist interpretation: ▶ population of distinct economic interactions between agents (e.g., monopolist-buyer trade for all buyers in market) ▶ Ω = set of types of interactions ▶ each interaction of type ω = datapoint of type ω ▶ population = database ▶ µ(ω) = stock of ω-datapoints as share of total quantity in database ▶ principal commits ex ante to how she mediates all interactions (ex: all monopolist-buyer trades) Incentive compatibility ⇒ as if ▶ principal already owns database with entire datapoints (e.g., platform owns all buyers’ valuations even if elicitation needed) ▶ but restricted to using IC mechanisms

SLIDE 15

The Notion of A Price

model

Data-pricing problem: given µ, fjnd function p : Ω → R s.t. p(ω) refmects principal’s willingness to pay for replacing/adding marginal ω-datapoint to those already in database Interpretation: • derivation of demand functions for each ω ∈ Ω

each demand depends on overall µ, as mechanisms ∼

non-separable production technology

SLIDE 16

Some Other Examples

examples

Internet platform mediating competing fjrms (Armstrong-Zhou ’19) ▶ platform’s own data about buyers’ demand ▶ fjrms’ internal data from market intelligence Auctions with(out) information design (Bergemann-Pesendorfer ’07; Daskalakis et al. ’16) ▶ data from bidders’ reports about their valuations ▶ auctioneer’s own data about features of item for sale Navigation app routing drivers (Kremer et al. ’14, Das et al. ’17, Liu-Whinston ’19) ▶ app’s own data about overall traffjc conditions ▶ drivers’ data about desired destination and road conditions

SLIDE 17

data-pricing formulation

SLIDE 18

Omniscient Principal

data-pricing formulation

Important case: principal’s data fully reveals all parties’ data (omniscient)

1. simpler to develop concepts and intuitions
2. in many instances (Posner-Weyl ’18), principal already knows agents’

data and can use it without their consent

(akin to no privacy protection)

3. benchmark for problem where principal has to elicit agents’ data with

their consent

(akin to privacy protection)

SLIDE 19

Data Usage Formalized

data-pricing formulation

Consider mechanisms x that have to satisfy only obedience Problem U VU = max

x

∑

ω,a

u0(a, ω)x(a|ω)µ(ω) s.t. for all i, ωi, ai, and a′

i

∑

ω−i,a−i

( ui ( ai, a−i, ω ) − ui ( a′

i, a−i, ω

)) x ( ai, a−i|ω ) µ(ω) ≥ 0 Question: what is the proper share of VU to attribute to ω? → p(ω) One approach: defjne direct value of ω as v∗(ω) = ∑

a u0(a, ω)x∗(a|ω)

Clearly, ∑

ω

µ(ω)v∗(ω) = VU. But v∗ may give incorrect shares/prices ...

SLIDE 20

An Alternative Approach

data-pricing formulation

Using primitives Γ, we can defjne a data-pricing problem Principal designs for each agent i, ai, and ωi ℓi(·|ai, ωi) ∈ ∆(Ai) and qi(ai, ωi) ∈ R++ Problem P VP = min

ℓ,q

∑

ω

p(ω)µ(ω) s.t. for all ω, p(ω) = max

a∈A

{ u0(a, ω) + ∑

i

Tℓi,qi(a, ω) } Tℓi,qi(a, ω) = qi(ai, ωi) ∑

a′

i∈Ai

( ui(ai, a−i, ω) − ui(a′

i, a−i, ω)

) ℓi(a′

i|ai, ωi)

SLIDE 21

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) corresponds to

constraint

captures shadow price of stock

f
datapoints

principal’s WTP for marginal

datapoint in database
variables

correspond to

obedience constraints

SLIDE 22

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

x(a|ω) = 1 ∀ω captures shadow price of stock

f
datapoints

principal’s WTP for marginal

datapoint in database
variables

correspond to

obedience constraints

SLIDE 23

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

x(a|ω)µ(ω) = µ(ω) ∀ω captures shadow price of stock

f
datapoints

principal’s WTP for marginal

datapoint in database
variables

correspond to

obedience constraints

SLIDE 24

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

χ(ω, a) = µ(ω) ∀ω captures shadow price of stock

f
datapoints

principal’s WTP for marginal

datapoint in database
variables

correspond to

obedience constraints

SLIDE 25

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

χ(ω, a) = µ(ω) ∀ω ▶ p(ω) captures shadow price of stock µ(ω) of ω-datapoints principal’s WTP for marginal

datapoint in database
variables

correspond to

obedience constraints

SLIDE 26

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

χ(ω, a) = µ(ω) ∀ω ▶ p(ω) captures shadow price of stock µ(ω) of ω-datapoints ▶ p(ω) = principal’s WTP for marginal ω-datapoint in database

variables

correspond to

obedience constraints

SLIDE 27

Dual Relationship

data-pricing formulation

Why is P the ‘right’ data-pricing problem? Lemma Problem P is equivalent to the dual of Problem U. By strong duality, ∑

ω

v∗(ω)µ(ω) = VU = VP = ∑

ω

p∗(ω)µ(ω) ▶ p(ω) corresponds to U-constraint ∑

a

χ(ω, a) = µ(ω) ∀ω ▶ p(ω) captures shadow price of stock µ(ω) of ω-datapoints ▶ p(ω) = principal’s WTP for marginal ω-datapoint in database ▶ P-variables (ℓ, q) correspond to U-obedience constraints

SLIDE 28

A Normative Interpretation

data-pricing formulation

P ofgers rigorous way of assessing individual price of each datapoint, viewed as input in mechanism-information-design problem A classic interpretation of duality:

(Dorfman, Samuelson, Solow ’58)

▶ reminiscent of operations of frictionless competitive market ▶ competition among data users forces to ofger data sources full value to which their data give rise ▶ competition among data sources drives data prices down to minimum consistent with this full value ⇝ normative meaning to p∗ ▶ takes into account full value that each datapoint generates in database ▶ a benchmark for actual markets for data

SLIDE 29

back to example

SLIDE 30

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

∑

ω0

p(ω0)µ(ω0) s.t. for all ω0, p(ω0) = max

a∈A

{ u0(a, ω0) + Tℓ,q(a, ω0) } Assuming , solution involves setting

SLIDE 31

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

∑

ω0

p(ω0)µ(ω0) s.t. for all ω0, p(ω0) = max

a∈A

{ u0(a, ω0) + Tℓ,q(a, ω0) } Assuming , solution involves setting

SLIDE 32

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

∑

ω0

p(ω0)µ(ω0) = p(1)µ + p(2)(1 − µ) s.t. for all ω0, p(ω0) = max

a∈A

{ u0(a, ω0) + Tℓ,q(a, ω0) } Assuming , solution involves setting

SLIDE 33

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = max { u0(1, 1) + Tℓ,q(1, 1), u0(2, 1) + Tℓ,q(2, 1) } p(2) = max { u0(1, 2) + Tℓ,q(1, 2), u0(2, 2) + Tℓ,q(2, 2) } Assuming , solution involves setting

SLIDE 34

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = max { q(1)ℓ(2|1), −q(2)ℓ(2|1) } p(2) = max { 1 − q(1)ℓ(2|1), q(2)ℓ(1|2) } Assuming , solution involves setting

SLIDE 35

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = max { q(1)ℓ(2|1), −q(2)ℓ(2|1) } = q(1)ℓ(2|1) p(2) = max { 1 − q(1)ℓ(2|1), q(2)ℓ(1|2) } Assuming , solution involves setting

SLIDE 36

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) p(2) = max { 1 − q(1)ℓ(2|1), q(2)ℓ(1|2) } Assuming , solution involves setting

SLIDE 37

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) p(2) = max { 1 − q(1)ℓ(2|1), q(2)ℓ(1|2) } Assuming , solution involves setting

SLIDE 38

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) p(2) = max { 1 − q(1)ℓ(2|1), 0 } Assuming , solution involves setting

SLIDE 39

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) p(2) = max { 1 − q(1)ℓ(2|1), 0 } Assuming µ < 1

2, solution involves setting q∗(1)ℓ∗(2|1) = 1

SLIDE 40

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) = 1 p(2) = max { 1 − q(1)ℓ(2|1), 0 } = 0 Assuming µ < 1

2, solution involves setting q∗(1)ℓ∗(2|1) = 1

SLIDE 41

Internet Platform

example

Seller’s profjt:

u1(a, ω0) a = 1 a = 2 ω0 = 1 1 ω0 = 2 1 2

Buyer’s surplus:

u0(a, ω0) a = 1 a = 2 ω0 = 1 ω0 = 2 1 Data-pricing problem (seller is the only agent) min

ℓ,q

p(1)µ + p(2)(1 − µ) s.t. p(1) = q(1)ℓ(2|1) = 1 > v∗(1) = 0 p(2) = max { 1 − q(1)ℓ(2|1), 0 } = 0 < v∗(2) = µ 1 − µ Assuming µ < 1

2, solution involves setting q∗(1)ℓ∗(2|1) = 1

SLIDE 42

information externalities

SLIDE 43

Externalities Between Data Entries

externalities

Principal combines datapoints to produce actionable information What ω yields depends on which/how other ω′ are combined with it Information externalities between datapoints, which v∗ fails to capture Proposition Let x∗ and (ℓ∗, q∗) be optimal for U and P. Then

1. p∗(ω) > v∗(ω) for some ω ⇐

⇒ p∗(ω′) < v∗(ω′) for some ω′

2. p∗(ω) − v∗(ω) = ∑

a

( ∑

i Tℓ∗

i ,q∗ i (a, ω)

) x∗(a|ω) for all ω

1. ⇐ strong duality: ∑

ω[v∗(ω) − p∗(ω)]µ(ω) = 0

2. ⇐ compl. slackness: x∗(a|ω)

{ p∗(ω) − v(a, ω) − ∑

i Tℓ∗

i ,q∗ i (a, ω)

} = 0

SLIDE 44

Externalities Between Data Entries

externalities

Why transfer value VU from ω-datapoints to ω′-datapoints? Defjnition: Augmented Correlated Equilibrium ACE(Γω) = distributions y ∈ ∆(A) s.t. for all i ∈ I and ai, a′

i ∈ Ai,

∑

a−i

( ui(ai, a−i, ω) − ui(a′

i, a−i, ω)

) y(ai, a−i) ≥ 0 Proposition If v∗(ω) > p∗(ω), there must exists a such that x∗(a|ω) > 0 and u0(a, ω) > ¯ v(ω) = max

y∈ACE(Γω)

∑

a

u0(a, ω)y(a) Achieve u0(a, ω) > ¯ v(ω) by pooling ω with ω′ → p∗(ω′) > v∗(ω′) In paper: suffjcient conditions for p∗ ̸= v∗ and for p∗ = v∗

SLIDE 45

Recap

price determinants

Which datapoints tend to be less valuable? ▶ ω pooled with other ω′ to produce information that achieves otherwise impossible outcomes for ω Which datapoints tend to be more valuable? ▶ ω pooled with other ω′ to help ω′ achieve otherwise impossible

utcomes

SLIDE 46

what drives p∗

SLIDE 47

Dual Side of Mechanisms-Information Design

price determinants

An independent interpretation of P to understand what drives p∗ Recall : min

ℓ,q

∑

ω

p(ω)µ(ω) s.t. p(ω) = max

a∈A

{ u0(a, ω) + ∑

i

Tℓi,qi(a, ω) } ∀ω → p ultimately determined by (ℓ, q) through best trade-ofg between

1. principal’s direct payofg u0
2. “transfer” function Tℓi,qi that account for information externalities

What are ℓ and q?

SLIDE 48

Dual Side of Mechanisms-Information Design

price determinants

Fix (a, ω) and recall qi(ai, ωi) ∈ R++, ℓi(·|ai, ωi) ∈ ∆(Ai), and Tℓi,qi(a, ω) = qi(ai, ωi) ∑

a′

i∈Ai

( ui(ai, a−i, ω) − ui(a′

i, a−i, ω)

) ℓi(a′

i|ai, ωi)

Principal designs gambles against agents contingent on (a, ω) ▶ (ℓi, qi) family of gambles (lottery & stake) contingent on (ai, ωi) ▶ given (a, ω), ℓi(?|ai, ωi) yields prize ui(ai, a−i, ω) − ui(?, a−i, ω) ▶ principal wins ifg ui(ai, a−i, ω) < ui(a′

i, a−i, ω)

↔ had i known (a−i, ω), he would have preferred a′

i ̸= ai (ex-post mistake)

▶ for every ω, value p(ω) given by best trade-ofg between u0(a, ω) and gambles ∑

i Tℓi,qi(a, ω) across a

▶ principal commits to (ℓ, q) ex ante → average with respect to µ

SLIDE 49

Structure of Gambles

price determinants

min

ℓ,q

∑ p(ω)µ(ω) ⇝ principal wants to win gambles as much as possible Constraint 1: Limited Flexibility gambles against i can be tailored to (ai, ωi), but not (a−i, ω−i) ⇝ links between pricing formula of (ωi, ω−i) and (ωi, ω′

−i)

− manifestation in P of non-separabilities in U across ω − still pin down individual prices for each ω

⇝ trade-ofgs across datapoints: using (ℓi, qi) to lower p(ωi, ω−i) may cost raising p(ωi, ω′

−i)

SLIDE 50

Structure of Gambles

price determinants

min

ℓ,q

∑ p(ω)µ(ω) ⇝ principal wants to win gambles as much as possible Constraint 2: Agents’ Joint Rationality (Nau ’92) ∼ agents accept gambles where they lose in (a, ω) only if they win in (a′, ω′) Proposition For every∗ (ℓ, q), if ∑

i Tℓi,qi(a, ω) < 0 for (a, ω), there must exist (a′, ω′)

such that ∑

i Tℓi,qi(a′, ω′) > 0

⇒ key trade-ofg for principal: winning less important for relatively scarce data (low µ) ⇝ higher price

SLIDE 51

Relations between Pricing and Using Data

price determinants

Optimal (ℓ∗, q∗) for P has corresponding optimal x∗ for U (and vice versa) Proposition Generically, ℓ∗

i (a′ i|ai, ωi) > 0 if and only if, given ωi, agent i indifgerent

between a′

i and recommendation ai from x∗

∼ only indifgerent agents under x∗ contribute to gap p∗(ω) − v∗(ω) Proposition Generically, x∗(a|ω) > 0 if and only if p∗(ω) = u0(a, ω) + ∑

i Tℓ∗

i ,q∗ i (a, ω)

∼ all uses of ω-datapoints under x∗ yield same (maximal) total value p∗(ω)

SLIDE 52

Recap

price determinants

Which datapoints tend to be more valuable?

1. ω that helps principal trick agents into making ex-post mistakes for

some other ω′

2. ω relatively scarce in database (i.e., low µ(ω))

Which datapoints tend to be less valuable?

1. ω where agents make ex-post mistakes with help of some other ω′
2. ω relatively abundant in database (i.e., high µ(ω))

SLIDE 53

example II

SLIDE 54

Mediated Cournot Competition in e-Commerce

example

To illustrate, operator (principal) manages online marketplace Two fjrms (agents), each chooses to participate or not: produce ai ∈ {0, 1} Profjts: ui(ai, a−i, ω0) = ( ω0 − ∑

i ai

) ai Demand strength: Ω0 = {ω0, ¯ ω0}, µ(ω0) = µ(¯ ω0) = 1

2

Operator maximizes total production: u0(a, ω) = ∑

i ai

SLIDE 55

Mediated Cournot Competition in e-Commerce

example

Firms have own data about demand strength: Ωi = {ωi, ¯ ωi} ω0 ω2 ¯ ω2 ω1 γ2 γ(1 − γ) ¯ ω1 γ(1 − γ) (1 − γ)2 ¯ ω0 ω2 ¯ ω2 ω1 (1 − γ)2 γ(1 − γ) ¯ ω1 γ(1 − γ) γ2 where 1/2 < γ < 1 Data usage: given ω, convey info to infmuence a1 and a2 Data pricing: fjnd p(ω) = p(ω0, ω1, ω2) for all ω Today, assume ω0 ∈ {0, 3}

SLIDE 56

Optimal Prices p∗

example

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

Case 1: fjrms’ data gives weak signal, γ < γ ▶ prices independent of (ω1, ω2) ▶ ¯ ω0 is more valuable than ω0 − p∗(ω0, ω1, ω2) < v∗(ω0, ω1, ω2) and p∗(¯ ω0, ω1, ω2) > v∗(¯ ω0, ω1, ω2) − gambles: q∗

i (1, ωi)ℓ∗ i (0|1, ωi) = q∗ i (1, ¯

ωi)ℓ∗

i (0|1, ¯

ωi) > 0, for all i

SLIDE 57

Optimal Prices p∗

example

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

Case 2: fjrms’ data gives strong signal, γ > ¯ γ ▶ pessimistic fjrms ⇝ pooling harder ⇝ larger externality p∗(ω0, ω1, ω2) < v∗(ω0, ω1, ω2) < v∗(¯ ω0, ω1, ω2) < p∗(¯ ω0, ω1, ω2) ▶ optimistic fjrms ⇝ always produce ⇝ no externalities p∗(ω0, ¯ ω1, ¯ ω2) = v∗(ω0, ¯ ω1, ¯ ω2) = v∗(¯ ω0, ¯ ω1, ¯ ω2) = p∗(¯ ω0, ¯ ω1, ¯ ω2) ▶ gambles: q∗

i (1, ωi)ℓ∗ i (0|1, ωi) > 0 = q∗ i (1, ¯

ωi)ℓ∗

i (0|1, ¯

ωi), for all i

SLIDE 58

Optimal Prices p∗

example

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

Case 3: fjrms’ data gives intermediate signal, γ < γ < ¯ γ ▶ pessimistic fjrms ⇝ pooling harder ⇝ larger externality p∗(ω0, ω1, ω2) < v∗(ω0, ω1, ω2) < v∗(¯ ω0, ω1, ω2) < p∗(¯ ω0, ω1, ω2) ▶ optimistic fjrms ⇝ always produce ⇝ no externalities p∗(ω0, ¯ ω1, ¯ ω2) = v∗(ω0, ¯ ω1, ¯ ω2) = v∗(¯ ω0, ¯ ω1, ¯ ω2) = p∗(¯ ω0, ¯ ω1, ¯ ω2) ▶ gambles: q∗

i (1, ωi)ℓ∗ i (0|1, ωi) > 0 = q∗ i (1, ¯

ωi)ℓ∗

i (0|1, ¯

ωi), for all i

SLIDE 59

prices under privacy

SLIDE 60

Privacy Protection

privacy

Suppose principal has to incentivize agents to report their private data Incentives: ▶ directly from how principal commits to use data (no monetary transfers) ▶ in some settings, monetary transfer as part of mechanisms Formally, mechanisms in problem U must satisfy honesty and obedience Question: How are prices afgected by need to elicit data?

SLIDE 61

Data Usage with Elicitation

privacy

Elicitation does not change mathematical structure of problem Problem Ue VU = max

x

∑

ω,a

u0(a, ω)x(a|ω)µ(ω) s.t. for all i, ωi, and δi : Ai → Ai ∑

ai,a−i,ω−i

ui ( ai, a−i, ω ) x ( ai, a−i|ωi, ω−i ) µ(ωi, ω−i) ≥ ∑

ai,a−i,ω−i

ui ( δi(ai), a−i, ω ) x ( ai, a−i|ωi, ω−i ) µ(ωi, ω−i)

SLIDE 62

Data Usage with Elicitation

privacy

Elicitation does not change mathematical structure of problem Problem Ue VUe = max

x

∑

ω,a

u0(a, ω)x(a|ω)µ(ω) s.t. for all i, ωi, ω′

i, and δi : Ai → Ai

∑

ai,a−i,ω−i

ui ( ai, a−i, ω ) x ( ai, a−i|ωi, ω−i ) µ(ωi, ω−i) ≥ ∑

ai,a−i,ω−i

ui ( δi(ai), a−i, ω ) x ( ai, a−i|ω′

i, ω−i

) µ(ωi, ω−i)

SLIDE 63

Data Pricing with Elicitation

privacy

Principal chooses, for each player i and ωi, ˆ ℓi(·|ωi) ∈ ∆(Ωi × Di) and ˆ qi(ωi) ∈ R++ Problem Pe VPe = min

ˆ ℓ,ˆ q

∑

ω

p(ω)µ(ω) s.t. for all ω, p(ω) = max

a∈A

{ u0(a, ω) + ∑

i

Tˆ

ℓi,ˆ qi(a, ω)

}

SLIDE 64

Data Pricing with Elicitation

privacy

Data pricing with vs without elicitation: ▶ transfer function Tˆ

ℓi,ˆ qi now involves richer gambles (ˆ

ℓ, ˆ q) ▶ principal can win against agent when

1. deviating from obedience is ex-post benefjcial (as before)
2. deviating from honesty is ex-post benefjcial (new)
3. both (new)

Work in progress: ▶ p(ω) incorporates diffjculty to honestly elicit ω: new externalities ▶ compare p(ω) under omniscient and elicitation ⇝ insights into efgects on value of data (e.g., efgects of privacy protection) ▶ compare p(ω) under elicitation with monetary transfer (if any) to agents for their data ⇝ are they properly rewarded?

SLIDE 65

back to example II

SLIDE 66

Cournot Competition with Elicitation

example

Clearly, total value VU decreases with elicitation. What about individual p∗?

1. elicitation

qualitative change in tempted to mimic to get more informative recommendation induces temptation to lie sufgers negative externality (gambles) distorted to make mimicking less attractive, despite

SLIDE 67

Cournot Competition with Elicitation

example

Clearly, total value VU decreases with elicitation. What about individual p∗? γ > ¯ γ

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

1. elicitation

qualitative change in tempted to mimic to get more informative recommendation induces temptation to lie sufgers negative externality (gambles) distorted to make mimicking less attractive, despite

SLIDE 68

Cournot Competition with Elicitation

example

Clearly, total value VU decreases with elicitation. What about individual p∗? γ > ¯ γ

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

1. elicitation

qualitative change in tempted to mimic to get more informative recommendation induces temptation to lie sufgers negative externality (gambles) distorted to make mimicking less attractive, despite

SLIDE 69

Cournot Competition with Elicitation

example

Clearly, total value VU decreases with elicitation. What about individual p∗? γ > ¯ γ

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

1. elicitation ⇝ qualitative change in p(¯

ω0, ω1, ω2) − ¯ ωi tempted to mimic ωi to get more informative recommendation − ω induces temptation to lie → sufgers negative externality (gambles) − x∗ distorted to make mimicking ωi less attractive, despite ¯ ω0

SLIDE 70

Cournot Competition with Elicitation

example

Clearly, total value VU decreases with elicitation. What about individual p∗? γ > ¯ γ

ω1, ω2 ω1, ¯ ω2 ¯ ω1, ω2 ¯ ω1, ¯ ω2 ωI 1 2 3 4 p∗(¯ ω0, ·) p∗(ω0, ·)

2. p∗(¯

ω0, ¯ ω1, ¯ ω2) higher than in omniscient case − mimicking gamble ˆ ℓi(ωi, ·|¯ ωi) > 0 → loss for principal if ω0 = ¯ ω0 − (¯ ω0, ¯ ω1, ¯ ω2) only data left with full participation under x∗ ⇝ value ↑

SLIDE 71

extensions

SLIDE 72

Next Steps

Robust data usage: ▶ robust mechanisms that do not rely on agents’ higher-order beliefs ▶ for example, ex-post equilibrium → LP and similar data pricing Restrictions on data usage: ▶ mechanism x can depend only on parts of datapoint ω ▶ for example, auctioneer can use data to infmuence bidders’ valuations, but not to directly run the auction (Bergemann-Pesendorfer ’07) ▶ formulated as linear constraints on x → LP and similar data pricing Value of more precise data for each mediated interaction: ▶ ω′

0 is more precise data than ω0 about buyer’s valuation for seller’s

product (e.g., longer cookie history) ⇝ databases (Ω, µ) and (Ω′, µ′) ▶ individual value of extra data = p∗(ω′

0) − p∗(ω0)

SLIDE 73

summary

SLIDE 74

Summary

A theory of how to price datapoints in a database to refmect their individual value Basic insight: ▶ data-usage problem = mechanism-information design problem ▶ data-pricing problem = its dual Preliminary analysis reveals: ▶ prices take into account information externalities across datapoints ▶ valuable data: scarce + helps trick agents into making mistakes ▶ rigorous method to assess efgects of privacy protection: can have signifjcant impact and increase prices of some types of data