Lea Learn rning ing to to Bi Bid d Wi With thout out Kn - - PowerPoint PPT Presentation

lea learn rning ing to to bi bid d wi with thout out
SMART_READER_LITE
LIVE PREVIEW

Lea Learn rning ing to to Bi Bid d Wi With thout out Kn - - PowerPoint PPT Presentation

Lea Learn rning ing to to Bi Bid d Wi With thout out Kn Knowin wing g yo your ur Va Valu lue Zhe Feng, Harvard Joint work with Chara Podimata (Harvard) and Vasilis Syrgkanis (MSR) 19th ACM Conference on Economics and Computation,


slide-1
SLIDE 1

Lea Learn rning ing to to Bi Bid d Wi With thout

  • ut

Kn Knowin wing g yo your ur Va Valu lue

Zhe Feng, Harvard Joint work with Chara Podimata (Harvard) and Vasilis Syrgkanis (MSR)

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 1

slide-2
SLIDE 2

Wa Warm rm-up up

19th ACM Conference on Economics and Computation, EC’18 6/21/2018

Auction theory & Mechanism Design

Auction

vi bi (ai, pi)

Utility to buyer i: ui = aivi − pi

2

slide-3
SLIDE 3

Motiva tivation tion

Key assumption in Auction Theory & Mechanism Design Private valuation but known to the bidder himself/herself

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 3

slide-4
SLIDE 4

Motiva tivation tion

Key assumption in Auction Theory & Mechanism Design Private valuation but known to the bidder himself/herself

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 4

slide-5
SLIDE 5

Motiva tivation tion

Key assumption in Auction Theory & Mechanism Design

Small markets; Bidders have time to prepare to bid (market research) Digital economy: online advertisement auctions; No time to prepare to bid (market research)

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 5

slide-6
SLIDE 6

How to design a bidding strategy for the learner in online advertisement auctions when he/she doesn’t know the value before submitting the bid.

Main ain que uest stion ion

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 6

slide-7
SLIDE 7

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 7

Advertiser (Learner) bids Platform (Auctioneer)

slide-8
SLIDE 8

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 8

Advertiser (Learner) bids Platform (Auctioneer) Generates 𝑦𝑢(⋅), 𝑞𝑢(⋅)

slide-9
SLIDE 9

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 9

Advertiser (Learner) bids Platform (Auctioneer) Generates 𝑦𝑢(⋅), 𝑞𝑢(⋅) Clicked by users Generates value 𝑤𝑢

slide-10
SLIDE 10

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 10

Advertiser (Learner) bids Platform (Auctioneer) Generates 𝑦𝑢(⋅), 𝑞𝑢(⋅) Clicked by users Generates value 𝑤𝑢 Observes (estimated) 𝑦𝑢(⋅), 𝑞𝑢(⋅)

slide-11
SLIDE 11

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 11

Advertiser (Learner) bids Platform (Auctioneer) Generates 𝑦𝑢(⋅), 𝑞𝑢(⋅) Observes (estimated) 𝑦𝑢(⋅), 𝑞𝑢(⋅)

slide-12
SLIDE 12

Sp Sponsored nsored Se Search arch Example xample

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 12

Advertiser (Learner) bids Platform (Auctioneer) Generates 𝑦𝑢(⋅), 𝑞𝑢(⋅) Clicked by users Generates value 𝑤𝑢 Expected utility 𝑣𝑢(𝑐) = (𝑤𝑢−𝑞𝑢 𝑐 ) ⋅ 𝑦𝑢(𝑐) Reward 𝑤𝑢 − 𝑞𝑢(⋅) Observes (estimated) 𝑦𝑢(⋅), 𝑞𝑢(⋅)

slide-13
SLIDE 13

Si Simp mple le Model: del: Si Sing ngle le-item item Auc uctio tions ns

  • At each day 𝒖:
  • Designer and competitors choose allocation rule, 𝒚𝒖(⋅); payment rule, 𝒒𝒖(⋅)
  • Learner submits 𝒄𝒖 ∈ 𝑪 (finite set)
  • The learner wins item with probability 𝒚𝒖(𝐜𝐮)
  • At the end, observes 𝒚𝒖(⋅), 𝒒𝒖(⋅)
  • If the learner wins, observes 𝒘𝒖

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 13

slide-14
SLIDE 14

Si Simp mple le Model: del: Si Sing ngle le-item item Auc uctio tions ns

  • At each day 𝒖:
  • Designer and competitors choose allocation rule, 𝒚𝒖(⋅); payment rule, 𝒒𝒖(⋅)
  • Learner submits 𝒄𝒖 ∈ 𝑪 (finite set)
  • The learner wins item with probability 𝒚𝒖(𝐜𝐮)
  • At the end, observes 𝒚𝒖(⋅), 𝒒𝒖(⋅)
  • If the learner wins, observes 𝒘𝒖
  • Expected utility function: 𝒗𝒖 𝒄 = 𝒘𝒖 − 𝒒𝒖 𝒄

⋅ 𝒚𝒖(𝒄)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 14

slide-15
SLIDE 15

Si Simp mple le Model: del: Si Sing ngle le-item item Auc uctio tions ns

  • At each day 𝒖:
  • Designer and competitors choose allocation rule, 𝒚𝒖(⋅); payment rule, 𝒒𝒖(⋅)
  • Learner submits 𝒄𝒖 ∈ 𝑪 (finite set)
  • The learner wins item with probability 𝒚𝒖(𝐜𝐮)
  • At the end, observes 𝒚𝒖(⋅), 𝒒𝒖(⋅)
  • If the learner wins, observes 𝒘𝒖
  • Expected utility function: 𝒗𝒖 𝒄 = 𝒘𝒖 − 𝒒𝒖 𝒄

⋅ 𝒚𝒖(𝒄)

  • Goal: minimize expected regret

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 15

𝑺 𝑼 = 𝐭𝐯𝐪

𝒄∗ 𝔽 ෍ 𝒖=𝟐 𝑼

𝒗𝒖(𝒄∗) − 𝔽 ෍

𝒖=𝟐 𝑼

𝒗𝒖(𝒄𝒖)

Utility with best fixed bid in hindsight Utility with bids generated by algorithm

slide-16
SLIDE 16

Mul ulti ti-Arme Armed d Ban andit dit (MAB AB)

At each round 𝒖 = 𝟐, ⋯ , 𝑼

  • Adversary chooses reward vector 𝒔𝒖 = (𝒔𝟐,𝒖, ⋯ , 𝒔𝑳,𝒖)
  • Learner chooses an action 𝒋𝒖 ∈ 𝑪
  • Learner gets reward 𝒔𝒋𝒖,𝒖 and only observes 𝒔𝒋𝒖,𝒖

EXP3 achieves regret 𝑷 𝑼|𝑪|

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 16

slide-17
SLIDE 17

Formal rmal mai ain n que uest stion ion

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 17

Can we design an online learning algorithm for the learner to achieve better regret than generic MAB?

slide-18
SLIDE 18

Ou Our Re r Resu sults: lts: WI WIN-EXP EXP al algorithm

  • rithm

Utilize partial feedback information from the auctions. Partial feedback: between bandit feedback and full information feedback Recall: EXP3 achieves 𝑷( 𝑼|𝑪|)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 18

Theorem 1. WIN-EXP algorithm achieves regret at most 𝟓 𝑼 𝐦𝐩𝐡|𝑪|

slide-19
SLIDE 19

Rel elated ated Wo Work rk

No regret learning in GT/MD

From auctioneer side: [Blum et. al, 04], [Amin et. al, 05], [Amin et. al, 06], [Cesa-Bianchi et.al, 15], … From bidder side: [Dikkala & Tardos, 13], [Balseiro & Gur, 17], [Weed et. al, 16]

Learning with partial feedback

Contextual Bandit: [Bubeck & Cesa-Bianchi, 12] [Agarwal et. al, 14]… Feedback graphs: [Alon et. al, 13], [Alon et. al, 15]

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 19

slide-20
SLIDE 20

Technical Parts

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 20

slide-21
SLIDE 21

The he Abstractio straction: n: Wi Win-Only Only Fee eedback dback

At each day 𝒖:

  • Learner chooses an action 𝒄𝒖 ∈ 𝑪.

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 21

slide-22
SLIDE 22

The he Abstractio straction: n: Wi Win-Only Only Fee eedback dback

At each day 𝒖:

  • Learner chooses an action 𝒄𝒖 ∈ 𝑪.
  • The adversary chooses a reward function 𝒔𝒖: 𝑪 → [−𝟐, 𝟐]

and allocation function 𝒚𝒖(⋅).

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 22

slide-23
SLIDE 23

The he Abstractio straction: n: Wi Win-Only Only Fee eedback dback

At each day 𝒖:

  • Learner chooses an action 𝒄𝒖 ∈ 𝑪.
  • The adversary chooses a reward function 𝒔𝒖: 𝑪 → [−𝟐, 𝟐]

and allocation function 𝒚𝒖(⋅).

  • The learner wins reward 𝒔𝒖(𝒄𝒖) with probability of 𝒚𝒖(𝒄𝒖)

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 23

slide-24
SLIDE 24

The he Abstractio straction: n: Wi Win-Only Only Fee eedback dback

At each day 𝒖:

  • Learner chooses an action 𝒄𝒖 ∈ 𝑪.
  • The adversary chooses a reward function 𝒔𝒖: 𝑪 → [−𝟐, 𝟐]

and allocation function 𝒚𝒖(⋅).

  • The learner wins reward 𝒔𝒖(𝒄𝒖) with probability of 𝒚𝒖(𝒄𝒖)
  • Feedback: always learns the allocation rule 𝒚𝒖; if she wins,

also learns 𝒔𝒖(⋅)

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 24

slide-25
SLIDE 25

WI WIN-EXP EXP Alg lgorithm

  • rithm For

r Wi Win-Only Only Fee eedback dback

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 25

At each round 𝒖:

  • Draw a bid 𝒄𝒖 ∼ 𝝆𝒖
slide-26
SLIDE 26

WI WIN-EXP EXP Alg lgorithm

  • rithm For

r Wi Win-Only Only Fee eedback dback

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 26

At each round 𝒖:

  • Draw a bid 𝒄𝒖 ∼ 𝝆𝒖
  • Observe allocation rule 𝒚𝒖; if wins, observe 𝒔𝒖(⋅)
slide-27
SLIDE 27

WI WIN-EXP EXP Alg lgorithm

  • rithm For

r Wi Win-Only Only Fee eedback dback

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 27

At each round 𝒖:

  • Draw a bid 𝒄𝒖 ∼ 𝝆𝒖
  • Observe allocation rule 𝒚𝒖; if wins, observe 𝒔𝒖(⋅)
  • Compute the unbiased estimator of 𝒗𝒖 𝒄 − 𝟐

෥ 𝒗𝒖 𝒄 = (𝒔𝒖 𝒄 −𝟐) ⋅ 𝒚𝒖 (𝒄) σ𝒄 𝝆𝒖 𝒄 𝒚𝒖(𝒄) , 𝐣𝐠 𝐮𝐢𝐟 𝐦𝐟𝐛𝐬𝐨𝐟𝐬 𝐱𝐣𝐨𝐭 − 𝟐 − 𝒚𝒖 𝒄 𝟐 − σ𝒄 𝝆𝒖 𝒄 𝒚𝒖 𝒄 , 𝐣𝐠 𝐮𝐢𝐟 𝐦𝐟𝐛𝐬𝐨𝐟𝐬 𝐞𝐩𝐟𝐭𝐨′𝐮 𝐱𝐣𝐨

slide-28
SLIDE 28

WI WIN-EXP EXP Alg lgorithm

  • rithm For

r Wi Win-Only Only Fee eedback dback

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 28

At each round 𝒖:

  • Draw a bid 𝒄𝒖 ∼ 𝝆𝒖
  • Observe allocation rule 𝒚𝒖; if wins, observe 𝒔𝒖(⋅)
  • Compute the unbiased estimator of 𝒗𝒖 𝒄 − 𝟐

෥ 𝒗𝒖 𝒄 = (𝒔𝒖 𝒄 −𝟐) ⋅ 𝒚𝒖 (𝒄) σ𝒄 𝝆𝒖 𝒄 𝒚𝒖(𝒄) , 𝐣𝐠 𝐮𝐢𝐟 𝐦𝐟𝐛𝐬𝐨𝐟𝐬 𝐱𝐣𝐨𝐭 − 𝟐 − 𝒚𝒖 𝒄 𝟐 − σ𝒄 𝝆𝒖 𝒄 𝒚𝒖 𝒄 , 𝐣𝐠 𝐮𝐢𝐟 𝐦𝐟𝐛𝐬𝐨𝐟𝐬 𝐞𝐩𝐟𝐭𝐨′𝐮 𝐱𝐣𝐨

  • Update: 𝝆𝒖+𝟐 𝒄 ∝ 𝝆𝒖 𝒄 ⋅ 𝐟𝐲𝐪(𝜽 ⋅ ෦

𝒗𝒖 𝒄 )

slide-29
SLIDE 29

Pro roof

  • f Sk

Sket etch ch of T f The heorem rem 1. 1.

  • 1. The regret w.r.t 𝒗𝒖(𝒄) is equal to the regret w.r.t 𝒗𝒖 𝒄 − 𝟐

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 29

slide-30
SLIDE 30

Pro roof

  • f Sk

Sket etch ch of T f The heorem rem 1. 1.

  • 1. The regret w.r.t 𝒗𝒖(𝒄) is equal to the regret w.r.t 𝒗𝒖 𝒄 − 𝟐

2. ෥ 𝒗𝒖(𝒄) is the unbiased estimator of 𝒗𝒖 𝒄 − 𝟐 [Lemma 1] 𝑺 𝑼 ≤

𝜽 𝟑 σ𝒖=𝟐 𝑼

σ𝒄∈𝑪 𝝆𝒖 𝒄 ⋅ 𝔽 ෦ 𝒗𝒖 𝒄 𝟑 +

𝟐 𝜽 𝒎𝒑𝒉(|𝑪|)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 30

slide-31
SLIDE 31

Pro roof

  • f Sk

Sket etch ch of T f The heorem rem 1. 1.

  • 1. The regret w.r.t 𝒗𝒖(𝒄) is equal to the regret w.r.t 𝒗𝒖 𝒄 − 𝟐

2. ෥ 𝒗𝒖(𝒄) is the unbiased estimator of 𝒗𝒖 𝒄 − 𝟐 [Lemma 1] 𝑺 𝑼 ≤

𝜽 𝟑 σ𝒖=𝟐 𝑼

σ𝒄∈𝑪 𝝆𝒖 𝒄 ⋅ 𝔽 ෦ 𝒗𝒖 𝒄 𝟑 +

𝟐 𝜽 𝒎𝒑𝒉(|𝑪|)

  • 3. Variance of the estimator:

𝒄∈𝑪

𝝆𝒖 𝒄 ⋅ 𝔽 ෦ 𝒗𝒖 𝒄 𝟑 ≤ 𝟔 (𝟐) Q.E.D Note: in EXP3, (1) grows as # of actions.

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 31

slide-32
SLIDE 32

Beyond Binary outcomes: a set of outcomes 𝑷

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 32

Ext xtension ension 1: 1: Ou Outc tcome

  • me-based

based Fee eedback dback

slide-33
SLIDE 33

Beyond Binary outcomes: a set of outcomes 𝑷

  • Reward function 𝒔𝒖: 𝑪 × 𝑷 → −𝟐, 𝟐 and allocation 𝒚𝒖: 𝑪 → 𝚬(𝐏)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 33

Ext xtension ension 1: 1: Ou Outc tcome

  • me-based

based Fee eedback dback

slide-34
SLIDE 34

Beyond Binary outcomes: a set of outcomes 𝑷

  • Reward function 𝒔𝒖: 𝑪 × 𝑷 → −𝟐, 𝟐 and allocation 𝒚𝒖: 𝑪 → 𝚬(𝐏)
  • 𝒑𝒖 is chosen based on distribution 𝒚𝒖(𝒄𝒖) and learner wins reward

𝒔𝒖(𝒄𝒖, 𝒑𝒖).

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 34

Ext xtension ension 1: 1: Ou Outc tcome

  • me-based

based Fee eedback dback

slide-35
SLIDE 35

Beyond Binary outcomes: a set of outcomes 𝑷

  • Reward function 𝒔𝒖: 𝑪 × 𝑷 → −𝟐, 𝟐 and allocation 𝒚𝒖: 𝑪 → 𝚬(𝐏)
  • 𝒑𝒖 is chosen based on distribution 𝒚𝒖(𝒄𝒖) and learner wins reward

𝒔𝒖(𝒄𝒖, 𝒑𝒖).

  • Feedback: the learner observes 𝒚𝒖 and 𝒔𝒖(⋅, 𝒑𝒖)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 35

Ext xtension ension 1: 1: Ou Outc tcome

  • me-based

based Fee eedback dback

slide-36
SLIDE 36

Ext xtension ension 1: 1: Ou Outc tcome

  • me-based

based Fee eedback dback

Beyond Binary outcomes: a set of outcomes 𝑷

  • Reward function 𝒔𝒖: 𝑪 × 𝑷 → −𝟐, 𝟐 and allocation 𝒚𝒖: 𝑪 → 𝚬(𝐏)
  • 𝒑𝒖 is chosen based on distribution 𝒚𝒖(𝒄𝒖) and learner wins reward

𝒔𝒖(𝒄𝒖, 𝒑𝒖).

  • Feedback, the learner observes 𝒚𝒖 and 𝒔𝒖(⋅, 𝒑𝒖)

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 36

Theorem 2. WIN-EXP algorithm with Outcome-based feedback achieves regret at most 2 𝟑𝑼|𝑷|𝐦𝐩𝐡|𝑪|

slide-37
SLIDE 37

Application plication 1: 1: Ou Outc tcome

  • me-based

based fe feedback edback

Binary Outcome

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 37

slide-38
SLIDE 38

Application plication 1: Ou Outc tcome

  • me-based

based fe feed edback back

Binary Outcome

  • Second-price auction
  • 𝑷 = {win, not win}
  • Recover [Weed et. al, 16] result by choosing discretization

appropriately

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 38

slide-39
SLIDE 39

Application plication 1: 1: Ou Outc tcome

  • me-based

based fe feedback edback

Binary Outcome

  • Second-price auction
  • 𝑷 = {win, not win}
  • Recover [Weed et. al, 16] result by choosing discretization

appropriately

  • Value-per-click auction
  • 𝑷 = {get clicked, not clicked}

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 39

slide-40
SLIDE 40

Application plication 1: 1: Ou Outc tcome

  • me-based

based fe feedback edback

Binary Outcome

  • Second-price auction
  • 𝑷 = {win, not win}
  • Recover [Weed et. al, 16] result by choosing discretization

appropriately

  • Value-per-click auction
  • 𝑷 = {get clicked, not clicked}

Non-Binary Outcome

  • Unit-demand 𝑳-items auctions
  • 𝑷 = {1, 2,…, K+1}, where outcome 𝑳 + 𝟐 is associated with not

getting item

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 40

slide-41
SLIDE 41

Ext xtension ension 2: 2: Cont ntinu inuous us ac actio tion n sp space aces s

Piecewise-Lipschitz rewards

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 41

Theorem 3 (Regret of WIN-EXP algorithm in the continuous action space with 𝚬𝒑-Piecewise 𝑴-Lipschitz Average Utilities). WIN-EXP algorithm achieves regret at most 2 𝟑𝒆𝑼|𝑷|𝐦𝐩𝐡(𝐧𝐛𝐲{

𝟐 𝚬𝐩 , 𝑴𝑼})+1

Δ𝑝: length of minimum interval

slide-42
SLIDE 42

Application plication 2: 2: Cont ntinu inuous

  • us ac

action tion sp spaces aces

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 42

  • Second-price auctions
  • 𝚬𝐩 is the smallest difference between highest other bids at any two

iterations 𝒖 and 𝒖′

  • 𝑴 = 𝟏
slide-43
SLIDE 43

Application plication 2: 2: Cont ntinu inuous

  • us ac

action tion sp spaces aces

  • Second-price auctions
  • First-price and All-pay auctions
  • 𝚬𝐩 is the smallest difference between highest bids at any two

iterations 𝒖 and 𝒖′

  • 𝑴 = 𝟐

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 43

slide-44
SLIDE 44

Application plication 2: 2: Cont ntinu inuous

  • us ac

action tion sp spaces aces

  • Second-price auctions
  • First-price and All-pay auctions
  • Weighted GSP auction
  • Each bidder is assigned a score 𝒕𝒋 ∈ [𝟏, 𝟐] (drawn by auctioneer)
  • Allocating with decreasing order of score-weighted bids 𝒕𝒋 ⋅ 𝒄𝒋
  • If bidder wins slot 𝒍, charge

𝝇𝒍+𝟐 𝒕𝒍 , where 𝝇𝒍+𝟐 is the score-weighted

bid of bidder wins slot 𝒍 + 𝟐

  • Utility is Lipschitz if score is generated from distribution with

Lipschitz CDF

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 44

slide-45
SLIDE 45

Si Simu mulation lations

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 45

Set up:

Weighted GSP auctions; 𝒘𝒋 ∈ 𝟏, 𝟐 , randomly draw 20 bidders, 3 slots Consider three behaviors for other bidders (opponents): Stochastic, EXP3, WIN-EXP

slide-46
SLIDE 46

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 46

Different discretization of bidding space

slide-47
SLIDE 47

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 47

Robust to Noisy CTR Estimates: 𝑶 𝟏,

𝟐 𝒏

Stochastic adversaries

slide-48
SLIDE 48

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 48

Robust to Noisy CTR Estimates: 𝑶 𝟏,

𝟐 𝒏

EXP3 adversaries

slide-49
SLIDE 49

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 49

Robust to Noisy CTR Estimates: 𝑶 𝟏,

𝟐 𝒏

WIN-EXP adversaries

slide-50
SLIDE 50

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 50

Robust to CTR/payment estimates w/ regression

slide-51
SLIDE 51

Conc nclusion lusion

  • Design an online learning algorithm (WIN-EXP) for bidding in

the repeated auctions without knowing your value

  • Utilize partial feedback to achieve better regret than generic

MAB algorithm

  • Applications to a lot of auction settings
  • Robust experimental performance

6/21/2018 19th ACM Conference on Economics and Computation, EC’18 51

slide-52
SLIDE 52

Th Thanks for anks for yo your ur at attention! tention!

19th ACM Conference on Economics and Computation, EC’18 6/21/2018 52