in Succinct Games Hesam Nikpey Pooya Shati Social and Economical - - PowerPoint PPT Presentation

β–Ά
in succinct games
SMART_READER_LITE
LIVE PREVIEW

in Succinct Games Hesam Nikpey Pooya Shati Social and Economical - - PowerPoint PPT Presentation

Inverse Game Theory: Learning Utilities in Succinct Games Hesam Nikpey Pooya Shati Social and Economical Networks Dr. Fazli Spring 96-97 Inverse Game Theory: Learning Utilities in Succinct Games PAPER Volodymyr Kuleshov and Okke


slide-1
SLIDE 1

Inverse Game Theory: Learning Utilities in Succinct Games

Hesam Nikpey Pooya Shati Social and Economical Networks

  • Dr. Fazli

Spring 96-97

slide-2
SLIDE 2

PAPER

  • Inverse Game Theory: Learning Utilities in Succinct

Games

  • Volodymyr Kuleshov and Okke Schrijvers
  • WINE 2015 conference

1

slide-3
SLIDE 3

OUTLINE

  • Problem Introduction
  • Related works
  • Equilibrium Concepts
  • Succinct Games
  • Rationalizing a Game
  • Learning Utilities

2

slide-4
SLIDE 4

PROBLEM INTRODUCTION

  • Classic Game Theory
  • Inverse Game Theory
  • Succinct Games

3

slide-5
SLIDE 5

APPLICATIONS

  • Economics; design mechanisms
  • Machine learning; helicopter autopilots
  • Developing predictive techniques
  • Forecasting the agents’ behavior

4

slide-6
SLIDE 6

RELATED WORKS

  • Computer science:
  • Computational complexity of rationalizing

stable matchings

  • Correlated equilibria
  • Economics:
  • Inferring utilities of bidders in online ad

auctions

  • Rationalizing agent behavior

5

slide-7
SLIDE 7

NASH EQUILIBRIUM

  • Each player chooses a mixed strategy:
  • π‘žπ‘— ∈ 𝐸(𝐡𝑗)
  • And no one is interested in changing her

choice:

  • βˆ€π‘Ÿπ‘— ∈ 𝐸 𝐡𝑗 : 𝑣𝑗 π‘žπ‘—, π‘žβˆ’π‘— β‰₯ 𝑣𝑗 π‘Ÿπ‘—, π‘žβˆ’π‘—

6

π‘ž1 π‘ž2

slide-8
SLIDE 8

CORRELATED EQUILIBRIUM

  • π‘ž not necessarily product of distributions
  • Equilibrium defined as
  • Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜

𝑗,π‘βˆ’π‘— 𝑣𝑗 π‘π‘˜ 𝑗, π‘βˆ’π‘— β‰₯ Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜ 𝑗, π‘βˆ’π‘— 𝑣𝑗 𝑏𝑙 𝑗 ,π‘βˆ’π‘—

7

π‘ž1,1 π‘ž1,2 π‘ž1,|𝐡𝑗|

π‘ž|π΅π‘˜|,|𝐡𝑗|

π‘ž|π΅π‘˜|,1

slide-9
SLIDE 9

POLYNOMIAL MIXTURE OF PRODUCTS

  • A specific kind of correlated equilibriums
  • Probability distribution is sum of products of

distributions

  • π‘ž = σ𝑙=1

𝐿

π‘Ÿπ‘™

  • Where K is polynomial in input size and

every π‘Ÿπ‘™ is a product of distributions

  • Every game has an easy to compute PMP

equilibrium

8

slide-10
SLIDE 10

SUCCINCT GAMES

  • Every player’s utility is determined by a limited

number of observations

  • Interesting for the small number of parameters

required to represent the utility

  • Covering a vast number of games

9

slide-11
SLIDE 11

SUCCINCT GAMES

Definition

LINEAR SUCCINCT GAMES

  • A set of (not necessarily disjoint) factors for

every player and a utility for every factor

  • 𝐻 ≔ [ 𝐡𝑗 𝑗=1

π‘œ

, 𝑀𝑗 𝑗=1

π‘œ

, 𝑃𝑗 𝑗=1

π‘œ

]

  • 𝑃𝑗 ∈ 0,1 𝑛×𝑒, 𝑀𝑗 ∈ 𝑆𝑒
  • βˆ€π‘—: 𝑣𝑗 = 𝑃𝑗𝑀𝑗
  • 𝑃𝑗 is dimensionally large but has a compact

representation

10

slide-12
SLIDE 12

SUCCINCT GAMES

Example 1

GRAPHICAL GAMES

  • Player 𝑗’s utility depends solely on her

and her neighbors’ actions.

  • 𝑃𝑗 𝑏,𝑏𝑂 𝑗 = α‰Š1

11

if 𝑏, 𝑏𝑂 𝑗 agree on the actions of 𝑂(𝑗)

  • therwise.
slide-13
SLIDE 13

SUCCINCT GAMES

Example 2

CONGESTION GAMES

  • Players choose from possible subsets of the set of

resources.

  • Each player should pay the cost of it’s chosen

resources according to the function:

  • Οƒπ‘“βˆˆπ‘π‘— 𝑒𝑓(π‘šπ‘“)
  • Where 𝑒𝑓 is 𝑓’s cost function and π‘šπ‘“ is the

number of player’s using 𝑓

  • General case of network flow games
  • 𝑃𝑗 𝑏,(𝑓,𝑀) = α‰Š1

12

if 𝑓 ∈ 𝑏𝑗 and π‘šπ‘“ 𝑏 = 𝑀

  • therwise.
slide-14
SLIDE 14

RATIONALIZING A GAME

  • First we write the correlated equilibrium as a linear

constraint:

  • Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘— 𝑣𝑗 π‘π‘˜ 𝑗, π‘βˆ’π‘— β‰₯ Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜ 𝑗, π‘βˆ’π‘— 𝑣𝑗 𝑏𝑙 𝑗 , π‘βˆ’π‘—

β†’ π‘žπ‘ˆπ·π‘—π‘˜π‘™π‘£π‘— = π‘žπ‘ˆπ·π‘—π‘˜π‘™π‘ƒπ‘—π‘€π‘— β‰₯ 0

  • Where π·π‘—π‘˜π‘™ is
  • π·π‘—π‘˜π‘™ (𝑏𝑠𝑝π‘₯,π‘π‘‘π‘π‘š) = ቐ

βˆ’1 1

13

if 𝑏𝑠𝑝π‘₯ = (π‘π‘˜, π‘βˆ’π‘—

π‘‘π‘π‘š)

  • therwise.

if 𝑏𝑠𝑝π‘₯ = (𝑏𝑙, π‘βˆ’π‘—

π‘‘π‘π‘š)

slide-15
SLIDE 15

RATIONALIZING A GAME

Example

  • p 𝑏1

1, 𝑏1 2 βˆ— 𝑣1 𝑏1 1, 𝑏1 2 + p 𝑏1 1, 𝑏2 2 βˆ— 𝑣1 𝑏1 1, 𝑏2 2

β‰₯ p 𝑏1

1, 𝑏1 2 βˆ— 𝑣1 𝑏2 1, 𝑏1 2 + p 𝑏1 1, 𝑏2 2 βˆ— 𝑣1 𝑏2 1, 𝑏2 2

  • π‘Ÿ1, π‘Ÿ2, π‘Ÿ3, π‘Ÿ4

1 , 0, βˆ’1, 0 0, 1 , 0, βˆ’1 0 , 0 , 0 , 0 0 , 0 , 0 , 0 𝑣 𝑏1

1, 𝑏1 2

𝑣 𝑏1

1, 𝑏2 2

𝑣 𝑏2

1, 𝑏1 2

𝑣 𝑏2

1, 𝑏2 2

β‰₯ 0

  • Where:
  • π‘Ÿ1 = π‘ž 𝑏1

1, 𝑏1 2

  • π‘Ÿ2 = π‘ž 𝑏1

1, 𝑏2 2

  • π‘Ÿ3 = π‘ž 𝑏2

1, 𝑏1 2

  • π‘Ÿ4 = π‘ž 𝑏2

1, 𝑏2 2

14

slide-16
SLIDE 16

NON- DEGENERACY CONDITION

  • To avoid trivial un-interesting solutions like 𝑀𝑗 = 0
  • We add the condition:
  • βˆ€π‘—: σ𝑙=1

𝑒

𝑀𝑗𝑙 = 1

  • Furthermore by adding constraints or tweaking the
  • bjective function of the optimization problem:
  • We can limit the answer space
  • We can add conditions based on prior

knowledge of valuations and their coupling

  • We can encourage properties like sparsity and

entropy

15

slide-17
SLIDE 17

INVERSE- UTILITY PROBLEM

FORMAL DEFINITION

  • A set of 𝑀 partially observed succinct n-player

games:

  • π»π‘š =

π΅π‘—π‘š 𝑗=1

π‘œ

, , π‘ƒπ‘—π‘š 𝑗=1

π‘œ

for π‘š ∈ {1,2, … , 𝑀}

  • Each with an equilibria: π‘žπ‘š π‘š=1

𝑀

  • Find 𝑀𝑗 𝑗=1

𝑂

  • Such that βˆ€π‘š, 𝑗, π‘˜, 𝑙: π‘žπ‘š

π‘ˆπ·π‘—π‘˜π‘™π‘šπ‘ƒπ‘—π‘šπ‘€π‘— β‰₯ 0

16

slide-18
SLIDE 18

COMPUTABILITY PROPERTY

  • We need to compute cijk

T = π‘žπ‘ˆπ·π‘—π‘˜π‘™π‘ƒπ‘— efficiently

  • Computing the probability of each factor in games

that possess this property is feasible:

  • The following sum can be computed in

polynomial time for any factor 𝑝, product distribution π‘ž and action π‘π‘˜

𝑗

  • Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑃) π‘ž(π‘βˆ’π‘—)

17

slide-19
SLIDE 19

COMPUTABILITY PROPERTY

Example

CONGESTION GAMES

  • Each factor is a tuple (𝑓, 𝑀) meaning that the player 𝑗

and 𝑀 βˆ’ 1 other players used the resource 𝑓

  • The answer for the 𝑓 βˆ‰ π‘π‘˜

𝑗 case is trivial

  • Otherwise we use dynamic programming to

compute the probability of the sum of Bernoulli random variables being 𝑀 βˆ’ 1

18

slide-20
SLIDE 20

LEARNING UTILITIES

Computing 𝐷

π‘—π‘˜π‘™ π‘ˆ (1)

  • We had

Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘— 𝑣𝑗 π‘π‘˜ 𝑗, π‘βˆ’π‘— β‰₯ Οƒπ‘βˆ’π‘— π‘ž π‘π‘˜ 𝑗, π‘βˆ’π‘— 𝑣𝑗 𝑏𝑙 𝑗 , π‘βˆ’π‘—

  • Rewriting the left-hand side:
  • Οƒπ‘βˆ’π‘— π‘ž(π‘π‘˜

𝑗, π‘βˆ’π‘—) Οƒπ‘βˆˆπ‘ˆπ‘—(π‘π‘˜

𝑗,π‘βˆ’π‘—) 𝑀𝑗(𝑝)

  • = Οƒπ‘βˆˆπ‘ƒπ‘— Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑝) π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘— 𝑀𝑗(𝑝)

  • = Οƒπ‘βˆˆπ‘ƒπ‘— 𝑀𝑗(𝑝) Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑝) π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘—

  • Where π‘ˆπ‘— 𝑏 = 𝑝 𝑃𝑏,𝑝 = 1} represents the set
  • f factors triggered by 𝑏
  • Similarly for the right-hand side we have:
  • = Οƒπ‘βˆˆπ‘ƒπ‘— 𝑀𝑗(𝑝) Οƒπ‘βˆ’π‘—: 𝑏𝑙

𝑗 ,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑝) π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘—

19

slide-21
SLIDE 21

LEARNING UTILITIES

Computing 𝐷

π‘—π‘˜π‘™ π‘ˆ (2)

  • Subtracting the two results we have:
  • Οƒπ‘βˆˆπ‘ƒπ‘— 𝑀𝑗 𝑝 [Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑝) π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘—

βˆ’ ෍

π‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘— 𝑝

π‘ž π‘π‘˜

𝑗, π‘βˆ’π‘— ] β‰₯ 0

  • We can factor π‘ž out considering that it is a product
  • f distributions.
  • Οƒπ‘βˆˆπ‘ƒπ‘— 𝑀𝑗 𝑝 [Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘—(𝑝) π‘ž π‘βˆ’π‘—

βˆ’ Οƒπ‘βˆ’π‘—: π‘π‘˜

𝑗,π‘βˆ’π‘— βˆˆπ΅π‘— 𝑝 π‘ž π‘βˆ’π‘— ] β‰₯ 0

  • The remaining inequality resembles the dot product
  • f 𝑀𝑗 and another vector (namely cijk

T ) which we

know how to compute efficiently

20

slide-22
SLIDE 22

LEARNING UTILITIES

Optimization Problem

  • Combination of these Linear Programs for every

game results in valid valuations for each player:

  • Minimize σ𝑗=1

π‘œ

𝑔(𝑀𝑗)

  • Subject to π‘‘π‘—π‘˜π‘™

π‘ˆ 𝑀𝑗 β‰₯ 0 βˆ€π‘—, π‘˜, 𝑙

1π‘ˆπ‘€π‘— = 1 βˆ€π‘—

  • Of course the resulting program is not necessarily

feasible

21

slide-23
SLIDE 23

INVERSE- GAME PROBLEM

FORMAL DEFINITION

  • A set of 𝑀 partially observed succinct n-player

games:

  • π»π‘š =

π΅π‘—π‘š 𝑗=1

π‘œ

, , for π‘š ∈ {1,2, … , 𝑀}

  • Each with an equilibria: π‘žπ‘š π‘š=1

𝑀

  • Each with a set of candidate structures π‘‡π‘š π‘š=1

𝑀

  • Find 𝑀𝑗 𝑗=1

𝑂

and choose a structure (π‘ƒπ‘—π‘šβ„Ž)𝑗=1

π‘œ for

each game

  • Such that βˆ€π‘š, 𝑗, π‘˜, 𝑙: π‘žπ‘š

π‘ˆπ·π‘—π‘˜π‘™π‘šπ‘ƒπ‘—π‘šβ„Žπ‘€π‘— β‰₯ 0

22

slide-24
SLIDE 24

INVERSE- GAME PROBLEM

NP-HARDNESS

PROOF SKETCH

  • 3-SAT reduction to a sequence of graphical games
  • For every variable, a vertex with true and false

actions plus one base player with only one action

  • For every clause, a game with three candidate
  • structures. Each containing a single edge between
  • ne of the literals and the base node
  • Positive nodes play true, and negative nodes play

false purely.

23

slide-25
SLIDE 25

THANKS FOR YOUR ATTENTION Q & A