[PPT] - in Succinct Games Hesam Nikpey Pooya Shati Social and Economical PowerPoint Presentation

SLIDE 1

Inverse Game Theory: Learning Utilities in Succinct Games

Hesam Nikpey Pooya Shati Social and Economical Networks

Dr. Fazli

Spring 96-97

SLIDE 2

PAPER

Inverse Game Theory: Learning Utilities in Succinct

Games

Volodymyr Kuleshov and Okke Schrijvers
WINE 2015 conference

1

SLIDE 3

OUTLINE

Problem Introduction
Related works
Equilibrium Concepts
Succinct Games
Rationalizing a Game
Learning Utilities

2

SLIDE 4

PROBLEM INTRODUCTION

Classic Game Theory
Inverse Game Theory
Succinct Games

3

SLIDE 5

APPLICATIONS

Economics; design mechanisms
Machine learning; helicopter autopilots
Developing predictive techniques
Forecasting the agents’ behavior

4

SLIDE 6

RELATED WORKS

Computer science:
Computational complexity of rationalizing

stable matchings

Correlated equilibria
Economics:
Inferring utilities of bidders in online ad

auctions

Rationalizing agent behavior

5

SLIDE 7

NASH EQUILIBRIUM

Each player chooses a mixed strategy:
𝑞𝑗 ∈ 𝐸(𝐵𝑗)
And no one is interested in changing her

choice:

∀𝑟𝑗 ∈ 𝐸 𝐵𝑗 : 𝑣𝑗 𝑞𝑗, 𝑞−𝑗 ≥ 𝑣𝑗 𝑟𝑗, 𝑞−𝑗

6

𝑞1 𝑞2

SLIDE 8

CORRELATED EQUILIBRIUM

𝑞 not necessarily product of distributions
Equilibrium defined as
σ𝑏−𝑗 𝑞 𝑏𝑘

𝑗,𝑏−𝑗 𝑣𝑗 𝑏𝑘 𝑗, 𝑏−𝑗 ≥ σ𝑏−𝑗 𝑞 𝑏𝑘 𝑗, 𝑏−𝑗 𝑣𝑗 𝑏𝑙 𝑗 ,𝑏−𝑗

7

𝑞1,1 𝑞1,2 𝑞1,|𝐵𝑗|

𝑞|𝐵𝑘|,|𝐵𝑗|

𝑞|𝐵𝑘|,1

SLIDE 9

POLYNOMIAL MIXTURE OF PRODUCTS

A specific kind of correlated equilibriums
Probability distribution is sum of products of

distributions

𝑞 = σ𝑙=1

𝐿

𝑟𝑙

Where K is polynomial in input size and

every 𝑟𝑙 is a product of distributions

Every game has an easy to compute PMP

equilibrium

8

SLIDE 10

SUCCINCT GAMES

Every player’s utility is determined by a limited

number of observations

Interesting for the small number of parameters

required to represent the utility

Covering a vast number of games

9

SLIDE 11

SUCCINCT GAMES

Definition

LINEAR SUCCINCT GAMES

A set of (not necessarily disjoint) factors for

every player and a utility for every factor

𝐻 ≔ [ 𝐵𝑗 𝑗=1

𝑜

, 𝑤𝑗 𝑗=1

𝑜

, 𝑃𝑗 𝑗=1

𝑜

]

𝑃𝑗 ∈ 0,1 𝑛×𝑒, 𝑤𝑗 ∈ 𝑆𝑒
∀𝑗: 𝑣𝑗 = 𝑃𝑗𝑤𝑗
𝑃𝑗 is dimensionally large but has a compact

representation

10

SLIDE 12

SUCCINCT GAMES

Example 1

GRAPHICAL GAMES

Player 𝑗’s utility depends solely on her

and her neighbors’ actions.

𝑃𝑗 𝑏,𝑏𝑂 𝑗 = ቊ1

11

if 𝑏, 𝑏𝑂 𝑗 agree on the actions of 𝑂(𝑗)

therwise.

SLIDE 13

SUCCINCT GAMES

Example 2

CONGESTION GAMES

Players choose from possible subsets of the set of

resources.

Each player should pay the cost of it’s chosen

resources according to the function:

σ𝑓∈𝑏𝑗 𝑒𝑓(𝑚𝑓)
Where 𝑒𝑓 is 𝑓’s cost function and 𝑚𝑓 is the

number of player’s using 𝑓

General case of network flow games
𝑃𝑗 𝑏,(𝑓,𝑀) = ቊ1

12

if 𝑓 ∈ 𝑏𝑗 and 𝑚𝑓 𝑏 = 𝑀

therwise.

SLIDE 14

RATIONALIZING A GAME

First we write the correlated equilibrium as a linear

constraint:

σ𝑏−𝑗 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗 𝑣𝑗 𝑏𝑘 𝑗, 𝑏−𝑗 ≥ σ𝑏−𝑗 𝑞 𝑏𝑘 𝑗, 𝑏−𝑗 𝑣𝑗 𝑏𝑙 𝑗 , 𝑏−𝑗

→ 𝑞𝑈𝐷𝑗𝑘𝑙𝑣𝑗 = 𝑞𝑈𝐷𝑗𝑘𝑙𝑃𝑗𝑤𝑗 ≥ 0

Where 𝐷𝑗𝑘𝑙 is
𝐷𝑗𝑘𝑙 (𝑏𝑠𝑝𝑥,𝑏𝑑𝑝𝑚) = ቐ

−1 1

13

if 𝑏𝑠𝑝𝑥 = (𝑏𝑘, 𝑏−𝑗

𝑑𝑝𝑚)

therwise.

if 𝑏𝑠𝑝𝑥 = (𝑏𝑙, 𝑏−𝑗

𝑑𝑝𝑚)

SLIDE 15

RATIONALIZING A GAME

Example

p 𝑏1

1, 𝑏1 2 ∗ 𝑣1 𝑏1 1, 𝑏1 2 + p 𝑏1 1, 𝑏2 2 ∗ 𝑣1 𝑏1 1, 𝑏2 2

≥ p 𝑏1

1, 𝑏1 2 ∗ 𝑣1 𝑏2 1, 𝑏1 2 + p 𝑏1 1, 𝑏2 2 ∗ 𝑣1 𝑏2 1, 𝑏2 2

𝑟1, 𝑟2, 𝑟3, 𝑟4

1 , 0, −1, 0 0, 1 , 0, −1 0 , 0 , 0 , 0 0 , 0 , 0 , 0 𝑣 𝑏1

1, 𝑏1 2

𝑣 𝑏1

1, 𝑏2 2

𝑣 𝑏2

1, 𝑏1 2

𝑣 𝑏2

1, 𝑏2 2

≥ 0

Where:
𝑟1 = 𝑞 𝑏1

1, 𝑏1 2

𝑟2 = 𝑞 𝑏1

1, 𝑏2 2

𝑟3 = 𝑞 𝑏2

1, 𝑏1 2

𝑟4 = 𝑞 𝑏2

1, 𝑏2 2

14

SLIDE 16

NON- DEGENERACY CONDITION

To avoid trivial un-interesting solutions like 𝑤𝑗 = 0
We add the condition:
∀𝑗: σ𝑙=1

𝑒

𝑤𝑗𝑙 = 1

Furthermore by adding constraints or tweaking the
bjective function of the optimization problem:
We can limit the answer space
We can add conditions based on prior

knowledge of valuations and their coupling

We can encourage properties like sparsity and

entropy

15

SLIDE 17

INVERSE- UTILITY PROBLEM

FORMAL DEFINITION

A set of 𝑀 partially observed succinct n-player

games:

𝐻𝑚 =

𝐵𝑗𝑚 𝑗=1

𝑜

, , 𝑃𝑗𝑚 𝑗=1

𝑜

for 𝑚 ∈ {1,2, … , 𝑀}

Each with an equilibria: 𝑞𝑚 𝑚=1

𝑀

Find 𝑤𝑗 𝑗=1

𝑂

Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞𝑚

𝑈𝐷𝑗𝑘𝑙𝑚𝑃𝑗𝑚𝑤𝑗 ≥ 0

16

SLIDE 18

COMPUTABILITY PROPERTY

We need to compute cijk

T = 𝑞𝑈𝐷𝑗𝑘𝑙𝑃𝑗 efficiently

Computing the probability of each factor in games

that possess this property is feasible:

The following sum can be computed in

polynomial time for any factor 𝑝, product distribution 𝑞 and action 𝑏𝑘

𝑗

σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗(𝑃) 𝑞(𝑏−𝑗)

17

SLIDE 19

COMPUTABILITY PROPERTY

Example

CONGESTION GAMES

Each factor is a tuple (𝑓, 𝑀) meaning that the player 𝑗

and 𝑀 − 1 other players used the resource 𝑓

The answer for the 𝑓 ∉ 𝑏𝑘

𝑗 case is trivial

Otherwise we use dynamic programming to

compute the probability of the sum of Bernoulli random variables being 𝑀 − 1

18

SLIDE 20

LEARNING UTILITIES

Computing 𝐷

𝑗𝑘𝑙 𝑈 (1)

We had

σ𝑏−𝑗 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗 𝑣𝑗 𝑏𝑘 𝑗, 𝑏−𝑗 ≥ σ𝑏−𝑗 𝑞 𝑏𝑘 𝑗, 𝑏−𝑗 𝑣𝑗 𝑏𝑙 𝑗 , 𝑏−𝑗

Rewriting the left-hand side:
σ𝑏−𝑗 𝑞(𝑏𝑘

𝑗, 𝑏−𝑗) σ𝑝∈𝑈𝑗(𝑏𝑘

𝑗,𝑏−𝑗) 𝑤𝑗(𝑝)

= σ𝑝∈𝑃𝑗 σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗(𝑝) 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗 𝑤𝑗(𝑝)

= σ𝑝∈𝑃𝑗 𝑤𝑗(𝑝) σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗(𝑝) 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗

Where 𝑈𝑗 𝑏 = 𝑝 𝑃𝑏,𝑝 = 1} represents the set
f factors triggered by 𝑏
Similarly for the right-hand side we have:
= σ𝑝∈𝑃𝑗 𝑤𝑗(𝑝) σ𝑏−𝑗: 𝑏𝑙

𝑗 ,𝑏−𝑗 ∈𝐵𝑗(𝑝) 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗

19

SLIDE 21

LEARNING UTILITIES

Computing 𝐷

𝑗𝑘𝑙 𝑈 (2)

Subtracting the two results we have:
σ𝑝∈𝑃𝑗 𝑤𝑗 𝑝 [σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗(𝑝) 𝑞 𝑏𝑘

𝑗, 𝑏−𝑗

− ෍

𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗 𝑝

𝑞 𝑏𝑘

𝑗, 𝑏−𝑗 ] ≥ 0

We can factor 𝑞 out considering that it is a product
f distributions.
σ𝑝∈𝑃𝑗 𝑤𝑗 𝑝 [σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗(𝑝) 𝑞 𝑏−𝑗

− σ𝑏−𝑗: 𝑏𝑘

𝑗,𝑏−𝑗 ∈𝐵𝑗 𝑝 𝑞 𝑏−𝑗 ] ≥ 0

The remaining inequality resembles the dot product
f 𝑤𝑗 and another vector (namely cijk

T ) which we

know how to compute efficiently

20

SLIDE 22

LEARNING UTILITIES

Optimization Problem

Combination of these Linear Programs for every

game results in valid valuations for each player:

Minimize σ𝑗=1

𝑜

𝑔(𝑤𝑗)

Subject to 𝑑𝑗𝑘𝑙

𝑈 𝑤𝑗 ≥ 0 ∀𝑗, 𝑘, 𝑙

1𝑈𝑤𝑗 = 1 ∀𝑗

Of course the resulting program is not necessarily

feasible

21

SLIDE 23

INVERSE- GAME PROBLEM

FORMAL DEFINITION

A set of 𝑀 partially observed succinct n-player

games:

𝐻𝑚 =

𝐵𝑗𝑚 𝑗=1

𝑜

, , for 𝑚 ∈ {1,2, … , 𝑀}

Each with an equilibria: 𝑞𝑚 𝑚=1

𝑀

Each with a set of candidate structures 𝑇𝑚 𝑚=1

𝑀

Find 𝑤𝑗 𝑗=1

𝑂

and choose a structure (𝑃𝑗𝑚ℎ)𝑗=1

𝑜 for

each game

Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞𝑚

𝑈𝐷𝑗𝑘𝑙𝑚𝑃𝑗𝑚ℎ𝑤𝑗 ≥ 0

22

SLIDE 24

INVERSE- GAME PROBLEM

NP-HARDNESS

PROOF SKETCH

3-SAT reduction to a sequence of graphical games
For every variable, a vertex with true and false

actions plus one base player with only one action

For every clause, a game with three candidate
structures. Each containing a single edge between
ne of the literals and the base node
Positive nodes play true, and negative nodes play

false purely.

23

SLIDE 25

Inverse Game Theory: Learning Utilities in Succinct Games

PAPER

1

OUTLINE

2

PROBLEM INTRODUCTION

3

APPLICATIONS

4

RELATED WORKS

stable matchings

5

NASH EQUILIBRIUM

choice:

6

CORRELATED EQUILIBRIUM

7

POLYNOMIAL MIXTURE OF PRODUCTS

𝑟𝑙

8

SUCCINCT GAMES

number of observations

9

SUCCINCT GAMES

LINEAR SUCCINCT GAMES

every player and a utility for every factor

10

SUCCINCT GAMES

GRAPHICAL GAMES

11

SUCCINCT GAMES

12

if 𝑓 ∈ 𝑏𝑗 and 𝑚𝑓 𝑏 = 𝑀

RATIONALIZING A GAME

−1 1

13

if 𝑏𝑠𝑝𝑥 = (𝑏𝑘, 𝑏−𝑗

if 𝑏𝑠𝑝𝑥 = (𝑏𝑙, 𝑏−𝑗

RATIONALIZING A GAME

≥ p 𝑏1

𝑣 𝑏2

≥ 0

14

NON- DEGENERACY CONDITION

𝑤𝑗𝑙 = 1

knowledge of valuations and their coupling

entropy

15

INVERSE- UTILITY PROBLEM

games:

16

COMPUTABILITY PROPERTY

that possess this property is feasible:

polynomial time for any factor 𝑝, product distribution 𝑞 and action 𝑏𝑘

17

COMPUTABILITY PROPERTY

CONGESTION GAMES

and 𝑀 − 1 other players used the resource 𝑓

compute the probability of the sum of Bernoulli random variables being 𝑀 − 1

18

LEARNING UTILITIES

19

LEARNING UTILITIES

− ෍

𝑞 𝑏𝑘

20

LEARNING UTILITIES

1𝑈𝑤𝑗 = 1 ∀𝑗

21

INVERSE- GAME PROBLEM

games:

22

INVERSE- GAME PROBLEM

false purely.

23

THANKS FOR YOUR ATTENTION Q & A