Game Theory: Lecture #11 Outline: Strategic form games Best - - PDF document

game theory lecture 11
SMART_READER_LITE
LIVE PREVIEW

Game Theory: Lecture #11 Outline: Strategic form games Best - - PDF document

Game Theory: Lecture #11 Outline: Strategic form games Best Response Nash equilibrium Strategic games Setup: Strategic games Set of players, N = { 1 , ..., n } A set of actions for each player i N , A i . This


slide-1
SLIDE 1

Game Theory: Lecture #11

Outline:

  • Strategic form games
  • Best Response
  • Nash equilibrium
slide-2
SLIDE 2

Strategic games

  • Setup: Strategic games

– Set of players, N = {1, ..., n} – A set of actions for each player i ∈ N, Ai. – This induces the set of action profiles A = A1 × A2 × ... × An – For each player, preferences over action profiles characterized by a function: Ui : A → R

  • Descriptive agenda: What is a reasonable prediction of social behavior? Alternatively,

what should an agent do in a given game?

  • Fundamental challenge: An agent’s desired choice depends heavily on the agent’s model

for the other agents in the game

  • Previous focus: Zero-sum games

– Model of other agents: worst-case / adversarial – Reasonable choice: Security strategies – Expected performance: Security levels

  • Are security-strategies a reasonable choice beyond zero-sum games?
  • Example:

L R T 2, 2 0, 0 B 0, 0 ǫ, ǫ

1

slide-3
SLIDE 3

Nash Equilibrium

  • Alternative model: Agents are contingent optimizers
  • Definition: The action profile a∗ is a Nash equilibrium if for every player i,

Ui(a∗) = Ui(a∗

i, a∗ −i) ≥ Ui(ai, a∗ −i)

for every ai ∈ Ai.

  • Notation:

– {−i} represents all players other than i, i.e., {−i} = {1, . . . , i − 1, i + 1, . . . , n} – a−i represents the choice of all players other than i, i.e., a−i = {a1, . . . , ai−1, ai+1, . . . , an}

  • Compare:

– Optimization case: An optimizer will play the best action – Nash equilibrium: An action profile in which each player is acting as an optimizer – Term “rational” implies that an agent is an “optimizer”

  • View: Nash equilibrium reasonable outcome associated with rational players
  • Alternative definition of Nash equilibrium:

– Definition: The best response function of player i, Bi(·), is Bi(a−i) = {ai : Ui(ai, a−i) ≥ Ui(a′

i, a−i) for all a′ i ∈ Ai}

Note that the best response “function” is actually a “set” – An action profile a∗ is a Nash equilibrium if for every player i, a∗

i ∈ Bi(a∗ −i)

i.e., each player is playing a best response to the actions of other player

  • Nash equilibrium restated: No player has a unilateral incentive to change action

2

slide-4
SLIDE 4

Descriptive question: What’s the outcome?

  • Descriptive agenda: What is a reasonable predicition of social behavior?

– Does a Nash equilibrium exist? – Is a Nash equilibrium unique? – Which Nash equilibrium? – Why Nash equilibrium?

  • Prisoner’s dilemma:

– Setup: Cooperate vs Defect? C D C 3, 3 0, 4 D 4, 0 1, 1 – Also used to model work vs shirk? arm vs disarm? – What if played several times?

  • Bach or Stravinsky: (coordination)

B S B 2, 1 0, 0 S 0, 0 1, 2

  • Stag hunt: (safety and social cooperation)

Stag Hare Stag 2, 2 0, 1 Hare 1, 0 1, 1

  • Typewriter: QWERTY vs. Dvorak (social norms and conventions)

Alt Std Alt 3, 3 0, 0 Std 0, 0 1, 1

3

slide-5
SLIDE 5

Nash equilibrium, cont

  • There can be one NE, multiple NE, or no NE
  • Examples: Prisoner’s dilemma, BoS, Stag Hunt, Typewriter, and matching pennies:

H T H 1, −1 −1, 1 T −1, 1 1, −1

  • Curiosity: Equilibrium of what?
  • Cournot adjustment process: At stage k, player i uses a best response to the move of

players −i at stage k − 1

  • Consider matching pennies:

– Stage 1: (H, H) – Stage 2: (H, T) – Stage 3: (T, T) – Stage 4: (T, H) – Stage 5: ...

  • NE is an equilibrium of Cournot adjustment process
  • Cournot adjustment process need not converge to a NE (e.g., stag hunt)

4

slide-6
SLIDE 6

Example: Routing game

S D High road Low road

  • Assume N players
  • Congestion:

– High road: cH + nH – Low road: cL + nL

  • Claim: NE when both roads have (almost) same congestion.
  • Characterization of NE:

High satisfied Low satisfied cH + nH ≤ cL + nL + 1 cL + nL ≤ cH + nH + 1 cH + nH ≤ cL + (N − nH) + 1 cL + (N − nH) ≤ cH + nH + 1 2nH ≤ N + cL − cH + 1 2nH ≥ N + cL − cH − 1

  • For N = 100, cH = 20, cL = 6:

85 ≤ 2nH ≤ 87 ⇒ nH = 43

  • For N = 100, cH = 20, cL = 5

84 ≤ 2nH ≤ 86 ⇒ 42 ≤ nH ≤ 43 NE is both nH = 42 or nH = 43

5

slide-7
SLIDE 7

Example: Routing

S D High road Low road

c(x) = x c(x) = 1

c(x) = 2x

  • Setup:

– Players: Two agents that each control 1/2 units of splittable traffic. – Actions: Players can route 1/2 of traffic arbitrarily over H and L – Cost: The cost of an agent is just the total cost of it’s traffic Ji(f H

1 , f H 2 ) = f H i cH(f H 1 + f H 2 ) + (0.5 − f H i )cL(1 − f H 1 − f H 2 )

– Convention: Use Ji(·) for cost and Ui(·) for benefit

  • Computing best response of player 1, B1(·)

B1(x) = arg min

0≤f H

1 ≤0.5

f H

1 · 2(x + f H 1 ) + (0.5 − f H 1 )

  • Take derivative and set to 0 yields

B1(x) = 1 4 − x 2

  • Player 1 and player 2 are symmetric so we have

B2(y) = 1 4 − y 2

6

slide-8
SLIDE 8

Example: Routing

  • What is the routing profile of the NE?

f H

1 = B1(f H 2 )

f H

2 = B2(f H 1 )

  • NE is mutual best response

f H

i

= 1/6

7

slide-9
SLIDE 9

Example: Routing

  • Could have also found NE by iteratively eliminating actions that are not best response
  • Recall: Best response functions

f ∗

1 = B1(f2) = 1

4 − f2 2 f ∗

2 = B2(f1) = 1

4 − f1 2

  • Similar to iterated elimination of strictly dominated strategies:

f ∗

i ∈

  • 0, 1

2

f ∗

i ∈

  • 0, 1

4

f ∗

i ∈

1 8, 1 4

f ∗

i ∈

1 8, 3 16

f ∗

i ∈

5 32, 3 16

f ∗

i ∈

10 64, 11 64

  • .

. . f ∗

i = 1

6

8

slide-10
SLIDE 10

Dominated strategies

  • Issue: Finding a Nash equilibrium is hard

– Approach #1: Exhaustively check all joint actions – Approach #2: Investigate best response functions – Best approach depends on game of interest

  • Certain structures can greatly simplify the analysis

– Prisoner’s dilemma: Defect was did better than alternatives (strict) – Second price sealed bid: Internal valuation did no worse than alternatives (weak)

  • Fact: A strictly dominated strategy cannot be used in a NE
  • Why? It is never part of a best response
  • Q: Can a weakly dominated strategy be used in a NE? Yes

L R T 2, 2 1, 1 B 1, 1 1, 2 T weakly dominates B, but both (T, L) and (B, R) are NE (see also auction example)

  • Viewpoint: If strictly dominated strategies are not used, we can reduce the game to the

remaining strategies.

9

slide-11
SLIDE 11

Iterated elimination of strictly dominated strategies

  • Recall example from previous lecture
  • Q: What is the NE?
  • Successively eliminating dominated strategies can (sometimes) lead to NE

L C R T 4, 3 5, 1 6, 2 M 2, 1 8, 4 3, 6 B 3, 0 9, 6 2, 8

  • Row player has no (strictly) dominated strategies
  • Column player can eliminate C
  • Reduced game:

L R T 4, 3 6, 2 M 2, 1 3, 6 B 3, 0 2, 8

  • Row player can now eliminate both M and B:

L R T 4, 3 6, 2

  • Column player can now eliminate R
  • NE: (T, L) is the sole survivor

10