Monte-Carlo Tree Search Parallelisation International Go Symposium - - PowerPoint PPT Presentation

monte carlo tree search parallelisation
SMART_READER_LITE
LIVE PREVIEW

Monte-Carlo Tree Search Parallelisation International Go Symposium - - PowerPoint PPT Presentation

Monte-Carlo Tree Search Parallelisation International Go Symposium 2012 Francois van Niekerk francoisvn@ml.sun.ac.za August 2012 Collaborators: Steve Kroon Gert-Jan van Rooyen Cornelia Inggs This work was partially supported by the National


slide-1
SLIDE 1

Monte-Carlo Tree Search Parallelisation

International Go Symposium 2012 Francois van Niekerk francoisvn@ml.sun.ac.za August 2012

slide-2
SLIDE 2

Collaborators: Steve Kroon Gert-Jan van Rooyen Cornelia Inggs This work was partially supported by the National Research Foundation of South Africa.

slide-3
SLIDE 3

Outline

1

Introduction

2

Background Computer Go Monte-Carlo Tree Search Parallelisation

3

Implementation

4

Testing and Results Multi-Core Parallelisation Cluster Parallelisation

5

New Developments

6

Conclusions

slide-4
SLIDE 4

Introduction

  • Top Go programs are currently about 5 dan KGS.
  • Monte-Carlo Tree Search (MCTS) is dominant Computer

Go algorithm.

  • MCTS parallelisation possible on multi-core and cluster

systems.

slide-5
SLIDE 5

Computer Go

  • Tree for moves and their follow-ups.
  • Exponential tree growth means brute-force is infeasible.
  • Evaluation function is used to avoid growing tree too far.
slide-6
SLIDE 6

Classical Methods

  • Emulate humans with expert knowledge.
  • Difficult to assimilate new knowledge into an already large

body.

  • Top strength in SDKs, far from pros.
slide-7
SLIDE 7

Monte-Carlo Tree Search

  • Monte-Carlo methods — stochastic simulations (playouts).
  • Winrate of playouts starting from a position is the value of

the position.

  • Playouts are used in a tree to form Monte-Carlo Tree

Search (MCTS).

  • MCTS can be broken into four parts: selection, expansion,

simulation and backpropagation.

slide-8
SLIDE 8

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 3/5 2/3 1/1 0/1 0/1 Selection

slide-9
SLIDE 9

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 3/5 2/3 1/1 0/1 0/1 Expansion

slide-10
SLIDE 10

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 3/5 2/3 1/1 0/1 0/1 W Simulation (playout)

slide-11
SLIDE 11

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 3/5 2/3 1/1 0/1 0/1 1/1 Backpropagation

slide-12
SLIDE 12

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 3/5 2/3 1/1 0/1 1/2 1/1 Backpropagation

slide-13
SLIDE 13

Monte-Carlo Tree Search

4/9 1/3 0/1 1/1 0/1 4/6 2/3 1/1 0/1 1/2 1/1 Backpropagation

slide-14
SLIDE 14

Monte-Carlo Tree Search

5/10 1/3 0/1 1/1 0/1 4/6 2/3 1/1 0/1 1/2 1/1 Backpropagation

slide-15
SLIDE 15

Parallelisation

  • Improve MCTS: improve algorithm or increase playouts.
  • Increasing number of playouts increases playing strength.
  • Increase playouts: increase thinking time or playout rate.
  • Parallelisation: use parallel hardware to increase playout

rate and therefore strength.

  • Three parallelisation methods for MCTS: tree, leaf, and

root.

slide-16
SLIDE 16

Tree Parallelisation

  • Single shared tree.
  • Well-suited to shared-memory

systems, such as multi-core systems.

slide-17
SLIDE 17

Leaf Parallelisation

master: slaves:

  • Master and slave nodes.
  • Only one tree, on the master.
  • Slaves are playout workers.
slide-18
SLIDE 18

Root Parallelisation

  • Each execution

node maintains a tree.

  • Each node performs

MCTS.

  • Periodic sharing of

information.

slide-19
SLIDE 19

Parallel Effect

  • Strength penalty for parallelisation.
  • Due to change from sequential to parallel execution.
  • More pronounced if the playout updates are delayed, for

example in root vs. multi-core parallelisation.

slide-20
SLIDE 20

Implementation

  • Oakfoam is an open-source

cross-platform MCTS engine for Computer Go.

  • Tree parallelisation for

multi-core systems.

  • Root parallelisation for cluster

systems.

slide-21
SLIDE 21

Testing and Results

  • Test for playout rate increase.
  • If increase found, test for strength penalty.
  • If strength penalty found, test for overall strength increase.
slide-22
SLIDE 22

Multi-Core Parallelisation Results

1 2 4 8 1 2 4 8 Cores Speedup Ideal No additions Virtual Loss Lock-free Both additions

Speedup on 9x9

1 2 4 8 1 2 4 8 Cores Speedup Ideal No additions Virtual Loss Lock-free Both additions

Speedup on 19x19

slide-23
SLIDE 23

Cluster Parallelisation Results

1 2 4 8 16 50 60 70 80 90 100 Cores/Periods Winrate vs. 1-Core [%]

Baseline 10s/move 10s/move p = 0.1 10s/move p = 0.2 10s/move p = 0.05 2s/move p = 0.1 2s/move p = 0.2 2s/move p = 0.05

Strength Comparison on 9x9

1 2 4 8 16 32 64 50 60 70 80 90 100 Cores/Periods Winrate vs. 1-Core [%] Baseline 10s/move 10s/move p = 0.1 2s/move p = 0.1

Strength Comparison on 19x19

slide-24
SLIDE 24

Overview of Results

  • Multi-core: tree parallelisation showed linear scaling up to

eight cores (physical limit in these tests).

  • Cluster: root parallelisation for 19x19 showed scaling up to

eight nodes, where it had a four-core ideal strength improvement.

slide-25
SLIDE 25

New Developments

  • Pachi uses virtual wins and losses to improve cluster

scaling.

  • Depth-First UCT changes MCTS from a best-first to a

depth-first search.

  • Distributed UCT, and Distributed Depth-First UCT use

Transposition-table Driven Scheduling to break up the tree across nodes.

  • UCT-Treesplit uses Transposition-table Driven Scheduling

to break up the MCTS work across nodes.

  • Only virtual wins and losses applied to Computer Go so far.
slide-26
SLIDE 26

Conclusions

  • MCTS is dominant algorithm for Computer Go.
  • Parallelisation on multi-core systems scales well.
  • Parallelisation on cluster systems possible, but still room

for improvement.

  • Future of cluster parallelisation holds possibilities.
slide-27
SLIDE 27

Thanks

Thank you for taking time to listen to this talk. More information about this talk is available at: http://oakfoam.com/igs2012. Please send any questions to: francoisvn@ml.sun.ac.za.