the gizmo player

The Gizmo Player Simon Doll Jan Kopcsek Alper Tunga Dresden, - PowerPoint PPT Presentation

Fakulttsname Informatik Fachrichtung Informatik Institutsname Intelligente Systeme The Gizmo Player Simon Doll Jan Kopcsek Alper Tunga Dresden, 13.02.2008 Finding a heuristic function Two ways for learning a heuristic function:


  1. Fakultätsname Informatik Fachrichtung Informatik Institutsname Intelligente Systeme The Gizmo Player Simon Dollé Jan Kopcsek Alper Tunga Dresden, 13.02.2008

  2. Finding a heuristic function Two ways for learning a heuristic function: • Deductive – Analyzing the rules – Identify common elements like game boards or pieces – Finding patterns • Inductive – Playing and learning from experience – Monte Carlo strategy TU Dresden, 13.02.2008 Gizmo Player Slide 2 of 10

  3. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  4. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  5. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  6. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  7. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  8. Monte Carlo Strategy • Problem: Same effort spend on interesting moves and uninteresting moves • Equivalent to play against a dummy player • • UCT Algorithm (Upper Confidence Bound for Trees): An algorithm to balance: • Exploration of interesting parts of the graph  Exploration of new parts  Make random games more realistic • TU Dresden, 13.02.2008 Gizmo Player Slide 4 of 10

  9. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  10. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  11. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  12. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  13. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  14. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  15. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  16. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  17. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  18. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  19. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  20. UCT Algorithm Which move to play? • The one with the highest  heuristic value In multiplayer games: • Store the heuristic value  for each player TU Dresden, 13.02.2008 Gizmo Player Slide 7 of 10

  21. Good points • Heuristic directly linked to the final score • Heuristic converges to min-max values • Time scalable • Easily parallelisable TU Dresden, 13.02.2008 Gizmo Player Slide 8 of 10

  22. Problems • Simultaneous moves: – What rule to choose to explore the nodes? – Which move to play? • Long games and loops: – Depth first search problem TU Dresden, 13.02.2008 Gizmo Player Slide 9 of 10

  23. Thank you for your attention And good luck to your players TU Dresden, 13.02.2008 Gizmo Player Slide 10 of 10

Recommend


More recommend


Explore More Topics

Stay informed with curated content and fresh updates.