osu!mania Reinforcement Learning Agent - - PowerPoint PPT Presentation

▶

Feb 13, 2023 6 likes •137 views

osu!mania Reinforcement Learning Agent ichrysomallis@isc.tuc.gr 2014030078 Contents Introduction osu!mania game Graphical User Interface customization Agents environment Approach and

SLIDE 1

su!mania Reinforcement Learning

Agent

Χρυσομάλλης Ιάσων ichrysomallis@isc.tuc.gr 2014030078

SLIDE 2

 Introduction  osu!mania game  Graphical User Interface customization  Agent’s environment  Approach and variable definition  Q-learning  Deep reinforcement learning  Future plans

SLIDE 3

Introduction

Topic: Develop an agent able to learn how to play the video game osu!mania, through reinforcement learning. Two agents:

 Q-learning agent  Deep reinforcement learning agent

SLIDE 4

su!mania game

Rhythm game, notes are falling

1.

Single-tap notes

2.

Hold notes

3.

Judgment bar

4.

Player keys

5.

Combo

6.

Hitburst

7.

Overall accuracy

8.

Score

SLIDE 5

Graphical User Interface customization

Fully customizable environment, all elements can be changed Each element is painted with solid color RGB = [X, 100, 100], where X is in accordance with the element’s identity (see numbers)

SLIDE 6

Agent’s environment

Record screenshots and translate information based on the RGB values given Small fraction of the screen includes relevant information, specific boxes are being recorded

SLIDE 7

Approach and variable definition (1)

Identical behavior on each column, problem can be narrowed down to single column learning

 Agent’s actions : 1.

Instantaneous key tap

2.

Key press (no release)

3.

Key release

4.

Do nothing

SLIDE 8

Approach and variable definition (2)

 Rewards:  Epsilon:

Initial value = 1
Decay value = 0.9977
Minimum value = 0.01

SLIDE 9

Approach and variable definition (3)

 State:

 One column of 200 pixels  Only red (R) layer  Three possible values (no note, singe-tap note, hold note)

Deep reinforcement learning:

Raw input of the column

Q-learning:

Only 8 pixels due to state complexity, taking one pixel every 15 pixels
f the recorded column

SLIDE 10

Q-learning

 Algorithm:  Steps:

 Receive current state  Choose an action based on epsilon  Execute the action  Receive new state  Check if song is over  Update Q-table

SLIDE 11

Deep reinforcement learning

 Neural network model (Keras):  Steps:

Identical steps apart from last one. Save transitions in temporary memory and train the model with a smaller, randomly selected sample group (batch).

SLIDE 12

Results

Q-learning agent

DQN agent

SLIDE 13