De Deep R Reinforcement Learning i in a a Ha Handf dful of of - - PowerPoint PPT Presentation

▶

Oct 31, 2023 224 likes •500 views

De Deep R Reinforcement Learning i in a a Ha Handf dful of of Trials ls u using Probabilistic D Dynamics M Models Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley How L Lon ong D

SLIDE 1

De Deep R Reinforcement Learning i in a a Ha Handf dful

f
f Trials

ls u using Probabilistic D Dynamics M Models

Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley

SLIDE 2

How L Lon

ng D

Doe

s Lea earnin ing Take? e?

~800,000 grasp attempts ~21 million games ~50 million frames

[Mnih et al. 2015] [Silver et al. 2017] [Levine et al. 2017]

SLIDE 3

Can Can w we speed t this u up?

SLIDE 4

Mo Model-Ba Based ed Reinforcem emen ent Learning

Optimize Policy Execute Policy Train Dynamics Model

SLIDE 5

Comparative P Perf rform rmance

n Ha

HalfCh Chee eetah

SLIDE 6

Comparative P Perf rform rmance

n Ha

HalfCh Chee eetah

SLIDE 7

Determ rministic N Neural Nets as Models

SLIDE 8

Determ rministic N Neural Nets as Models

SLIDE 9

Determ rministic N Neural Nets as Models

SLIDE 10

Determ rministic N Neural Nets as Models

SLIDE 11

Determ rministic N Neural Nets as Models

SLIDE 12

Probabilisti tic Neural N Nets ts a as Models

SLIDE 13

Probabilisti tic Ensembles as Models

SLIDE 14

Probabilisti tic Ensembles as Models

SLIDE 15

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 16

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 17

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 18

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 19

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 20

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 21

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 22

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 23

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 24

Trajec ector

ry S

Sampling f g for State Prop

pagation
n

SLIDE 25

Ex Experi rimental Results

SLIDE 26

https://github.com/kchua/handful-of-trials https://sites.google.com/view/drl-in-a-handful-of-trials Code: Website:

De Deep R Reinforcement Learning i in a a Ha Handf dful

f
f Trials

ls u using Probabilistic D Dynamics M Models

Kurtland Chua Roberto Calandra Rowan McAllister Sergey Levine

Data efficient Competitive asymptotic performance Easy to implement

De Deep R Reinforcement Learning i in a a Ha Handf dful

ls u using Probabilistic D Dynamics M Models

How L Lon

Doe

s Lea earnin ing Take? e?

Can Can w we speed t this u up?

Mo Model-Ba Based ed Reinforcem emen ent Learning

Comparative P Perf rform rmance

HalfCh Chee eetah

Comparative P Perf rform rmance

HalfCh Chee eetah

Determ rministic N Neural Nets as Models

Determ rministic N Neural Nets as Models

Determ rministic N Neural Nets as Models

Determ rministic N Neural Nets as Models

Determ rministic N Neural Nets as Models

Probabilisti tic Neural N Nets ts a as Models

Probabilisti tic Ensembles as Models

Probabilisti tic Ensembles as Models

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Trajec ector

Sampling f g for State Prop

Ex Experi rimental Results

https://github.com/kchua/handful-of-trials https://sites.google.com/view/drl-in-a-handful-of-trials Code: Website:

De Deep R Reinforcement Learning i in a a Ha Handf dful

ls u using Probabilistic D Dynamics M Models

Poster #165