Learning to Plan with Logical Automata Brandon Araki 1 *, Kiran - - PowerPoint PPT Presentation

▶

Dec 01, 2022 141 likes •317 views

Learning to Plan with Logical Automata Brandon Araki 1 *, Kiran Vodrahalli 2 *, Thomas Leech 1,3 , Mark Donahue 3 , Cristian-Ioan Vasile 1 , Daniela Rus 1 1 Massachusetts Institute of Technology 2 Columbia University 3 MIT Lincoln Laboratory

SLIDE 1

Learning to Plan with Logical Automata

Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1

1Massachusetts Institute of Technology 2Columbia University 3MIT Lincoln Laboratory

*Equal contributors

SLIDE 2

SLIDE 3

Goals

Learn to plan in an environment with rules

1. Learn the rules in a way that they can be easily interpreted by humans
2. Incorporate the rules into planning so that modifying the rules results in

predictable changes in behavior

SLIDE 4

Packing a Lunchbox

Pack a burger or a sandwich; then pack a banana

SLIDE 5

Goal 1 – Interpretability

Pack a burger or a sandwich; then pack a banana

Initial State Picked up

Packed

Picked up Packed

GOAL! Rules Finite State Automaton

SLIDE 6

Low-level MDP High-level MDP

Pack sandwich or burger; Then pack banana Avoid obstacles

Factoring the Environment

SLIDE 7

Discrete 2D gridworld Finite state automaton

Representing the Environment

Pack sandwich or burger; Then pack banana Avoid obstacles

Initial State Picked up

Packed

Picked up Packed

SLIDE 8

Goal 2 – Manipulability

Incorporate FSA into planning

Initial State Picked up

Packed

Picked up Packed

S0 S1 S2 S3 G S0

S0 S1 S2 S3 G T

SLIDE 9

Learn reward Learn transitions Learn transitions of FSA

One VIN for each FSA state

Differentiable Recursive Planning

Based on Tamar, Aviv, et al. "Value iteration networks." Advances in Neural Information Processing Systems. 2016.

SLIDE 10

Experiments - Interpretability

Propositions FSA States

S0

S0 S1 S2 S3 G T

SLIDE 11

S0

S0 S1 S2 S3 G T

Experiments - Interpretability

Picking up the sandwich or the hamburger causes a transition to the next state

SLIDE 12

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

Initial State Picked up

Packed

Picked up Packed

SLIDE 13

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

Initial State Picked up

Packed

Picked up Packed

SLIDE 14

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

S0

S0 S1 S2 S3 G T

SLIDE 15

S0

S0 S1 S2 S3 G T

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

SLIDE 16

S0

S0 S1 S2 S3 G T

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

SLIDE 17

Learning to Plan with Logical Automata

Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1

1Massachusetts Institute of Technology 2Columbia University 3MIT Lincoln Laboratory

*Equal contributors