[PPT] - Classical Planning CE417: Introduction to Artificial Intelligence PowerPoint Presentation

SLIDE 1

Classical Planning

CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017

AIMA, 3rd Edition, Chapter 10 & more about planning

Soleymani

SLIDE 2

What is planning?

2

 Planning problem: finding a sequence of actions that leads

to a goal state starting from any of the initial states

 Solution (obtained sequence of actions) is optimal if it

minimizes sum of action costs

 Search-based problem solving agents as a special case of

planning agents

 𝑆𝑓𝑡𝑣𝑚𝑢(𝑡, 𝑏) as a black-box function and states are also black-

boxes that are either goal or not goal.

SLIDE 3

General planning problem

3

 Environment can be

 dynamic, nondeterministic, partially observable, continuous,

multi-agent

 Actions may

 take time (have durations)  have continuous effects  be taken concurrently

 Initial state may be arbitrarily many and goal may be

ptimizing an objective function

SLIDE 4

Classical planning

4

 We focus on classical planning

 Environment: deterministic, static, fully observable, discrete,

single agent

 Actions: duration-less, taken only one at a time  Initial state: a unique known one  Goal state: specified goal states

 Importance

 Most of the recent progress are based on classical planning  Provides also useful idea for more complex problems

SLIDE 5

Applications

5

 Robotics  Spacecraft and Mars rover mission controls  Transportation of cargos, peoples, …  Interactive decision making

 Military operations  Astronomic observations

SLIDE 6

Why planning?

6

 Planning as a form of general problem solving

 Idea: problems described at high-level and solved automatically

 Representation of planning problems

 Scaling up to larger problems  Deriving domain independent heuristics automatically

SLIDE 7

Representation has a key role

7

 A state is represented more clear than atomic (black-box)

nes.

 Actions for state-transition are represented in a concise and

declarative manner.

 This type of representation can be used to solve problem

effectively.

SLIDE 8

Representation of states and actions

8

 Representation of states (logic, set theory, …)

 Conjunction of ground, functionless, and positive literals

 Closed world assumption: any fluent that are not mentioned are false

 Representation of actions (logic, set theory, …)

 Specifying the result of an action in terms of what changes

 e.g., described by sets of preconditions and effects (post-conditions)

SLIDE 9

Representing actions

9

 Actions are described in terms of preconditions and

effects.



Preconditions are predicates that must be true before applying the action



Effects are predicates that are made true (or false) after executing the action

SLIDE 10

Representation language

10

 Concise description  PDDL (Planning Domain Definition Language)

 States, actions and goals are described in the language of

symbolic logic

 Predicates denote particular features of the world.  Does not allow quantifiers and functions

 Other languages: STRIPS,ADL

SLIDE 11

PDDL description of a planning problem

11

 Initial state

 Conjunction of ground atoms

 Goal states

 Conjunction of literals (positive or negative) that may contain variables

 Variables are treated as existentially quantified

 Actions

 Action schema (lifted representation)

 Action name  List of variables  Precondition  Effect

SLIDE 12

Example: Air cargo transfer

 Domain

 Objects:

 airports (𝑇𝐺𝑃, 𝐾𝐺𝐿, ...), cargos (𝐷1, 𝐷2, …), airplanes (𝑄

1, 𝑄2, …)

 Predicates:

 𝐵𝑢(𝑞, 𝑏), 𝐽𝑜 𝑑, 𝑞 , 𝑄𝑚𝑏𝑜𝑓 𝑞 , 𝐷𝑏𝑠𝑕𝑝(𝑑), 𝐵𝑗𝑠𝑞𝑝𝑠𝑢(𝑏)

 States:

 Planes and cargos are at specific airports.

 Actions:

 𝑀𝑝𝑏𝑒 (𝑑𝑏𝑠𝑕𝑝, 𝑞𝑚𝑏𝑜𝑓, 𝑏𝑗𝑠𝑞𝑝𝑠𝑢)  𝐺𝑚𝑧 (𝑞𝑚𝑏𝑜𝑓, 𝑏𝑗𝑠𝑞𝑝𝑠𝑢1, 𝑏𝑗𝑠𝑞𝑝𝑠𝑢2)  𝑉𝑜𝑚𝑝𝑏𝑒 (𝑑𝑏𝑠𝑕𝑝, 𝑞𝑚𝑏𝑜𝑓, 𝑏𝑗𝑠𝑞𝑝𝑠𝑢)

12

SLIDE 13

Example: Air cargo transfer (actions in PDDL)

13

 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑝𝑏𝑒(𝑑, 𝑞, 𝑏),  PRECOND: 𝐵𝑢 𝑑, 𝑏 ∧ 𝐵𝑢 𝑞, 𝑏 ∧ 𝐷𝑏𝑠𝑕𝑝 𝑑 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑏 ,  EFFECT: ¬𝐵𝑢 𝑑, 𝑏 ∧ 𝐽𝑜 𝑑, 𝑞 )  𝐵𝑑𝑢𝑗𝑝𝑜(𝑉𝑜𝑚𝑝𝑏𝑒(𝑑, 𝑞, 𝑏),  PRECOND: 𝐽𝑜 𝑑, 𝑞 ∧ 𝐵𝑢 𝑞, 𝑏 ∧ 𝐷𝑏𝑠𝑕𝑝 𝑑 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑏 ,  EFFECT: 𝐵𝑢 𝑑, 𝑏 ∧ ¬𝐽𝑜 𝑑, 𝑞 )  𝐵𝑑𝑢𝑗𝑝𝑜(𝐺𝑚𝑧(𝑞, 𝑔𝑠𝑝𝑛, 𝑢𝑝),  PRECOND: 𝐵𝑢 𝑞, 𝑔𝑠𝑝𝑛 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑔𝑠𝑝𝑛 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑢𝑝 ,  EFFECT: ¬𝐵𝑢 𝑞, 𝑔𝑠𝑝𝑛 ∧ 𝐵𝑢 𝑞, 𝑢𝑝 )

SLIDE 14

Example: Blocks World

 Domain

 Objects:

 A set of blocks (𝐵, 𝐶, 𝐷, …) and a table (𝑈𝑏𝑐𝑚𝑓).

 Predicates:

 𝑃𝑜 𝑐, 𝑦 , 𝐷𝑚𝑓𝑏𝑠 𝑐 , 𝐶𝑚𝑝𝑑𝑙 𝑐

 States:

 Blocks are stacked on other blocks and the table.

 Actions (here, two actions):

 Move from a tower or table to another tower  Move to table

A C B A C B

initial state goal

14

SLIDE 15

Example: Blocks World

15

 Initial state  𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐷, 𝐵

∧ 𝐶𝑚𝑝𝑑𝑙(𝐵) ∧ 𝐶𝑚𝑝𝑑𝑙(𝐶) ∧ 𝐶𝑚𝑝𝑑𝑙(𝐷) ∧ 𝐷𝑚𝑓𝑏𝑠(𝐶) ∧ 𝐷𝑚𝑓𝑏𝑠(𝐷)

 Goal state  𝑃𝑜 𝐵, 𝐶 ∧ 𝑃𝑜 𝐶, 𝐷  Actions  𝑁𝑝𝑤𝑓(𝑐, 𝑦, 𝑧)  PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑧 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐

∧ 𝐶𝑚𝑝𝑑𝑙 𝑧 ∧ (𝑐 ≠ 𝑦) ∧ (𝑐 ≠ 𝑧) ∧ (𝑦 ≠ 𝑧)

 EFFECT: 𝑃𝑜 𝑐, 𝑧 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 ∧ ¬𝐷𝑚𝑓𝑏𝑠 𝑧  𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓(𝑐, 𝑦)  PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ (𝑐 ≠ 𝑦)  EFFECT: 𝑃𝑜 𝑐, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦

A C B A C B

SLIDE 16

Example: Blocks World in PDDL

16

 𝐽𝑜𝑗𝑢(𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐷, 𝐵 ∧ 𝐶𝑚𝑝𝑑𝑙 𝐵

∧ 𝐶𝑚𝑝𝑑𝑙 𝐶



∧ 𝐶𝑚𝑝𝑑𝑙 𝐷 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐶 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐷 )

 𝐻𝑝𝑏𝑚 𝑃𝑜 𝐵, 𝐶 ∧ 𝑃𝑜 𝐶, 𝐷  𝐵𝑑𝑢𝑗𝑝𝑜(𝑁𝑝𝑤𝑓 𝑐, 𝑦, 𝑧 ,  PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑧 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑧

∧ (𝑐 ≠ 𝑦) ∧ (𝑐 ≠ 𝑧) ∧ (𝑦 ≠ 𝑧),

 EFFECT: 𝑃𝑜 𝑐, 𝑧 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 ∧ ¬𝐷𝑚𝑓𝑏𝑠 𝑧 )  𝐵𝑑𝑢𝑗𝑝𝑜(𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓 𝑐, 𝑦 ,  PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ (𝑐 ≠ 𝑦),  EFFECT: 𝑃𝑜 𝑐, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 )

A C B A C B

SLIDE 17

Planning problem & approaches

17

 Planning problem:

 Given: start state, goal conditions, actions  Aim: finding a sequence of actions leading from start to goal

 The most popular approaches to solve it:

 Forward state-space search (+ heuristics)

 e.g., Fast-Forward (FF)

 Backward state-space search (+ constraints)

 e.g., GraphPlan

 Reduction to propositional satisfiability problem

 SATPlan

 Search in the space of plans

 Partial Order Planning (POP)

SLIDE 18

Planning as a state-space search

 States  Actions  Path Cost  Goal Test

18

SLIDE 19

C A B C B A A B C B A C A C B B C A A B C B C A A C B A B C A C B B C A A B C

19

Initial state Goal

SLIDE 20

Forward state-space search

 Progression: Starting from initial state and using actions to

reach a goal state.

A C B A C B A C B A C B A C B A C B A C B

. . .

20 A C B A C B A C B A C B

…

SLIDE 21

Forward state-space search (Cont.)

21

 An action 𝑏 is applicable in state 𝑡 if its precondition is

satisfied:

𝑄𝑆𝐹+(𝑏) ⊆ 𝑡 𝑄𝑆𝐹−(𝑏) ∩ 𝑡 = 

 The state 𝑡′ resulted when executing 𝑏 in 𝑡 is given by

(progressing 𝑡 through 𝑏):

𝑡′ = (𝑡– DEL(𝑏)) ∪ ADD(𝑏)

RESULTS 𝑡, 𝑏 = 𝑡 − DEL 𝑏 ∪ ADD(𝑏) ⇔ (𝑏 ∈ ACTIONS(𝑡))

SLIDE 22

Forward Search

Forward-Search(s, g) if s satisfies g then return [] applicable = {a | a is applicable in s} if applicable =  then return failure for each a  applicable do s’ = (𝑡– DEL(𝑏)) ∪ ADD(𝑏) ’ = Forward-Search(s’, g) if ’≠ failure then return [a|’] return failure

22

SLIDE 23

Backward state-space search

 An action 𝑏 is relevant for 𝑕, if 𝑏 can be the last step in

a plan leading to 𝑕: 𝑕 ∩ ADD 𝑏 ≠  𝑕 ∩ DEL(𝑏) = 

 Regression: To achieve goal 𝑕, we regress it through a

relevant action 𝑏 (𝑏 as final step of plan to reach 𝑕):

𝑕′ = (𝑕 – ADD(𝑏)) ∪ PRE(𝑏)

23

A C B

𝑁𝑝𝑤𝑓(𝐵, 𝑦, 𝐶)

?

SLIDE 24

Regression Example

24

 𝑕𝑝𝑏𝑚 = {𝑃𝑜(𝐶, 𝐷), 𝑃𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵)}  Relevant action: 𝑏 = 𝑁𝑝𝑤𝑓(𝐶, 𝐵, 𝐷)

g ∩ ADD(a) = {On(B,C)} ≠  g ∩ DEL(a) = 

 Regression (add preconds. of 𝑏, remove predicates in add list 𝑏)

 𝑕𝑝𝑏𝑚 = {𝑝𝑜(𝐶, 𝐷), 𝑝𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵), 𝑝𝑜(𝐶, 𝐵), 𝑑𝑚𝑓𝑏𝑠(𝐶), 𝑑𝑚𝑓𝑏𝑠(𝐷)}

ADD(𝑏): 𝑃𝑜(𝐶, 𝐷) DEL(𝑏): 𝑃𝑜(𝐶, 𝐵) PREC(𝑏): 𝑃𝑜(𝐶, 𝐵), 𝐷𝑚𝑓𝑏𝑠(𝐶), 𝐷𝑚𝑓𝑏𝑠(𝐷) B A C

move(B,A,C) ???

SLIDE 25

Backward Search

Backward-Search(s, g) if s satisfies g then return [] relevant = {a | a is relevant to g} if relevant =  then return failure for each a  relevant do g’ =(𝑕 – ADD(𝑏)) ∪ PREC(𝑏) ’ = Backward-Search(s, g’) if ’≠ failure then return [’|a] return failure

25

SLIDE 26

Backward Search

26

 Instantiating Schema

 Goal as a conjunction of literals that may contain variables  T

be

more efficient, instantiate schema variables by unification, rather than generating and testing different actions

 For most domains, it has lower branching factor than

forward search

 Heuristics are more difficult to use

 It is based on set of states rather than individual states.

SLIDE 27

Regression Example

27

 𝑕𝑝𝑏𝑚 = {𝑃𝑜(𝐶, 𝐷), 𝑃𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵)}  Relevant action: 𝑏 = 𝑁𝑝𝑤𝑓(𝐶, ? , 𝐷)

g ∩ ADD(a) = {On(B,C)} ≠  g ∩ DEL(a) = 

 Regression (add preconds. of 𝑏, remove predicates in add list 𝑏)

 𝑕𝑝𝑏𝑚 = {𝑝𝑜(𝐶, 𝐷), 𝑝𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵), 𝑝𝑜(𝐶, ? ), 𝑑𝑚𝑓𝑏𝑠(𝐶), 𝑑𝑚𝑓𝑏𝑠(𝐷)}

ADD(𝑏): 𝑃𝑜(𝐶, 𝐷) DEL(𝑏): 𝑃𝑜(𝐶, ? ) PREC(𝑏): 𝑃𝑜(𝐶, ? ), 𝐷𝑚𝑓𝑏𝑠(𝐶), 𝐷𝑚𝑓𝑏𝑠(𝐷) B A C

move(B,?,C) ???

SLIDE 28

State-space search problems

28

 Both of forward and backward algorithms may have

repeated states problem

 visited states must be recorded

 What’s wrong with search?

 Branching factor is usually too high.

 Combinatorial explosion if state given by set of possible worlds/logical

interpretations/variable assignments

SLIDE 29

Heuristic for planning

29

 Solving problems by searching atomic states (Chapter 3)

 Human intelligence is usually used to define domain-specific

heuristics

 Assumption:“path cost = number of plan steps”

 We want to estimate # of steps needed to reach 𝑕 from 𝑡

 In planning, problems we use factored representation of

states

 Allows us to find domain-independent heuristics

SLIDE 30

Heuristic for planning

 Heuristics:

 Relaxed problems:

 Ignore delete lists  Ignore preconditions

 Problem decomposition

 Sub-goal independence assumption

30

SLIDE 31

Heuristics: relaxed problems

31

1)

Ignore delete lists:

Delete negative effects from actions, solve relaxed problem and use the length of plan as heuristic

 Admissible?  Can we solve this problem in polynomial time?

2)

Ignore preconditions:

Delete all preconditions from actions, solve relaxed problem and use the length of plan as heuristic

 Admissible?  Can we solve this problem in polynomial time?

SLIDE 32

Heuristics: problem decomposition

32

𝑔(𝑞, 𝑡): minimum # of steps needed to reach proposition 𝑞 from 𝑡

 Sum of the cost of reaching each sub-goal from 𝑡

ℎ𝑡𝑣𝑛(𝑡) =

𝑕 ∈𝐻

𝑔(𝑕, 𝑡)

 Not necessarily admissible

 independence assumption can be pessimistic

 Max of the cost of reaching each sub-goal from 𝑡

ℎ𝑛𝑏𝑦(𝑡) = m𝑏𝑦

𝑕 ∈𝐻 𝑔(𝑕, 𝑡)

SLIDE 33

Heuristics: problem decomposition (sum or max)

33

 Max or sum?

 Admissibility vs. accuracy  Sum works well in practice for problems that are largely

decomposable.

 How to compute 𝑔(𝑞, 𝑡)?

SLIDE 34

Ignore delete lists & problem decomposition

34

 When both ignoring delete lists & decomposing the

problem

 we can compute 𝑔(𝑞, 𝑡)

in polynomial time using the Planning Graph (we will see it in the next slides).

 Examples of such heuristics used in these planners:

 HSP  Fast-Forward (FF)

 Competed in fully automated track of AIPS’2000

 Granted ``Group A distinguished performance Planning System'‘

 Estimate the heuristic with the help of a planning graph

J. Hoffman, B. Nebel, “The FF planning system: Fast plan generation through heuristic search”, Journal
f Artificial Intelligence Research 14 (2001), 253-302

SLIDE 35

Planning graph

35

 A way to find accurate heuristics  (Under)estimating no. steps required to reach 𝑕

 Admissible

 A layered graph that keeps track of literal pairs and action

pairs that cannot be reached simultaneously (mutexes)

SLIDE 36

Planning graph: structure

36

 Directed, leveled graph

 Two types of levels:

 𝑄: proposition levels  𝐵: action levels  Proposition and action levels alternate

 Edges (between levels)

 Precondition: each action at 𝐵𝑗 is connected to its preconditions at 𝑄𝑗  Effect: each action at 𝐵𝑗 is connected to its effects at 𝑄𝑗+1

SLIDE 37

Planning graph: layers

37

 𝑄𝑗 contains all the literals that could hold at time 𝑗  𝐵𝑗 contains all actions whose preconditions are satisfied in 𝑄𝑗

plus no-op actions (to solve frame problem).

… … …

(Initial state) 𝑄0 𝐵0 𝑄

1

𝐵1 𝑄2

SLIDE 38

Planning graph: layers

38

𝑄0 = {𝑞 ∈ 𝐽𝑜𝑗𝑢} 𝐵𝑗 = {𝑏 is an action| PRECONDS(𝑏) ⊆ 𝑄𝑗} 𝑄𝑗+1 = {𝑞 ∈ EFFECT(𝑏)| 𝑏 ∈ 𝐵𝑗}

… … …

(Initial state) 𝑄0 𝐵0 𝑄

1

𝐵1

SLIDE 39

Planning graph: Cake example

39

no-op action



𝐽𝑜𝑗𝑢(𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓))



𝐻𝑝𝑏𝑚 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓) ∧ 𝐹𝑏𝑢𝑓𝑜(𝐷𝑏𝑙𝑓)



𝐵𝑑𝑢𝑗𝑝𝑜(𝐹𝑏𝑢(𝐷𝑏𝑙𝑓)



PRECOND: 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓)



EFFECT: ¬𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓) ∧ 𝐹𝑏𝑢𝑓𝑜 𝐷𝑏𝑙𝑓 )



𝐵𝑑𝑢𝑗𝑝𝑜(𝐶𝑏𝑙𝑓(𝐷𝑏𝑙𝑓)



PRECOND: ¬𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓)



EFFECT: 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓))

SLIDE 40

Planning graph: Spare tire example

40

𝐽𝑜𝑗𝑢 𝑈𝑗𝑠𝑓 𝐺𝑚𝑏𝑢 ∧ 𝑈𝑗𝑠𝑓 𝑇𝑞𝑏𝑠𝑓 ∧ 𝐵𝑢 𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓 ∧ 𝐵𝑢 𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙 𝐻𝑝𝑏𝑚(𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑓𝑛𝑝𝑤𝑓(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝐹𝐺𝐺𝐹𝐷𝑈: ¬𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑) ∧ 𝐵𝑢(𝑝𝑐𝑘, 𝐻𝑠𝑝𝑣𝑜𝑒)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑄𝑣𝑢𝑃𝑜(𝑢, 𝐵𝑦𝑚𝑓), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝑈𝑗𝑠𝑓(𝑢) ∧ 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓) 𝐹𝐺𝐺𝐹𝐷𝑈:  𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑏𝑤𝑓𝑃𝑤𝑓𝑠𝑜𝑗𝑕ℎ𝑢, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓))

SLIDE 41

Planning graph: Spare tire example

41

SLIDE 42

Planning graphs: properties

42

 In level 𝑄𝑗, both 𝑄 and ¬𝑄 may exist.  A literal may appear at level 𝑄𝑗 while actually it could not

be true until a later level (if any)

 A literal will never appear late in the planning graph.

SLIDE 43

Planning graphs: cost of each goal literal

43

 How difficult is it to achieve a goal literal 𝑕𝑗 from 𝑡?

 Level-cost of 𝑕𝑗 (𝑚𝑑(𝑕𝑗, 𝑡)) : It shows the first level of PG at which 𝑕𝑗

appears.

 Relation to previously introduced heuristics?

 Is it accurate?

SLIDE 44

Planning graphs: heuristics

44

ℎmax _𝑚𝑓𝑤𝑓𝑚(𝑡) = m𝑏𝑦

𝑕𝑗 ∈ 𝑕𝑝𝑏𝑚 𝑚𝑑(𝑕𝑗, 𝑡)

ℎ𝑚𝑓𝑤𝑓𝑚_𝑡𝑣𝑛(𝑡) =

𝑕𝑗 ∈ 𝑕𝑝𝑏𝑚

𝑚𝑑(𝑕𝑗, 𝑡)

SLIDE 45

Planning graphs: constraints

45

 Mutual exclusion (mutex) links

 Two actions at a given action level are mutually exclusive if

no valid plan could possibly contain both.

 Two propositions at a given proposition level are mutually

exclusive if no valid plan could possibly make both true.

 This structure helps in reducing the search for a sub-graph

f a Planning Graph that might correspond to a valid plan.

SLIDE 46

Planning graphs: constraints

46

 Mutexes between actions

 Inconsistent effects: one action negates an effect of the other  Interference: one of the effects of one action is the negation of a

precondition of the other

 Competing needs: mutually exclusive preconditions

 Mutexes between literals

 One of the literals is the negation of the other  Inconsistent support: Each possible pair of actions that could achieve

them (in this level) is mutually exclusive.

SLIDE 47

Planning graphs: constraints Types of mutexes

Inconsistent Effects Inconsistent Support Competing Needs Interference (Prec-Effect)

47

SLIDE 48

Planning graph: Spare tire example

48

SLIDE 49

Planning graph: more accurate heuristic

49

 We want to define a more accurate heuristic using the

mutexes: ℎ2 (set-level heuristic): the level at which all the goal literals appear without any pair of them being mutually exclusive.

 ℎ1(max-level heuristic) is extended to ℎ2 considering

mutexes between all pairs of propositions.

 ℎ2 is more useful than ℎ1 (0 ≤ ℎ1 ≤ ℎ2 ≤ ℎ∗)

SLIDE 50

Planning graph: more accurate heuristics

50

 ℎ2 can be extended to ℎ3 by defining and considering

inconsistencies of triplets of propositions

 In general

 ℎ𝑙 are admissible  ℎ𝑙+1 ≥ ℎ𝑙  Computing ℎ𝑙 is 𝑃(𝑜𝑙) with 𝑜 propositions

 𝑙 = 2 is commonly used

SLIDE 51

GraphPlan: basic idea

51

 Construct a graph that encodes constraints on plans  Use this graph to constrain search for a valid plan:

 If a valid plan exists it is a sub-graph of the Planning Graph.

 Actions at the same level don’t interfere  Each action’s preconditions are made true by the plan  Goals are satisfied

 Planning

graph can be built for each problem in polynomial time.

SLIDE 52

GraphPlan: level off

 Definition: Planning Graph levels off if two consecutive

proposition levels are identical (both literals and mutexes).

 We will show that the set of literals never decreases in

the proposition levels and mutexes don’t reappear.

52

SLIDE 53

GraphPlan: level off (Observation 1)

Literals monotonically increase

p ¬q ¬r p q ¬q ¬r p q ¬q r ¬r p q ¬q r ¬r A B A B 53

Propositions are always carried forward by no-ops.

A

SLIDE 54

GraphPlan: level off (Observation 2)

Actions monotonically increase (Once an action appears at a level, it will appear at all subsequent levels)

54

If preconds. of an action appear at one level, they will appear at subsequent levels and thus the action will appear so.

p ¬q ¬r p q ¬q ¬r p q ¬q r ¬r p q ¬q r ¬r A B A B A

SLIDE 55

GraphPlan: level off (Observation 3)

Proposition mutex relationships monotonically decrease

p q r … A p q r … p q r … 55

Available actions are monotonically increasing. Thus mutex relations between literals are decreasing.

(When mutexes between literals are due to mutex relations between actions, they may be removed in the next levels)

SLIDE 56

GraphPlan: level off (Observation 4)

Action mutex relationships monotonically decrease

p q … B p q r s … p q r s … A C B C A p q r s … B C A 56

Mutex relations between actions due to competing needs (when preconditions are not negations of each other) must be decreasing.

SLIDE 57

GraphPlan Algorithm

1)

Graph levels are constructed until all goals are reached and not mutex.

 If PG levels off before reaching this level, GraphPlan returns

failure.

2)

ExtractSolution phase: search the PG for a valid plan

3)

If non found, add a level to the PG and go to step 2.

57

GraphPlan builds graph forward and extracts plan backwards

necessary, but usually insufficient condition for plan existence

SLIDE 58

GraphPlan: “Extract Solution” phase

58

 Some ways

 As a backward search

 looks for actions that produce goals while pruning as many of them as

possible via incompatibility information.

 As a heuristic search computes an admissible heuristic

for each state and then uses it during search.

 As a CSP (related to SATPlan algorithm)

 Variables: a variable for an action at each level  Domain={0,1}  Constraints: mutexes

SLIDE 59

Extract Solution: backward search

59

Start from the last level & agenda=goals

 Termination: 𝑙 = 0  Action Selection: At each level 𝑙, select any conflict-free subset

f actions in 𝐵𝑙−1 whose effects cover current goals.

 If no such subset is found return failure

 Preconditions of selected actions become new goals for

recursive call at level 𝑙 − 1.

SLIDE 60

GraphPlan: Example

60

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

SLIDE 61

GraphPlan: Example

61

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

SLIDE 62

GraphPlan: Example

62

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)

SLIDE 63

GraphPlan: Example

63

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)

SLIDE 64

GraphPlan: Example

64

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)

SLIDE 65

GraphPlan: Example

65

𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)

𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)

𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)

𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo

𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)

SLIDE 66

GraphPlan: heuristics for backward search

66

 Pick first the goal literal with the highest level cost  To achieve a literal prefer actions with easier preconds.

 Sum (or max) of the level costs of its preconds. is smallest.

SLIDE 67

Planning as a satisfiability problem

 Bounded planning problem (𝑄, 𝑙):

 𝑄 is a planning problem  Find a solution for 𝑄 of length 𝑙

1)

Translate (𝑄, 𝑙) into a SAT problem.

2)

Solve SAT problem.

3)

Convert the solution to a plan

67

Translate to PL Satisfiability Solver Decode solution Planning problem Satisfiability problem Logical model Solution plan

SLIDE 68

68

Pictorial view of fluent for (P,k)

… …

s0 sk ak-1

Initial state fluents (t=0) state fluents at t=k action fluents at t=k-1

… … …

 Truth assignment selects a subset of these nodes to be true  Propositional formulas correspond to valid plans

a0

…

SLIDE 69

69

Translating PDDL to propositional logic

 Initial state: Conjunction of all true literals at time 0 (and negation of not

mentioned literals)

 Goal state: Conjunction of all goal literals at time 𝑙

 Instantiate literals containing variable (replace with ∨ over constants).

 Actions

 successor-state axioms at each time up to 𝑢

 𝐺𝑢+1 ⇒ 𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝐺𝑢 ∨ (𝐺𝑢 ∧ ¬𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝑂𝑝𝑢𝐺𝑢)

 precondition axioms:

 𝐵𝑢 ⇒ PRECOND 𝐵 𝑢

 action exclusion axioms:

 ¬𝐵𝑗

𝑢 ∨ ¬𝐵𝑘 𝑢

SLIDE 70

70

Translating PDDL to propositional logic: Example

 Initial state: Conjunction of all true literals at time 0 (and negation of not

mentioned literals)

 𝐽𝑜𝑗𝑢(𝑃𝑜(𝐵, 𝐶) ∧ 𝑃𝑜(𝐶, 𝑈𝑏𝑐𝑚𝑓))  𝑃𝑜 𝐵, 𝐶 0 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 0 ∧ ¬𝑃𝑜 𝐶, 𝐵 0 ∧ ¬𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 0

 Goal state: Conjunction of all goal literals at time 𝑙

 Instantiate literals containing variable (replace with ∨ over constants).  𝐻𝑝𝑏𝑚(𝑃𝑜(𝐶, 𝐵))  𝑃𝑜 𝐶, 𝐵 1 (for 𝑙 = 1)

B A A B

SLIDE 71

Translating PDDL to propositional logic: Example

71

 Add successor-state axioms at each time up to 𝑢

 𝐺𝑢+1 ⇒ 𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝐺𝑢 ∨ (𝐺𝑢 ∧ ¬𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝑂𝑝𝑢𝐺𝑢)

 Example: 𝑃𝑜 𝐶, 𝐵 𝑢+1 ⇒ 𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 𝑢 ∨ [On B, A t ∧ ¬𝑁𝑝𝑤𝑓 𝐶, 𝐵, 𝑈𝑏𝑐𝑚𝑓 𝑢]

 Add precondition axioms:

 𝐵𝑢 ⇒ PRECOND 𝐵 𝑢

 Example: 𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 𝑢 ⇒ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 𝑢 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐶 𝑢 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐵 𝑢

 Is it necessary to include effects: 𝐵𝑢 ⇒ EFFECT 𝐵 𝑢+1 ?

 Add action exclusion axioms:

 ¬𝐵𝑗

𝑢 ∨ ¬𝐵𝑘 𝑢

 Example: ¬𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 0 ∨ ¬𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓 𝐵, 𝐶 0

SLIDE 72

Propositional logic solver and decoding

72

 Apply a SAT solver to the whole sentence 

 : conjunction of encoding initial state, goals, successor-state axioms,

precondition axioms, action exclusion axioms

 If an assignment of truth values that satisfies  is found,

extract action sequence.

 This means 𝑄 has a solution of length 𝑙

 Extract solution: For 𝑗 = 0, … , 𝑙 − 1, there is exactly one

action that has been assigned “True”

 This is the 𝑗’th action of the plan.

SLIDE 73

SATPlan

73

function SATPLAN(𝑗𝑜𝑗𝑢, 𝑢𝑠𝑏𝑜𝑡𝑗𝑢𝑗𝑝𝑜, 𝑕𝑝𝑏𝑚, 𝑈_𝑛𝑏𝑦) returns solution or failure inputs: 𝑗𝑜𝑗𝑢, 𝑢𝑠𝑏𝑜𝑡𝑗𝑢𝑗𝑝𝑜, 𝑕𝑝𝑏𝑚, constitute a description of the problem 𝑈_𝑛𝑏𝑦, an upper limit for plan length for 𝑢 = 0 to 𝑈_max do 𝑑𝑜𝑔 ← TRANSLATE_TO_SAT(𝑗𝑜𝑗𝑢, 𝑢𝑠𝑏𝑜𝑡𝑗𝑢𝑗𝑝𝑜, 𝑕𝑝𝑏𝑚, 𝑢 ) 𝑛𝑝𝑒𝑓𝑚 ← SAT_SOLVER(𝑑𝑜𝑔) if 𝑛𝑝𝑒𝑓𝑚 ≠ {} then return EXTRACT_SOLUTION(𝑛𝑝𝑒𝑓𝑚) return 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 It is guaranteed to find the shortest plan if one exist.

SLIDE 74

SATPlan example

 Domain:

 Robot 𝑆  T

wo locations 𝑀1, 𝑀2

 One operator “move” the robot

 Initial state: 𝐵𝑢(𝑆, 𝑀1)  Goal: 𝐵𝑢(𝑆, 𝑀2)  Action schema:

 𝑁𝑝𝑤𝑓 𝑠, 𝑚, 𝑚’

 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑠, 𝑚)  𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑠, 𝑚’) ∧ ¬𝐵𝑢(𝑠, 𝑚)

74

𝑀1 𝑀2

SLIDE 75

SATPlan example (translation to SAT)

75

 Encode (𝑄, 1)

 Initial state:

 𝐵𝑢(𝑆, 𝑀1, 0) ∧ ¬ 𝐵𝑢(𝑆, 𝑀2, 0)

 Goal:

 𝐵𝑢(𝑆, 𝑀2, 1)

 Actions preconditions:

 𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0) ⇒ 𝐵𝑢(𝑆, 𝑀1, 0)  𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ⇒ 𝐵𝑢(𝑆, 𝑀2, 0)

 Action exclusion axiom:

 ¬𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ∨ ¬𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0)

SLIDE 76

SATPlan example (translation to SAT)

 Fluents (Success-state axioms):



𝐵𝑢 𝑆, 𝑀1, 0  𝐵𝑢 𝑆, 𝑀1, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0



𝐵𝑢 𝑆, 𝑀2, 0  𝐵𝑢 𝑆, 𝑀2, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0



𝐵𝑢 𝑆, 𝑀1, 0 𝐵𝑢 𝑆, 𝑀1, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0

 𝐵𝑢 𝑆, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0

76

SLIDE 77

SATPlan example (translation to SAT)

𝐵𝑢(𝑆, 𝑀1, 0) ∧ ¬𝐵𝑢(𝑆, 𝑀2, 0) ∧ 𝐵𝑢(𝑆, 𝑀2, 1) ∧ [𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0  𝐵𝑢 𝑆, 𝑀1, 0 ] ∧ [𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0  𝐵𝑢 𝑆, 𝑀2, 1 ] ∧ [¬𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ∨ ¬𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0)] ∧ [𝐵𝑢 𝑆, 𝑀1, 0  𝐵𝑢 𝑆, 𝑀1, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0 ] ∧ [𝐵𝑢 𝑆, 𝑀2, 0  𝐵𝑢 𝑆, 𝑀2, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 ] ∧ [𝐵𝑢 𝑆, 𝑀1, 0 𝐵𝑢 𝑆, 𝑀1, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 ] ∧ 𝐵𝑢 𝑆, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1  𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0

Above formula is converted to CNF and solved by a SAT solver.

77

SAT formula for (P,1)

SLIDE 78

SATPlan example (Extracting a plan)

78

  can be satisfied with 𝑛𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0) = 𝑢𝑠𝑣𝑓  ⇒ 𝑛𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0) is a solution (and the only one) for

panning problem with 1 step plan

SLIDE 79

Layered Plans in SATPlan

 Complete exclusion axiom (only one action at a time):

 For all pairs of actions at each time step i:

𝑏𝑗   𝑐𝑗

 Partial exclusion axiom (more than one action could be

taken at a time step):

 For any pair of incompatible actions (recall from Graphplan):

𝑏𝑗   𝑐𝑗

 Fewer time steps may be required (i.e. shorter formulas)

79

SLIDE 80

Solving SAT problem

80

 Systematic search

 DPLL (Davis Putnam Logemann Loveland)

 Local search

 WalkSAT

SLIDE 81

Partial order planning Sock-shoe example: PDDL

81  𝐽𝑜𝑗𝑢()  𝐻𝑝𝑏𝑚(𝑆𝑗𝑕ℎ𝑢𝑇ℎ𝑝𝑓𝑃𝑜  𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓𝑃𝑜)  𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑗𝑕ℎ𝑢𝑇ℎ𝑝𝑓,  PRECOND: 𝑆𝑗𝑕ℎ𝑢𝑇𝑝𝑑𝑙𝑃𝑜,  EFFECT: 𝑆𝑗𝑕ℎ𝑢𝑇ℎ𝑝𝑓𝑃𝑜))  𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑗𝑕ℎ𝑢𝑇𝑝𝑑𝑙,  EFFECT: 𝑆𝑗𝑕ℎ𝑢𝑇𝑝𝑑𝑙𝑃𝑜))  𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓,  PRECOND: 𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙𝑃𝑜,  EFFECT: 𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓𝑃𝑜)  𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙,  EFFECT: 𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙𝑃𝑜)

SLIDE 82

82 Left Sock

Start Finish

Right Shoe Left Shoe Right Sock

Start Right Sock Finish Left Shoe Right Shoe Left Sock Start Start Start Start Start Right Sock Right Sock Right Sock Right Sock Right Sock Left Sock Left Sock Left Sock Left Sock Left Sock Left Sock Right Shoe Right Shoe Right Shoe Right Shoe Right Shoe Left Shoe Left Shoe Left Shoe Left Shoe Finish Finish Finish Finish Finish

Left Shoe on Right Shoe on Left Sock on Right Sock on

Partial Order Plans: Total Order Plans:

SLIDE 83

Partial Order Planning

 Two initial actions

 Start

 No precondition  All ‘Initial State’ as its effects

 Finish

 All ‘Goal State’ as its precondition  No Effect

83

SLIDE 84

Partial plan definition

 Partial plan is a < 𝐵, 𝑃, 𝑀 > where:

 𝐵: set of actions in the plan (plan steps)

 Initially {Start, Finish}

 𝑃: set of orderings between actions

 Initially {Start<Finish}

 𝑀: set of causal links

 Initially {}

84

SLIDE 85

Causal links and threats

 Causal Link: serve to record the purpose of steps in the plan

 Purpose of 𝐵𝑗 is to achieve the precondition 𝑑 of 𝐵𝑘

 Threat: causal links are used to detect when a newly introduced

action interferes with past decisions.

 𝐵𝑙 threatens 𝐵𝑗

𝑑 𝐵𝑘 when:

𝐵𝑙 can become between 𝐵𝑗 and 𝐵𝑘 (𝑃 ∪ {𝐵𝑗 < 𝐵𝑙 < 𝐵𝑘} is consistent)
𝐵𝑙 has ¬𝑑 as an effect.

𝐵𝑗 𝐵𝑘 𝑑

85

SLIDE 86

Resolving Threats

86

 Resolve Threat: ensuring that threats are ordered to come

before or after the protected link

 Demotion (placed before): add 𝑇3 < 𝑇1 to 𝑃  Promotion (placed after): add 𝑇2 < 𝑇3 to 𝑃

SLIDE 87

Spare tire example

87

𝐽𝑜𝑗𝑢 𝑈𝑗𝑠𝑓 𝐺𝑚𝑏𝑢 ∧ 𝑈𝑗𝑠𝑓 𝑇𝑞𝑏𝑠𝑓 ∧ 𝐵𝑢 𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓 ∧ 𝐵𝑢 𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙 𝐻𝑝𝑏𝑚(𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑓𝑛𝑝𝑤𝑓(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝐹𝐺𝐺𝐹𝐷𝑈: ¬𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑) ∧ 𝐵𝑢(𝑝𝑐𝑘, 𝑕𝑠𝑝𝑣𝑜𝑒)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑄𝑣𝑢𝑃𝑜(𝑢, 𝑏𝑦𝑚𝑓), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝑈𝑗𝑠𝑓(𝑢) ∧ 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓) 𝐹𝐺𝐺𝐹𝐷𝑈:  𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑏𝑤𝑓𝑃𝑤𝑓𝑠𝑜𝑗𝑕ℎ𝑢, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓))

SLIDE 88

Spare tire example

88

SLIDE 89

Spare tire example

89

SLIDE 90

Spare tire example

90

SLIDE 91

Spare tire example

91

SLIDE 92

Spare tire example

92

SLIDE 93

POP

93

 Agenda: open preconditions (along with actions requiring them)

 Initially all preconditions of End

 function POP(< 𝐵, 𝑃, 𝑀 >, 𝑏𝑕𝑓𝑜𝑒𝑏)  𝐣𝐠 𝑏𝑕𝑓𝑜𝑒𝑏 = {} then return (< 𝐵, 𝑃, 𝑀 >)  (𝑟, 𝐵𝑜𝑓𝑓𝑒) ← Select a goal from agenda  𝑏 ← Choose an action that adds 𝑟  if no such action then return failure  Update < 𝐵, 𝑃, 𝑀 > and 𝑏𝑕𝑓𝑜𝑒𝑏  Add consistent ordering constraints for causal link protection  if no constraint is consistent then return failure  POP(< 𝐵, 𝑃, 𝑀 >, 𝑏𝑕𝑓𝑜𝑒𝑏)

SLIDE 94

POP algorithm (more details)

POP(<A,O,L>, agenda)

2. Goal selection: Let <Q,Aneed> be a pair on the agenda
3. Action selection: Let Aadd = choose an action that adds Q

if no such action exists, then return failure Let L’= L  {Aadd → Aneed}, and let O’ = O  {Aadd < Aneed}. If Aadd is newly instantiated, then A’ = A  {Aadd} and O’ = O  {A0 < Aadd < A} (otherwise, let A’ = A)

4. Updating of goal set: Let agenda’ = agenda -{<Q,Aneed>}.

If Aadd is newly instantiated, then for each conjunction, Qi,

f its precondition, add <Qi,Aadd> to agenda’
5. Causal link protection: For every action At that might

threaten a causal link Ap → Ac, add a consistent

rdering constraint, either

(a) Demotion: Add At < Ap to O’ (b) Promotion: Add Ac < At to O’ If neither constraint is consistent, then return failure

6. Recursive invocation: POP((<A’,O’,L’>, agenda’)
1. Termination: If agenda is empty return <A,O,L>

Q p

94

SLIDE 95

Shopping example

95  𝐽𝑜𝑗𝑢(𝐵𝑢(𝐼𝑝𝑛𝑓)  𝑇𝑓𝑚𝑚𝑡(𝐼𝑋𝑇, 𝐸𝑠𝑗𝑚𝑚)  𝑇𝑓𝑚𝑚𝑡(𝑇𝑁, 𝑁𝑗𝑚𝑙)𝑇𝑓𝑚𝑚𝑡(𝑇𝑁, 𝐶𝑏𝑜𝑏𝑜𝑏))  𝐻𝑝𝑏𝑚(𝐼𝑏𝑤𝑓(𝐸𝑠𝑗𝑚𝑚)  𝐼𝑏𝑤𝑓 𝑁𝑗𝑚𝑙  𝐼𝑏𝑤𝑓 𝐶𝑏𝑜𝑏𝑜𝑏  𝐵𝑢(𝐼𝑝𝑛𝑓))  𝐵𝑑𝑢𝑗𝑝𝑜(𝐻𝑝(𝑢ℎ𝑓𝑠𝑓)  PRECOND: 𝐵𝑢(ℎ𝑓𝑠𝑓),  EFFECT: 𝐵𝑢(𝑢ℎ𝑓𝑠𝑓) ∧ ¬𝐵𝑢(ℎ𝑓𝑠𝑓)) 

𝐵𝑑𝑢𝑗𝑝𝑜(𝐶𝑣𝑧(𝑦),

 PRECOND: 𝐵𝑢(𝑡𝑢𝑝𝑠𝑓)  𝑇𝑓𝑚𝑚𝑡(𝑡𝑢𝑝𝑠𝑓, 𝑦),  EFFECT: 𝐼𝑏𝑤𝑓(𝑦))

SLIDE 96

Shopping example

96

 Many possible ways to elaborate the initial plan

 Three 𝐶𝑣𝑧 actions for three preconditions of Finish action  𝑇𝑓𝑚𝑚𝑡 precondition of Buy

 Bold arrows: causal links, protection of precondition  Light arrows: ordering constraints

SLIDE 97

Shopping example

97

SLIDE 98

Shopping example

98

SLIDE 99

Shopping example

99

SLIDE 100

Shopping example

100

SLIDE 101

Shopping example

101

backtracking

SLIDE 102

Shopping example

102

SLIDE 103

Shopping example

103

SLIDE 104

Shopping example

104

SLIDE 105

Shopping example

105

SLIDE 106

Shopping example

106

SLIDE 107

POP: advantages and disadvantages

 Least commitment may lead to smaller branching factor  Postpone instantiating actions (postpone binding values to

variables until necessary): e.g., Move(x,B,y) when needing Clear(B).

 More human-like plan  More complex algorithm than recent planning algorithms

 higher computation per-node

 Harder to find proper heuristics for all types of choices

 Action selection, goal selection, order refinement

 How to prune infinite long paths?

107

SLIDE 108

Planning advantages

108

 Planning models are more efficient: 1)

Clear action and goal representation to allow selection

2)

Problem decomposition by sub-goaling

some goals are independent of most other parts, thus we can use divide-and-conquer strategy 3)

Requirement relaxation for sequential construction of solutions

SLIDE 109

Summary

 GraphPlan: winner of 1998 contest  SATPlan: winner of 2004, 2006 contest  POP (introduced in mid 1970’s): not competitive to

GraphPlan and SATPlan

 Partially ordered plans  Can generate more human-like plans that can be checked by

human operators

109