Classical Planning
CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017
AIMA, 3rd Edition, Chapter 10 & more about planning
Classical Planning CE417: Introduction to Artificial Intelligence - - PowerPoint PPT Presentation
Classical Planning CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani AIMA, 3 rd Edition, Chapter 10 & more about planning What is planning? Planning problem: finding a sequence of
CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017
AIMA, 3rd Edition, Chapter 10 & more about planning
2
Solution (obtained sequence of actions) is optimal if it
𝑆𝑓𝑡𝑣𝑚𝑢(𝑡, 𝑏) as a black-box function and states are also black-
3
dynamic, nondeterministic, partially observable, continuous,
take time (have durations) have continuous effects be taken concurrently
4
Environment: deterministic, static, fully observable, discrete,
Actions: duration-less, taken only one at a time Initial state: a unique known one Goal state: specified goal states
Most of the recent progress are based on classical planning Provides also useful idea for more complex problems
5
Military operations Astronomic observations
6
Idea: problems described at high-level and solved automatically
Scaling up to larger problems Deriving domain independent heuristics automatically
7
8
Conjunction of ground, functionless, and positive literals
Closed world assumption: any fluent that are not mentioned are false
Specifying the result of an action in terms of what changes
e.g., described by sets of preconditions and effects (post-conditions)
9
10
States, actions and goals are described in the language of
Predicates denote particular features of the world. Does not allow quantifiers and functions
11
Conjunction of ground atoms
Conjunction of literals (positive or negative) that may contain variables
Variables are treated as existentially quantified
Action schema (lifted representation)
Action name List of variables Precondition Effect
airports (𝑇𝐺𝑃, 𝐾𝐺𝐿, ...), cargos (𝐷1, 𝐷2, …), airplanes (𝑄
1, 𝑄2, …)
𝐵𝑢(𝑞, 𝑏), 𝐽𝑜 𝑑, 𝑞 , 𝑄𝑚𝑏𝑜𝑓 𝑞 , 𝐷𝑏𝑠𝑝(𝑑), 𝐵𝑗𝑠𝑞𝑝𝑠𝑢(𝑏)
12
13
𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑝𝑏𝑒(𝑑, 𝑞, 𝑏), PRECOND: 𝐵𝑢 𝑑, 𝑏 ∧ 𝐵𝑢 𝑞, 𝑏 ∧ 𝐷𝑏𝑠𝑝 𝑑 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑏 , EFFECT: ¬𝐵𝑢 𝑑, 𝑏 ∧ 𝐽𝑜 𝑑, 𝑞 ) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑉𝑜𝑚𝑝𝑏𝑒(𝑑, 𝑞, 𝑏), PRECOND: 𝐽𝑜 𝑑, 𝑞 ∧ 𝐵𝑢 𝑞, 𝑏 ∧ 𝐷𝑏𝑠𝑝 𝑑 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑏 , EFFECT: 𝐵𝑢 𝑑, 𝑏 ∧ ¬𝐽𝑜 𝑑, 𝑞 ) 𝐵𝑑𝑢𝑗𝑝𝑜(𝐺𝑚𝑧(𝑞, 𝑔𝑠𝑝𝑛, 𝑢𝑝), PRECOND: 𝐵𝑢 𝑞, 𝑔𝑠𝑝𝑛 ∧ 𝑄𝑚𝑏𝑜𝑓 𝑞 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑔𝑠𝑝𝑛 ∧ 𝐵𝑗𝑠𝑞𝑝𝑠𝑢 𝑢𝑝 , EFFECT: ¬𝐵𝑢 𝑞, 𝑔𝑠𝑝𝑛 ∧ 𝐵𝑢 𝑞, 𝑢𝑝 )
Objects:
A set of blocks (𝐵, 𝐶, 𝐷, …) and a table (𝑈𝑏𝑐𝑚𝑓).
Predicates:
𝑃𝑜 𝑐, 𝑦 , 𝐷𝑚𝑓𝑏𝑠 𝑐 , 𝐶𝑚𝑝𝑑𝑙 𝑐
Blocks are stacked on other blocks and the table.
Move from a tower or table to another tower Move to table
initial state goal
14
15
Initial state 𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐷, 𝐵
Goal state 𝑃𝑜 𝐵, 𝐶 ∧ 𝑃𝑜 𝐶, 𝐷 Actions 𝑁𝑝𝑤𝑓(𝑐, 𝑦, 𝑧) PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑧 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐
EFFECT: 𝑃𝑜 𝑐, 𝑧 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 ∧ ¬𝐷𝑚𝑓𝑏𝑠 𝑧 𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓(𝑐, 𝑦) PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ (𝑐 ≠ 𝑦) EFFECT: 𝑃𝑜 𝑐, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦
16
𝐽𝑜𝑗𝑢(𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝑃𝑜 𝐷, 𝐵 ∧ 𝐶𝑚𝑝𝑑𝑙 𝐵
𝐻𝑝𝑏𝑚 𝑃𝑜 𝐵, 𝐶 ∧ 𝑃𝑜 𝐶, 𝐷 𝐵𝑑𝑢𝑗𝑝𝑜(𝑁𝑝𝑤𝑓 𝑐, 𝑦, 𝑧 , PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑧 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑧
EFFECT: 𝑃𝑜 𝑐, 𝑧 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 ∧ ¬𝐷𝑚𝑓𝑏𝑠 𝑧 ) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓 𝑐, 𝑦 , PRECOND: 𝑃𝑜 𝑐, 𝑦 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑐 ∧ 𝐶𝑚𝑝𝑑𝑙 𝑐 ∧ (𝑐 ≠ 𝑦), EFFECT: 𝑃𝑜 𝑐, 𝑈𝑏𝑐𝑚𝑓 ∧ 𝐷𝑚𝑓𝑏𝑠 𝑦 ∧ ¬𝑃𝑜 𝑐, 𝑦 )
17
Given: start state, goal conditions, actions Aim: finding a sequence of actions leading from start to goal
Forward state-space search (+ heuristics)
e.g., Fast-Forward (FF)
Backward state-space search (+ constraints)
e.g., GraphPlan
Reduction to propositional satisfiability problem
SATPlan
Search in the space of plans
Partial Order Planning (POP)
18
C A B C B A A B C B A C A C B B C A A B C B C A A C B A B C A C B B C A A B C
19
Initial state Goal
A C B A C B A C B A C B A C B A C B A C B
. . .
20 A C B A C B A C B A C B
21
RESULTS 𝑡, 𝑏 = 𝑡 − DEL 𝑏 ∪ ADD(𝑏) ⇔ (𝑏 ∈ ACTIONS(𝑡))
22
23
𝑁𝑝𝑤𝑓(𝐵, 𝑦, 𝐶)
?
24
𝑝𝑏𝑚 = {𝑝𝑜(𝐶, 𝐷), 𝑝𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵), 𝑝𝑜(𝐶, 𝐵), 𝑑𝑚𝑓𝑏𝑠(𝐶), 𝑑𝑚𝑓𝑏𝑠(𝐷)}
25
26
Goal as a conjunction of literals that may contain variables T
It is based on set of states rather than individual states.
27
𝑝𝑏𝑚 = {𝑝𝑜(𝐶, 𝐷), 𝑝𝑜(𝑈𝑏𝑐𝑚𝑓, 𝐵), 𝑝𝑜(𝐶, ? ), 𝑑𝑚𝑓𝑏𝑠(𝐶), 𝑑𝑚𝑓𝑏𝑠(𝐷)}
28
visited states must be recorded
Branching factor is usually too high.
Combinatorial explosion if state given by set of possible worlds/logical
29
Human intelligence is usually used to define domain-specific
We want to estimate # of steps needed to reach from 𝑡
Allows us to find domain-independent heuristics
Relaxed problems:
Ignore delete lists Ignore preconditions
Problem decomposition
Sub-goal independence assumption
30
31
Admissible? Can we solve this problem in polynomial time?
Admissible? Can we solve this problem in polynomial time?
32
∈𝐻
Not necessarily admissible
independence assumption can be pessimistic
∈𝐻 𝑔(, 𝑡)
33
Admissibility vs. accuracy Sum works well in practice for problems that are largely
34
HSP Fast-Forward (FF)
Competed in fully automated track of AIPS’2000
Granted ``Group A distinguished performance Planning System'‘
Estimate the heuristic with the help of a planning graph
35
Admissible
36
𝑄: proposition levels 𝐵: action levels Proposition and action levels alternate
Precondition: each action at 𝐵𝑗 is connected to its preconditions at 𝑄𝑗 Effect: each action at 𝐵𝑗 is connected to its effects at 𝑄𝑗+1
37
1
𝐵1 𝑄2
38
𝑄0 = {𝑞 ∈ 𝐽𝑜𝑗𝑢} 𝐵𝑗 = {𝑏 is an action| PRECONDS(𝑏) ⊆ 𝑄𝑗} 𝑄𝑗+1 = {𝑞 ∈ EFFECT(𝑏)| 𝑏 ∈ 𝐵𝑗}
(Initial state) 𝑄0 𝐵0 𝑄
1
39
no-op action
𝐽𝑜𝑗𝑢(𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓))
𝐻𝑝𝑏𝑚 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓) ∧ 𝐹𝑏𝑢𝑓𝑜(𝐷𝑏𝑙𝑓)
𝐵𝑑𝑢𝑗𝑝𝑜(𝐹𝑏𝑢(𝐷𝑏𝑙𝑓)
PRECOND: 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓)
EFFECT: ¬𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓) ∧ 𝐹𝑏𝑢𝑓𝑜 𝐷𝑏𝑙𝑓 )
𝐵𝑑𝑢𝑗𝑝𝑜(𝐶𝑏𝑙𝑓(𝐷𝑏𝑙𝑓)
PRECOND: ¬𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓)
EFFECT: 𝐼𝑏𝑤𝑓(𝐷𝑏𝑙𝑓))
40
𝐽𝑜𝑗𝑢 𝑈𝑗𝑠𝑓 𝐺𝑚𝑏𝑢 ∧ 𝑈𝑗𝑠𝑓 𝑇𝑞𝑏𝑠𝑓 ∧ 𝐵𝑢 𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓 ∧ 𝐵𝑢 𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙 𝐻𝑝𝑏𝑚(𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑓𝑛𝑝𝑤𝑓(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝐹𝐺𝐺𝐹𝐷𝑈: ¬𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑) ∧ 𝐵𝑢(𝑝𝑐𝑘, 𝐻𝑠𝑝𝑣𝑜𝑒)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑄𝑣𝑢𝑃𝑜(𝑢, 𝐵𝑦𝑚𝑓), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝑈𝑗𝑠𝑓(𝑢) ∧ 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓) 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑏𝑤𝑓𝑃𝑤𝑓𝑠𝑜𝑗ℎ𝑢, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓))
41
42
43
Level-cost of 𝑗 (𝑚𝑑(𝑗, 𝑡)) : It shows the first level of PG at which 𝑗
Is it accurate?
44
𝑗 ∈ 𝑝𝑏𝑚 𝑚𝑑(𝑗, 𝑡)
𝑗 ∈ 𝑝𝑏𝑚
45
46
Inconsistent effects: one action negates an effect of the other Interference: one of the effects of one action is the negation of a
Competing needs: mutually exclusive preconditions
One of the literals is the negation of the other Inconsistent support: Each possible pair of actions that could achieve
47
48
49
ℎ2 is more useful than ℎ1 (0 ≤ ℎ1 ≤ ℎ2 ≤ ℎ∗)
50
ℎ𝑙 are admissible ℎ𝑙+1 ≥ ℎ𝑙 Computing ℎ𝑙 is 𝑃(𝑜𝑙) with 𝑜 propositions
51
If a valid plan exists it is a sub-graph of the Planning Graph.
Actions at the same level don’t interfere Each action’s preconditions are made true by the plan Goals are satisfied
52
p ¬q ¬r p q ¬q ¬r p q ¬q r ¬r p q ¬q r ¬r A B A B 53
A
54
p ¬q ¬r p q ¬q ¬r p q ¬q r ¬r p q ¬q r ¬r A B A B A
p q r … A p q r … p q r … 55
Available actions are monotonically increasing. Thus mutex relations between literals are decreasing.
(When mutexes between literals are due to mutex relations between actions, they may be removed in the next levels)
p q … B p q r s … p q r s … A C B C A p q r s … B C A 56
If PG levels off before reaching this level, GraphPlan returns
57
necessary, but usually insufficient condition for plan existence
58
looks for actions that produce goals while pruning as many of them as
Variables: a variable for an action at each level Domain={0,1} Constraints: mutexes
59
If no such subset is found return failure
60
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
61
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
62
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)
63
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)
64
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)
65
𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) 𝐵𝑢 𝑄, 𝐵 𝐵𝑢(𝐷, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶) 𝐺𝑚𝑧(𝑄, 𝐵, 𝐶) 𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐵) 𝐵𝑢(𝑄, 𝐶) 𝐵𝑢(𝑄, 𝐵) 𝐵𝑢(𝐷, 𝐵) 𝐽𝑜(𝐷, 𝑄) ¬𝐵𝑢(𝑄, 𝐵) ¬𝐵𝑢(𝐷, 𝐵) 𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) 𝐵𝑢(𝐷, 𝐶) ¬𝐽𝑜(𝐷, 𝑄)
𝑉𝑜𝑚𝑝𝑏𝑒(𝐷, 𝑄, 𝐵)
𝐺𝑚𝑧(𝑄, 𝐶, 𝐵) ¬𝐵𝑢(𝑄, 𝐶)
𝐵𝑢(𝐷, 𝐶) Goal A: Airport P: Plane C: Cargo
𝑀𝑝𝑏𝑒(𝐷, 𝑄, 𝐶) ¬𝐵𝑢(𝐷, 𝐶)
66
Sum (or max) of the level costs of its preconds. is smallest.
𝑄 is a planning problem Find a solution for 𝑄 of length 𝑙
67
Translate to PL Satisfiability Solver Decode solution Planning problem Satisfiability problem Logical model Solution plan
68
Initial state fluents (t=0) state fluents at t=k action fluents at t=k-1
69
Instantiate literals containing variable (replace with ∨ over constants).
successor-state axioms at each time up to 𝑢
𝐺𝑢+1 ⇒ 𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝐺𝑢 ∨ (𝐺𝑢 ∧ ¬𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝑂𝑝𝑢𝐺𝑢)
precondition axioms:
𝐵𝑢 ⇒ PRECOND 𝐵 𝑢
action exclusion axioms:
¬𝐵𝑗
𝑢 ∨ ¬𝐵𝑘 𝑢
70
𝐽𝑜𝑗𝑢(𝑃𝑜(𝐵, 𝐶) ∧ 𝑃𝑜(𝐶, 𝑈𝑏𝑐𝑚𝑓)) 𝑃𝑜 𝐵, 𝐶 0 ∧ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 0 ∧ ¬𝑃𝑜 𝐶, 𝐵 0 ∧ ¬𝑃𝑜 𝐵, 𝑈𝑏𝑐𝑚𝑓 0
Instantiate literals containing variable (replace with ∨ over constants). 𝐻𝑝𝑏𝑚(𝑃𝑜(𝐶, 𝐵)) 𝑃𝑜 𝐶, 𝐵 1 (for 𝑙 = 1)
71
𝐺𝑢+1 ⇒ 𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝐺𝑢 ∨ (𝐺𝑢 ∧ ¬𝐵𝑑𝑢𝑗𝑝𝑜𝐷𝑏𝑣𝑡𝑓𝑡𝑂𝑝𝑢𝐺𝑢)
Example: 𝑃𝑜 𝐶, 𝐵 𝑢+1 ⇒ 𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 𝑢 ∨ [On B, A t ∧ ¬𝑁𝑝𝑤𝑓 𝐶, 𝐵, 𝑈𝑏𝑐𝑚𝑓 𝑢]
𝐵𝑢 ⇒ PRECOND 𝐵 𝑢
Example: 𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 𝑢 ⇒ 𝑃𝑜 𝐶, 𝑈𝑏𝑐𝑚𝑓 𝑢 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐶 𝑢 ∧ 𝐷𝑚𝑓𝑏𝑠 𝐵 𝑢
Is it necessary to include effects: 𝐵𝑢 ⇒ EFFECT 𝐵 𝑢+1 ?
¬𝐵𝑗
𝑢 ∨ ¬𝐵𝑘 𝑢
Example: ¬𝑁𝑝𝑤𝑓 𝐶, 𝑈𝑏𝑐𝑚𝑓, 𝐵 0 ∨ ¬𝑁𝑝𝑤𝑓𝑈𝑝𝑈𝑏𝑐𝑚𝑓 𝐵, 𝐶 0
72
: conjunction of encoding initial state, goals, successor-state axioms,
This means 𝑄 has a solution of length 𝑙
This is the 𝑗’th action of the plan.
73
Robot 𝑆 T
One operator “move” the robot
𝑁𝑝𝑤𝑓 𝑠, 𝑚, 𝑚’
𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑠, 𝑚) 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑠, 𝑚’) ∧ ¬𝐵𝑢(𝑠, 𝑚)
74
𝑀1 𝑀2
75
Initial state:
𝐵𝑢(𝑆, 𝑀1, 0) ∧ ¬ 𝐵𝑢(𝑆, 𝑀2, 0)
Goal:
𝐵𝑢(𝑆, 𝑀2, 1)
Actions preconditions:
𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0) ⇒ 𝐵𝑢(𝑆, 𝑀1, 0) 𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ⇒ 𝐵𝑢(𝑆, 𝑀2, 0)
Action exclusion axiom:
¬𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ∨ ¬𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0)
𝐵𝑢 𝑆, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1 𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0
76
𝐵𝑢(𝑆, 𝑀1, 0) ∧ ¬𝐵𝑢(𝑆, 𝑀2, 0) ∧ 𝐵𝑢(𝑆, 𝑀2, 1) ∧ [𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀1, 0 ] ∧ [𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1 ] ∧ [¬𝑁𝑝𝑤𝑓(𝑆, 𝑀2, 𝑀1, 0) ∨ ¬𝑁𝑝𝑤𝑓(𝑆, 𝑀1, 𝑀2, 0)] ∧ [𝐵𝑢 𝑆, 𝑀1, 0 𝐵𝑢 𝑆, 𝑀1, 1 𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0 ] ∧ [𝐵𝑢 𝑆, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1 𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 ] ∧ [𝐵𝑢 𝑆, 𝑀1, 0 𝐵𝑢 𝑆, 𝑀1, 1 𝑁𝑝𝑤𝑓 𝑆, 𝑀1, 𝑀2, 0 ] ∧ 𝐵𝑢 𝑆, 𝑀2, 0 𝐵𝑢 𝑆, 𝑀2, 1 𝑁𝑝𝑤𝑓 𝑆, 𝑀2, 𝑀1, 0
77
SAT formula for (P,1)
78
For all pairs of actions at each time step i:
For any pair of incompatible actions (recall from Graphplan):
Fewer time steps may be required (i.e. shorter formulas)
79
80
DPLL (Davis Putnam Logemann Loveland)
WalkSAT
81 𝐽𝑜𝑗𝑢() 𝐻𝑝𝑏𝑚(𝑆𝑗ℎ𝑢𝑇ℎ𝑝𝑓𝑃𝑜 𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓𝑃𝑜) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑗ℎ𝑢𝑇ℎ𝑝𝑓, PRECOND: 𝑆𝑗ℎ𝑢𝑇𝑝𝑑𝑙𝑃𝑜, EFFECT: 𝑆𝑗ℎ𝑢𝑇ℎ𝑝𝑓𝑃𝑜)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑗ℎ𝑢𝑇𝑝𝑑𝑙, EFFECT: 𝑆𝑗ℎ𝑢𝑇𝑝𝑑𝑙𝑃𝑜)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓, PRECOND: 𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙𝑃𝑜, EFFECT: 𝑀𝑓𝑔𝑢𝑇ℎ𝑝𝑓𝑃𝑜) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙, EFFECT: 𝑀𝑓𝑔𝑢𝑇𝑝𝑑𝑙𝑃𝑜)
82 Left Sock
Start Finish
Right Shoe Left Shoe Right Sock
Start Right Sock Finish Left Shoe Right Shoe Left Sock Start Start Start Start Start Right Sock Right Sock Right Sock Right Sock Right Sock Left Sock Left Sock Left Sock Left Sock Left Sock Left Sock Right Shoe Right Shoe Right Shoe Right Shoe Right Shoe Left Shoe Left Shoe Left Shoe Left Shoe Finish Finish Finish Finish Finish
Left Shoe on Right Shoe on Left Sock on Right Sock on
Start
No precondition All ‘Initial State’ as its effects
Finish
All ‘Goal State’ as its precondition No Effect
83
𝐵: set of actions in the plan (plan steps)
Initially {Start, Finish}
𝑃: set of orderings between actions
Initially {Start<Finish}
Initially {}
84
Purpose of 𝐵𝑗 is to achieve the precondition 𝑑 of 𝐵𝑘
𝑑 𝐵𝑘 when:
85
86
Demotion (placed before): add 𝑇3 < 𝑇1 to 𝑃 Promotion (placed after): add 𝑇2 < 𝑇3 to 𝑃
87
𝐽𝑜𝑗𝑢 𝑈𝑗𝑠𝑓 𝐺𝑚𝑏𝑢 ∧ 𝑈𝑗𝑠𝑓 𝑇𝑞𝑏𝑠𝑓 ∧ 𝐵𝑢 𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓 ∧ 𝐵𝑢 𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙 𝐻𝑝𝑏𝑚(𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑆𝑓𝑛𝑝𝑤𝑓(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑), 𝐹𝐺𝐺𝐹𝐷𝑈: ¬𝐵𝑢(𝑝𝑐𝑘, 𝑚𝑝𝑑) ∧ 𝐵𝑢(𝑝𝑐𝑘, 𝑠𝑝𝑣𝑜𝑒)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑄𝑣𝑢𝑃𝑜(𝑢, 𝑏𝑦𝑚𝑓), 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝑈𝑗𝑠𝑓(𝑢) ∧ 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓) 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝑀𝑓𝑏𝑤𝑓𝑃𝑤𝑓𝑠𝑜𝑗ℎ𝑢, 𝑄𝑆𝐹𝐷𝑃𝑂𝐸: 𝐹𝐺𝐺𝐹𝐷𝑈: 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝑇𝑞𝑏𝑠𝑓, 𝐵𝑦𝑚𝑓) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐻𝑠𝑝𝑣𝑜𝑒) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝑈𝑠𝑣𝑜𝑙) ∧ 𝐵𝑢(𝐺𝑚𝑏𝑢, 𝐵𝑦𝑚𝑓))
88
89
90
91
92
93
Agenda: open preconditions (along with actions requiring them)
Initially all preconditions of End
function POP(< 𝐵, 𝑃, 𝑀 >, 𝑏𝑓𝑜𝑒𝑏) 𝐣𝐠 𝑏𝑓𝑜𝑒𝑏 = {} then return (< 𝐵, 𝑃, 𝑀 >) (𝑟, 𝐵𝑜𝑓𝑓𝑒) ← Select a goal from agenda 𝑏 ← Choose an action that adds 𝑟 if no such action then return failure Update < 𝐵, 𝑃, 𝑀 > and 𝑏𝑓𝑜𝑒𝑏 Add consistent ordering constraints for causal link protection if no constraint is consistent then return failure POP(< 𝐵, 𝑃, 𝑀 >, 𝑏𝑓𝑜𝑒𝑏)
POP(<A,O,L>, agenda)
if no such action exists, then return failure Let L’= L {Aadd → Aneed}, and let O’ = O {Aadd < Aneed}. If Aadd is newly instantiated, then A’ = A {Aadd} and O’ = O {A0 < Aadd < A} (otherwise, let A’ = A)
If Aadd is newly instantiated, then for each conjunction, Qi,
threaten a causal link Ap → Ac, add a consistent
(a) Demotion: Add At < Ap to O’ (b) Promotion: Add Ac < At to O’ If neither constraint is consistent, then return failure
Q p
94
95 𝐽𝑜𝑗𝑢(𝐵𝑢(𝐼𝑝𝑛𝑓) 𝑇𝑓𝑚𝑚𝑡(𝐼𝑋𝑇, 𝐸𝑠𝑗𝑚𝑚) 𝑇𝑓𝑚𝑚𝑡(𝑇𝑁, 𝑁𝑗𝑚𝑙)𝑇𝑓𝑚𝑚𝑡(𝑇𝑁, 𝐶𝑏𝑜𝑏𝑜𝑏)) 𝐻𝑝𝑏𝑚(𝐼𝑏𝑤𝑓(𝐸𝑠𝑗𝑚𝑚) 𝐼𝑏𝑤𝑓 𝑁𝑗𝑚𝑙 𝐼𝑏𝑤𝑓 𝐶𝑏𝑜𝑏𝑜𝑏 𝐵𝑢(𝐼𝑝𝑛𝑓)) 𝐵𝑑𝑢𝑗𝑝𝑜(𝐻𝑝(𝑢ℎ𝑓𝑠𝑓) PRECOND: 𝐵𝑢(ℎ𝑓𝑠𝑓), EFFECT: 𝐵𝑢(𝑢ℎ𝑓𝑠𝑓) ∧ ¬𝐵𝑢(ℎ𝑓𝑠𝑓))
PRECOND: 𝐵𝑢(𝑡𝑢𝑝𝑠𝑓) 𝑇𝑓𝑚𝑚𝑡(𝑡𝑢𝑝𝑠𝑓, 𝑦), EFFECT: 𝐼𝑏𝑤𝑓(𝑦))
96
Three 𝐶𝑣𝑧 actions for three preconditions of Finish action 𝑇𝑓𝑚𝑚𝑡 precondition of Buy
Bold arrows: causal links, protection of precondition Light arrows: ordering constraints
97
98
99
100
101
backtracking
102
103
104
105
106
higher computation per-node
Action selection, goal selection, order refinement
107
108
Partially ordered plans Can generate more human-like plans that can be checked by
109