[PPT] - Automatic Creation of Search Heuristics Stefan Edelkamp 1 PowerPoint Presentation

SLIDE 1

Automatic Creation of Search Heuristics

Stefan Edelkamp

SLIDE 2

1 Overview

Automatic Creation of Heuristics
Macro Problem Solving
Hierarchical A* and Voltorta’s Theorem
Pattern Databases
Disjoint Pattern Databases
Multiple, Bounded, Symmetrical, Dual, and Secondary Databases

Overview 1

SLIDE 3

2 History

History on automated creation of admissible heuristics: Gaschnik (1979), Pearl (1984), Prititis (1993), Guida and Somalvico (1979)

Korf: Macro Problem Solver, Learning Real-Time A*
Valtorta: A result on complexity of heuristic estimates for A*
Holte et al.’s Hierarchical A*: Searching Abstraction Hierarchies Efficiently
Recent work: Pattern/Abstraction Databases

On-line computation in Hierarchical A* and its precursors probably main difference from off-line calculations that are applied in construction of pattern databases

History 2

SLIDE 4

3 Macro Problem Solving

Macro Problem Solver constructs a table that contains macros solving subproblems Search: Solver looks at table entries to sequentially improve current state to goal Eight-Puzzle example: operators labeled by direction blank is moving Table Structure: entry in row r and column c denotes operator sequence to move tile in position r into c Invariance: after executing macro, tiles in position 1 to r − 1 remain correctly placed

Macro Problem Solving 3

SLIDE 5

Macro Table for the Eight-Puzzle

1 2 3 4 5 6 1 DR 2 D LURD 3 DL URDL URDL LURD 4 L RULD RULD LURRD LURD LULDR 5 UL DRUL RDLU RULD RDLU DLUR RULD RDLU ULDR URDL 6 U DLUR DRU DLUU DRUL LURRD ULDR ULD RDRU DLURU LLDR LLDR 7 UR LDRU ULDDR LDRUL DLUR DRULDL DLUR ULDR ULURD URDRU DRUL URRDLU LLDR 8 R ULDR LDRR LURDR LDRRUL DRUL LDRU UULD ULLDR LDRU RDLU

Macro Problem Solving 4

SLIDE 6

Running the Table

For tile 1 ≤ i ≤ 6, determine current position c and goal r and apply macro (c, r)

1 4 6 3 8 7 5 2

c=0, r=5 c=1, r=2 c=2, r=7 c=3, r=3 c=4, r=4 c=5, r=7 c=6, r=7

1 6 3 8 7 2 5 2 3 3 3 3 3 1 2 8 7 4 8 1 2 4 5 6 7 1 2 7 5 8 6 4 1 2 4 7 5 8 6 5 4 1 8 3 4 7 6 1 2 5 7 4 8 6 6 5

DLUR DRUL DLUR RDLU ULDDR ULURD LURD UL

worst case solution (sum column maxima): 2 + 12 + 10 + 14 + 8 + 14 + 4 = 64 average: 12/9 + 52/8 + 40/7 + 58/6 + 22/5 + 38/4 + 8/3 = 39.78

Macro Problem Solving 5

SLIDE 7

Construction

. . . with Backward-DFS or Backward-BFS starting from the set of goals

backward operators m−1 do not necessarily need to be valid
given vector representation of current position p = (p0, . . . , pk), then m−1 can be

reached from goal position p′ = (p′

0, . . . , p′ k)

Column c(m) of m, which transforms p into p′: length of longest common prefix of

p and p′, i.e., c(m) = min{i ∈ {0, . . . , k − 1} | pi = p′

i}

Row r(m) of macro m: position on which pc(m) is located, which has to be moved

to c(m) in next macro application

Macro Problem Solving 6

SLIDE 8

Example

m−1 =LDRU alters goal position p′ = (0, 1, 2, 3, 4, 5, 6, 7, 8) into

p = (0, 1, 2, 3, 4, 5, 8, 6, 7)

its inverse m is DLUR

⇒ c(m) = 6 and r(m) = 7, matching last macro application in table

larger problems: BFS exhaust memory resources before table entry fixed
larger tables require pattern database heuristic search

Macro Problem Solving 7

SLIDE 9

4 Patterns and Domain Abstraction

pattern: refers to vector representation v(u), each position i contains an

assignment to variable vi, i ∈ {1, . . . , k}

specialized pattern: state with one or more constants replaced by don’t cares
generalized pattern: each variable vi with domain Di is mapped to abstract

domain Ai, i ∈ {1, . . . , k}

start and goal pattern: making same substitutions in start and goal states
domain abstraction: mapping stating which assignment to replace by which other

Patterns and Domain Abstraction 8

SLIDE 10

Two Examples in Eight-Puzzle

1. Tiles 1, 2, 7 replaced by don’t care x

⇒ φ1(v) = v′ with v′

i = vi if vi ∈ {0, 3, 4, 5, 6, 8}, and vi = x, otherwise

2. φ2: also map tiles 3 and 4 to y, and tiles 6 and 8 to z

Granularity: vector indicating how many constants in the original domain are mapped to each constant in the abstract domain ⇒ gran (φ2) = (3, 2, 2, 1, 1)

3 constants are mapped to x
2 are mapped to each of y and z
constants 5 and 0 (the blank) remain unique

Patterns and Domain Abstraction 9

SLIDE 11

5 Embeddings and Homomorphisms

embedding: earliest and most commonly studied type of abstraction

transformation

informally, φ embedding transformation if it adds edges to S
E.g., macro-operators, dropped preconditions
homomorphism: other main type of abstraction transformation
informally, homomorphism φ groups together several states in S to create

single abstract state

E.g., drop predicate entirely from state space description (Knoblock 1994)

Embeddings and Homomorphisms 10

SLIDE 12

Hierarchical A*

Abstraction works by replacing one state space by another that is easier to search Hierarchical A* is an versions that computes distances of an abstract to the abstract goal on-the-fly, by means for each node that is expanded

different to earlier approaches (exploring abstract space from scratch),

Hierarchical A* uses caching to avoid repeated expansion of states in abstract space

restricts to state space embedding that are homomorphisms

Embeddings and Homomorphisms 11

SLIDE 13

Voltorta Theorem

Theorem If state space S is embedded in S′ and h is computed by blind BFS in v then A* using h will expand every state that is expanded by BFS ⇒ by re-computing heuristic estimates for each state this option cannot possibly speed-up search

Absolver II: 1st system to break this barrier
Hierarchical A*: subsequent one

Embeddings and Homomorphisms 12

SLIDE 14

Example

N × N grid, abstracted by ignoring 2nd coordinate

goal state is (N, 1)
initial state (1, 1)

Theorem of Voltorta ⇒ A* expands Ω(N2) nodes Main Observation: Search for h(s) also generates value of h(u), ∀u ∈ S′

abstraction yields a perfect heuristic on solution path
Hierarchical A* will expand optimum of O(N) nodes

Embeddings and Homomorphisms 13

SLIDE 15

Proof of Voltorta’s Theorem

When A* terminates, u closed, open, or unvisited u closed ⇒ it will have been expanded u open ⇒ hφ(u) must have been computed

hφ(u) computed by search starting at φ(u)
φ(u) /

∈ φ(T) ⇒ 1st step is to expand φ(u)

φ(u) ∈ φ(T) ⇒ hφ(u) = 0, and u itself is necessarily expanded

u unvisited ⇒ ∀ paths from s to u, ∃ never expanded state added to Open Let w be any such state on shortest path from s to u

w opened ⇒ hφ(w) must have been computed

Embeddings and Homomorphisms 14

SLIDE 16

Proof (ctd)

To show: in computing hφ(w), φ(u) expanded

u necessarily expanded by BFS ⇒ δ(s, u) < δ(s, T)
w on shortest path ⇒ δ(s, w) + δ(w, u) < δ(s, T)
M never expanded by A* ⇒ δ(s, w) + hφ(w) ≥ δ(s, T)
combining the two inequalities: δ(w, u) < hφ(w) = δφ(w, T)
homomorphism: δφ(w, u) ≤ δ(w, u) ⇒ δφ(w, u) < δ(w, T)

⇒ φ(u) necessarily expanded

Embeddings and Homomorphisms 15

SLIDE 17

Consistency

Theorem hφ is consistent hφ consistent ⇒ ∀u, v ∈ S: hφ(u) ≤ δ(u, , v) + hφ(v)

δφ(u, T) shortest path ⇒ δφ(u, T) ≤ δφ(u, v) + δφ(v, T) for all u and v
substituting hφ: hφ(u) ≤ δφ(u, v) + hφ(v) for all u and v
homomorphism: δφ(u, v) ≤ δ(u, v) ⇒ hφ(u) ≤ δ(u, v) + hφ(v) for all u and v

Embeddings and Homomorphisms 16

SLIDE 18

6 Pattern Databases

Name inspired by (n2 − 1)-Puzzle, where pattern is selection of tiles Pattern database: stores all pattern together with their shortest path distance on simplified board to the pattern for goal Construction PDB: prior to overall search in a Backward BFS starting with goal pattern and using inverse abstract state transitions Search in original space: pattern selected in active state with stored distance value as estimator function

Pattern Databases 17

SLIDE 19

Example (n2 − 1)-Puzzle

fringe and the corner pattern (databases):

11 3 7 15 14 13 12 15 14 13 12 8 9 10

Multiple pattern databases:

Pattern Databases 18

SLIDE 20

Maximizing Pattern Databases

Shortest path distance in pattern space ≤ shortest path distance in the original one ⇒ pattern databases heuristics are admissible Combined pattern databases:

take maximum of heuristic values provided by different databases
use result as admissible heuristic

⇒ optimal solutions for random instances to Rubik’s Cube

Pattern Databases 19

SLIDE 21

Disjoint Pattern Databases

For sliding-tile puzzles only one tile can move at a time ⇒ disjoint pattern databases count moves of pattern tiles only General Problem: different pattern databases may count operators twice, since an

perator can have a non-trivial image under more than one relaxation

Assumption: Two pattern databases Dφ1 and Dφ2 are disjoint, if for all non-trivial O′ ∈ Oφ1, O′′ ∈ Oφ2 we have φ−1

1 (O′) ∩ φ−1 2 (O′′) = ∅

Finding partitions for pairwise disjoint pattern databases automatically not trivial ⇒ assign 1 to each operator only in 1 relaxation ⇒ sum of retrieved pattern database values preserves admissibility, while being more accurate

Pattern Databases 20

SLIDE 22

Automated Pattern Selection

Simplify the problem of finding a suitable partition to bin-packing Task: distribute pattern position (tile) to bins in such a way that a minimal number of bins is used. Size of the bins: determined by maximum size of abstract state space, which is to be approximated Adding a position to the pattern: multiplication of domain size to the expected abstract state size ⇒ bin-packing based on multiplying the individual object sizes (for addition use logarithms) Bin-packing is NP complete but has several efficient approximations

Pattern Databases 21

SLIDE 23

Korf’s Conjecture

n: # states in entire problem space
b: brute-force branching factor
d: average optimal solution length for a random problem instance
e: expected value of heuristic
m: amount of memory used, in terms of abstract states stored
t: in # generated nodes in A* (without duplicate detection)

Estimated average optimal solution length d of random instance (depth to which A* must search): d ≈ logb n Furthermore: e ≈ logb m (abstract space) and t ≈ bd−e Substituting values for d and e into this formula gives: t ≈ bd−e ≈ blogb n−logb m = n/m

Pattern Databases 22

SLIDE 24

Multiple Pattern Databases

Observation: Maximized smaller databases reduces # nodes generated better Example Eight-puzzle: 20 pattern databases of size 252 perform less state expansions (318) then 1 pattern database of size 5,040 (2,160 state expansions)

1. Smaller pattern databases reduces # patterns with high h-values, but

maximization of smaller pattern databases can make the number of patterns with low h-values significantly larger than in the larger pattern database

2. Eliminating low h values more important for improving search performance

than retaining high h-values

Pattern Databases 23

SLIDE 25

On-Demand Pattern Databases

Secondary A* PDB construction ———— Need for on-demand extension

abstract space A*

−1

t s s’ s’ (A*) t’ A* (A*)

−1

t’

riginal space

abstract space

Pattern Databases 24

SLIDE 26

Symmetrical and Dual Lookups in Pattern Databases

Symmetry PDB: exploit physical symmetries (n2 − 1)-Puzzle: symmetry about the main diagonal in the (n2 − 1)-Puzzle ⇒ PDB for 2, 3, 6, and 7 reused for pattern 8, 9, 12, and 13 Dual PDB: symmetry between objects and locations

inverse 10 3 7 2 3 6 7 1 2 3 4 5 6 7 12 8 9 10 11 13 14 15 15 2 9 14 13 7 12 8 6 2 9 2 1 7 6 14 13 12 9 14 13 12 3 3 5 6 11 4 2 7 3 6

riginal

abstract dual abstract

Pattern Databases 25

SLIDE 27

7 Bounded Computation of Pattern Database

Theorem U upper bound on δ(s, T), φ : abstraction function f: cost function in backward traversal of abstract space ⇒ pattern database heuristic only needs to be computed if f(φ(u)) < U Proof: f(φ(u)) ≤ δφ(s, T) ≤ δ(s, T) ≤ U ⇒ all φ(u) with f(φ(u)) > U cannot lead to any better solution with cost ≤ U ⇒ ignore u

Bounded Computation of Pattern Database 26

SLIDE 28

8 Planning Pattern Databases

An abstract planning problem P|R = < S|R, O|R, I|R, G|R > of a propositional planning problem < S, O, I, G > wrt. set of atoms R is defined by

1. S|R = {S|R= S ∩ R | S ∈ S},
2. T|R = {G|R | G ∈ G},
3. O|R = {O|R | O ∈ O}, with O|R = (P|R, A|R, D|R)

πR: solutions for abstract planning problem P|R δR: optimal abstract plan length

Planning Pattern Databases 27

SLIDE 29

Pattern Databases in AI Planning

Planning pattern database DR (wrt. a set of propositions R and a propositional planning problem < S, O, I, G >): collection of pairs (h, u) with u ∈ S|R such that h = δR(u), set DR = {(δR(u), u) | u ∈ S|R}

ptimal abstract plan πopt

R

for P|R always shorter than an optimal concrete plan πopt for P, i.e., δR(u|R) ≤ δ(u), for all u ∈ S Remark: Strict inequality δR(u|R) < δ(u): some abstract operators are void, or we have alternative even shorter paths in abstract space

Planning Pattern Databases 28