- Y. Arvelo, B. Bonet, M. Vidal. AAAI-06. July 18th, 2006.
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 1/19
Compilation of Query-Rewriting Problems into Tractable Fragments of - - PowerPoint PPT Presentation
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic Yolif Arvelo Blai Bonet Mara Esther Vidal Departamento de Computacin Universidad Simn Bolvar Caracas, Venezuela Y. Arvelo, B. Bonet, M. Vidal.
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 1/19
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 2/19
■ We consider the problem of rewriting a query using materialized views ■ This problem appears frequently in the context of Data Integration, Web
■ The problem is in general intractable and existing algorithms do not scale
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 3/19
■ OBJECTIVE: Given a query Q, retrieve all tuples obtainable from the
■ Data sources are assumed to be: ◆ Independent (i.e. maintained in a distributed manner) ◆ Described as views (i.e. the Local As View model) ◆ Incomplete
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 4/19
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 5/19
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 6/19
■ ASSUMPTION: Views may be incomplete ■ Then, the solution is the collection of rewritings:
■ Observe that there is no rewriting using onestop(x, y)
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 7/19
■ INPUT: A query Q and set of views V = {V1, V2, . . . , Vn} ■ TASK: Find a maximal-contained set of rewritings of Q using the views ■ A rewriting is a query-like expression that refers only to the views ■ ASSUMPTION: Q and Vi are conjunctive queries without arithmetic
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 8/19
■ Bucket algorithm [Levy & Rajaraman & Ullman 1996] ■ Inverse rules algorithm [Duscka & Genesereth 1997] ■ MiniCon algorithm [Pottinger & Halevy 2001]
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 9/19
■ Exploit independences to decompose into smaller subproblems and then
■ Solutions to subproblems are called MCDs
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 10/19
■ Generate all MCDs (very expensive since performs blind search) ■ Rewritings generated greedily as combination of MCDs such that: ◆ Cover disjoint subsets of subgoals in the query ◆ Cover all subgoals in the query ■ In the example, combining M3, M5, M6 produces the rewriting:
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 11/19
■ Given a query Q and a set of views V ■ Build a propositional theory such that its models are in correspondence
■ Generating MCDs is now a problem of model enumeration ■ Model enumeration can be done with modern SAT techniques that
◆ Non-chronological backtracking via clause learning ◆ Caching of common subproblems ◆ Heuristics ■ We also extend propositional theory such that its models are in
■ We call our approach MCDSAT!!
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 12/19
■ A formula is in Negation Normal Form (NNF) if constructed from literals
■ It can be represented as a rooted DAG whose leaves are literals and
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 13/19
■ Introduced by [Darwiche 2001] ■ A NNF is decomposable if each variable appears at most once below
■ A NNF is deterministic if disjuncts are pairwise logically inconsistent ■ A d-DNNF supports a number of operations in linear time: ◆ satisfiability ◆ clause entailment ◆ model counting ◆ model enumeration (output linear time) ◆ ... ■ Transformation into d-DNNF is intractable in the worst case, but not
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 14/19
■ MCDSAT translates QRP into a propositional theory T ■ T is compiled into d-DNNF using Darwiche’s c2d compiler ■ Models are obtained from the d-DNNF and transformed into MCDs or
■ c2d and models are off-the-shelf components ■ MCDSAT written in scripting language
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 15/19
■ Large benchmark with problems of different sizes and structures ■ Comparison metric: time ■ For lack of space, we only report few instances
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 16/19
■ MCD Theory: time to generate MCDs (no combination) ■ Extended Theory: time to generate rewritings ■ Structure: Chain and Star ■ Half distinguished variables ■ Queries of different length ■ Different number of views ■ Each point is average over 10 instances ■ Random instances created with generator of [Afrati, Li & Ullman 2001]
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 17/19
0.1 1 10 100 1000 3 4 5 6 7 8 9 10 time in seconds number of goals in query chain queries / half distinguished vars / 80 views MiniCon McdSat 0.1 1 10 100 1000 3 4 5 6 7 8 9 10 time in seconds number of goals in query star queries / half distinguished vars / 80 views MiniCon McdSat 0.1 1 10 100 1000 10000 20 40 60 80 100 120 140 time in seconds number of views chain queries / half distinguished vars / 8 subgoals MiniCon McdSat 0.1 1 10 100 1000 10000 20 40 60 80 100 120 140 time in seconds number of views star queries / half distinguished vars / 8 subgoals MiniCon McdSat
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 18/19
0.1 1 10 100 1000 3 4 5 6 7 8 9 10 time in seconds number of goals in query chain queries / half distinguished vars / 80 views MiniCon McdSat 0.1 1 10 100 1000 3 4 5 6 7 8 9 10 time in seconds number of goals in query star queries / half distinguished vars / 80 views MiniCon McdSat 0.1 1 10 100 1000 20 40 60 80 100 120 140 time in seconds number of views chain queries / half distinguished vars / 6 subgoals MiniCon McdSat 0.1 1 10 100 1000 20 40 60 80 100 120 140 time in seconds number of views star queries / half distinguished vars / 6 subgoals MiniCon McdSat
Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic - p. 19/19
■ Proposed a novel method for QRPs using propositional logic which: ◆ Uses off-the-shelf propositional components ◆ It’s easy to implement ◆ Shows improved performance over other methods ■ Thus, the logical approach is not only of scientific interest but
■ Similar ideas can be applied to other problems!