SLIDE 1 Query Evaluation Techniques for Large Databases
1
GOETZ GRAETE Presented by Long Zhang
- Dept. of Computer Science, UBC
zhlong73@cs.ubc.ca
SLIDE 2 Purpose
To survey efficient algorithms and software architecture of query execution engines for executing complex queries
2
SLIDE 3 3
Restrictions
Restrictive data definition and manipulation languages can make application development and maintenance unbearably cumbersome. Data volumes might be so large or complex that the real
- r perceived performance advantage of file systems is
considered more important than all other criteria, e.g. the higher levels of abstraction and programmer productivity typically achieved with database management systems.
SLIDE 4 4
Topics algorithms and there costs sorting versus hashing parallelism resource allocation and scheduling general performance-enhancing techniques … …
SLIDE 5 Which parts we concern?
User I nterface Database Query Language Query Optimizer Query Execution Engine Files and I ndices I / O Buffer Disk 5
SLIDE 6 Query Execution Engine
What is it?
Collection of query execution operators and mechanisms for operator communication and synchronization Query execution engine defines the space of possible plans that can be chosen by query
6
SLIDE 7 Translate logical query from SQL to query tree in logical algebra.. Query tree in logical algebra is translated into a physical plan Optimizer expands search space and finds best plan. Optimal physical plan copied out of optimizer’s memory structure and sent to query execution engine. query execution engine executes plan using relations in database as input, and produces output 7
SLIDE 8 Discussion
Fact: On the first page, the author states that DBMSs have not been used in many areas for two reasons.
Application development and maintenance is
- difficult. (Restrictive data definition and
manipulation languages can make the development difficult) The data in those areas is SO big, that speed trumps all, and people would rather hand- code.
SLIDE 9 Questions are…
Do you think in today’s world databases are popular for most applications? IF NO, GO TO (1), IF YES, GO TO (2)
1)Why do you think databases aren't used more? Why don't you use them on your data? 2)How do database systems' needs for dealing with large amounts of data differ from other applications
- r systems such as OS? Should we only depend on it?
IF YES, what should be changed in the architecture
- f databases or data abstraction?
SLIDE 10
- 1. Architecture of Query Execution
Engines
Focus on useful mechanisms for processing sets of items Records Tuples Entities Objects
10
SLIDE 11 Logical algebra VS Physical algebra
Logical algebra, e.g. relational algebra, is more closely related to the data model and defines what queries can be expressed in the data model. Physical algebra is system specific. For example, nested-loops joins, merge-join, hash join, etc. Systems may implement same model and logical algebra with different physical algebra.
11
SLIDE 12 Logical algebra VS physical algebra
Specific algorithms and therefore cost functions are associated only with physical
- perators not logical algebra operators
Mapping logical to physical non–trivial:
It involves algorithm choices Logical and physical operators not directly mapped Some operators in physical algebra may implement multiple logical operators etc
12
SLIDE 13 To sum up
Query optimization is the mapping from logical to physical operations Query execution engine is the implementation of operations on physical representation types and of mechanisms for coordination and cooperation among multiple such operations in complex queries.
13
SLIDE 14 14
Prepare an operator for producing data : Open Produce an item : Next Perform final housekeeping: Close
Implementation issues
SLIDE 15
Iterators
Two important features of operators
Can be combined into arbitrarily complex evaluation plans Any number of operators can schedule and execute each other in a single process without assistant from underlying OS
SLIDE 16
Observations
Entire query plan executed within a single process Operators produce an item at a time on request Items never wait in a temporary file or buffer (pipelining) Efficient in time-space-product memory cost Iterators can schedule any type of trees including bushy trees No operator affected by the complexity of the whole plan
SLIDE 17
Sorting &Hashing
The purpose of many query-processing algorithms is to perform some kind of matching,
i.e., bringing items that are “alike” together and performing some operation on them.
There are two basic approaches used for this purpose:
sorting hashing.
These are the basis for many join algorithms
SLIDE 18
Design Issues
Sorting should be implemented as an iterator
In order to ensure that sort module interfaces well with the other operators, (e.g., file scan or merge-join).
Input to the sort module must be an iterator, and sort uses open, next, and close procedures to request its input
therefore, sort input can come from a scan or a complex query plan, and sort operator can be inserted into a query plan at any place or at several places.
SLIDE 19 Out of Memory
If the input is larger than main memory, the
- pen-sort procedure creates sorted runs
and merges them until only one final merge phase is left. Dividing and combining
SLIDE 20 More on Sorting
For sorting large data sets there are two distinct sub-algorithm :
One for sorting within main memory One for managing subsets of the data set on the disk.
For practical reasons, e.g., ensuring that a run fits into main memory, the disk management algorithm typically uses physical dividing and logical combining (merging). A point of practical importance is the fan-in
- r degree of merging, but this is a parameter
rather than a defining algorithm property.
SLIDE 21
Creating level-0 runs
In memory sort, e.g. Quicksort. By doing so, each run will have the size of allocated memory. Replacement selection. Using priority heap, which support insert and remove- smallest operations. The main purpose is to generate the longest initial ordered sequence for each file.
SLIDE 22 Replacement Selection Sort
Input Buffer, Output Buffer and Ram have the same size.
I nput File I nput Buffer RAM Output Buffer Output File
SLIDE 23 Replacement Selection Sort
In memory:
31 19 40 56 21 25 12 Next input: 20 10 Do not need to care When the run is Larger than the memory
SLIDE 24 Merging
Run 1 Run 2 Output Buffer Run 3 I n Memory On the disk
SLIDE 25
Merging
Scans are faster if read-ahead and write-behind are used; Using large cluster sizes for the run files is very beneficial; The number of runs W is typically not a power of Fan-in;
SLIDE 26
Merging as many as possible instead of always merging the same level
SLIDE 27
Quick Sort VS Replacement selection
Run files in RS are typically larger than memory ,as oppose to QS where they are the size of the memory Qs results in burst of reads and writes for entire memory loads from the input file to initial run files while RS alternates between individual read and write In RS memory management is more complex The advantage of having fewer runs must be balanced with the different I/0 pattern and the disadvantage of more complex memory management.
SLIDE 28
2.2 Hashing
Alternative to sorting Expected complexity of hashing algorithms is O(N) rather than O( N log N) as for sorting. Hash-based query processing algorithms use an in-memory hash table of database objects to perform their matching task.
SLIDE 29 29
When hash table is larger than memory, hash table overflow occurs and must be dealt with. Input divided into multiple partition files such that partitions can be processed independently from one another Concatenation of results of all partitions is the result of the entire operation.
Hash Table
SLIDE 30
Discussion 2
Question 2
Fact1: Nowadays, data is larger and larger than before. Fact2: Memory overflow was a serious problem to both sorting and hashing at the time when the paper was written. Fact3: Machines are more powerful than the time of this paper: such as larger memory.
SLIDE 31
The question is…
For query execution in large database system:
Do you think that in the time since then the issues would have gotten
Better? Because memories have gotten larger? Worse? Because there is a bigger gap between the time it takes to access memory and the time to access things on disk?
Do you think increasing power of the machines could deal with the large data processing or still worth making great effort on the field as this paper did? Are there any other new issues worth consideration?
SLIDE 32
File Scans Associative Access Using Indices Buffer Management
SLIDE 33
3.1 File scans
Make it fast -> Read ahead -> requires contiguous file allocation
SLIDE 34 Associative Access Using Indices
Goal:
To reduce the number of accesses to secondary storage
How?
By employing associative search techniques in the form of indices Indices map key or attribute values to locator information with which database
SLIDE 35 Some Index Structures:
Clustered & Un-clustered
Clustered: order or organization of index entries determines order of items on disk.
Sparse & Dense
Sparse: Indices do not contain an entry for each data item in the primary file, but only
- ne entry for each page of the primary file;
Dense: there are same number of entries in index as there are items in primary file. Non-clustering indices must always be dense
SLIDE 36
Buffer Management
Goal: reduce I/O cost by cashing data in an I/O buffer. Design Issues
Recovery Replacement policy performance effect of buffer allocation Interactions of index retrieval and buffer management
Implementation Issues
Interface provided : fixing –unfixing Intermediate results kept in a separate buffer
SLIDE 37 Part 5 Binary Matching Operations
There are a number of database
- perations that combine information
from two inputs, files, or sets. A group of operators that all do basically the same task are called the
- ne-to-one match operations.
SLIDE 38 5.1 Nested-Loops Join
For each item in one input (called the
- uter input), scan the entire other input
(called the inner input) and find matches. Temporary file used to store inner input. Shortcomings: the inner input is scanned very often.
SLIDE 39
Improvements for Nested-Loops
If a single match (semi-join) carries all necessary information, a scan of the inner input can be terminated after the first match. Scan Each item of inner input-> scan each page (block nested-loops join). More in memory computation Scanning the inner input Forward and Backward, reusing the last page.
SLIDE 40 5.2 Merge-Join Algorithms
Require both inputs are sorted on the join attribute. Similar to merge process used in sorting Heap-filter merge-join: a combination
- f nested-loops join and merge-join
Hybrid join (used by IBM for DB2), uses elements from index nested-loop joins and merge join, and techniques joining sorted lists on index leaf entries.
SLIDE 41
5.3 Hash Join Algorithms
Are based on the idea of building an in- memory hash table on one input and then probing this hash table using items from the other input. Good News: Very fast, no temporary files, if the build input does indeed fits into memory.
SLIDE 42 5.3 Hash Join Algorithms
Build and probe inputs are partitioned using the same partitioning function. The final join result comes from concatenating the join results of pairs
Recursive partitioning is used when a build partition file is larger than memory.
SLIDE 43
Discussion
The author mentions that many commercial database systems use only the nested loop and merge joins due to the results of the System R project, which did not consider hash join algorithms. He states that hash join is today regarded as the most efficient in many cases. Should hash joins have been considered for use in the System R project? Why? What factors might have made the System R team not consider hash join?
SLIDE 44
- 7. Duality of Sort – And Hash-Based
Query Processing Algorithms
Sort based algorithms: a large data set is divided into subsets using a physical rule, namely into chunks as large as memory. Hash based algorithms: Large inputs are cut into subsets by hash value.
SLIDE 45
Several dualities
I/O capabilities (random and sequential). Memory limitation (fan-in, fan-out) Read-ahead (merging) VS write-behind (partitioning) Etc… There exist many dualities between sorting using multilevel merging and recursive hash table overflow management.
SLIDE 46
Two special cases
If two join inputs are of different size (and the query optimizer can reliably predict this difference), hybrid hash join outperforms merge-join. If the hash function is very poor, merge-join performs better than hash join.
SLIDE 47
- 8. Execution of Complex Query Plans
Optimal scheduling of multiple operators and the division and allocation of resources in a complex plan are important issues. Share of physical resources, i.e. memory, disk bandwidth
SLIDE 48 earlier relational execution engines VS nowadays’
In earlier time:
- Concurrent execution of multiple subplans in a
single query was not possible.
- Temporary files written impedes concurrency and
need for resource allocation.
Now:
- Consider more join algorithms, more complex query
plans.
- Support more concurrent users and use parallel-
processing capabilities.
SLIDE 49
For bushy plans
Stop points (all data in temporary files and no intermediate result data in memory) can be used to switch subplans. Which subplan is initiated first. Memory partition among active concurrent operators. Recursion levels should be executed level by level in recursive hybrid hash join
SLIDE 50
Other I/O issues, e.g. disk bandwidth and disk arms for seeking in partitioning and merging need to be considered. Scheduling bushy trees in multiprocessor systems is not entirely understood yet.
SLIDE 51
Discussion 4
When tuning your database system for performance, how much would you weight the common queries versus the computationally complex queries? the queries that touch a large amount of data versus the queries that touch a small amount of data?
SLIDE 52
It’s over….