[PPT] - Query Evaluation Techniques for Large Databases GOETZ GRAETE PowerPoint Presentation

SLIDE 1

Query Evaluation Techniques for Large Databases

1

GOETZ GRAETE Presented by Long Zhang

Dept. of Computer Science, UBC

zhlong73@cs.ubc.ca

SLIDE 2

Purpose

To survey efficient algorithms and software architecture of query execution engines for executing complex queries

ver large databases.

2

SLIDE 3

3

Restrictions

Restrictive data definition and manipulation languages can make application development and maintenance unbearably cumbersome. Data volumes might be so large or complex that the real

r perceived performance advantage of file systems is

considered more important than all other criteria, e.g. the higher levels of abstraction and programmer productivity typically achieved with database management systems.

SLIDE 4

4

Topics algorithms and there costs sorting versus hashing parallelism resource allocation and scheduling general performance-enhancing techniques … …

SLIDE 5

Which parts we concern?

User I nterface Database Query Language Query Optimizer Query Execution Engine Files and I ndices I / O Buffer Disk 5

SLIDE 6

Query Execution Engine

What is it?

Collection of query execution operators and mechanisms for operator communication and synchronization Query execution engine defines the space of possible plans that can be chosen by query

ptimizer.

6

SLIDE 7

Translate logical query from SQL to query tree in logical algebra.. Query tree in logical algebra is translated into a physical plan Optimizer expands search space and finds best plan. Optimal physical plan copied out of optimizer’s memory structure and sent to query execution engine. query execution engine executes plan using relations in database as input, and produces output 7

SLIDE 8

Discussion

Fact: On the first page, the author states that DBMSs have not been used in many areas for two reasons.

Application development and maintenance is

difficult. (Restrictive data definition and

manipulation languages can make the development difficult) The data in those areas is SO big, that speed trumps all, and people would rather hand- code.

SLIDE 9

Questions are…

Do you think in today’s world databases are popular for most applications? IF NO, GO TO (1), IF YES, GO TO (2)

1)Why do you think databases aren't used more? Why don't you use them on your data? 2)How do database systems' needs for dealing with large amounts of data differ from other applications

r systems such as OS? Should we only depend on it?

IF YES, what should be changed in the architecture

f databases or data abstraction?

SLIDE 10

1. Architecture of Query Execution

Engines

Focus on useful mechanisms for processing sets of items Records Tuples Entities Objects

10

SLIDE 11

Logical algebra VS Physical algebra

Logical algebra, e.g. relational algebra, is more closely related to the data model and defines what queries can be expressed in the data model. Physical algebra is system specific. For example, nested-loops joins, merge-join, hash join, etc. Systems may implement same model and logical algebra with different physical algebra.

11

SLIDE 12

Logical algebra VS physical algebra

Specific algorithms and therefore cost functions are associated only with physical

perators not logical algebra operators

Mapping logical to physical non–trivial:

It involves algorithm choices Logical and physical operators not directly mapped Some operators in physical algebra may implement multiple logical operators etc

12

SLIDE 13

To sum up

Query optimization is the mapping from logical to physical operations Query execution engine is the implementation of operations on physical representation types and of mechanisms for coordination and cooperation among multiple such operations in complex queries.

13

SLIDE 14

14

Prepare an operator for producing data : Open Produce an item : Next Perform final housekeeping: Close

Implementation issues

SLIDE 15

Iterators

Two important features of operators

Can be combined into arbitrarily complex evaluation plans Any number of operators can schedule and execute each other in a single process without assistant from underlying OS

SLIDE 16

Observations

Entire query plan executed within a single process Operators produce an item at a time on request Items never wait in a temporary file or buffer (pipelining) Efficient in time-space-product memory cost Iterators can schedule any type of trees including bushy trees No operator affected by the complexity of the whole plan

SLIDE 17

Sorting &Hashing

The purpose of many query-processing algorithms is to perform some kind of matching,

i.e., bringing items that are “alike” together and performing some operation on them.

There are two basic approaches used for this purpose:

sorting hashing.

These are the basis for many join algorithms

SLIDE 18

Design Issues

Sorting should be implemented as an iterator

In order to ensure that sort module interfaces well with the other operators, (e.g., file scan or merge-join).

Input to the sort module must be an iterator, and sort uses open, next, and close procedures to request its input

therefore, sort input can come from a scan or a complex query plan, and sort operator can be inserted into a query plan at any place or at several places.

SLIDE 19

Out of Memory

If the input is larger than main memory, the

pen-sort procedure creates sorted runs

and merges them until only one final merge phase is left. Dividing and combining

SLIDE 20

More on Sorting

For sorting large data sets there are two distinct sub-algorithm :

One for sorting within main memory One for managing subsets of the data set on the disk.

For practical reasons, e.g., ensuring that a run fits into main memory, the disk management algorithm typically uses physical dividing and logical combining (merging). A point of practical importance is the fan-in

r degree of merging, but this is a parameter

rather than a defining algorithm property.

SLIDE 21

Creating level-0 runs

In memory sort, e.g. Quicksort. By doing so, each run will have the size of allocated memory. Replacement selection. Using priority heap, which support insert and remove- smallest operations. The main purpose is to generate the longest initial ordered sequence for each file.

SLIDE 22

Replacement Selection Sort

Input Buffer, Output Buffer and Ram have the same size.

I nput File I nput Buffer RAM Output Buffer Output File

SLIDE 23

Replacement Selection Sort

In memory:

31 19 40 56 21 25 12 Next input: 20 10 Do not need to care When the run is Larger than the memory

SLIDE 24

Merging

Run 1 Run 2 Output Buffer Run 3 I n Memory On the disk

SLIDE 25

Merging

Scans are faster if read-ahead and write-behind are used; Using large cluster sizes for the run files is very beneficial; The number of runs W is typically not a power of Fan-in;

SLIDE 26

Merging as many as possible instead of always merging the same level

SLIDE 27

Quick Sort VS Replacement selection

Run files in RS are typically larger than memory ,as oppose to QS where they are the size of the memory Qs results in burst of reads and writes for entire memory loads from the input file to initial run files while RS alternates between individual read and write In RS memory management is more complex The advantage of having fewer runs must be balanced with the different I/0 pattern and the disadvantage of more complex memory management.

SLIDE 28

2.2 Hashing

Alternative to sorting Expected complexity of hashing algorithms is O(N) rather than O( N log N) as for sorting. Hash-based query processing algorithms use an in-memory hash table of database objects to perform their matching task.

SLIDE 29

29

When hash table is larger than memory, hash table overflow occurs and must be dealt with. Input divided into multiple partition files such that partitions can be processed independently from one another Concatenation of results of all partitions is the result of the entire operation.

Hash Table

verflow

SLIDE 30

Discussion 2

Question 2

Fact1: Nowadays, data is larger and larger than before. Fact2: Memory overflow was a serious problem to both sorting and hashing at the time when the paper was written. Fact3: Machines are more powerful than the time of this paper: such as larger memory.

SLIDE 31

The question is…

For query execution in large database system:

Do you think that in the time since then the issues would have gotten

Better? Because memories have gotten larger? Worse? Because there is a bigger gap between the time it takes to access memory and the time to access things on disk?

Do you think increasing power of the machines could deal with the large data processing or still worth making great effort on the field as this paper did? Are there any other new issues worth consideration?

SLIDE 32

3. Disk Access

File Scans Associative Access Using Indices Buffer Management

SLIDE 33

3.1 File scans

Make it fast -> Read ahead -> requires contiguous file allocation

SLIDE 34

Associative Access Using Indices

Goal:

To reduce the number of accesses to secondary storage

How?

By employing associative search techniques in the form of indices Indices map key or attribute values to locator information with which database

bjects can be retrieved.

SLIDE 35

Some Index Structures:

Clustered & Un-clustered

Clustered: order or organization of index entries determines order of items on disk.

Sparse & Dense

Sparse: Indices do not contain an entry for each data item in the primary file, but only

ne entry for each page of the primary file;

Dense: there are same number of entries in index as there are items in primary file. Non-clustering indices must always be dense

SLIDE 36

Buffer Management

Goal: reduce I/O cost by cashing data in an I/O buffer. Design Issues

Recovery Replacement policy performance effect of buffer allocation Interactions of index retrieval and buffer management

Implementation Issues

Interface provided : fixing –unfixing Intermediate results kept in a separate buffer

SLIDE 37

Part 5 Binary Matching Operations

There are a number of database

perations that combine information

from two inputs, files, or sets. A group of operators that all do basically the same task are called the

ne-to-one match operations.

SLIDE 38

5.1 Nested-Loops Join

For each item in one input (called the

uter input), scan the entire other input

(called the inner input) and find matches. Temporary file used to store inner input. Shortcomings: the inner input is scanned very often.

SLIDE 39

Improvements for Nested-Loops

If a single match (semi-join) carries all necessary information, a scan of the inner input can be terminated after the first match. Scan Each item of inner input-> scan each page (block nested-loops join). More in memory computation Scanning the inner input Forward and Backward, reusing the last page.

SLIDE 40

5.2 Merge-Join Algorithms

Require both inputs are sorted on the join attribute. Similar to merge process used in sorting Heap-filter merge-join: a combination

f nested-loops join and merge-join

Hybrid join (used by IBM for DB2), uses elements from index nested-loop joins and merge join, and techniques joining sorted lists on index leaf entries.

SLIDE 41

5.3 Hash Join Algorithms

Are based on the idea of building an in- memory hash table on one input and then probing this hash table using items from the other input. Good News: Very fast, no temporary files, if the build input does indeed fits into memory.

SLIDE 42

5.3 Hash Join Algorithms

Build and probe inputs are partitioned using the same partitioning function. The final join result comes from concatenating the join results of pairs

f partitioning files.

Recursive partitioning is used when a build partition file is larger than memory.

SLIDE 43

Discussion

The author mentions that many commercial database systems use only the nested loop and merge joins due to the results of the System R project, which did not consider hash join algorithms. He states that hash join is today regarded as the most efficient in many cases. Should hash joins have been considered for use in the System R project? Why? What factors might have made the System R team not consider hash join?

SLIDE 44

7. Duality of Sort – And Hash-Based

Query Processing Algorithms

Sort based algorithms: a large data set is divided into subsets using a physical rule, namely into chunks as large as memory. Hash based algorithms: Large inputs are cut into subsets by hash value.

SLIDE 45

Several dualities

I/O capabilities (random and sequential). Memory limitation (fan-in, fan-out) Read-ahead (merging) VS write-behind (partitioning) Etc… There exist many dualities between sorting using multilevel merging and recursive hash table overflow management.

SLIDE 46

Two special cases

If two join inputs are of different size (and the query optimizer can reliably predict this difference), hybrid hash join outperforms merge-join. If the hash function is very poor, merge-join performs better than hash join.

SLIDE 47

8. Execution of Complex Query Plans

Optimal scheduling of multiple operators and the division and allocation of resources in a complex plan are important issues. Share of physical resources, i.e. memory, disk bandwidth

SLIDE 48

earlier relational execution engines VS nowadays’

In earlier time:

Concurrent execution of multiple subplans in a

single query was not possible.

Temporary files written impedes concurrency and

need for resource allocation.

Now:

Consider more join algorithms, more complex query

plans.

Support more concurrent users and use parallel-

processing capabilities.

SLIDE 49

For bushy plans

Stop points (all data in temporary files and no intermediate result data in memory) can be used to switch subplans. Which subplan is initiated first. Memory partition among active concurrent operators. Recursion levels should be executed level by level in recursive hybrid hash join

SLIDE 50

Other I/O issues, e.g. disk bandwidth and disk arms for seeking in partitioning and merging need to be considered. Scheduling bushy trees in multiprocessor systems is not entirely understood yet.

SLIDE 51

Discussion 4

When tuning your database system for performance, how much would you weight the common queries versus the computationally complex queries? the queries that touch a large amount of data versus the queries that touch a small amount of data?

SLIDE 52