Data Modeling and Database Design Yuri Takhteyev Faculty of - - PowerPoint PPT Presentation

data modeling and database design
SMART_READER_LITE
LIVE PREVIEW

Data Modeling and Database Design Yuri Takhteyev Faculty of - - PowerPoint PPT Presentation

INF1343, Winter 2012 Data Modeling and Database Design Yuri Takhteyev Faculty of Information University of Toronto This presentation is licensed under Creative Commons Attribution License, v. 3.0. To view a copy of this license, visit


slide-1
SLIDE 1

INF1343, Winter 2012

Data Modeling and Database Design

Yuri Takhteyev

Faculty of Information

University of Toronto

This presentation is licensed under Creative Commons Attribution License, v. 3.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/. This presentation incorporates images from the Crystal Clear icon collection by Everaldo Coelho, available under LGPL from http://everaldo.com/crystal/.

slide-2
SLIDE 2

Week 9

Advanced Queries

slide-3
SLIDE 3

“Loading” Data

Loading data from tab- delimited CSV Files

the files can be prepared in OpenOffice Calc / Excel Note: tab-delimited

Remote method:

DB server accesses the files

Local method:

Through the database client

slide-4
SLIDE 4

LOAD DATA INFILE

load data infile "<file_name>" into table <table_name>; load data infile "/home/okenobi/names.csv" into table names;

The DB server must have access to the file. Your DB account must have file/super privileges.

slide-5
SLIDE 5

LOAD DATA LOCAL

load data local infile "<file_name>" into table <table_name>; load data local infile "c:/Users/okenobi/Desktop/names.csv" into table names;

The file path in this case is to a file on the computer where your DB client is running. Backslashes (\) in Windows files names must be flipped (/) or escaped (\\).

slide-6
SLIDE 6

select customer.name from customer join vacation_order using(customer_id) where payment_method = "Cash"; What are the names of the Diveshop’s customers who paid cash for their

  • rders?
slide-7
SLIDE 7

select count(*) from vacation_order where payment_method = "Cash"; How many orders were paid in cash?

slide-8
SLIDE 8

select sum(cost) from vacation_order where payment_method = "Cash"; How much cash was paid altogether?

slide-9
SLIDE 9

select sum(cost) from vacation_order where payment_method = "Visa"; And how much money was paid through Visa?

slide-10
SLIDE 10

select sum(cost) from vacation_order where payment_method = "Master Card"; And how much money was paid through Master Card?

slide-11
SLIDE 11

select payment_method, sum(cost) from vacation_order group by payment_method; Can we just get all the sums for each payment method at once?

slide-12
SLIDE 12

select payment_method, sum(cost) from vacation_order group by payment_method

  • rder by sum(cost) desc;

Which payment method brought in most money?

slide-13
SLIDE 13

select payment_method, sum(cost) from vacation_order where no_of_people<=2 group by payment_method

  • rder by sum(cost) desc limit 1;

Which payment method brought in most money for vacations that involved up to 2 people?

Note: “where” before “group by”!

slide-14
SLIDE 14

select payment_method, count(*) from vacation_order where no_of_people<=2 group by payment_method

  • rder by count(*) desc limit 1;

Which payment method was used most often for for vacations that involved up to 2 people?

slide-15
SLIDE 15

select payment_method, count(*) from vacation_order where no_of_people<=2 and cost>5000 group by payment_method

  • rder by count(*) desc limit 1;

Which payment method was used most often for vacations that involved up to 2 people and cost over $5000?

slide-16
SLIDE 16

select payment_method, sum(cost) from vacation_order group by payment_method having sum(cost)>15000

  • rder by count(*) desc limit 1;

Which payment methods brought in more than $15,000 in total? Note the order!

slide-17
SLIDE 17

where selects rows from the original table (after all the joins) having by selects rows from the aggregated table

Where vs Having

slide-18
SLIDE 18

select ... from … join … (several times) where … group by … having …

  • rder by …

limit ...

Order of Clauses

7 1 2 3 4 5 6

slide-19
SLIDE 19

select payment_method, avg(cost) from vacation_order group by payment_method having avg(cost)>15000; Which payment methods brought in more than $15,000 on average per

  • rder?
slide-20
SLIDE 20

select avg(cost) from vacation_order where cost>15000 group by payment_method; What was the average amount coming from each payment methods for

  • rders costing over $15,000?
slide-21
SLIDE 21

select avg(cost) from vacation_order where avg(cost)>15000; What about this query? Invalid!

slide-22
SLIDE 22

select payment_method, avg(cost) from vacation_order group by payment_method having cost > 15000; And this one? Invalid!

slide-23
SLIDE 23

select avg(cost) from vacation_order; select payment_method from vacation_order where cost > 18462.8 group by payment_method; What payment methods were used for vacations costing above average?

slide-24
SLIDE 24

set @avg_cost = ( select avg(cost) from vacation_order ); select payment_method from vacation_order where cost > @avg_cost group by payment_method; What payment methods were used for vacations costing above average?

slide-25
SLIDE 25

select payment_method from vacation_order where cost > ( select avg(cost) from vacation_order ) group by payment_method; What payment methods were used for vacations costing above average? An “uncorrelated” subquery

slide-26
SLIDE 26

select payment_method, avg(cost) from vacation_order group by payment_method having avg(cost) > ( select avg(cost) from vacation_order ); What payment methods on average brought more money than the average for all vacations? Still “uncorrelated”

slide-27
SLIDE 27

select order_id from vacation_order as o where cost > ( select avg(cost) from vacation_order where payment_method =o.payment_method ); Which vacations cost more than the average for their payment method? This is a “correlated” query!

slide-28
SLIDE 28

select category, count(*) from species group by category having count(*) > 1; Which categories of marine life have more than one species? (Use species.)

We did not go through this and subsequent slides in class. Please review them on your own.

slide-29
SLIDE 29

select category, min(length_cm), max(length_cm) from species group by category having count(*) > 1; What are the smallest and the largest lengths in each of the categories that have more than 1 species?

slide-30
SLIDE 30

select category, max(length_cm) / min(length_cm) from species group by category having count(*) > 1; Which category with >1 species has the largest ratio between the largest and the smallest length of species? Step 1

slide-31
SLIDE 31

select category, max(length_cm) / min(length_cm) as ratio from species group by category having count(*) > 1; Which category with >1 species has the largest ratio between the largest and the smallest length of species? Step 2

slide-32
SLIDE 32

select * from ( select category, max(length_cm) / min(length_cm) as ratio from species group by category having count(*) > 1) as categories

  • rder by categories.ratio desc;

Which category with >1 species has the largest ratio between the largest and the smallest length of species? Step 3

slide-33
SLIDE 33

Questions?