Data Presentation and Collection Week 2 Prepared by: Nurazrin - - PowerPoint PPT Presentation
Data Presentation and Collection Week 2 Prepared by: Nurazrin - - PowerPoint PPT Presentation
Data Presentation and Collection Week 2 Prepared by: Nurazrin Jupri Types of data Data Quantitative Qualitative Discrete Continuous Primary Data Secondary Primary data Secondary data Data collected especially for the Data which have
Data Quantitative Discrete Continuous Qualitative
Types of data
Data Primary Secondary
Primary data Secondary data
Data collected especially for the purpose of whatever survey is being conducted. Raw data are primary data which have not been processed at all, and which are still just a list of numbers. The main sources of primary data are personal investigation, teams of investigators, interviews, questionnaires and telephone surveys. It is reliable as you know where the data has come from and are aware
- f any inadequacies or limitations.
However, it can take time to collect and is expensive. Data which have already been collected elsewhere, for some other purpose, but which can be used or adapted for the survey being conducted. For example from government, banks, newspapers, the Internet. Secondary data sources may be satisfactory in certain situations, or they may be the only convenient means of
- btaining an item of data.
It is essential to ensure secondary data used is accurate and reliable. relatively inexpensive and quick to
- btain
4
Discrete data
- Can only be whole numbers,
- e.g. people,
- houses
5
Continuous data
Can take on all values: examples include
- Change in speed (acceleration) of a car
- Change in height of a person
- Change in age of a person
Discrete data Continuous data are the number of goals scored by Arsenal against Chelsea in the FA Cup Final: Arsenal could score 0, 1, 2, 3
- r even 4 goals (discrete
variables = 0, 1, 2, 3, 4), but they cannot score 2.1 or 2.5 goals. include the heights of all the members of your family, as these can take on any value: 1.542m, 1.639m and 1.492m for example. Continuous variables = 1.542, 1.639, 1.492
Look through the following list of surveys and decide whether each is collecting qualitative data or quantitative data. If you think the data is quantitative, indicate whether it is discrete or continuous.
Statement
Qualitative Discrete Continuous
(a) A survey of accountancy textbooks, to determine how many diagrams they contain. (b) A survey of greetings cards on a newsagent's shelf, to determine whether or not each has a price sticker on it. (d) A survey of swimmers to find out how long they take to swim a kilometre.
- Tabulation means putting data into tables.
- A table is a matrix of data in rows and
columns, with the rows and the columns having titles.
8
FREQUENCY DISTRIBUTIONS
- converting the set of numbers into the
form of a grouped frequency table.
- This involves dividing the range covered
by the data into classes and counting the numbers of data values which fall into each class.
- These numbers are the class frequencies.
9
10
Tally Chart for defective items
Age in months Tally Frequency (F) 0 and under 5 5 and under 10 10 and under 15 15 and under 20 20 and under 25 25 and under 30 30 and under 35 Total ıı ııı Iııı ıı Iııı Iııı Iııı ıııı ıııı ıııı Iııı ıııı ıııı Iı Iııı ıııı ıııı ıııı ıı 2 3 7 9 20 17 22 80
11
Frequency table for defective items by age of machine
Age in months Frequency 0-5 5-10 10-15 15-20 20-25 25-30 30-35 2 3 7 9 20 17 22 80
12
Class intervals
- used in grouped data
- e.g. 0 - 9
- 10 - 19
- 20 – 29
13
Class boundaries
- the values at which different class intervals
meet; e.g. for the second class interval
- 9.5 is the lower class boundary
- 19.5 is the upper class boundary
- What are the class boundaries of the third class
interval?
- 19.5 is the lower class boundary
- 29.5 is the upper class boundary
10 -19 20 -29
14
Class size (class widths)
- the difference between the upper and
lower class boundaries (e.g. 10 in our example)
15
Class frequencies
- the number of times an observation occurs
in a class interval
Class Frequency 25- 34 1 35- 44 2 45- 54 11 55- 64 30 65- 74 36 75- 84 21 85- 94 15 95-104 3 105-114 115-124 1
Lower boundary ? Upper boundary ? Class length / width? Class interval?
- How many classes
should be used but it is usual to arrange for at least 5 and not more than 15.
- If it can be conveniently
arranged for all classes to have the same class width
16
T R Y
CUMMULATIVE TABLE
Histogram
– Set of rectangles (bars), giving a picture of the distribution – Base of each bar: equal to class size (width) – Height of each bar: equal to class frequency (or proportional where class size varies – The area of a bar above a class interval is proportional to the frequency in that class – AREA not HEIGHT – If there is non-equal size of class interval. Find the frequency density Height of block = class frequency class width
17
18
Number of students Score on final exam (maximum 100) 10 20 30 40 20 40 60 80 100
Histogram (Frequency density)
The following table shows the ages of 25 children
- n a school bus:
Draw a histogram to represent the above data.
Age Frequency (No. of children) 5 – 10 6 11 – 15 15 16 – 17 4
19
Solution:
Age Frequency Lower boundary Upper boundary Frequency density 5 – 10 6 4.5 10.5 1 11 – 15 15 10.5 15.5 3 16 – 17 4 15.5 17.5 2
20
Try this!
The ages of children entering a theme park in a 1- hour period are recorded in the table: Find the class widths and frequency densities. Then draw a histogram to represent the data.
Age Frequency (No. of children) 1 – 3 12 4 – 10 14 11 – 18 48
21
22
Frequency curve
- Connecting the mid-points of the tops of
each rectangle in the histogram by straight lines gives the frequency polygon.
- Area beneath the frequency polygon is
identical to that of the histogram
- If we ‘smooth’ the straight lines to form
‘curves’ then a frequency curve results
Frequency polygon
23
24
Frequency Curve
cumulative frequency table
Range Cummulative Frequency Less than 34.5 1 Less than 44.5 3 Less than 54.5 14 Less than 64.5 44 Less than 74.5 80 Less than 84.5 101 Less than 94.5 116 Less than 104.5 119 Less than 114.5 119 Less than 124.5 120
25
FREQUENCY TABLE
26
Cumulative frequency
- the running total of the figures shown in
the frequency column of a frequency table. Can be
- Cumulative less than
- OR Cumulative more than
‘More than’ Ogive
Using the data given below, construct a 'more than' cumulative frequency table and draw the Ogive.
27
Solution:
Marks Lower boundary Upper boundary Cumulative Frequency 0.5 70 1 – 10 0.5 10.5 67 11 – 20 10.5 20.5 49 21 – 30 20.5 30.5 37 31 – 40 30.5 40.5 23 41 – 50 40.5 50.5 13 51 – 60 50.5 60.5 7 61 – 70 60.5 70.5 2 71 – 80 70.5 80.5
28
Cumulative frequency Marks
10 20 30 40 50 60 70 80
0.5 10.5 20.5 30.5 40.5 50.5 60.5 70.5 80.5
'More than' Ogive
29
‘Less than’ Ogive
Draw a 'less than' ogive curve for the following data:
30
Solution:
What is the lower boundary and upper boundary of the classes?
Marks Lower boundary Upper boundary Cumulative Frequency 0 – 10 0.5 9.5 2 10 – 20 9.5 19.5 10 20 – 30 19.5 29.5 22 30 – 40 29.5 39.5 40 40 – 50 39.5 49.5 68 50 – 60 49.5 59.5 90 60 – 70 59.5 69.5 96 70 – 80 69.5 79.5 100
31
Cumulative frequency Marks
20 40 60 80 100 120 10 20 30 40 50 60 70 80 90
32
33
Percentiles
- A particular value below (or above) which
a given percentage of the distribution lies
- Lower quartile: that value below which
25% of the distribution lies
- Upper quartile: that value above which
25% of the distribution lies
- Median: that value below/above which
50% of the distribution lies
34
35
Stem-and-leaf
- Order the raw data in an array (e.g. lowest
to highest)
- Stems: often the tens
- Leafs: often the units
- Example:
80, 83, 84, 86, 87, 89, 90, 94, 95, 97, 99 8 : 0 3 4 6 7 9 9 : 0 4 5 7 9
36
Bar charts
A method of data presentation in which data
are represented by
bars of equal width, the height / length of the bar
corresponding to the value of the data.
Axes must be labeled and there must be a
scale to indicate the magnitude of the data.
37
Simple Component Compound
- A chart consisting of one or
more bars
- The actual magnitude of
each item is shown
- The lengths of bars on the
chart allow magnitudes to be compared A bar chart that gives a breakdown of each total into its components. A percentage component = does not show total magnitudes
- two or more separate bars
are used to present sub-divisions of data.
- There is usually no space
between the bars for data in the same category
38
39
1 2 3 4 5 6 7 8 China Turkey Ukraine Kyrgyzstan India Pakistan Oman Saudi Arabia Ivory Coast Thailand nationalities of students number of students
Bar chart
40
Pie Chart
- Circular ‘pie’ diagram using sectors to
show distribution
- A complete 'pie' = 360° = 100%
180° = 50%
- Shading and Colour => distinguishes the
segments from each other
- Sector angle:
fraction of total in sector x 360°
41
Example (total of 120 items)
- Sector 1
60 items, angle = 180°
- Sector 2
40 items, angle = 120°
- Sector 3
20 items, angle = 60°
360 120 60
360 120 40
360 120 20
42
Example (total of 120 items)
Pie diagram of 120 items
Sector 1, 60 Sector 2, 40 Sector 3, 20 Sector 1 Sector 2 Sector 3
43
Scatter graph
- Plot of points on a graph
- Line of best fit: Straight line through the
points following the trend
- Positive correlation: line of best fit slopes
up, variables rise (and fall) together
- Negative correlation: line of best fit
slopes down; one variable rises as the
- ther falls
44
- Random scatter
Y X
45
Does this represent the sale of gas fires or ice creams?
Temperature x x x x x x X Y Sales
46
Does this represent the sale of gas fires or ice creams?
Temperature x X Y x x x x x x sales