PRESENTATION OF DATA Data summarization Is the organization of data - - PowerPoint PPT Presentation

presentation of data data summarization
SMART_READER_LITE
LIVE PREVIEW

PRESENTATION OF DATA Data summarization Is the organization of data - - PowerPoint PPT Presentation

Tabular presentation PRESENTATION OF DATA Data summarization Is the organization of data in a way for easy understanding. It is the first step of data interpretation (analysis). Consists of the following steps: 1) Data entering. 2)


slide-1
SLIDE 1

PRESENTATION OF DATA

Tabular presentation

slide-2
SLIDE 2

Data summarization

Is the organization of data in a way for easy understanding.

  • It is the first step of data interpretation

(analysis).

  • Consists of the following steps:

1) Data entering. 2) Ordered array. 3) Summarization

slide-3
SLIDE 3

Data entering

 Generally, computers are used for data entry.  Nowadays, many software are developed for

data entering, presentation and data analysis.

 Examples of statistical software:

 MS Excel.  Epi-Info.  SPSS.  Stata.

slide-4
SLIDE 4

Ordered array

  • It is the first step in the process of data
  • rganization after data entering.
  • An ordered array is a listing of values from

the smallest value to the largest value.

  • It enables one to determine quickly the

largest and smallest measurements.

  • It also enables to determine roughly

proportion of people lying below and above certain value.

slide-5
SLIDE 5

FREQUENCY DISTRIBUTION TABLES

It determines the number of observations falling into each class

In qualitative data we are counting the number of

  • bservations in each category. These counts are

called frequencies. And they are also presented as relative percentages of the total numbers.

In quantitative data frequencies can be counted

by grouping data into equal intervals and counting frequency of event in each interval.

slide-6
SLIDE 6

GROUPING DATA

To group a set of observations, we select a set of contagious, non

  • verlapping

intervals, such that each value in the set of

  • bservation can be placed in one interval
  • nly, and no single observation should be

missed.The interval is called: CLASS INTEVAL.

slide-7
SLIDE 7

NUMBER OF CLASS INTERVALS

The number of class intervals should not be too few because of the loss of important information, and not too many because of the loss of the needed summarization

slide-8
SLIDE 8

NUMBER OF CLASS INTERVALS

When there is a priori classification of that particular observation we can follow that classification. But when there is no such classification we can follow the Sturge's Rule

slide-9
SLIDE 9

NUMBER OF CLASS INTERVALS Sturge's Rule: k=1+3.322 log n

  • k= number of class intervals
  • n= number of observations in the set
  • The result should not be regarded as final,

modification is possible

slide-10
SLIDE 10

WIDTH OF CLASS INTERVAL

The width of the class intervals should be the same, if possible. R W=-------- K W= Width of the class interval R= Range (largest value – smallest value) K= Number of class intervals

slide-11
SLIDE 11

RELATIVE FREQUENCY DISTRIBYTION

  • It determines the proportion of observation

in the particular class interval relative to the total observations in the set.

slide-12
SLIDE 12

CUMULATIVE FREQUENCY DISTRIBUTION

  • This is calculated by adding the number of
  • bservation in each class interval to the

number of observations in the class interval above, starting from the second class interval

  • nward.
slide-13
SLIDE 13

CUMULATIVE RELATIVE FREQUENCY DISTRIBUTION

This is calculated by adding the relative frequency in each class interval to the relative frequency in the class interval above, starting also from the second class interval onward.

slide-14
SLIDE 14

CUMULATIVE DISTRIBUTION

Cumulative frequency and cumulative relative frequency distributions are used to facilitate

  • btaining information regarding the frequency or

relative frequency within two or more contagious class intervals.

slide-15
SLIDE 15

 The

following are the number

  • f

hours of 45 patients slept following the administration of a certain hypnotic drug: 10 7 7 1 7 2 3 10 12 12 5 7 8 3 4 11 3 1 5 8 5 13 7 1 3 4 3 17 17 10 3 4 4 4 11 5 7 7 8 5 8 8 1 8 13

slide-16
SLIDE 16

 Construct a table showing:

  • Frequency
  • Relative frequency
  • Cumulative frequency
  • Cumulative relative frequency distribution.
slide-17
SLIDE 17

Number of class intervals: K=1+3.322 log n =1+3.322 log45 =1+3.322 X 1.653 =6.4 =6 Width of class interval: R 17-1 W=------= ------- = 2.7 = 3 K 6

slide-18
SLIDE 18

CUM.REL. FREQUENCY % CUMULATIVE FREQUENCY RELATIVE FREQUENCY % FREQUENCY CLASS INTERVAL (hour)

24.4 11 24.4 11 1-3 46.6 21 22.2 10 4-6 75.5 34 28.9 13 7-9 91.1 41 15.6 7 10-12 95.5 43 4.4 2 13-15 99.9 45 4.4 2 16-18 99.9 45 Total

slide-19
SLIDE 19

 The following are the weight (in ounces)

  • f malignant tumours removed from the

abdomen of 57 subjects:

slide-20
SLIDE 20

28

51

36

41

16

31

12 21 22

11

68

1

25

52

19

42

24

32

32 22 23

12

63

2

45

53

46

43

69

33

49 23 24

13

42

3

12

54

30

44

47

34

38 24 25

14

27

4

57

55

43

45

23

35

42 25 44

15

30

5

51

56

49

46

22

36

27 26 65

16

36

6

23

57

12

47

43

37

31 27 43

17

28

7

42

48

27

38

50 28 25

18

32

8

28

49

49

39

38 29 74

19

79

9

31

50

28

40

21

30

51

20

27

10

slide-21
SLIDE 21

 Construct a table showing :

  • Frequency
  • Relative frequency
  • Cumulative frequency
  • Cumulative relative frequency
slide-22
SLIDE 22

Number of class intervals: K=1+3.322 log n =1+3.322 log 57 =1+3.322 X1.76 = 6.8.3 = 7 Width of class interval: R 79-12 67 W=---------= ------------=-----------= 9.6 = 10 K 7 7

slide-23
SLIDE 23

Cum.Rel Freq% Rel.Freq %

Cum.Freq Frequency

Class interval 8.77 8.77 5 5 10-19 42.10 33.33 24 19 20-29 59.64 17.54 34 10 30-39 82.45 22.81 47 13 40-49 89.47 7.02 51 4 50-59 96.49 7.02 55 4 60-69 100.00 3.51 57 2 70-79 100.00 57 TOTAL

slide-24
SLIDE 24

Tabular presentation

Presentation of data in tables so as to

  • rganize the data into a compact, concise

and readily comprehensible form. They can display the characteristics of data more efficiently than the raw data.

slide-25
SLIDE 25

Types

  • Simple Table : including one variable (quantitative
  • r qualitative ) and the corresponding frequency
  • Cross

tabulation: (Two–dimensional

tables), two variables are cross classified

  • Contingency table: demonstrating the relationship

between two or more variables

slide-26
SLIDE 26

Graphical and Pictorial presentation of data

  • The use of diagrams or pictures to

display distribution or characteristics of

  • ne or more sets of data in a compact and

readily comprehensible form.

  • They can provide a better visual

appreciation of characteristics of data than tabular presentation

slide-27
SLIDE 27

Graphs

  • It is a pictorial display of quantitative data

using a coordinate system , where the X is the horizontal axis and the Y is the vertical axis.

  • X-axis usually includes the independent

variable (method of classification)

  • Y-axis includes the dependant variable

( frequency or relative frequency or other indicator)

slide-28
SLIDE 28

Stem-and-Leaf Plot

  • Summarizes quantitative data.
  • Each data point is broken down into a “stem” and

a “leaf.”

  • First, “stems” are aligned in a column.
  • Then, “leaves” are attached to the stems.
slide-29
SLIDE 29

Stem-and-Leaf Plot

Stem-and-leaf of Shoes N = 139 Leaf Unit = 1.0 12 0 223334444444 63 0 555555555555566666666677777778888888888888999999999 (33) 1 000000000000011112222233333333444 43 1 555555556667777888 25 2 0000000000023 12 2 5557 8 3 0023 4 3 4 4 00 2 4 2 5 0 1 5 1 6 1 6 1 7 1 7 5

slide-30
SLIDE 30

Histogram

  • Graphical display of frequency distribution of

quantitative variable .

  • The values of the quantitative variable( as

class interval) will be placed on the X-axis ( representing the width of the rectangles), and the corresponding frequency (or relative frequency) will be placed on the Y-axis (representing the height of the rectangles)

slide-31
SLIDE 31

Histogram

  • The area is proportional to the height, and

the frequencies in different categories can be directly compared by examining the relative height of the respective bar.

  • It is important that the class interval should

be equal, otherwise the area should be compared.

  • Only one set of data can be shown in one

histogram

slide-32
SLIDE 32
slide-33
SLIDE 33

Frequency Polygon

  • Another form of graphical presentation of

frequency distribution of quantitative variables.

  • It is similar to the histogram , but instead of

using rectangles to present data, the midpoint of the top of each rectangle are plotted , and connected together by straight lines.

slide-34
SLIDE 34

Frequency Polygon

  • More than one set of data can be

demonstrated on the same graph, to facilitate direct comparison.

  • It provides information about underlying

characteristics of data .

  • The area under the frequency polygon is

equal to the area under the equivalent histogram

slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38

Scatter diagram

  • A pair of measurements is plotted as a single

point on a graph.

  • The value of one variable of each pair is

plotted on the X axis and the value of the

  • ther variable is plotted on the Y axis
slide-39
SLIDE 39

Scatter diagram

  • The pattern made by the plotted points is

indicative of the relationship between these two variables, which might be linear (if they follow straight line) or curvilinear (if the pattern doesn't follow straight line)

slide-40
SLIDE 40

Scatter diagram

A scatter diagram could suggest:

  • No relationship: when one variable changes

with no change in the other variable ,or when the pattern is buzzard

  • Linear relationship: an increase in the 1st

variable is associated with an increase (positive)

  • r decrease (negative) in the 2nd variable, and

the pattern follows a straight line.

  • Curvilinear (positive or negative) relationship:

the pattern of increase or decrease will not follow a straight line .

slide-41
SLIDE 41

0.5 1 1.5 2 2.5 3 1 2 3 4 l/min l/min

correlation of two methods of cardiac

  • utput measurments

Series1

slide-42
SLIDE 42

Bar chart

  • Used to present discrete or qualitative data
  • It includes separated bars of equal width
  • The method of classification of the variable is

usually placed on the X-axis, and the Y-axis usually represents the corresponding frequency or relative frequency.

slide-43
SLIDE 43

Bar chart

  • It can be used to present more than one set of

data simultaneously using different colors , shades,... In this case a key should be used

  • Comparison will be made on the basis of the

height of the bar (frequency). i.e.: the width of the bar has no value

  • It is important that the vertical axis should start

at the zero, otherwise the heights of the bars are not proportional to the frequencies.

slide-44
SLIDE 44

Estimated Direct and Indirect Costs of Cardiovascular Diseases and Stroke United States: 2005

Source: Heart Disease and Stroke Statistics – 2005 Update. 254.8 142.1 56.8 59.7 27.9 393.5 50 100 150 200 250 300 350 400 450

Heart Disease Coronary Heart Disease Stroke Hypertensive Disease Congestive Heart Failure Total CVD*

Billions of Dollars

slide-45
SLIDE 45

434 289 69 61 34 494 269 64 42 39 100 200 300 400 500 A B C D E A B D F E Males Females

Deaths in Thousands Leading Causes of Death for All Males and Females

United States: 2002 A Total CVD (Preliminary) B Cancer C Accidents D Chronic Lower Respiratory Diseases E Diabetes Mellitus F Alzheimer’s Disease

Source: CDC/NCHS

slide-46
SLIDE 46

Fig 3: Distribution of unvaccinated children below one year by governorates

0% 5% 10% 15% 20% 25% 30% 35% Baghdad Anbar Babylon Wassit Basrah Ninevah Missan Qadisiya Diyala Kerbala Taamem Muthana Thi qar Najaf Salah Al Din Suleimaniya Erbil Duhok Total

Governorates % of unvaccinated children

slide-47
SLIDE 47

Component bar chart

  • It is a type of charts based on proportion.
  • It uses bars that are either shaded or colored

to show the relative contribution of each of its components

slide-48
SLIDE 48

Fig 9: Reason for unvaccination for unvaccinated children by governorates

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Baghdad Babylon Basrah Missan Diyala Taamem Thi qar Salah Al din Erbil Total Governorates

% of other causes % of the child abscent % of not visited by vaccination team

slide-49
SLIDE 49

<40 40-49 50-59 60-69 70-79 80+ Age (y) 17% 16% 16% 20% 20% 11%

Distribution of Hypertension Subtype in the untreated Hypertensive Population in NHANES III by Age

ISH (SBP 140 mm Hg and DBP <90 mm Hg) SDH (SBP 140 mm Hg and DBP 90 mm Hg) IDH (SBP <140 mm Hg and DBP 90 mm Hg) 20 40 60 80 100

Numbers at top of bars represent the overall percentage distribution of untreated hypertension by age. Franklin et al. Hypertension 2001;37: 869-874.

Frequency of hypertension subtypes in all untreated hypertensives (%)

slide-50
SLIDE 50

POL-WAR LTU-KAU RUS-NOC UNK-GLA FIN-NKA RUS-NOI RUS-MOC CZE-CZE YUG-NOS RUS-MOI BEL-CHA FRA-LIL POL-TAR FIN-KUO UNK-BEL FIN-TUL FRA-STR GER-EGE ITA-FRI GER-BRE BEL-GHE USA-STA DEN-GLO GER-AUG SWE-GOT NEZ-AUC ITA-BRI AUS-NEW CAN-HAL SWI-VAF ICE-ICE SWE-NS SWI-TIC AUS-PER FRA-TOU SPA-CAT CHN-BEI

500 1000 1500 2000 Annual mortality rate per 100 000 CHD Stroke Other CVD Non CVD

Men

UNK-GLA POL-WAR LTU-KAU USA-STA DEN-GLO BEL-CHA RUS-NOC YUG-NOS CZE-CZE UNK-BEL RUS-MOC BEL-GHE GER-EGE RUS-NOI RUS-MOI NEZ-AUC POL-TAR FRA-LIL AUS-NEW CHN-BEI CAN-HAL GER-BRE FIN-NKA SWE-GOT FIN-KUO ITA-FRI GER-AUG FIN-TUL FRA-STR ICE-ICE AUS-PER ITA-BRI SWE-NS FRA-TOU SPA-CAT

250 500 750 1000 Annual mortality rate per 100 000

Women

G3

slide-51
SLIDE 51

Distribution of coronary risk factors among patients with chronic metabolic syndrome

48.8 27.5 53.8 66.3 93.1 17.5 10 20 30 40 50 60 70 80 90 100 Relative frequency (%) Hypertension Diabetes Mellitus Family history of ischemic Heart Di... Smoking habit Dyslipidemia Obesity

slide-52
SLIDE 52

Pictograms

It uses series of small identifying symbols to present the data. Each symbol represents a fixed number of units

slide-53
SLIDE 53
slide-54
SLIDE 54

Pie chart

  • It is a type of charts based on proportion
  • It uses wedge-shaped portions of a circle to

illustrate the relative contribution of each part to the total (division of the whole into segments)

slide-55
SLIDE 55

Pie chart

  • To demonstrate the angel of each wedge , we

multiply the relative frequency of each division by 360 degrees.

  • Start at 12 o’clock,
  • It is preferable to arrange segments in order
  • f their magnitude (starting with the largest),

and proceed clockwise around the chart.

slide-56
SLIDE 56

Percentage Breakdown of Deaths From Cardiovascular Diseases

United States:2002 Preliminary

Source: CDC/NCHS.

18% 6% 5% 4% 0% 0% 13% 53% Coronary Heart Disease Stroke Congestive Heart Failure High Blood Pressure Diseases of the Arteries Rheumatic Fever/Rheumatic Heart Disease Congenital Cardiovascular Defects Other

slide-57
SLIDE 57

Most Myocardial Infarctions Are Caused by Low-Grade Stenoses

Pooled data from 4 studies: Ambrose et al, 1988; Little et al, 1988; Nobuyoshi et al, 1991; and Giroud et al, 1992. (Adapted from Falk et al.)

Falk E et al, Circulation, 1995.

slide-58
SLIDE 58

Box Plot

  • Summarizes quantitative data.
  • Vertical (or horizontal) axis represents measurement

scale.

  • Lines in box represent the 25th percentile (“first

quartile”), the 50th percentile (“median”), and the 75th percentile (“third quartile”), respectively.

slide-59
SLIDE 59

Box and whisker plot

Largest non-outlying value Upper quartile Lower quartile Smallest non-outlying value

*

  • Outlying value

Extreme outlying value Median

Box Whiskers Outlying values

slide-60
SLIDE 60

Box Plot

1 2 3 4 5 6 7 8 9 10

Amount of sleep in past 24 hours

  • f Spring 1998 Stat 250 Students
slide-61
SLIDE 61

Map charts

  • These are used to present the geographical

distribution of one or more sets of data

slide-62
SLIDE 62

Change in coronary event rate Change in MONICA CHD mortality Change in case fatality Significant increase Insignificant change Significant decrease

Men

G24

slide-63
SLIDE 63

Suggestions for the design and use of tables, graphs, and charts

  • Choose the method most effective for data and

purpose

  • Point out one idea at a time
  • Limit the amount of data and include one kind
  • f data in each presentation
  • Use adequate , properly located titles and labels
  • Mention the source , if it is not yours
  • Care and caution in proposing conclusions
slide-64
SLIDE 64

Exercise

  • The following are the

DBP measurements (mmHg) of 60 individuals. Make a suitable graphical or pictorial presentation No. DBP

(mmHg)

3 65-69 5 70-74 9 75-79 18 80-84 13 85-89 9 90-94 3 95-99

60 Total

slide-65
SLIDE 65

DBP (mmHg) of 60 men

5 10 15 20 65- 69 70- 74 75- 79 80- 84 85- 89 90- 94 95- 99 years No. Series1

slide-66
SLIDE 66

Exercise

  • The following are the

proportions of the commonest ten cancers in Iraq, 1995

  • Make a suitable

graphical or pictorial presentation

% of total CA

Primary site

14.3

Breast

11.2

Bronchus &lung

7.4

Urinary Bladder

6.2

Non-Hodgkin Lymphoma

5.9

Larynx

5.2

Leukemia

4.8

Brain & other CNS

4.3

Skin

3.6

Stomach

3.0

Hodgkin Lymphoma

slide-67
SLIDE 67

Commonest 10 Ca in Iraq

2 4 6 8 10 12 14 16 Breast Bronchus Urinary Non- Larynx Leukemia Brain & Skin Stomach Hodgkin CA site % of total CA Series1

slide-68
SLIDE 68

Exercise

  • The following is the

distribution of TB cases registered in City X.

  • Make a suitable

graphical or pictorial presentation No. Type of TB 360

Smear +ve PTB

240

Smear –ve PTB

200

Extra PTB

800 Total

slide-69
SLIDE 69

Types of TB

Smear +ve PTB Smear –ve PTB Extra PTB

slide-70
SLIDE 70

Exercise

  • The following is the

distribution of meningitis cases , Ibn Al-Khateeb Hospital, 1999.

  • Make a suitable

graphical or pictorial presentation

Total Female No. Male No. Agent 252 84 168 Viral 126 42 84 Bacterial 42 21 21 TB 420 147 273 Total

slide-71
SLIDE 71

0% 20% 40% 60% 80% 100% Viral Bacterial TB % type

Meningitis cases by type and sex

Series2 Series1