Correlation Learning Objectives At the end of this lecture, the - PowerPoint PPT Presentation
Chapter 4.1 Scatter Diagrams and Linear Correlation Learning Objectives At the end of this lecture, the student should be able to: Explain what a scattergram is and how to make one State what strength and direction mean with
Chapter 4.1 Scatter Diagrams and Linear Correlation
Learning Objectives At the end of this lecture, the student should be able to: • Explain what a scattergram is and how to make one • State what “strength” and “direction” mean with respect to correlations • Compute correlation coefficient r using the computational formula • Describe why correlation is not necessarily causation
Introduction • Making a scatter diagram • Correlation coefficient r • Causation and lurking variables Photograph provided by Dr. John Bollinger
Scattergram Also called Scatter Plots
Scattergrams Graph x,y Pairs 8 • Explanatory (independent) 7 variable is called x 6 • Graphed on x-axis 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 x axis
Scattergrams Graph x,y Pairs 8 • Explanatory (independent) Y 7 variable is called x 6 • Graphed on x-axis 5 • Response (dependent) 4 y axis variable is called y 3 • Graphed on y-axis 2 1 0 0 1 2 3 4 5 6 7 8 x axis
Scattergrams Graph x,y Pairs 8 • Explanatory (independent) 7 variable is called x 6 • Graphed on x-axis 5 • Response (dependent) 4 y axis variable is called y 3 • Graphed on y-axis • Trick to memorizing: x → y, 2 x comes before y, so x 1 “causes” y. 0 • Scatter diagram is a graph 0 1 2 3 4 5 6 7 8 of these x,y pairs x axis
Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 y axis x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 x axis
Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 1 x y 3 (# of dx) (# of meds) 2 1 3 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses
Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 3 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 5 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses
Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses
Scattergrams Graph x,y Pairs 8 Do the number of diagnoses a Number of Medications 7 patient has correlate with the 6 number of medications s/he 5 takes? 4 x y 3 (# of dx) (# of meds) 2 1 3 1 3 5 0 4 4 0 1 2 3 4 5 6 7 8 7 6 Number of Diagnoses
Linear Correlation 8 • Linear correlation means 7 that when you make a 6 scatterplot of x,y pairs, it x y 5 looks kind of like a line 1 2 4 • “Perfect” linear correlation 2 4 3 3 6 looks like graphing points 2 4 8 in algebra 1 0 0 1 2 3 4 5 6 7 8
Facts About Linear Correlation 8 • The line can go up. This Number of Medications 7 is a positive correlation. 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Number of Diagnoses
Facts About Linear Correlation 8 Number of Nurses Staffed on Shift • The line can go up. This 7 is a positive correlation. 6 • The line can go down. 5 This is negative 4 correlation. 3 2 1 0 0 1 2 3 4 5 6 7 8 Number of Patient Complaints
Facts About Linear Correlation 8 • The line can go up. This 7 is a positive correlation. Days Spent in Hospital 6 • The line can go down. 5 This is negative 4 correlation. 3 • The line can be straight. 2 This is no correlation. 1 0 0 1 2 3 4 5 6 7 8 Total Unique Visitors
Facts About Linear Correlation 8 • The line can go up. This 7 is a positive correlation. Number of Books 6 • The line can go down. 5 This is negative 4 correlation. 3 • The line can be straight. 2 This is no correlation. 1 • The line can be goofy. 0 This is also no 0 1 2 3 4 5 6 7 8 correlation. Number of Games
Correlation Has Two Attributes Direc Di ection tion Str Stren ength gth • Strength refers to how • Positive close to the line all the correlation dots fall. • If they fall really close to • Negative the line, it is strong • If they fall kind of close to correlation the line, it is moderate • No correlation • If they aren’t very close to the line, it is weak
Correlation Has Two Attributes Str Strong ong 8 Stren Str ength gth Ne Nega gativ tive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak
Correlation Has Two Attributes Str Strong ong 8 Stren Str ength gth Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak
Correlation Has Two Attributes Moder Moderate te 8 Stren Str ength gth Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak
Correlation Has Two Attributes Hey, what’s Weak eak 8 Stren Str ength gth that? tha t?? ? Outl Outlier! ier! Posit ositiv ive 7 • Strength refers to how 6 close to the line all the 5 dots fall. 4 • If they fall really close to 3 the line, it is strong 2 • If they fall kind of close to 1 the line, it is moderate 0 • If they aren’t very close to 0 1 2 3 4 5 6 7 8 the line, it is weak
Outliers in Correlation • Outliers can have a very powerful effect on a correlation • An outlier in any of the 4 corners of the plot can really affect the direction of the line • An outlier can also change the correlation from strong and moderate to weak • It’s good to look at a scatterplot to make sure you identify outliers
Correlation Coefficient r Putting a Number on Correlation
Correlation Coefficient r • Remember “coefficient” from CV (coefficient of variation)? • Coefficient just means a number • r stands for the sample correlation coefficient • Remember! Corrrrrrrrrrrrrrrrrrelation • Population correlation coefficient = • We will only focus on r
What is r? Wha hat i t it i t is Ho How w to i to inter nterpr pret et it it • • A numerical quantification of The r calculation produces a how correlated a set of x,y number pairs are • The lowest number possible is • Calculated from plugging -1.0 x,y pairs into an equation • Perfect negative correlation • Has a defining formula and • The highest possible number is a computational formula 1.0 • I will demonstrate • Perfect positive correlation computational formula • All others are in-between
Examples of Negative r r = -0.25 r = -0.70 r = -0.44 OPINION!!! For negative correlations: • 0.0 to -0.40: Weak • -0.40 to -0.70: Moderate • -0.70 to -1.0: Strong
Examples of Positive r r = 0.66 r = 0.92 OPINION!!! For positive correlations: • 0.0 to 0.40: Weak • 0.40 to 0.70: Moderate • 0.70 to 1.0: Strong
Calculating r Computational Formula
Computational Formula • FLASHBACK! …to Chapter n Σ xy – ( Σ x)( Σ y) r = √nΣ x 2 – ( Σ x) 2 3.2 √nΣ y 2 – ( Σ y) 2 • Notice all the Σ’s Hypothetical Scenario • We have 7 patients • As before, we will • They have come to the clinic for • make columns appointments throughout the year. • We predict those with a higher diastolic • make calculations blood pressure (DBP) will have more • Then add up the appointments columns to get these Σ’s • We take DBP at last appointment as “x” • We take number of appointments over the year as “y”
x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166
x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166
x=DBP , y=# of Appointments x 2 y 2 # x y xy n Σ xy – ( Σ x)( Σ y) r = 1 70 3 √nΣ x 2 – ( Σ x) 2 √nΣ y 2 – ( Σ y) 2 2 115 45 3 105 21 NOT! 4 82 7 5 93 16 6 125 62 7 88 12 Σ x = Σ y = 678 166 Σ xy will go here
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.