Causality in Biomedicine Lecture Series: Lecture 2 Ava Khamseh - - PowerPoint PPT Presentation
Causality in Biomedicine Lecture Series: Lecture 2 Ava Khamseh - - PowerPoint PPT Presentation
Causality in Biomedicine Lecture Series: Lecture 2 Ava Khamseh (Biomedical AI Lab) IGMM & School of Informatics 30 Oct, 2020 <latexit
Last time: Observational data, what goes wrong?
p(x|t = 1) 6= p(x|t = 0)
<latexit sha1_base64="WBS6W4REKXQyW0MTnj/H3EUrYU=">AB/3icdVDLSgMxFM3UV62vUcGNm2AR2k3J1GrHhVB047KCfUBbSiZN29BMZkwyYm78FfcuFDErb/hzr8x01ZQ0QMXTs65l9x7vJAzpRH6sBILi0vLK8nV1Nr6xuaWvb1TVUEkCa2QgAey7mFORO0opnmtB5Kin2P05o3uIj92i2VigXiWg9D2vJxT7AuI1gbqW3vhZm7sT5zsrAp6A2cvVC2badRDqFC8QRBQ05d1y0akj9GBYSgY6wYaTBHuW2/NzsBiXwqNOFYqYaDQt0aYakZ4XSakaKhpgMcI82DBXYp6o1mu4/gYdG6cBuIE0JDafq94kR9pUa+p7p9LHuq9eLP7lNSLdVsjJsJIU0FmH3UjDnUA4zBgh0lKNB8agolkZldI+lhiok1kKRPC16Xwf1LN5yjXP6qkC6dz+NIgn1wADLAUVQApegDCqAgDF4AE/g2bq3Hq0X63XWmrDmM7vgB6y3TxImlNw=</latexit>treatment Control Age
✓Z y1(x)p(x|t = 1)dx Z y0(x)p(x|t = 0)dx ◆ 6= Z y1(x) y0(x)
- p(x)dx
Simpson’s Paradox
- Why concluding causality from purely associational measures, i.e.
correlation, can be very wrong (not just neutral): “It would have better not to make any statements!”
Causal Inference in Statistics, Pearl (2016)
Simpson’s Paradox
- Why concluding causality from purely associational measures, i.e.
correlation, can be very wrong (not just neutral): “It would have better not to make any statements!”
Causal Inference in Statistics, Pearl (2016)
Potential Outcomes Assumptions (Rubin)
- Consistency: The observed outcome is independent of how the
treatment is assigned
- Unconfoundedness: Treatment assignment is random, given
covariants X
- Positivity: Every individual has a non-zero chance of receiving
the treatment/control
p(t = 1|x) ∈ (0, 1) if P(x) > 0
<latexit sha1_base64="fG9hr1wb+mxWBc3sl+X7Dh9wJ0=">ACEHicdZDNSgMxFIUz/tb6V3XpJljEClIytdq6UEQ3LitYFTqlZNKMhmYyQ3JHWsY+ghtfxY0LRdy6dOfbmGoFb0Q+DjnXm7u8WMpDBDy5oyMjo1PTGamstMzs3PzuYXFUxMlmvE6i2Skz31quBSK10GA5Oex5jT0JT/zO4cD/+yKayMidQK9mDdDeqFEIBgFK7Vya3EBdt3r7jr2hMIFsuFaAt6F1Mi8HAf1wrd9T3SyuVJkZByZtgCzvVarViobRFyoRg1qDyqNh1Vq5V68dsSTkCpikxjRcEkMzpRoEk7yf9RLDY8o69I3LCoactNMPw7q41WrtHEQafsU4A/1+0RKQ2N6oW87QwqX5rc3EP/yGgkE1WYqVJwAV+xzUZBIDBEepIPbQnMGsmeBMi3sXzG7pJoysBlmbQhfl+L/4bRUdDeLpeNyfv9gGEcGLaMVEAuqB9dIRqI4YukF36AE9OrfOvfPkPH+2jDmSX0o5yXd/K0mg=</latexit>T Y X
Average treatment effect:
τ = ˆ E[τ (i)] = ˆ E[y(i)
1
− y(i)
0 ] = 1
N
N
X
i=0
⇣ y(i)
1
− y(i) ⌘
<latexit sha1_base64="WbTWK3I0Vx5XeT+HXSWxJjiZj3c=">ACanicdVFdSxwxFM2MbdW12lVBKb6EboX1oUtmXd31QZCWQp+Kha4KO+OQyWZ2g5kPkjvCEObBv+ibv6Av/RFm3K3Yai8ETs65h5t7EuVSaCDkznEXr1+s7i03Fh5u7r2rm+cazQjE+ZJnM1EVENZci5UMQIPlFrjhNIsnPo6svtX5+zZUWfoTypwHCZ2kIhaMgqXC5o0PtDj2pxSMn1CYRpH5WlWjmr0bFXBc/FMvRm2qcyJI9dsaLMeJX5Xvm6SEIjkl1Wd8kj6H9gsdXYjKFvbDZIh1Cev1Dgi04GgwGfQu6B6RHCPasVFcLzes0bN764wVCU+BSar1yCM5BIYqEzyquEXmueUXdEJH1mY0oTrwDxEVeFdy4xnCl7UsAP7FOHoYnWZRLZznpj/a9Wky9powLiQWBEmhfAUzYbFBcSQ4br3PFYKM5AlhZQpoR9K2ZTajMD+zsNG8KfTfH/wVm34+13uj96rZP8ziW0A76gNrIQ310gr6hUzREDP1yVp0tZ9v57W64792dWavrzD2b6K9yP94DMpK7Lg=</latexit>Causal Inference
Overview of the course
Causal Effect Estimation Casual Discovery
Obsv confounders
Unobsv confounders
Regression Adjustment Propensity score IV Front- door criterion Constraint- based Score- based FCM
Rubin Rubin, Pearl
- Estimating causal effects
- Randomised trial vs observational data
Modern ML
Causal inference with observed confounders
T Y X
Regression Adjustment
- X is a sufficient set of confounders if conditioning on X, there would
be no confounding bias
- For individual (i) there is only one observed outcome:
- Would like to estimate (infer) counterfactual:
- Using a design matrix, fit:
- Assumptions: Overlap and additivity
T = 1 1 .. .. 1 1
<latexit sha1_base64="V+Taq8YIMQSZsckmKN+RsAVF0eI=">ACMnicdZBNSwMxEIazflu/qh69BIviaUlasXoQRC96q2Ct0C0lm05rMJtdkqxYlv4mL/4SwYMeFPHqjzCtVaroQJiHeWfIzBsmUhLyKM3Nj4xOTU9M5ubm19YXMovr5ybONUcqjyWsb4ImQEpFStsBIuEg0sCiXUwqujvl67Bm1ErM5sN4FGxDpKtAVn1pWa+ZOz/SCEjlBZEjGrxU0vR/EmJkHwnX3fge87Ig7oSA5Atb7nmvkC8QnZIztl7GCvVNwuOSjSHVqmDqpHwU0jEozfx+0Yp5GoCyXzJg6JYltZExbwSX0ckFqIGH8inWg7lCxCEwjG5zcwxu0sLtWLunLB5URycyFhnTjULX6fa7NL+1fvEvrZ7a9m4jEypJLSj+VE7ldjGuO8fbgkN3MquA8a1cLtifsk049a5nHMmfF2K/4fzok9LfvF0u3BwOLRjBq2hdbSFKCqjA3SMKqiKOLpFD+gZvXh3pP36r19to5w5lV9CO89w+J0aRT</latexit>Ctrl Drug
X = 1 1 .. .. 1 1
<latexit sha1_base64="IrSAICuzpW+6WAEf9p0DoZSw78=">ACMnicdVBNaxsxENUmaZNsP+Kmx1xETEtPi2QbOzkUQntJbynEH+A1RiuPbRGtdpG0pWbxb8olv6TQ3toKLnmR3T8kdKadkDM4715aOYluVbOM/Yt2NrefR4d28/fPL02fODyovDjsK6EtM53ZXiIcaGWg7ZX0MstiDTR0E2u3i/07iewTmXm0s9yGKRiYtRYSeGRGlY+9N7GCUyUKfNUeKs+z0NOX1MWxyHDzrFHEYIoQrSpxGBGv3DSpVFjJ2yZosiOK3XGnUENd7kLU45SouqknVdDCtf4lEmixSMl1o41+cs94NSWK+khnkYFw5yIa/EBPoIjUjBDcrlyXP6CpkRHWcWn/F0yf7pKEXq3CxNcBL3m7pNbUH+S+sXfnwyKJXJCw9Grj4aF5r6jC7yoyNlQXo9QyCkVbgrlVNhfSYcoghPFxK/w86tYjXo9rHRvXs3TqOPXJEjskbwkmLnJFzckHaRJr8pX8ILfBTfA9+BncrUa3grXnJfmrgvtfkQGkVw=</latexit>Young Old
Y = XX + T T + ✏
<latexit sha1_base64="lWIDvoFMuna1T+j4f7ZbYD08=">ACXicdVDLSgMxFM3UV62vqks3wSIhTLTltYuhKIblxX6GlLyaS3bWjmQZIRyjBbN/6KGxeKuPUP3Pk3ZtoKnogcDjnXG7ucQLOpDLNDyO1srq2vpHezGxt7+zuZfcP2tIPBYUW9bkvbIdI4MyDlmKgx0IK7DoeNMLxO/cwtCMt9rqlkAfZeMPTZilCgtDbL45rzngCKDyI7t/JI242a+B4FkPInkzIJp1sxKFWtSKxXLJU2KVsWqWtjSVoIcWqIxyL73hj4NXfAU5UTKrmUGqh8RoRjlEGd6oYSA0CkZQ1dTj7g+9H8khifaGWIR7Qz1N4rn6fiIgr5cx1dNIlaiJ/e4n4l9cN1eisHzEvCBV4dLFoFHKsfJzUgodMAFV8pgmhgum/YjohglCly8voEr4uxf+TdrFglQrF63KufrGsI42O0DE6Raqojq6Qg3UQhTdoQf0hJ6Ne+PReDFeF9GUsZw5RD9gvH0CvjmaZA=</latexit>τ = ˆ E[τ (i)] = ˆ E[y(i)
1
− y(i)
0 ] = 1
N
N
X
i=0
⇣ y(i)
1
− y(i) ⌘
<latexit sha1_base64="WbTWK3I0Vx5XeT+HXSWxJjiZj3c=">ACanicdVFdSxwxFM2MbdW12lVBKb6EboX1oUtmXd31QZCWQp+Kha4KO+OQyWZ2g5kPkjvCEObBv+ibv6Av/RFm3K3Yai8ETs65h5t7EuVSaCDkznEXr1+s7i03Fh5u7r2rm+cazQjE+ZJnM1EVENZci5UMQIPlFrjhNIsnPo6svtX5+zZUWfoTypwHCZ2kIhaMgqXC5o0PtDj2pxSMn1CYRpH5WlWjmr0bFXBc/FMvRm2qcyJI9dsaLMeJX5Xvm6SEIjkl1Wd8kj6H9gsdXYjKFvbDZIh1Cev1Dgi04GgwGfQu6B6RHCPasVFcLzes0bN764wVCU+BSar1yCM5BIYqEzyquEXmueUXdEJH1mY0oTrwDxEVeFdy4xnCl7UsAP7FOHoYnWZRLZznpj/a9Wky9powLiQWBEmhfAUzYbFBcSQ4br3PFYKM5AlhZQpoR9K2ZTajMD+zsNG8KfTfH/wVm34+13uj96rZP8ziW0A76gNrIQ310gr6hUzREDP1yVp0tZ9v57W64792dWavrzD2b6K9yP94DMpK7Lg=</latexit> <latexit sha1_base64="XcTopXEfSETh7FOlB7N0af2vQ=">ACAnicdVDLSgMxFM3UV62vUVfiJliEuikzbW3trujGZQX7gLYOmTRtQzOZIckIwzC48VfcuFDErV/hzr8x01ZQ0QMXDufcm9x73IBRqSzrw8gsLa+srmXcxubW9s75u5eW/qhwKSFfeaLroskYZSTlqKkW4gCPJcRjru9CL1O7dESOrzaxUFZOChMacjipHSkmMeRDdxgZ4kTtwP+ZCI9J1YJQ5NHDNvFS2rblVrUJN6uVQpa1Kyq3bNhra2UuTBAk3HfO8PfRx6hCvMkJQ92wrUIEZCUcxIkuHkgQIT9GY9DTlyCNyEM9OSOCxVoZw5AtdXMGZ+n0iRp6UkefqTg+pifztpeJfXi9Uo7NBTHkQKsLx/KNRyKDyYZoHFJBsGKRJgLqneFeIEwkqnltMhfF0K/yftUtE+LVpXlXzjfBFHFhyCI1ANqiBrgETdACGNyB/AEno1749F4MV7nrRljMbMPfsB4+wRrRZgc</latexit>y(i)
ti
<latexit sha1_base64="9odU2tzoPjKIqzEobcp4SjaO6NY=">ACLXicdVDLSsNAFJ34rPVdelmsAgKWpL6qF0I4gNcKlgVmlgm0k7OHkwcyOGmB9y46+I4EIRt/6Gk7aCih4YOJxzLnfucSPBFZjmizE0PDI6Nl6YKE5OTc/Mlubmz1UYS8oaNBShvHSJYoIHrAEcBLuMJCO+K9iFe32Q+xc3TCoeBmeQRMzxSfgHqcEtNQqHdpdAmSXaUrfDVrpdY6ZLs9zfYJdF03PcoyWzAPmk/c6cja7d9bkve6YLTKpXNimnWze0a1qS+Ud3c0KRqbVs1C1vaylFGA5y0Sk92O6SxzwKgijVtMwInJRI4FSwrGjHikWEXpMOa2oaEJ8pJ+1dm+FlrbSxF0r9AsA9ftESnylEt/VyfwE9dvLxb+8ZgzejpPyIqBbS/yIsFhDn1eE2l4yCSDQhVHL9V0y7RBIKuCiLuHrUvw/Oa9WrK2KebpZ3tsf1FAi2gJrSAL1dAeOkYnqIEoukeP6AW9Gg/Gs/FmvPejQ8ZgZgH9gPHxCWKVqXA=</latexit>ˆ y(i)
1−t = ˆ
E h y(i)|1 − t, x(i)i
<latexit sha1_base64="d6/PjYT/vtO/n2H7zIpejOPMdpM=">AC8XicjVJNb9QwEHXCR8vy0S09crFYIbVCRM62dOlhpapcOFVFYtKm2XlON6tVceO7AnaKMq/4NIDqOLKv+HGv8FJU0qrghjJ0pt5b8YzY8eZFBYI+en5d+7eu7+0/KDz8NHjJyvd1aeHVueG8RHTUpvjmFouheIjECD5cWY4TWPJj+LTtzV/9IkbK7T6AEXGJymdKzETjILTVe9pSjmc6HKLKVgxKLqFB/L9XCjiqIG9RsUBK27/+qK2t+oOhFXyVXq8GYt5wOdljAkFX6JW28xjIAvoCx0ruZVXe2fMi2TRoSbJn5Lw/+reJvsuK15qfdHgkI2SHbA+zAzmZ/a9OBfrgdDkIcOq2HmrtYNr9ESWa5SlXwCS1dhySDCYlNSCY5G723PKMslM652MHFU25nZTNi1X4hYskeKaNOwpwE/0zo6SptUaO6Xr78Te5Orgbdw4h9mbSlUlgNX7OKiWS4xaFw/P06E4Qxk4QBlRrheMTuhjJwn6TjlnA5Kf47OwH4euAvN/q7e6161hGz9BztI5CNEC76B06QCPEPOV9r54X3rn/n/rcLqe+1OWvomvnfwHrYezS</latexit> y(1) y(2) .. y(N−1) y(N) = βt=0 + βx=young βt=0 + βx=old .. βt=1 + βx=young βt=1 + βx=old
ML aside: Improving estimate via ensemble learning
- Do we need the additivity assumption?
- In fact, ignoring covariate-treatment interaction can be a source of bias
- Data driven approach:
- V-fold cross-validation using an ensemble learning, e.g. super-learner
- Appropriate choice of loss function, e.g., L1 for conditional median,
L2 for conditional mean, log loss for binary outcome, …
<latexit sha1_base64="iGKFOf9X5YGM5TqZab+sef4rxb0=">ACLnicdVBda9swFJW7buyr3R73ItoGHR0BDmt2+yhUFYKe2zBaT3iYGTlOhGVbCNdF4KX9SX/ZXtYdCWsdf9jMpNBtvYDgdzrkX6Zy0VNIiY1feyr3V+w8erj1qPX7y9Nnz9vqLU1tURsBAFKowUcotKJnDACUqiEoDXKcKztLzw8Y/uwBjZGHOCthpPkl5kUHJ2UtI9izXGapvXRPGxgw3P34K30axkZMpvqH7NE4BecK2FndEoyULw614wrXmNAqTdod1GQu2d/vUkXe9fj9wpOcHO9sB9Z3VoEOWOE7aX+NxISoNOQrFrR36rMRzQ1KoWDeisLJRfnfAJDR3OuwY7qu7hz+topY5oVxp0c6Z36+0bNtbUznbrJpz92vEf3nDCrP+qJZ5WSHkYvFQVimKBW26o2NpQKCaOcKFke6vVEy54QJdwy1Xwq+k9P/ktNf1gy472ekcvF/WsUZekQ2ySXyRw7IB3JMBkSQS/KFXJMb7P3zfvu/ViMrnjLnZfkD3g/bwF+CaeW</latexit>E0 (Y |T, X) = β0 + βXX + βT T + γXT <latexit sha1_base64="zBGxKSGX4WGwCHXoDIHqltUQ0=">ACQ3icdVBda9RAFJ3Ur7p+beujL4OLUFlZJtumXR+Eog+Vsi20c02TGZvdofOJGHmRlji/re+9A/45h/wxQdFfBWcdCOo6IFhDuecy8w9amkRcY+ehtXrl67fmPzZufW7Tt373W3to9tURkBY1GowkQpt6BkDmOUqCAqDXCdKjhJz140/sk7MFYWeYjLEqaz3OZScHRSUn3baw5LtK0frlKWKwgw50378MnUWzkfIGP6TMap4A8Yf31HdGoZWHYj+dca06jkPbWHQal0ZqoNHpMOn2ICxYHd/RB15OhyNAkeGfrC3G1DfWQ16pMVR0v0QzwpRachRKG7txGclTmtuUAoFq05cWSi5ONzmDiacw12Wl92sKPnDKjWHcyZFeqr9P1Fxbu9SpSzYb27+9RvyXN6kwG01rmZcVQi7WD2WVoljQplA6kwYEqUjXBjp/krFghsu0NXecSX82pT+nxwPB34wYK/3eofP2zo2yQPykOwQnxyQ/KHJExEeScfCJfyFfvwvsfO+r6MbXjtzn/wB78dPX2evYQ=</latexit>E0 (Y |T, X) = β0 + βXX + βT T + γXT + β0
XX2
<latexit sha1_base64="3UxCkw6fZMaC4wPzjleB2lxZ3E=">ACWHicdVFda9swFJW9dm2zj2bt417EwqAjI9hpvWYPg9Ix2GMHTusRp0ZWrhNRyTbSdSF4+ZOFPax/ZS+TEw+6sR0QOpxzLpKO0lIKg573w3EfbW0/3tnd6zx5+uz5fvfFwaUpKs1hzAtZ6ChlBqTIYwCJUSlBqZSCVfpzcfGv7oFbUSRh7gsYarYPBeZ4AytlHSLWDFcpGn9aZV4sYQMj75+C9GsRbzBb6hH2icArLE62/2iEYtC8N+PGdKMRqFtN/Gou41EIBja6HjbgOPNDCpNvzBp4XHL8bUveD0ejwJKhH5wcB9S3VoMeaXGRdO/iWcErBTlyYyZ+F6J05pFzCqhNXBkrGb9gcJpbmTIGZ1utiVvS1VWY0K7RdOdK1+nCiZsqYpUptsqnB/O014r+8SYXZaFqLvKwQcr45KskxYI2LdOZ0MBRLi1hXAt7V8oXTDO9i86toTfL6X/J5fDgR8MvC8nvbPzto5d8pK8IkfEJ6fkjHwmF2RMOPlOfjpbzrZz7xJ3x93bRF2nTkf8A9+AUf9bE2</latexit>E0 (Y |T, X) = β0 + βXX + βT T + γXT + β0
XX2 + γ0X2T
Discrete Super Learner
Eric Polley, Mark van der Laan, Sherri Rose 2011
+ verify goodness-of-fit
Matching
- Idea: Blind ourselves to the outcomes, try to get as similar to a randomised
experiment as possible (‘correct for confounding’)
- Reveals lack of overlap in treatment vs control distributions: individuals in the
treatment group that have no chance of having an ‘equivalent’ in control group, ie, parts of the distribution with:
- Mahalanobis distance: Difference scaled by variance
- Issues: Outliers. Use a calliper: maximum acceptable distance, to avoid violating
the positivity (strong ignorability) assumption. But the populations becomes harder to define.
- See papers on anomaly detection: When in fact, we are interested in the outliers
p(t = 1|x) = 0, p(t = 0|x) = 0
<latexit sha1_base64="EdvDUKDSIMkvGy/hM+uTvI5FSbs=">ACAXicdVDLSgMxFM34rPU16kZwEyxCBSmZaWntolB047KCfUA7lEyatqGZB0lGLGPd+CtuXCji1r9w59+YaSuo6IHAuefcy809bsiZVAh9GAuLS8srq6m19PrG5ta2ubPbkEkCK2TgAei5WJOfNpXTHFaSsUFHsup013dJ74zWsqJAv8KzUOqePhgc/6jGClpa65H2ZVxbq9Oa6gE9iBSYWmVdfMoBxCZVQsQU3KebuQ18S2ilbJgpa2EmTAHLWu+d7pBSTyqK8Ix1K2LRQqJ8ZCMcLpJN2JA0xGeEBbWvqY49KJ5eMIFHWunBfiD08xWcqt8nYuxJOfZc3elhNZS/vUT8y2tHqn/qxMwPI0V9MlvUjzhUAUzigD0mKF8rAkmgum/QjLEAhOlQ0vrEL4uhf+Thp2z8jn7spCpns3jSIEDcAiywAIlUAUXoAbqgIA78ACewLNxbzwaL8brHXBmM/sgR8w3j4B+DmUqw=</latexit> <latexit sha1_base64="tLXvye3o6VOqMDJLS4x/WZPvTk=">ACVnicfVFdT9swFHXSMbrugwCPe7FWITUSVElhFB6Q0NjDHpmgUKlpK8e9aT2cD+wbRBXlT7IX+Cm8oDlQpG0aO5Llo3POle3jMJNCo+fdWXbt1dLr5fqbxt37z+sOKtrZzrNFYceT2Wq+iHTIEUCPRQoZ8pYHEo4Ty8OKr8ytQWqTJKc4zGMZsmohIcIZGjvx19b1qGgJt9ys9h9u6R4E+lJhEUiI8NncWpiBEtMZuqPTk1Gx5Zf/y5SbNKAnBwHCNRZH6VXZ6rtjp+m1PW/f2+1SQ/a3OzvbhnT8Xb/rU9YFZpkgeOxcxNMUp7HkCXTOuB72U4LJhCwSWUjSDXkDF+waYwMDRhMeh8VhLSTeMqFRqsxKkD6qv08ULNZ6HocmGTOc6b+9SvyXN8gx2hsWIslyhIQ/HRTlkmJKq47pRCjgKOeGMK6EuSvlM6YR/MTDVPC80vpy+Ss0/Y/t73vO83DL4s6uQj+URaxCdcki+kWPSI5z8JPeWbdWsW+vBXrKXn6K2tZhZJ3/Adn4BUDWyew=</latexit>D(x(i), x(j)) = q x(i) − x(j)T S−1 x(i) − x(j) , S = Cov(X)
Propensity Score
- In a randomised trial: p(t=1|x)=p(t=1)=0.5
- In an observational study, p(t=1|x) can be estimated, since it
involves observational data at a t and x (hence identifiable).
- A balancing score is any function b(x) such that:
i.e., distribution of confounders is independent of treatment given b(x):
x ⊥ ⊥ t|b(x)
<latexit sha1_base64="xN7E0ZzSLwCorE6Q/PM7JUEgOXE=">ACBXicdVDLTgIxFO3gC/E16lIXVWKCGzIzEJAd0Y1LTOSRMBPSKR1o6DzSdgxkZOPGX3HjQmPc+g/u/Bs7gIkaPU2T03Puze09bsSokIbxoWldW17LruY3Nre0dfXevJcKY9LEIQt5x0WCMBqQpqSkU7ECfJdRtru6CL12zeECxoG13ISEcdHg4B6FCOpJ5+OLYjwiNoH6UHzh8S3kK3MD7t6XmjaBg1o1KFitRKVrmkiGVWzKoJTWlyIMFGj393e6HOPZJIDFDQnRNI5JOgrikmJFpzo4FiRAeoQHpKhognwgnmW0xhSdK6UMv5OoGEs7U7x0J8oWY+K6q9JEcit9eKv7ldWPpnTkJDaJYkgDPB3kxgzKEaSwTznBk0UQZhT9VeIh4gjLFVwORXC16bwf9KyimapaF2V8/XzRxZcACOQGYoArq4BI0QBNgcAcewBN41u61R+1Fe52XZrRFz74Ae3tE8NoltQ=</latexit>p(X = x|b(x), t = 1) = p(X = x|b(x), t = 0)
<latexit sha1_base64="A983CR49ngqnA3yvJqbq4Dwp/8=">ACHicdVBNSwJBGJ61L7OvrY4dGpJAIWR3Fc2DIHXpaJAfoIvMjrM6OPvBzGwo5rFLf6VLhyK69hO69W+aVYOMeuCFZ57nfZn3fZyQUSEN41NLrKyurW8kN1Nb2zu7e/r+QUMEcekjgMW8JaDBGHUJ3VJSOtkBPkOYw0neFl7DdvCRc08G/kOCS2h/o+dSlGUkld/TjMtCqjOyczyp7JipmtL2NbFdPGznDKBvFElSknLcKeUs2iWTGgqK0YaLFDr6h+dXoAj/gSMyRE2zRCaU8QlxQzMk1IkFChIeoT9qK+sgjwp7MDpnCU6X0oBtwVb6EM/XnxAR5Qow9R3V6SA7Eby8W/LakXTP7Qn1w0gSH8/ciMGZQDjVGCPcoIlGyuCMKdqV4gHiCMsVXYpFcL3pfB/0rByZj5nXRfS1YtFHElwBE5ABpigBKrgCtRAHWBwDx7BM3jRHrQn7V7m7cmtMXMIViC9v4F84KXZA=</latexit>Rosenbaum, Rubin 1983
Propensity Score
- Candidate b(x) = x, trivially satisfies:
- b(x) = x is the finest such function: OK for e.g. binary confounders,
but only gives point estimates for (almost) continuous confounders!
- Propensity score is the coarsest such function (i.e. more data
points, leading to better estimates):
p(X = x|x, t = 1) = p(X = x|x, t = 0) = 1
<latexit sha1_base64="tvEnhEB0qxNC8CTwrOjLmJNXhs=">ACBHicdVDLSgMxFM3UV62vqstugkVoQcpkWlq7KBTduKxgH9AOJZOmbWjmQZKRlrELN/6KGxeKuPUj3Pk3pg/xgR64cHLOveTe4wScSWa70ZsZXVtfSO+mdja3tndS+4fNKQfCkLrxOe+aDlYUs48WldMcdoKBMWuw2nTGZ3P/OY1FZL53pWaBNR28cBjfUaw0lI3mQoyrcr4ZnyiKihb+XqY2QrqJtNmzjTLZrENSnrUJeEwsVUQlBpK0Z0mCJWjf51un5JHSpwjHUraRGSg7wkIxwuk0QklDTAZ4QFta+phl0o7mh8xhcda6cG+L3R5Cs7V7xMRdqWcuI7udLEayt/eTPzLa4eqf2pHzAtCRT2y+Kgfcqh8OEsE9pigRPGJpgIpneFZIgFJkrnltAhfF4K/ycNK4fyOeuykK6eLeOIgxQ4AhmAQAlUwQWogTog4Bbcg0fwZNwZD8az8bJojRnLmUPwA8brB8ealkQ=</latexit>Gender Treatment F x=M t=0 t=1
e(x) = p(t = 1|x)
<latexit sha1_base64="WIlUbaYz3qTQq4z/eipX0bOCUBE=">AB/nicdVBNS0JBFJ1nX2ZfVrRqMySBbuS9p2guBKlNS4PUQB8yb7zq4LwPZuaF8hL6K21aFNG239Guf9P4EVTUgQuHc+7l3nvckDOpTPDSKysrq1vJDdTW9s7u3vp/YOmDCJBoUEDHogbl0jgzIeGYorDTSiAeC6Hlju6mPmtWxCSBf61moTgeGTgsz6jRGmpmz7quMEYejFkx7lqmFV626cm3bTGTNvmhWzVMaVAp2saCJbZWsoUtbc2QUvUu+n3Ti+gkQe+opxI2bMUDkxEYpRDtNUJ5IQEjoiA2hr6hMPpBPz5/iU630cD8QunyF5+r3iZh4Uk48V3d6RA3lb28m/uW1I9U/c2Lmh5ECny4W9SOVYBnWeAeE0AVn2hCqGD6VkyHRBCqdGIpHcLXp/h/0rTzViFvXxUztfNlHEl0jE5QFlmojGroEtVRA1EUowf0hJ6Ne+PReDFeF60JYzlziH7AePsEbRKVJg=</latexit>b(x) Treatment t=0 t=1 x=Age Treatment t=0 t=1 x=Age
<latexit sha1_base64="wQRKEyZONbxUVKLD059lJta9DSw=">AB+HicdZDLSgMxFIYzXmu9dNSlm2AR6qbMtB2tu6IblxXsBdqhZNIzbWjmYpIRa+mTuHGhiFsfxZ1vY6atoKIHAh/fw7n5PdizqSyrA9jaXldW09s5Hd3NreyZm7e0ZJYJCg0Y8Em2PSOAshIZikM7FkACj0PLG12kfusWhGReK3GMbgBGYTMZ5QoLfXMnNXlcIOhcHecgt0z81bRspzySRVrOCtVq46Gku1Uyg62tZVWHi2q3jPfu/2IJgGEinIiZce2YuVOiFCMcphmu4mEmNARGUBHY0gCkO5kdvgUH2mlj/1I6BcqPFO/T0xIOU48HRnQNRQ/vZS8S+vkyi/6k5YGCcKQjpf5CcqwinKeA+E0AVH2sgVDB9K6ZDIghVOqusDuHrp/h/aJaKtlO0rir52vkijgw6QIeogGx0imroEtVRA1GUoAf0hJ6Ne+PReDFe561LxmJmH/0o4+0TsFCSeg=</latexit>0 ≤ e(x) ≤ 11-dimensional
Propensity Score Matching
- Let the distribution of covariates follow an exponential family of
distributions (Pt*(x) polynomial of degree k):
- Estimate propensity score e(x)=p(t=1|x):
- If we consider k=1, linear exponential family (e.g. Bernoulli),
- Fit parameters by maximising log-likelihood:
Rosenbaum, Rubin 1983
p(x|t = t∗) = h(X) exp(Pt∗(x)) , for t = 0 or 1
<latexit sha1_base64="PEVxWJHFksgG/4x9FJoLCTfK9U=">ACLHicdZBNaxpRFIbvpB+xNm1MuzmUimMIciMBo0LQeImSwvVCI6RO9cz8ZI7H9x7JihTf1A3/SuFkEUkdNvf0TtqoA3NWb087zmc14/kUKj46ysnRcvX73eLbwpvt17936/dHA40HGqOPR5LGM19JkGKSLo0AJw0QBC30JF/51N/cvbkBpEUdfcZHAOGRXkQgEZ2jQpNRN7Pk3bOPlUaU9s4cVD+aJ3ZtkBizteaVCPXrsUQ9hjlkQK7qk2HboI1kDd1IqO1XHaTmNJjWiVa+d1I2ouQ236VLXWHmVybZ6k9KtN415GkKEXDKtR6T4DhjCgWXsCx6qYaE8Wt2BSMjIxaCHmfrZ5f0syFTmt8SxBHSNf17ImOh1ovQN50hw5l+6uXwf94oxeB0nIkoSREivlkUpJiTPk6FQo4CgXRjCuhLmV8hlTjKPJt2hCePyUPi8Gtapbr9a+nJQ7Z9s4CuQj+URs4pIm6ZBz0iN9wsl38pPck5X1w7qzHqxfm9YdazvzgfxT1u8/mPOlbA=</latexit>log ✓ e(x) 1 − e(x) ◆ = log ✓p(t = 1|x) p(t = 0|x) ◆ = log ✓p(x|t = 1)p(t = 1) p(x|t = 0)p(t = 0) ◆ = log ✓p(t = 1) p(t = 0) ◆ + P1(x) − P0(x)
<latexit sha1_base64="Dy9cieT2ktI5IbcJOcN+HzPvAuM=">ACpnicfZHbhMxEIa9y6mEQwNcmMRIW2EGtlJ1dCLShXcwA0KhzSVstHidWYTq96D7FnUaJNH4yW423wblMplMNIlv+Z+X7ZHseFVhYZ+n5t27fuXtv737rwcNHj/fbT56e2bw0EsYy17k5j4UFrTIYo0IN54UBkcYaJvHF27o/+QbGqjz7gqsCZqlYZCpRUqArRe3voc4XoYEgzAxQlYQXHY3FT9o9tCoxRK7JzehIsATvq7BWrH1Dtr6k71cO7rbeBpHnbMmZ/1XfO73KtRxN3NDkYRc1vU7rAeY8fsaEidOB70DwdO9PkRH3LKXauODtnGKGr/COe5LFPIUGph7ZSzAmeVMKikhk0rLC0UQl6IBUydzEQKdlY1Y97Ql64yp0lu3MqQNtVdRyVSa1dp7MhU4NLe7NXFv/WmJSavZ5XKihIhk1cHJaWmNP6z+hcGZCoV04IaZS7K5VL4YaE7mdbgjXL6X/Fmf9Hh/0+h8PO6dvtuPYI8/JCxIQTobklLwjIzIm0ut471P3mc/8D/4Y39yhfre1vOM/Bb+198+MzJ</latexit>log ✓ e(x) 1 − e(x) ◆ = wx + w0 ⇒ e(x) = 1 1 + e−wx−w0
<latexit sha1_base64="yV/PxGtsufGRlKwFUy7OSFRINuY=">ACQXicdVA9axsxGNYl/UjcLycZu4iagkOwOdkhToZAaJeMakTg81Ovk9W0R3OqT3Ypvj/lqW/INu3btkSClZs1Rnu9CW9gWhR8Hkp4wVdKi73/1tYfPX7ydGOz8uz5i5evqlvb51ZnRkBXaKVNL+QWlEygixIV9FIDPA4VXISX70v94gqMlTr5hPMUBjEfJzKSgqOjhtVeoPQ4UBhPYgMFznUZ7tFzhqLPTByPMHd4+lsbzr0aUCDjyXBjdFTdyo9x8sYc5k9+Jw3prOGsxbFsFrzm75/5B90qANH7dZ+24EWO2AdRpmTyqmR1ZwNq1+CkRZDAkKxa3tMz/FQc4NSqGgqASZhZSLSz6GvoMJj8EO8kUDBX3rmBGNtHErQbpgf0/kPLZ2HofOGXOc2L+1kvyX1s8wOhzkMkzhEQsL4oyRVHTsk46kgYEqrkDXBjp3krFhLtG0JVecSX8+in9PzhvNVm72fqwXzt5t6pjg7wmb0idMNIhJ+SUnJEuEeSafCN35Lt34916P7z7pXNW2V2yB/jPfwEJS6vcg=</latexit>LL = 1 N
N
X
i=0
log p(t(i)|x(i))
<latexit sha1_base64="zgGyo3NDbTyYUt5TSEP+PSgFfjE=">ACHicdVDLSgMxFM34rPVdekmWIR2U2Y6pbWLQtGNiyIV7AP6IpNm2tDMgyQjlnE+xI2/4saFIm5cCP6NmbaCih4IOTnXm7usXxGhdT1D21peWV1bT2xkdzc2t7ZTe3tN4UXcEwa2GMeb1tIEZd0pBUMtL2OUGOxUjLmpzFfuacE90pOfdJz0MilNsVIKmQMmu1StfmCIdGF5EXRE4g5BW9Kgfv5g3gn5G9sMzUa3N/M7O0il9Zyul/ViCSpSNvMFU5G8UTRKBjSUFSMNFqgPUm/doYcDh7gSMyREx9B92QsRlxQzEiW7gSA+whM0Ih1FXeQ0Qtny0XwWClDaHtcHVfCmfq9I0SOEFPHUpUOkmPx24vFv7xOIO2TXkhdP5DExfNBdsCg9GCcFBxSTrBkU0UQ5lT9FeIxUlFJlWdShfC1KfyfNPM5w8zlLwvp6ukijgQ4BEcgAwxQAlVwDuqgATC4Aw/gCTxr9qj9qK9zkuXtEXPAfgB7f0Tl0+hrw=</latexit>Propensity Score Matching Algorithms
- Match control and treatment individuals based on their propensity score
- Greedy matching:
- Randomly order list of control and treated.
- Start with the first individual from e.g. treated and match to control
with the smallest distance (i.e. obtains the local minimum)
- Remove individuals from control and matched treated
- Move to the next treated subject
Treatment 40 65 Control 50 25
Propensity Score Matching Algorithms
- Match control and treatment individuals based on their propensity score
- Greedy matching:
- Randomly order list of control and treated.
- Start with the first individual from e.g. treated and match to control
with the smallest distance (i.e. obtains the local minimum)
- Remove individuals from control and matched treated
- Move to the next treated subject
Treatment 40 65 Control 50 25
Propensity Score Matching Algorithms
- Match control and treatment individuals based on their propensity score
- Greedy matching:
- Randomly order list of control and treated.
- Start with the first individual from e.g. treated and match to control
with the smallest distance (i.e. obtains the local minimum)
- Remove individuals from control and matched treated
- Move to the next treated subject
Treatment 40 65 Control 50 25 Treatment 40 65 Control 50 25 Total diff: 50 Total diff: 30
Propensity Score Matching Algorithms
- Match control and treatment individuals based on their propensity score
- Greedy matching:
- Randomly order list of control and treated.
- Start with the first individual from e.g. treated and match to control
with the smallest distance (i.e. obtains the local minimum)
- Remove individuals from control and matched treated
- Move to the next treated subject
- Optimal matching: Minimises the global distance, computationally
demanding
- ATE:
τ = ˆ E[τ (i)] = ˆ E[y(i)
1
− y(i)
0 ] = 1
N
N
X
i=0
⇣ y(i)
1
− y(i) ⌘
<latexit sha1_base64="WbTWK3I0Vx5XeT+HXSWxJjiZj3c=">ACanicdVFdSxwxFM2MbdW12lVBKb6EboX1oUtmXd31QZCWQp+Kha4KO+OQyWZ2g5kPkjvCEObBv+ibv6Av/RFm3K3Yai8ETs65h5t7EuVSaCDkznEXr1+s7i03Fh5u7r2rm+cazQjE+ZJnM1EVENZci5UMQIPlFrjhNIsnPo6svtX5+zZUWfoTypwHCZ2kIhaMgqXC5o0PtDj2pxSMn1CYRpH5WlWjmr0bFXBc/FMvRm2qcyJI9dsaLMeJX5Xvm6SEIjkl1Wd8kj6H9gsdXYjKFvbDZIh1Cev1Dgi04GgwGfQu6B6RHCPasVFcLzes0bN764wVCU+BSar1yCM5BIYqEzyquEXmueUXdEJH1mY0oTrwDxEVeFdy4xnCl7UsAP7FOHoYnWZRLZznpj/a9Wky9powLiQWBEmhfAUzYbFBcSQ4br3PFYKM5AlhZQpoR9K2ZTajMD+zsNG8KfTfH/wVm34+13uj96rZP8ziW0A76gNrIQ310gr6hUzREDP1yVp0tZ9v57W64792dWavrzD2b6K9yP94DMpK7Lg=</latexit>Inverse Probability of Treatment Weighting (IPTW)
- Inflate the weight for under represented-subjects due to missing data
- Based on propensity score
- Weight: inverse probability of receiving observed treatment, for
individual i with covariate x:
- Example: Suppose individual (i) has a large e(x), i.e., their probability of
receiving treatment is high.
- If then (typical behaviour: most with are treated)
- If then (underrepresented: boost weight for rare event)
Rosenbaum 1987
e(x) = p(t = 1|x)
<latexit sha1_base64="WIlUbaYz3qTQq4z/eipX0bOCUBE=">AB/nicdVBNS0JBFJ1nX2ZfVrRqMySBbuS9p2guBKlNS4PUQB8yb7zq4LwPZuaF8hL6K21aFNG239Guf9P4EVTUgQuHc+7l3nvckDOpTPDSKysrq1vJDdTW9s7u3vp/YOmDCJBoUEDHogbl0jgzIeGYorDTSiAeC6Hlju6mPmtWxCSBf61moTgeGTgsz6jRGmpmz7quMEYejFkx7lqmFV626cm3bTGTNvmhWzVMaVAp2saCJbZWsoUtbc2QUvUu+n3Ti+gkQe+opxI2bMUDkxEYpRDtNUJ5IQEjoiA2hr6hMPpBPz5/iU630cD8QunyF5+r3iZh4Uk48V3d6RA3lb28m/uW1I9U/c2Lmh5ECny4W9SOVYBnWeAeE0AVn2hCqGD6VkyHRBCqdGIpHcLXp/h/0rTzViFvXxUztfNlHEl0jE5QFlmojGroEtVRA1EUowf0hJ6Ne+PReDFeF60JYzlziH7AePsEbRKVJg=</latexit> <latexit sha1_base64="+jQcKULbtcqi84Vz3/VzBQZo5g=">AB7HicdVBNSwMxEM3Wr1q/qh69BIvgqSRtbe1BKHrxWMGthXYp2TbhmazS5IVytLf4MWDIl79Qd78N2bCir6YODx3gwz8/xYcG0Q+nByK6tr6xv5zcLW9s7uXnH/oKOjRFHm0khEqusTzQSXzDXcCNaNFSOhL9idP7nK/Lt7pjSP5K2ZxswLyUjygFNirOSaAb/Ag2IJlRFqonoDWtKsVmpVSyq4jhsYmtlKIEl2oPie38Y0SRk0lBtO5hFBsvJcpwKtis0E80iwmdkBHrWSpJyLSXzo+dwROrDGEQKVvSwLn6fSIlodbT0LedITFj/dvLxL+8XmKCcy/lMk4Mk3SxKEgENBHMPodDrhg1YmoJoYrbWyEdE0WosfkUbAhfn8L/SadSxmdldFMrtS6XceTBETgGpwCDBmiBa9AGLqCAgwfwBJ4d6Tw6L87rojXnLGcOwQ84b5+na46X</latexit>ti = 1 <latexit sha1_base64="GvOcYz5KVAp86HbNJAsSqDdkOI=">AB9HicdVDLTgIxFO3gC/GFunTSExckSkgyI7oxiUm8khgQjqlQEOnHdsOSiZ8hxsXGuPWj3Hn39gBTNToSW5ycs69ufceP+RMG9f9cFIrq2vrG+nNzNb2zu5edv+gqWkCG0QyaVq+1hTzgRtGY4bYeK4sDntOWPLxO/NaFKMyluzDSkXoCHg0YwcZK3l2PwS4OQyXvIeplc27edatuQItqRYLpaIlBVRGFQSRtRLkwBL1Xva925ckCqgwhGOtO8gNjRdjZRjhdJbpRpqGmIzxkHYsFTig2ovnR8/giVX6cCVLWHgXP0+EeNA62ng284Am5H+7SXiX14nMoNzL2YijAwVZLFoEHFoJEwSgH2mKDF8agkmitlbIRlhYmxOWVsCF+fwv9Js5BHZ3n3upSrXSzjSIMjcAxOAQIVUANXoA4agIBb8ACewLMzcR6dF+d10ZpyljOH4Aect09usJHj</latexit>wi ≈ 1 <latexit sha1_base64="idzmofVRyDLO5wKwkGIeTtaWY=">AB7HicdVBNSwMxEJ2tX7V+VT16CRbBU8m2tbUHoejFYwW3FtqlZNsG5rNLklWKW/wYsHRbz6g7z5b0w/BV9MPB4b4aZeUEiuDYfziZldW19Y3sZm5re2d3L79/0NJxqijzaCxi1Q6IZoJL5hluBGsnipEoEOwuGF3N/Lt7pjSP5a0ZJ8yPyEDykFNirOSZHr/AvXwBFzGu42oNWVIvlyplS0pu1a25yLXWDAVYotnLv3f7MU0jJg0VROuOixPjT4gynAo2zXVTzRJCR2TAOpZKEjHtT+bHTtGJVfojJUtadBc/T4xIZHW4yiwnRExQ/3bm4l/eZ3UhOf+hMskNUzSxaIwFcjEaPY56nPFqBFjSwhV3N6K6JAoQo3NJ2dD+PoU/U9apaJ7VsQ3lULjchlHFo7gGE7BhRo04Bqa4AEFDg/wBM+OdB6dF+d10ZpxljOH8APO2yel546W</latexit>ti = 0 <latexit sha1_base64="YFrVRfWCLe2KcvAUrf7GcjsTn/Y=">AB8HicdVDLTgIxFO3gC/GFunTSExckSkgyI7oxiUm8jAwIZ3SGRrazqTtaAjhK9y40Bi3fo47/8YOYKJGT3KTk3Puzb3+DFn2rjuh5NZWV1b38hu5ra2d3b38vsHbR0litAWiXikuj7WlDNJW4YZTruxolj4nHb8WXqd+6o0iySN2YSU0/gULKAEWysdHs/YLAfhAN8gW36Lp1t1qDltTLpUrZkhKqohqCyFopCmCJ5iD/3h9GJBFUGsKx1j3kxsabYmUY4XSW6yeaxpiMcUh7lkosqPam84Nn8MQqQxhEypY0cK5+n5hiofVE+LZTYDPSv71U/MvrJSY496ZMxomhkiwWBQmHJoLp93DIFCWGTyzBRDF7KyQjrDAxNqOcDeHrU/g/aZeK6KzoXlcKjYtlHFlwBI7BKUCgBhrgCjRBCxAgwAN4As+Och6dF+d10ZpxljOH4Aect08REo/v</latexit>wi 1 <latexit sha1_base64="IbZsoW3LUe4fUItAlwZdfXFs3AE=">AB6nicdVBNS8NAEJ3Ur1q/qh69LBbBU0na2tpb0YvHivYD2lA2027dLMJuxuxhP4ELx4U8eov8ua/cdNWUNEHA4/3ZpiZ50WcKW3bH1ZmZXVtfSO7mdva3tndy+8ftFUYS0JbJOSh7HpYUc4EbWmOe1GkuLA47TjTS5Tv3NHpWKhuNXTiLoBHgnmM4K1kW7uB2yQL9hF267b1RoypF4uVcqGlJyqU3OQY6wUBViOci/94chiQMqNOFYqZ5jR9pNsNSMcDrL9WNFI0wmeER7hgocUOUm81Nn6MQoQ+SH0pTQaK5+n0hwoNQ08ExngPVY/fZS8S+vF2v/3E2YiGJNBVks8mOdIjSv9GQSUo0nxqCiWTmVkTGWGKiTo5E8LXp+h/0i4VnbOifV0pNC6WcWThCI7hFByoQOuoAktIDCB3iCZ4tbj9aL9bpozVjLmUP4AevtE7o1jhk=</latexit>xi <latexit sha1_base64="YHRIfK+Pzb8NlVopV/G/OUEqZE=">ACVnicdVFNb9QwEHVSsvyFcqRy4gVqBxY2dvSpYdKFVw4FoltK21WkeOdbK06TmRPoKsof7Jc4KdwQXjbrfgeydLTe2/G4+e8NtoT51+jeO3W+u2NzTu9u/fuP3iYPNo69lXjFI5VZSp3mkuPRlsckyaDp7VDWeYGT/Lzt0v95CM6ryv7gRY1Tks5t7rQSlKgsqT8lOmDNMe5tq0Kc3zXSwsnVSu6FrcvMv2iA3ieEl5QqwvogDINByAgTX8axcsb69ODina2Wp2lvT5gPN9vjeCAPZ3hrs7AQzFnhgJEFaVp+t6ihLtNZpZoSLSkjvZ8IXtO0lY60Mh2bTzWUp3LOU4CtLJEP2vYungWBmUFQuHEtwxf7a0crS+0WZB2cp6cz/qS3Jf2mThorX01buiG06vqiojFAFSwzhpl2qMgsApDK6bArqDMZwqLwE70Qws1L4f/geDgQrwb8/W7/8M0qjk32hD1l20ywETtk79gRGzPFPrNvURytRV+i7/F6vHFtjaNVz2P2W8XJD6yMsg=</latexit>wi = (
1 e(xi)
if ti = 1
1 1−e(xi)
if ti = 0
Inverse Probability of Treatment Weighting (IPTW)
- Inflate the weight for under represented-subjects due to missing data
- Based on propensity score
- Weight: inverse probability of receiving observed treatment, for
individual i with covariate x:
Rosenbaum 1987
e(x) = p(t = 1|x)
<latexit sha1_base64="WIlUbaYz3qTQq4z/eipX0bOCUBE=">AB/nicdVBNS0JBFJ1nX2ZfVrRqMySBbuS9p2guBKlNS4PUQB8yb7zq4LwPZuaF8hL6K21aFNG239Guf9P4EVTUgQuHc+7l3nvckDOpTPDSKysrq1vJDdTW9s7u3vp/YOmDCJBoUEDHogbl0jgzIeGYorDTSiAeC6Hlju6mPmtWxCSBf61moTgeGTgsz6jRGmpmz7quMEYejFkx7lqmFV626cm3bTGTNvmhWzVMaVAp2saCJbZWsoUtbc2QUvUu+n3Ti+gkQe+opxI2bMUDkxEYpRDtNUJ5IQEjoiA2hr6hMPpBPz5/iU630cD8QunyF5+r3iZh4Uk48V3d6RA3lb28m/uW1I9U/c2Lmh5ECny4W9SOVYBnWeAeE0AVn2hCqGD6VkyHRBCqdGIpHcLXp/h/0rTzViFvXxUztfNlHEl0jE5QFlmojGroEtVRA1EUowf0hJ6Ne+PReDFeF60JYzlziH7AePsEbRKVJg=</latexit> <latexit sha1_base64="YHRIfK+Pzb8NlVopV/G/OUEqZE=">ACVnicdVFNb9QwEHVSsvyFcqRy4gVqBxY2dvSpYdKFVw4FoltK21WkeOdbK06TmRPoKsof7Jc4KdwQXjbrfgeydLTe2/G4+e8NtoT51+jeO3W+u2NzTu9u/fuP3iYPNo69lXjFI5VZSp3mkuPRlsckyaDp7VDWeYGT/Lzt0v95CM6ryv7gRY1Tks5t7rQSlKgsqT8lOmDNMe5tq0Kc3zXSwsnVSu6FrcvMv2iA3ieEl5QqwvogDINByAgTX8axcsb69ODina2Wp2lvT5gPN9vjeCAPZ3hrs7AQzFnhgJEFaVp+t6ihLtNZpZoSLSkjvZ8IXtO0lY60Mh2bTzWUp3LOU4CtLJEP2vYungWBmUFQuHEtwxf7a0crS+0WZB2cp6cz/qS3Jf2mThorX01buiG06vqiojFAFSwzhpl2qMgsApDK6bArqDMZwqLwE70Qws1L4f/geDgQrwb8/W7/8M0qjk32hD1l20ywETtk79gRGzPFPrNvURytRV+i7/F6vHFtjaNVz2P2W8XJD6yMsg=</latexit>wi = (
1 e(xi)
if ti = 1
1 1−e(xi)
if ti = 0
Treated O OOOO Not treated OOOOOOOOO O X=0 X=1 O O
<latexit sha1_base64="C39esRshPfcb3nuUmvPDNRymXvU=">AB/HicdVDLSgMxFM3UV62v0S7dBIvgqmb67qJQdOygn1AW0omzbShmQdJRhmG+ituXCji1g9x59+YaSuo6IELh3Pu5d57IAzqRD6MFJr6xubW+ntzM7u3v6BeXjUkX4oCG0Tn/uiZ2NJOfNoWzHFaS8QFLs2p17dpn43VsqJPO9GxUFdOjicRrDS0sjM3jUGjsAktuYxytfmjfJ5aWTmUB6hOqpUoSb1YqFU1KRgVayqBS1tJciBFVoj830w9knoUk8RjqXsWyhQwxgLxQin8wglDTAZIYntK+ph10qh/Hi+Dk81coYOr7Q5Sm4UL9PxNiVMnJt3eliNZW/vUT8y+uHyqkNY+YFoaIeWS5yQg6VD5Mk4JgJShSPNMFEMH0rJFOs1A6r4wO4etT+D/pFPJWOY+uS7nmxSqONDgGJ+AMWKAKmuAKtEAbEBCB/AEno1749F4MV6XrSljNZMFP2C8fQJHLpPi</latexit>w = 1 0.8 = 5/4
<latexit sha1_base64="NFeIZrSH9pNP/Mh86q0gVj/yumg=">AB9XicdVDLSgMxFM3UV62vqks3wSLUzZiZTpl2USi6cVnBPqAdSyZN29DMgySjltL/cONCEbf+izv/xvQhqOiBC4dz7uXe/yYM6kQ+jBSK6tr6xvpzczW9s7uXnb/oCGjRBaJxGPRMvHknIW0rpitNWLCgOfE6b/uhi5jdvqZAsCq/VOKZegAch6zOClZuaP7+Fagc1asILPUzeaQWXSsm1BZCIXuW5Bk4Lt2k4ZWiaIweWqHWz751eRJKAhopwLGXbQrHyJlgoRjidZjqJpDEmIzygbU1DHFDpTeZXT+GJVnqwHwldoYJz9fvEBAdSjgNfdwZYDeVvbyb+5bUT1S95ExbGiaIhWSzqJxyqCM4igD0mKF8rAkmgulbIRligYnSQWV0CF+fwv9JwzatomunFz1fBlHGhyBY5AHFnBFVyCGqgDAgR4AE/g2bgzHo0X43XRmjKWM4fgB4y3Tz7rkGk=</latexit>e(x) = 4/5 = 0.8 <latexit sha1_base64="17zf4NQkezXe3HRzcjU3kv2z4=">AB+HicdVDLTgIxFO3gC/EB6tJNIzHBzdgiCxIiG5cYiKPBAjplA40dB5pO0ac8CVuXGiMWz/FnX9jeZio0ZPc5OSce3PvPU4ouNIfViJldW19Y3kZmpre2c3ndnb6ogkpQ1aCAC2XaIYoL7rKG5FqwdSkY8R7CWM76c+a1bJhUP/Bs9CVnPI0Ofu5wSbaR+Js1ydyewCvEpRlVk434mi2xUxMVKCSL7rFTJo7IhCJ2XCghiQ2bIgiXq/cx7dxDQyGO+poIo1cEo1L2YSM2pYNUN1IsJHRMhqxjqE8pnrx/PApPDbKALqBNOVrOFe/T8TEU2riOabTI3qkfnsz8S+vE2m3Iu5H0a+XSxyI0E1AGcpQAHXDKqxcQiU3t0I6IpJQbJKmRC+PoX/k2bexkUbXReytYtlHElwCI5ADmBQAjVwBeqgASiIwAN4As/WvfVovVivi9aEtZw5AD9gvX0CEomQxg=</latexit>e(x) = 1/10 = 0.1 <latexit sha1_base64="oRWbq/t3ak69iE28Rt7NS0xA8uA=">ACnicdVDLSgMxFM3UV62vqks30SK4cUj6diGIblwq2Ae0pWTSTBuaeZBklDLM2o2/4saFIm79Anf+jelDfKAHLpycy+59zih4Eoj9G6l5uYXFpfSy5mV1bX1jezmVl0FkaSsRgMRyKZDFBPcZzXNtWDNUDLiOYI1nOHZ2G9cM6l4F/pUcg6Hun73OWUaCN1s7s3x21XEhrjJMaHyK4mX29k5PjUjebQzZCR6hcgYcFfLFgiF5XMYVDLGxsiBGS62bd2L6CRx3xNBVGqhVGoOzGRmlPBkw7UiwkdEj6rGWoTzymOvHklATuG6UH3UCa8jWcqN8nYuIpNfIc0+kRPVC/vbH4l9eKtFvtxNwPI818Ov3IjQTUARznAntcMqrFyBCJTe7QjogJglt0suYED4vhf+Tet7GJRtdFnMnp7M40mAH7IEDgEFnIBzcAFqgIJbcA8ewZN1Zz1Yz9bLtDVlzWa2wQ9Yrx8v5JlP</latexit>w = 1 1 − 0.8 = 1 0.2 = 5
Inverse Probability of Treatment Weighting (IPTW)
- ATE:
- Weights may be inaccurate/unstable for subjects with a very low
probability of receiving the observed treatment
- Other variations to stabilise the above
Rosenbaum 1987
<latexit sha1_base64="KRV4/yAgzURtyZYauzdBFTdZqns=">AB9HicdVDLSgMxFM3UV62vqks3wSIQsmU1o4LoejGValgH9AOJZNm2tBMZkwyhTL0O9y4UMStH+POvzHTVlDRA/dyOdecnO8iDOlEfqwMiura+sb2c3c1vbO7l5+/6ClwlgS2iQhD2XHw4pyJmhTM81pJ5IUBx6nbW98nfrtCZWKheJOTyPqBngomM8I1kZy6/AS1vs2PDMd9fMFVESoXD1H0JALx3GqhpQqIwQtI2VogCWaPTz71BSOKACk04Vqpro0i7CZaEU5nuV6saITJGA9p1CBA6rcZH70DJ4YZQD9UJoSGs7V7xsJDpSaBp6ZDLAeqd9eKv7ldWPtO27CRBRrKsjiIT/mUIcwTQAOmKRE86khmEhmboVkhCUm2uSUMyF8/RT+T1qlol0potyoXa1jCMLjsAxOAU2qIauAEN0AQE3IMH8ASerYn1aL1Yr4vRjLXcOQ/YL19AtZtkDU=</latexit>N = N1 + N0 <latexit sha1_base64="0ICHyOSo0dgfEJWSFsLN2maZ7g=">ACf3icdVFNbxMxEPUuX234CnDkYohAdTIrgJbxVcOFVFIqRSNqy83tnWqte7smdRI2t/Bn+MG7+FS71tqAKUkSw9vzfzxp7Jaq0cMvYzim/cvHX7ztZ27+69+w8e9h89/uKqxkqYyUpX9igTDrQyMEOFGo5qC6LMNMyz0w+dPv8G1qnKfMZVDctSHBtVKCkwUGn/e1JYIT1v/UHK28Q1ZognKH4IKQt6vUJ43JwXYdQl71Q/Vq/aqDIZnabjvbPiwTR9TIb3ei/3jxXfWbml/wEaMjSfvGA1gbzqdTgLYfcvGjFEepC4GZB2Haf9HkleyKcGg1MK5BWc1Lr2wqKSGtpc0DmohT8UxLAI0ogS39Bfja+mLwOS0qGw4BukFu1nhRencqsxCZinwxP2tdeR12qLBYr0ytQNgpGXjYpGU6xotwuaKwsS9SoAIa0Kb6XyRIRhYNhYLwzh90/p/8F8d8THI84/jQf79fz2CJPyXMyJxMyD75SA7JjEjyK3oWvY7exCR+GY9idpkaR+uaJ+SPiPfOAVuCxYI=</latexit> 1N1 X
treated
y(i)
1
1 e(xi) − 1 N0 X
not treated
y(i) 1 1 − e(xi)
Sensitivity Analysis
- Randomised trials are unconfounded by design (flipping a coin)
- Observational data may have possible hidden bias/unobserved
confounder that is not controlled for
- No guarantee that matching leads to balance on variables we did
not match for!
- People who look comparable may differ
- Violates ignorability (unconfoundedness) assumption
- Unconfoundedness is fundamentally (directly) unverifiable
Rosenbaum Design of Observational Studies, Springer, 2010
Sensitivity Analysis
- “This difference in the unobserved covariate u, the critic
continues, is the real reason outcomes differ in the treated and control groups: it is not an effect caused by the treatment, but rather a failure on the part of the investigators to measure and control imbalances in u. Although not strictly necessary, the critic is usually aided by an air of superiority: “This would never happen in my laboratory.””
- “It is important to recognize at the outset that our critic may be,
but need not be, on the side of the angels. The tobacco industry and its (sometimes distinguished) consultants criticized, in precisely this way, observational studies linking smoking with lung cancer.”
Rosenbaum Design of Observational Studies, Springer, 2010
Sensitivity Analysis
- If there is hidden bias, how severe is it:
- Does the conclusion change from statistically significant to not?
- Does it change the direction of effect?
- i.e., how sensitive are our conclusions to minor violation of our
keys assumption
- If very sensitive: change strategy (see Causal Inference with
Unobserved Confounders)
Sensitivity Analysis
- Take individuals (i) and (j), such that their observed covariates
are the same: hence no hidden bias
- Consider e.g., the odds ratio:
- Otherwise if there is a hidden bias, e.g., , one subject is
twice as likely to receive treatment because of unobserved pre- treatment feature
- quantifies degree of bias.
X(i) = X(j)
<latexit sha1_base64="SGrKYXj0QXqSr/vO2yQ23sWu9oQ=">AB+HicdVDLSgMxFM34rPXRUZdugkVoN2WmLa1dCEU3LivYB7RjyaSZNjaTGZKMUIf5EjcuFHrp7jzb8y0FVT0wOUezrmX3Bw3ZFQqy/owVlbX1jc2M1vZ7Z3dvZy5f9CRQSQwaeOABaLnIkY5aStqGKkFwqCfJeRrju9SP3uHRGSBvxazULi+GjMqUcxUloamrneTVygxeQs7bfFZGjmrZJlNaxaHWrSqJSrFU3Kds2u29DWVo8WKI1N8HowBHPuEKMyRl37ZC5cRIKIoZSbKDSJIQ4Skak76mHPlEOvH8ASeaGUEvUDo4grO1e8bMfKlnPmunvSRmsjfXir+5fUj5Z06MeVhpAjHi4e8iEVwDQFOKCYMVmiAsqL4V4gkSCudVaH8PVT+D/plEt2pVS+quab58s4MuAIHIMCsEdNMElaIE2wCACD+AJPBv3xqPxYrwuRleM5c4h+AHj7RPUIpKQ</latexit>e(i) = e(j)
<latexit sha1_base64="il0Qskgiy1BI+h40/MP0Bdy46ZE=">AB+HicdVDLSgMxFM34rPXRUZdugkVoN2WmLa1dCEU3LivYB7RjyaSZNjaTGZKMUIf5EjcuFHrp7jzb8y0FVT0wOUezrmX3Bw3ZFQqy/owVlbX1jc2M1vZ7Z3dvZy5f9CRQSQwaeOABaLnIkY5aStqGKkFwqCfJeRrju9SP3uHRGSBvxazULi+GjMqUcxUloamjlyExdoMTlL+20xGZp5q2RZDatWh5o0KuVqRZOyXbPrNrS1lSIPlmgNzfBKMCRT7jCDEnZt61QOTESimJGkuwgkiREeIrGpK8pRz6RTjw/PIEnWhlBLxC6uIJz9ftGjHwpZ76rJ32kJvK3l4p/ef1IeadOTHkYKcLx4iEvYlAFME0BjqgWLGZJgLqm+FeIEwkpnldUhfP0U/k865ZJdKZWvqvnm+TKODgCx6AbFAHTXAJWqANMIjA3gCz8a98Wi8GK+L0RVjuXMIfsB4+wT8jpKq</latexit>1 Γ ≤
e(i) 1−e(i) e(j) 1−e(j)
≤ Γ
<latexit sha1_base64="N8OSmPitB/oRdK5nLOaiayYDs0=">ACRHicdZDLSgMxFIYz3q23qks3wSLowjJpxepOdKFLBatCp5ZMekZjk5kxyQhlmIdz4wO48wncuFDErZhpK17QAwkf/39OLr8fC6N6z4Q8Mjo2PjE5OFqemZ2bni/MKJjhLFoM4iEakzn2oQPIS64UbAWayASl/Aqd/Zy/3TG1CaR+Gx6cbQlPQi5AFn1FipVWx4gaIsJVnq7VMpaeYJuMZ9sb/DebrK17IsJeufmH2zr6sHAfzvbNaxZJbdt1td7OGLWxXKxtVCxWySWoE2vlVUKDOmwV712xBIJoWGCat0gbmyaKVWGMwFZwUs0xJR16AU0LIZUgm6mvRAyvGKVNg4iZVdocE/9PpFSqXVX+rZTUnOpf3u5+JfXSEyw1Ux5GCcGQta/KEgENhHOE8VtroAZ0bVAmeL2rZhdUpuPsbkXbAifP8X/w0mlTKrlytFGaWd3EMcEWkLaBURVEM76Adojpi6BY9omf04tw5T86r89ZvHXIGM4voRznvH1pzstc=</latexit>Γ ≈ 1
<latexit sha1_base64="AFicDWOpX3jchDhWTuACrDbZVQM=">AB+HicdVDLSgNBEJz1GeMjUY9eBoPgKewmITG3oAc9RjAPSJbQO5lNhszsDjOzYgz5Ei8eFPHqp3jzb5w8BUtaCiqunuCiRn2rjuh7Oyura+sZnaSm/v7O5lsvsHTR0nitAGiXms2gFoylEG4YZTtSURABp61gdDHzW7dUaRZHN2YsqS9gELGQETBW6mUz3UsQArogpYrvsNfL5ty861bdcgVbUi0WSkVLCl7Zq3jYs9YMObREvZd97/ZjkgaGcJB647nSuNPQBlGOJ2mu4mEsgIBrRjaQSCan8yP3yKT6zSx2GsbEUGz9XvExMQWo9FYDsFmKH+7c3Ev7xOYsIzf8IimRgakcWiMOHYxHiWAu4zRYnhY0uAKGZvxWQICoixWaVtCF+f4v9Js5D3ivnCdSlXO1/GkUJH6BidIg9VUA1doTpqIS9ICe0LNz7zw6L87ronXFWc4coh9w3j4Bjp+TCA=</latexit>Γ
<latexit sha1_base64="JSclukI1PS3m+J0Gw5mkSla6yZQ=">AB7XicdVDLSgNBEJz1GeMr6tHLYBA8hd1NSMwt6EGPEcwDkiXMTmaTMfNYZmaFsOQfvHhQxKv/482/cTaJoKIFDUVN91dYcyoNq74aysrq1vbOa28ts7u3v7hYPDtpaJwqSFJZOqGyJNGBWkZahpBsrgnjISCecXGZ+54oTaW4NdOYByNBI0oRsZK7f4V4hwNCkW35Lp1t1qDltTLfqVsie9VvZoHPWtlKIlmoPCe38ocKJMJghrXueG5sgRcpQzMgs3080iRGeoBHpWSoQJzpI59fO4KlVhjCSypYwcK5+n0gR13rKQ9vJkRnr314m/uX1EhOdBykVcWKIwItFUcKgkTB7HQ6pItiwqSUIK2pvhXiMFMLGBpS3IXx9Cv8nb/klUv+TaXYuFjGkQPH4AScAQ/UQANcgyZoAQzuwAN4As+OdB6dF+d10briLGeOwA84b5+vjI82</latexit>Γ = 2
Sensitivity Analysis Computations: An example
- S pairs, s = 1,...,S of two subjects, one treated, one control,
matched for observed covariates
- Statistical test: Wilcoxon’s signed rank test (non-parametric),
W is the sum of the ranks of the positive differences between treatment and control
- In a moderately large randomized experiment, under the null
hypothesis of no effect, W is approximately normally distributed
Rosenbaum Sensitivity Analysis in Observational Studies, 2005
E[W] = S(S + 1)/4 , Var[W] = S(S + 1)(2S + 1)/24
Sensitivity Analysis Computations: An example
- Example: W=300, S=25 pairs in a randomised experiment
- In a randomised experiment (
, well-matched):
- Compared to a normal distribution: p-value = 0.0001
- In a moderately large observational study, under the null
hypothesis of no effect, the distribution of W is approximately bounded between two Normal distributions (notice: )
Γ ≈ 1
<latexit sha1_base64="AFicDWOpX3jchDhWTuACrDbZVQM=">AB+HicdVDLSgNBEJz1GeMjUY9eBoPgKewmITG3oAc9RjAPSJbQO5lNhszsDjOzYgz5Ei8eFPHqp3jzb5w8BUtaCiqunuCiRn2rjuh7Oyura+sZnaSm/v7O5lsvsHTR0nitAGiXms2gFoylEG4YZTtSURABp61gdDHzW7dUaRZHN2YsqS9gELGQETBW6mUz3UsQArogpYrvsNfL5ty861bdcgVbUi0WSkVLCl7Zq3jYs9YMObREvZd97/ZjkgaGcJB647nSuNPQBlGOJ2mu4mEsgIBrRjaQSCan8yP3yKT6zSx2GsbEUGz9XvExMQWo9FYDsFmKH+7c3Ev7xOYsIzf8IimRgakcWiMOHYxHiWAu4zRYnhY0uAKGZvxWQICoixWaVtCF+f4v9Js5D3ivnCdSlXO1/GkUJH6BidIg9VUA1doTpqIS9ICe0LNz7zw6L87ronXFWc4coh9w3j4Bjp+TCA=</latexit>Γ ≈ 1
<latexit sha1_base64="AFicDWOpX3jchDhWTuACrDbZVQM=">AB+HicdVDLSgNBEJz1GeMjUY9eBoPgKewmITG3oAc9RjAPSJbQO5lNhszsDjOzYgz5Ei8eFPHqp3jzb5w8BUtaCiqunuCiRn2rjuh7Oyura+sZnaSm/v7O5lsvsHTR0nitAGiXms2gFoylEG4YZTtSURABp61gdDHzW7dUaRZHN2YsqS9gELGQETBW6mUz3UsQArogpYrvsNfL5ty861bdcgVbUi0WSkVLCl7Zq3jYs9YMObREvZd97/ZjkgaGcJB647nSuNPQBlGOJ2mu4mEsgIBrRjaQSCan8yP3yKT6zSx2GsbEUGz9XvExMQWo9FYDsFmKH+7c3Ev7xOYsIzf8IimRgakcWiMOHYxHiWAu4zRYnhY0uAKGZvxWQICoixWaVtCF+f4v9Js5D3ivnCdSlXO1/GkUJH6BidIg9VUA1doTpqIS9ICe0LNz7zw6L87ronXFWc4coh9w3j4Bjp+TCA=</latexit>Rosenbaum Sensitivity Analysis in Observational Studies, 2005
E[W] = 162.5 , Var[W] = 1381.25 , deviate Z = (300 − 162.5)/ √ 1381.25 = 3.70
µmax = λS(S + 1)/2 , µmin = (1 − λ)S(S + 1)/2 σ2 = λ(1 − λ)S(S + 1)(2S + 1)/6 λ = Γ/(1 + Γ)
Notice
<latexit sha1_base64="SzPVy/na6luZf5EyBNEWUAh/1Zw=">AB73icdVDLSgMxFM34rPVdekmWARXJVNaOy6EogtdVrAPaIeSTNtaJIZk4xQhv6EGxeKuPV3Pk3ZtoKnrgwuGce7n3niDmTBuEPpyl5ZXVtfXcRn5za3tnt7C39JRoghtkohHqhNgTmTtGmY4bQTK4pFwGk7GF9mfvueKs0ieWsmMfUFHkoWMoKNlTq9KywEPnf7hSIqIVSpnSJoyZneTVLylVUQi61spQBAs0+oX3iAiaDSEI617roNn6KlWGE02m+l2gaYzLGQ9q1VGJBtZ/O7p3CY6sMYBgpW9LAmfp9IsVC64kIbKfAZqR/e5n4l9dNTOj5KZNxYqgk80VhwqGJYPY8HDBFieETSzBRzN4KyQgrTIyNKG9D+PoU/k9a5ZJbLaGbSrF+sYgjBw7BETgBLqiBOrgGDdAEBHDwAJ7As3PnPDovzu8dclZzByAH3DePgGvLY+/</latexit>Γ = 1Sensitivity Analysis Computations: An example
- Example: W=300, S=25 pairs in a randomised experiment
- For
,
- For the tobacco and lung cancer example, .
Γ = 2 λ = Γ/(1 + Γ) = 2/3
µmax = λS(S + 1)/2 = 216.67 , µmin = (1 − λ)S(S + 1)/2 = 108.33
σ2 = λ(1 − λ)S(S + 1)(2S + 1)/6 = 1227.78
Z1 = 5.47 ⇒ p = 0.00000002 Z2 = 2.38 ⇒ p = 0.009
still significant, even with Γ = 2 Notice: There are two sources of uncertainty: 1) Due to the causal statistical estimates 2) Due to sensitivity analysis (of unobserved variables, bias)
Γ = 6
Causal Inference
Overview of the course
Causal Effect Estimation Casual Discovery
Obsv confounders
Unobsv confounders
Regression Adjustment Propensity score IV Front- door criterion Constraint- based Score- based FCM
Rubin Rubin, Pearl
- Estimating causal effects
- Randomised trial vs observational data