Title: Adjusting for selection bias due to missing data in EHR-based research
Authors: Sebastien Haneuse - Harvard TH Chan School of Public Health (United States) [presenting]
Sarah Peskoe - Harvard TH Chan School of Public Health (United States)
David Arterburn - Kaiser Permanente Washington Health Research Institute (United States)
Michael Daniels - University of Florida (United States)
Abstract: While electronic health records (EHR) data provide unique opportunities for medical research, there are numerous challenges that must be dealt with. Among these, selection bias due to missing data is under-appreciated. While standard missing data methods are often applied in the EHR context, they will, in general, fail to capture the complexity of the data so that residual selection bias may remain. Building on a recently-proposed framework for characterizing how data arise in EHR-based studies, we develop and evaluate a statistical framework for regression modeling based on inverse probability weighting that adjusts for selection bias in the complex setting of EHR-based research. We show that the resulting estimator is consistent and asymptotically Normal, and derive the form of the asymptotic variance. Plug-in estimators for the latter are proposed. We use simulations to: (i) highlight the potential for bias in EHR studies when standard approaches are used to account for selection bias, and (ii) evaluate the small-sample operating characteristics of the proposed framework. Finally, the methods are illustrated using data from an on-going EHR-based study of bariatric surgery on BMI.