CMStatistics 2020: Start Registration
View Submission - CMStatistics
B0341
Title: Accelerating health research with electronic health records data Authors:  Rebecca Hubbard - University of Pennsylvania (United States) [presenting]
Abstract: Using data generated as a by-product of electronic interactions, including electronic health records (EHR), social media data, and data from wearable devices, has the potential to accelerate research on health and healthcare vastly. Statistical insights on sampling and inference are key to drawing valid conclusions based on these messy and incomplete data sources. We will use previous research on EHR-based phenotyping to motivate a discussion of the roles of informatics, statistics, and data science in the process of learning from EHR data. EHR-based phenotyping is hampered by complex missing data patterns and heterogeneity across patients and healthcare systems, features which have been largely ignored by existing phenotyping methods. As a result, not only are EHR-derived phenotypes imperfect, but they often feature exposure-dependent differential misclassification, which can bias results towards or away from the null. We will discuss novel and existing approaches to EHR-based phenotyping, as well as statistical methods to correct for phenotyping error in analyses. The overall goal is to use the example of phenotyping to illustrate the unique contribution of statistics to the process of generating evidence from modern data sources.