B1009
Title: Semi-supervised learning with electronic health records
Authors: Jessica Gronsbell - University of Toronto (United States) [presenting]
Abstract: The adoption of electronic health records (EHRs) has generated massive amounts of routinely collected medical data with the potential to improve our understanding of healthcare delivery and disease processes. However, the analysis of EHR data remains both practically and methodologically challenging as it is recorded as a byproduct of billing and clinical care, and not for research purposes. We will discuss methods that bridge classical statistical theory and modern machine learning tools in an effort to efficiently and reliably extract insight from EHR data. We will focus primarily on (i) the challenges in obtaining annotated outcome data, such as the presence of a disease or clinical condition, from patient records and (ii) how to reduce the annotation burden by leveraging unlabeled data in model estimation and evaluation.