CMStatistics 2022: Start Registration
View Submission - CMStatistics
B0546
Title: Automated harmonization of multi-institutional electronic health records data Authors:  Xu Shi - University of Michigan (United States) [presenting]
Abstract: Current practice for electronic health records (EHR) data harmonization involves standardizing data elements via a common data model, a critical step that unifies the medical coding ``vocabulary'' across study sites. However, despite a common vocabulary, the coding ``dialect'' (i.e., the use and interpretation of codes for a particular clinical procedure or diagnosis) may differ across data partners due to heterogeneity in care practice and financial drivers. With increasingly diverse health systems and coding systems, there is more and more potential variation in the way a clinical concept can be coded. Existing manually curated ontology and mapping to reduce heterogeneity and harmonize data are not scalable and error-prone. Data-sharing constraints bring additional challenges to statistical analysis across institutions. We will present data-driven and privacy-preserving statistical methods for detecting and reducing coding differences between healthcare systems. We will share our findings from a case study of EHR data harmonization between two healthcare institutions.