CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Bayesian functional clustering for laboratory data from electronic medical records Authors:  Jason Roy - Rutgers University (United States) [presenting]
Bret Zeldow - University of Pennsylvania (United States)
Abstract: A Bayesian semiparametric mixed model is proposed for longitudinal data by using an enriched Dirichlet process (EDP) prior. To account for nonlinearities in the outcome over time, we use splines to model the time effect. The nonparametric EDP prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. The goal is to predict the outcome at unobserved time points for subjects with outcome data at other time points and for completely new subjects. We find improved prediction over mixed models with Dirichlet process (DP) priors when there are a large number of predictors. Our method is demonstrated with electronic health records consisting of new initiators of second generation antipsychotics, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.