Title: Modelling big data to predict trajectories of repeated binary outcomes
Authors: Rafiqul Chowdhury - University of Prince Edward Island (Canada)
Tariqul Hasan - University of New Brunswick (Canada) [presenting]
Abstract: In environmental, health, social and biological sciences, the amount of longitudinal or repeated data being captured and stored is increasing significantly, due to technological advances and lower cost of data acquisition. Adequate modeling of such big data is useful for cost reduction, time reduction, new product developments, and developing new strategies and optimum decision making. As there is limited literature available to analyze big data in longitudinal or repeated measures setup, there is a significant interest to develop a big data modeling approach for naturally correlated longitudinal data. We develop a joint modelling approach to predict the trajectory risks of a sequence of repeated outcomes of interest. Trajectory prediction from big data collected longitudinally requires a unique modeling approach to overcome additional levels of complexity. The proposed methodology will be compared with various machine learning algorithms such as the Decision Tree, Random Forest, Support Vector Machine, Neural Network, etc. The performance of the proposed model will also be demonstrated using a longitudinally collected Fine particulate matter (PM2.5) data set.