CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: A multivariate mixed-effects selection model for batch-processed proteomics data with non-ignorable missingness Authors:  Lin Chen - University of Chicago (United States) [presenting]
Abstract: In quantitative proteomics, mass tag labeling techniques, such as isobaric tags for relative and absolute quantitation, have been widely adopted in mass spectrometry experiments. These techniques allow peptides/proteins from multiple samples of a batch being quantified in a single experiment, and as such greatly improve the efficiency of protein quantitation. However, the batch-processing of samples also results in severe batch effects and non-ignorable missing data occurring at the batch level. We developed a multivariate MIxed-effects SElection model framework (mvMISE) to jointly analyze multiple correlated genomic features in labeled proteomics data, considering the batch effects and the non-ignorable missingness. We proposed tailored models to account for different correlation structures among specific high-dimensional features. We employed a factor-analytic random effects structure to model the high correlations among multiple peptides, each of which is a shorter fragment digested from the same protein. We introduced a graphical lasso penalty on the error precision matrix for modeling sparse biological dependence among multiple proteins in a functional pathway. We developed estimation algorithms for the models. We applied the proposed methods to the breast cancer proteomic data from the Clinical Proteomic Tumor Analysis Consortium, and identified phosphoproteins/pathways showing differential abundances in triple negative breast tumors versus.