CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1487
Title: Gene selection using generalized linear measurement error models Authors:  Hajoung Lee - Sungkyunkwan University (Korea, South) [presenting]
Jaejik Kim - Sungkyunkwan University (Korea, South)
Abstract: Gene expression data is obtained by measuring the amount of DNA's genetic information expressed through each gene and its expression is involved in protein production, which is important in cell functioning. So far, many studies have been conducted to select significant genes from such data to understand disease causes and contribute to the development of medications and therapies. However, measurement errors caused by simultaneously measuring tens of thousands of genes with high-throughput equipment are inevitable, and gene selection considering them is uncommon. This is because it is practically difficult to quantify the measurement errors due to their unclear sources. However, if they are not considered in gene selection, it may cause an increase in the number of falsely discovered genes. To alleviate this problem, a gene selection method is proposed using generalized linear measurement error models (GLMEMs). Furthermore, to consider ultra-high dimensionality, we develop an iterative gene screening algorithm which repeats filtering and regularization in the GLMEM framework. The proposed method can reduce the number of falsely discovered genes, and it can also provide stable gene selection results under measurement errors. These results are verified through simulation studies and real data analysis.