Title: On the number of principal components in high dimensions
Authors: Sungkyu Jung - Seoul National University (Korea, South) [presenting]
Abstract: Modern big data challenges suggest investigation of growing dimension, with limited sample size. While the high dimension, low sample size asymptotics has been a powerful tool in understanding the success and failure of some linear multivariate methods, current tools also exhibit limitations. We will discuss some of the limitations and potential solutions. In particular, the problem of how many components to retain in the application of principal component analysis when the dimension is much higher than the number of observations will be discussed in detail. The proposed estimation strategy for the number of components is to sequentially test skewness of the squared lengths of residual scores that are obtained by removing leading principal components. The residual lengths are asymptotically left-skewed if all principal components with diverging variances are removed, and right-skewed if not. Some asymptotic properties of the proposed estimator will be discussed. Specifically, the estimator is shown to be consistent. The proposed estimator performs well in high-dimensional simulation studies, and provides reasonable estimates in a number of real data examples.