Title: The multirank likelihood for semiparametric CCA
Authors: Jordan Bryan - Duke University (United States) [presenting]
Peter Hoff - Duke University (United States)
Abstract: In the analysis of multivariate data, it is often of interest to evaluate the dependence between two or more sets of variables rather than the dependence between individual variables. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, whereas the dependence between variable sets is still described by a parametric model that provides a low dimensional summary of dependence. This approach is a generalization of the Gaussian copula model to the case of vector-valued margins. While maximum likelihood estimation in the proposed model is intractable, we develop a novel MCMC algorithm called cyclically monotone Monte Carlo (CMMC) that allows us to get estimates and confidence regions for the between-set dependence parameters. This algorithm is based on a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We illustrate the proposed inference procedure on nutrient data from the USDA.