Title: Partial least squares and interesting directions in data
Authors: John Kent - University of Leeds (United Kingdom) [presenting]
Abstract: Consider the usual multiple linear regression of a response random variable $y$ on a $p$-dimensional vector of explanatory random variables $x$. Ordinary least squares estimation looks for the linear combination of $x$ that has the highest correlation with $y$. In contrast, partial least squares (PLS) is an iterative method; the first iteration looks for the standardized linear combination of $x$ that has the highest covariance with $y$. The focus on standardized linear combinations makes PLS a ``regularized'' method of regression analysis. Higher-order iterations yield linear combinations of $x$ that have no ``direct'' correlation with $y$, but instead have an ``indirect'' correlation through their correlations with earlier linear combinations of $x$. Partial least squares have close links to envelope models and Krylov matrix decompositions. The sequence of optimal linear combinations identified by PLS can be viewed as a sequence of random variables that is dual to a one-dimensional Gaussian Markov chain; indirect correlation in the regression setting corresponds to conditional dependence in the Markov chain setting. These connections provide some novel insights into the behavior of PLS.