View Submission - CMStatistics

B0199
**Title: **Elemental estimates, influence, and algorithmic leveraging
**Authors: **Keith Knight - University of Toronto (Canada) **[presenting]**

**Abstract: **It is well-known that the ordinary least squares estimate can be expressed as a weighted sum of elemental estimates based on subsets of $p$ observations where $p$ is the dimension of parameter vector. These weights can be viewed as a probability distribution on subsets of size $p$ (from $n$ observations) of the predictors. We derive its lower dimensional distributions and define a measure of potential influence for subsets of observations analogous to the diagonal elements of the hat matrix for single observations. This theory is then applied to algorithmic leveraging, which is a method for approximating the ordinary least squares estimates when both $n$ and $p$ are large. This method draws a sample of size $m \ll n$ from the observations where high leverage observations (according to the diagonals of the hat matrix) are sampled with higher probability; we can then estimate the regression parameters using either ordinary (unweighted) or weighted least squares using the sampled observations. In particular, we provide a theoretical justification complementing the empirical evidence that unweighted estimation generally outperforms weighted estimation.