View Submission - CMStatistics

B0546
**Title: **On correlation in a simple Gaussian sample that cannot be identified from data
**Authors: **Christian Hennig - University of Bologna (Italy) **[presenting]**

**Abstract: **It is shown that $X_1,\ldots,X_n$ from an i.i.d. Gaussian sample cannot be distinguished based on observed data from Gaussian data for which the correlation $r(X_i,X_j)=\rho\neq 0 \forall i,j$. This particularly implies that this violation of the i.i.d. assumption in Gaussian sampling cannot be checked, despite potentially having quite a strong impact on inference about the mean and variance parameters. A general definition of identifiability from data is introduced. How this is different from standard identifiability is discussed. Other parameters that are generally identifiable but not identifiable from data are cluster membership parameters in a spherical Gaussian fixed partition model as estimated by $k$-means clustering. There is, however, a different model for $k$-means where these parameters are identifiable from data.