CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Machine learning methods for initial orthonormal basis selection for functional data Authors:  Heddy Bellout - Lund University (Sweden) [presenting]
Krzysztof Podgorski - Lund University (Sweden)
Abstract: In most current implementations of the functional data methods, the effect of the initial choice of an orthonormal basis that is used to analyze data is typically has not been studied. As a result, trigonometric (Fourier), wavelet, or polynomial bases are most popularly used by default. No formal criteria are developed to give a researcher indication which of the bases is preferable for initial transformation of the data. On the other hand it is well known that the choice of the basis affects efficiency in retrieving stochastic structure of a studied model. A classical result in this context is the Karhunen-Loeve expansion of the covariance. The basis associated with this expansion has the optimality in the total mean square error sense. We will propose quantitative criteria in terms of the computational efficiency and the mean square error that will allow for comparison performances of different bases in a given problem. The convenience of a priori chosen orthonormal basis is mostly mathematical, however typically for a given functional data set it maybe computationally more effective to work with a data driven basis. We plan to implement machine learning algorithms for the choice of basis uniformly for all samples and study its efficiency against arbitrary choice of the basis. The optimality criterion, like the total mean square error, discussed above would be utilized, both in the learning algorithms, and in comparison studies.