CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1672
Title: Principal nested shape subspace analysis of molecular data Authors:  Ian Dryden - Florida International University (United States) [presenting]
Abstract: Molecular dynamics simulations produce huge datasets of temporal sequences of molecules. It is of interest to summarize the shape evolution of the molecules in a succinct, low-dimensional representation. However, Euclidean techniques such as principal components analysis (PCA) can be problematic as the data may lie far from a flat manifold. Principal nested spheres give a fundamentally different decomposition of data from the usual Euclidean subspace-based PCA. Subspaces of successively lower dimensions are fitted to the data in a backwards manner with the aim of retaining signal and dispensing with noise at each stage. We adapt the methodology to 3D shape subspaces and provide some practical fitting algorithms. The methodology is applied to cluster analysis of peptides, where different states of the molecules can be identified. Also, the temporal transitions between cluster states are explored. Further molecular modelling tasks include resolution matching, where coarse-resolution models are back-mapped into high-resolution (atomistic) structures.