Title: Statistical approaches for networks and trees arising from natural language data
Authors: Simon Preston - University of Nottingham (United Kingdom) [presenting]
Katie Severn - University of Nottingham (United Kingdom)
Ian Dryden - University of Nottingham (United Kingdom)
Abstract: Networks and trees arise as natural representations of text data, for example, in characterising word-pair co-occurrence, or the syntactic structure of individual sentences. We will discuss networks and trees as ``object data'', i.e. in which they are treated as the statistical unit of observation, but with non-Euclidean sample space, and outline some statistical approaches we have developed for regression, two-sample testing and classification.