Title: Evaluation of similarity for contexts on association rule based extraction
Authors: Ken Nittono - Hosei University (Japan) [presenting]
Abstract: The text mining approaches aiming to extract expressions which have particular features and gather systematically the resulting parts of documents have been increasing its importance in recent years. The contexts which are regarded as particular expressions are represented as combinations of terms and association rules in text mining methods are utilized for the extraction. The utilization of association rules enables to find essential combinations of terms valued throughout the large-sized target documents. The combinations of terms extracted by the association rules imply the pointers to the specific parts of original documents. Latent semantic analysis is applied in order to make this analyzing model have a relationship between combinations of terms and contests. Similarities between combinations of terms and the contexts are measured on a concept space generated by latent semantic analysis and it is decided which contexts throughout the whole documents have particular meanings. Herein, conditions such as influence of dimensionality of concept space on similarity and composition of cosine values as a similarity measure are evaluated. Collection of contexts which have significant similarity leads to generation of abstracted documents and, furthermore, accumulation of them enables applying to constructing a text database which is reusable as knowledge, for instance.