Title: Statistics for statisticians
Authors: Jiashun Jin - Carnegie Mellon University (United States) [presenting]
Abstract: A data set for the publications of statisticians has been collected and cleaned. It consists of titles, authors, author affiliations, abstracts, MSC numbers, keywords, reference, and citation. It counts of 83,661 papers published in 36 journals in statistics, probability, and related field, spanning 41 years. The data set motivates an array of interesting problems. We will discuss paper counts, most cited authors and papers, journal ranking, text mining, network analysis, and citation prediction. For text mining, we use the paper abstracts in our data set as the text documents, and focus on how to use the estimated topic weights to study the research patterns of individual authors. For network analysis, we focus on hierarchical community detection, membership estimation, and especially how to characterize the research trajectories of a subset of selected statisticians over the years.