CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1039
Title: Disentangling homophily, community structure and triadic closure in networks Authors:  Tiago Peixoto - CEU GmbH (Austria) [presenting]
Abstract: One of the most typical properties of network data is the presence of homophily, i.e. the increased tendency of an edge to exist between two nodes if they share the same underlying characteristic, such as a social parameter, location, etc. Another pervasive pattern encountered is transitivity, i.e. the increased tendency to observe an edge between two nodes if they share a neighbor in common. Although these patterns are indicative of two distinct mechanisms of network formation, namely homophily and triadic closure, respectively, they are generically conflated in non-longitudinal data. This is because both processes can result in the same kinds of observation: 1. the presence of triangles, and 2. the formation of community structure. This conflation means we cannot reliably interpret the underlying mechanisms of network formation merely from the abundance of triangles or observed community structure in network data. We present a solution to this problem, consisting in a principled method to disentangle homophily and community structure from triadic closure in network data. This is achieved by formulating a generative model that includes community structure in a first instance, and an iterated process of triadic closure in a second. Based on this model, we develop a Bayesian inference algorithm that is capable of identifying which edges are more likely to be due to community structure or triadic closure, in addition to the underlying community structure itself.