Wedge Sampling for Computing Clustering Coefficients and Triangle Counts on Large Graphs

Abstract

Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Despite the importance of these triadic measures, algorithms to compute them can be extremely expensive. We discuss the method of wedge sampling. This versatile technique allows for the fast and accurate approximation of various types of clustering coefficients and triangle counts. Furthermore, these techniques are extensible to counting directed triangles in digraphs. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state of the art, while providing nearly the accuracy of full enumeration.

Publication
Statistical Analysis and Data Mining
Date
Citation
C. Seshadhri, A. Pinar, T. G. Kolda. Wedge Sampling for Computing Clustering Coefficients and Triangle Counts on Large Graphs. Statistical Analysis and Data Mining, Vol. 7, No. 4, pp. 294-307, 2014. https://doi.org/10.1002/sam.11224

BibTeX

@article{SePiKo14,  
author = {C. Seshadhri and Ali Pinar and Tamara G. Kolda}, 
title = {Wedge Sampling for Computing Clustering Coefficients and Triangle Counts on Large Graphs}, 
journal = {Statistical Analysis and Data Mining}, 
volume = {7}, 
number = {4}, 
pages = {294-307}, 
month = {August}, 
year = {2014},
doi = {10.1002/sam.11224},
eprint = {1309.3321},
}