A Scalable Generative Graph Model with Community Structure

Abstract

Network data is ubiquitous and growing, yet we lack realistic generative models that can be calibrated to match real-world data. The recently proposed Block Two-Level Erdős-Rényi (BTER) model can be tuned to capture two fundamental properties: degree distribution and clustering coefficients. The latter is particularly important for reproducing graphs with community structure, such as social networks. In this paper, we compare BTER to other scalable models and show that it gives a better fit to real data. We provide a scalable implementation that requires only $O(d_{\max})$ storage where $d_{\max}$ is the maximum number of neighbors for a single node. The generator is trivially parallelizable, and we show results for a Hadoop implementation for a modeling a real-world web graph with over 4.6 billion edges. We propose that the BTER model can be used as a graph generator for benchmarking purposes and provide idealized degree distributions and clustering coefficient profiles that can be tuned for user specifications.

Publication
SIAM Journal on Scientific Computing
Date
Tags
Citation
T. G. Kolda, A. Pinar, T. Plantenga, C. Seshadhri. A Scalable Generative Graph Model with Community Structure. SIAM Journal on Scientific Computing, Vol. 36, No. 5, pp. C424-C452, 2014. https://doi.org/10.1137/130914218

Keywords

graph generator, network data, block two-level Erd\H{o}s-R\‘enyi (BTER) model, large-scale graph benchmarks

BibTeX

@article{KoPiPlSe14,  
author = {Tamara G. Kolda and Ali Pinar and Todd Plantenga and C. Seshadhri}, 
title = {A Scalable Generative Graph Model with Community Structure}, 
journal = {SIAM Journal on Scientific Computing}, 
volume = {36}, 
number = {5}, 
pages = {C424--C452}, 
month = {September}, 
year = {2014},
doi = {10.1137/130914218},
}