Sampling and Estimation on Large Online Social Networks

Date

2016-12

Journal Title

Journal ISSN

Volume Title

Publisher

item.page.doi

Abstract

Studying the structural characteristics of online social networks (OSNs) provide useful information about the underlying OSN applications. Due to their very large sizes, sampling is a commonly used approach. One of the challenges in such studies is to develop proper statistical estimators under the limitations imposed by the OSN service providers. This thesis explores various sampling schemes in estimating structural properties of OSNs. First, we empirically show that the performance of the estimators highly depend on the studied characteristic and the underlying structure of the graph. Second, we propose estimators for the network size and the average degree under a very limited data access model which we call random neighbor access (RNA) model. The motivation is to understand the performance of the estimators when the OSN service providers limit the access to their data significantly. Third, we propose various estimators for average degree under the ego-centric sampling. Each estimator utilizes different information in the sampled ego-networks. Finally, we propose estimators for clustering coefficient measures by combining Metropolis-Hastings random walk with wedge sampling.

Description

Keywords

Sampling (Statistics), Estimation theory, Online social networks, Design-unbiased, Graph theory--Data processing, Experimental design

item.page.sponsorship

Rights

Copyright ©2016 is held by the author. Digital access to this material is made possible by the Eugene McDermott Library. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Citation