Sampling and Estimation on Large Online Social Networks
MetadataShow full item record
Studying the structural characteristics of online social networks (OSNs) provide useful information about the underlying OSN applications. Due to their very large sizes, sampling is a commonly used approach. One of the challenges in such studies is to develop proper statistical estimators under the limitations imposed by the OSN service providers. This thesis explores various sampling schemes in estimating structural properties of OSNs. First, we empirically show that the performance of the estimators highly depend on the studied characteristic and the underlying structure of the graph. Second, we propose estimators for the network size and the average degree under a very limited data access model which we call random neighbor access (RNA) model. The motivation is to understand the performance of the estimators when the OSN service providers limit the access to their data significantly. Third, we propose various estimators for average degree under the ego-centric sampling. Each estimator utilizes different information in the sampled ego-networks. Finally, we propose estimators for clustering coefficient measures by combining Metropolis-Hastings random walk with wedge sampling.