The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2013 vol.25)
pp: 662-676
Manos Papagelis , University of Toronto, Toronto
Gautam Das , University of Texas at Arlington, Arlington
Nick Koudas , University of Toronto, Toronto
ABSTRACT
As online social networking emerges, there has been increased interest to utilize the underlying network structure as well as the available information on social peers to improve the information needs of a user. In this paper, we focus on improving the performance of information collection from the neighborhood of a user in a dynamic social network. We introduce sampling-based algorithms to efficiently explore a user's social network respecting its structure and to quickly approximate quantities of interest. We introduce and analyze variants of the basic sampling scheme exploring correlations across our samples. Models of centralized and distributed social networks are considered. We show that our algorithms can be utilized to rank items in the neighborhood of a user, assuming that information for each user in the network is available. Using real and synthetic data sets, we validate the results of our analysis and demonstrate the efficiency of our algorithms in approximating quantities of interest. The methods we describe are general and can probably be easily adopted in a variety of strategies aiming to efficiently collect information from a social graph.
INDEX TERMS
Social network services, Peer to peer computing, Information technology, Algorithm design and analysis, Performance evaluation, Search engines, performance evaluation of algorithms and systems, Information networks, search process, query processing
CITATION
Manos Papagelis, Gautam Das, Nick Koudas, "Sampling Online Social Networks", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 3, pp. 662-676, March 2013, doi:10.1109/TKDE.2011.254
REFERENCES
[1] A. Mislove, K.P. Gummadi, and P. Druschel, "Exploiting Social Networks for Internet Search," Proc. Fifth Workshop Hot Topics in Networks (HotNets), 2006.
[2] W.G. Cochran, Sampling Techniques, third ed. John Wiley, 1977.
[3] D.E. Knuth, "Estimating the Efficiency of Backtrack Programs," Math. of Computation, vol. 29, no. 129, pp. 121-136, 1975.
[4] G. Cornujols, M. Karamanov, and Y. Li, "Early Estimates of the Size of Branch-and-Bound Trees," INFORMS J. Computing, vol. 18, pp. 86-96, 2006.
[5] P. Kilby, J. Slaney, S. Thiébaux, and T. Walsh, "Estimating Search Tree Size," Proc. Nat'l Conf. Artificial Intelligence (AAAI), 2006.
[6] H. Kautz, E. Horvitz, Y. Ruan, C. Gomes, and B. Selman, "Dynamic Restart Policies," Proc. 18th Nat'l Conf. Artificial Intelligence (AAAI), 2002.
[7] L. Katzir, E. Liberty, and O. Somekh, "Estimating Sizes of Social Networks via Biased Sampling," Proc. 20th Int'l Conf. World Wide Web (WWW), 2011.
[8] R.G. Miller, Simultaneous Statistical Inference. Springer Verlag, 1981.
[9] M.N. Garofalakis and P.B. Gibbons, "Approximate Query Processing: Taming the Terabytes," Proc. 27th Int'l Conf. Very Large Data Bases, Nov. 2001.
[10] S.A.M. Makki and G. Havas, "Distributed Algorithms for Depth-First Search," Information Processing Letters, vol. 60, no. 1, pp. 7-12, 1996.
[11] T.-Y. Cheung, "Graph Traversal Techniques and the Maximum Flow Problem in Distributed Computation," IEEE Trans. Software Eng., vol. SE-9, no. 4, pp. 504-512, July 1983.
[12] B. Awerbuch and R.G. Gallager, "A New Distributed Algorithm to Find Breadth First Search Trees," IEEE Trans. Information Theory, vol. 33, no. 3, pp. 315-322, May 1987.
[13] C.T.G. Pass and A. Chowdhury, "A Picture of Search," Proc. First Int'l Conf. Scalable Information Systems (InfoScale), 2006.
[14] R. Albert and I. Barabasi, "Statistical Mechanics of Complex Networks," Modern Physics Rev., vol. 74, p. 47, 2002.
[15] C. Gkantsidis, M. Mihail, and A. Saberi, "Random Walks in Peer-to-Peer Networks: Algorithms and Evaluation," Performance Evaluation, vol. 63, no. 3, pp. 241-263, 2006.
[16] M. Ajtai, J. Komlos, and E. Szemeredi, "Deterministic Simulation in Logspace," Proc. 19th Ann. ACM Symp. Theory of Computing (STOC), 1987.
[17] R. Impagliazzo and D. Zuckerman, "How to Recycle Random Bits," Proc. 30th Ann. Symp. Foundations of Computer Science (FOCS), 1989.
[18] D. Gillman, "A Chernoff Bound for Random Walks on Expander Graphs," SIAM J. Computing, vol. 27, no. 4, pp. 1203-1220, 1998.
[19] Z. Bar-Yossef and M. Gurevich, "Random Sampling from a Search Engine's Index," Proc. 15th Int'l Conf. World Wide Web (WWW), 2006.
[20] Z. Bar-Yossef, A. Berg, S. Chien, J. Fakcharoenphol, and D. Weitz, "Approximating Aggregate Queries About Web Pages via Random Walks," Proc. 26th Int'l Conf. Very Large Data Bases (VLDB), 2000.
[21] W. Hastings, "Monte Carlo Sampling Methods Using Markov Chains and Their Applications," Biometrika, vol. 57, no. 1, pp. 97-109, 1970.
[22] G. Das, N. Koudas, M. Papagelis, and S. Puttaswamy, "Efficient Sampling of Information in Social Networks," Proc. ACM Workshop Search in Social Media (SSM), 2008.
[23] M. Gjoka, M. Kurant, C.T. Butts, and A. Markopoulou, "Walking in Facebook: A Case Study of Unbiased Sampling of Osns," Proc. INFOCOM, 2010.
[24] L. Katzir, E. Liberty, and O. Somekh, "Estimating Sizes of Social Networks via Biased Sampling," Proc. 20th Int'l Conf. World Wide Web (WWW), 2011.
[25] A.S. Maiya and T.Y. Berger-Wolf, "Sampling Community Structure," Proc. 19th Int'l Conf. World Wide Web (WWW), 2010.
[26] J.-T. Sun, H.-J. Zeng, H. Liu, Y. Lu, and Z. Chen, "Cubesvd: A Novel Approach to Personalized Web Search," Proc. 14th Int'l Conf. World Wide Web (WWW), 2005.
[27] J. Teevan, S.T. Dumais, and E. Horvitz, "Personalizing Search via Automated Analysis of Interests and Activities," Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), 2005.
[28] E. Agichtein, E. Brill, and S. Dumais, "Improving Web Search Ranking by Incorporating User Behavior Information," Proc. 29th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), 2006.
[29] Z. Dou, R. Song, and J.-R. Wen, "A Large-Scale Evaluation and Analysis of Personalized Search Strategies," Proc. 16th Int'l Conf. World Wide Web (WWW), 2007.
[30] Q. Wang and H. Jin, "Exploring Online Social Activities for Adaptive Search Personalization," Proc. 19th ACM Int'l Conf. Information and Knowledge Management (CIKM), 2010.
[31] D. Horowitz and S.D. Kamvar, "The Anatomy of a Large-Scale Social Search Engine," Proc. 19th Int'l Conf. World Wide Web (WWW), 2010.
[32] A. Papagelis, M. Papagelis, and C.D. Zaroliagis, "Iclone: Towards Online Social Navigation," Proc. ACM 19th Conf. Hypertext and Hypermedia (HT), 2008.
[33] A. Papagelis, M. Papagelis, and C. Zaroliagis, "Enabling Social Navigation on the Web," Proc. IEEE/WIC/ACM Int'l Conf. Web Intelligence and Intelligent Agent Technology (WI-IAT), 2008.
[34] R. Wetzker, C. Zimmermann, C. Bauckhage, and S. Albayrak, "I Tag, You Tag: Translating Tags for Advanced User Models," Proc. ACM Third Int'l Conf. Web Search and Data Mining (WSDM), 2010.
[35] S. Chaudhuri, G. Das, M. Datar, R. Motwani, and V.R. Narasayya, "Overcoming Limitations of Sampling for Aggregation Queries," Proc. 17th Int'l Conf. Data Eng. (ICDE), 2001.
[36] T. Erickson and W.A. Kellogg, "Social Translucence: An Approach to Designing Systems that Support Social Processes," ACM Trans. Computer-Human Interaction, vol. 7, no. 1, pp. 59-83, 2000.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool