loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
Mining and Predicting Duplication over Peer-to-Peer Query Streams
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2702-7
Shicong Meng, Shanghai Jiao Tong University, Shanghai, 200030, P.R.China.
Yifeng Shao, Shanghai Jiao Tong University, Shanghai, 200030, P.R.China.
Cong Shi, Shanghai Jiao Tong University, Shanghai, 200030, P.R.China.
Dingyi Han, Shanghai Jiao Tong University, Shanghai, 200030, P.R.China.
Yong Yu, Shanghai Jiao Tong University, Shanghai, 200030, P.R.China.
Many previous works of data mining user queries in Peer-to-Peer systems focused their attention on the distribution of query contents. However, few has been done towards a better understanding of the time series distribution of these queries, which is vital for system performance. To remedy this situation, this paper mines query steams by using automatic time series analysis to evaluate different linear models(Box-Jenkins models and some simple windowed-mean models) for predicting the number of duplicated queries from 10 minutes to 2 hours into the future. Both the predictive power and the computational costs of these models are evaluated over 318,942,450 real world Gnutella queries collected over 3 months. We find the number of duplicated queries is consistently predictable. Simple, practical models like AR perform well on prediction.
Citation:
Shicong Meng, Yifeng Shao, Cong Shi, Dingyi Han, Yong Yu, "Mining and Predicting Duplication over Peer-to-Peer Query Streams," icdmw, pp.648-652, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.