Proceedings 2001 IEEE International Conference on Data Mining (2001)
San Jose, California
Nov. 29, 2001 to Dec. 2, 2001
We present a collective approach to mine Bayesian net-works from distributed heterogenous web-log data streams. In this approach we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits a sub-set of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmitted from the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian net-work, that models the entire data. We applied this technique to mine multiple data streams where data centralization is difficult because of large response time and scalability issues. Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented.
R. Chen, H. Kargupta and K. Sivakumar, "Distributed Web Mining Using Bayesian Networks from Multiple Data Streams," Proceedings 2001 IEEE International Conference on Data Mining(ICDM), San Jose, California, 2001, pp. 75.