Web Information Systems Engineering, International Conference on (2001)
Dec. 3, 2001 to Dec. 6, 2001
F. Masseglia , LIRMM
M. Teisseire , LIRMM
P. Poncelet , LIRMM
The behaviour of a Web site's users may change so quickly that attempting to make predictions, according to the frequent patterns coming from the analysis of an access log file, becomes challenging. In order for the obsolescence of the behavioural patterns to become as null as possible, the ideal method would provide frequent patterns in real time, allowing the result to be available immediately. We propose, in this paper, a method allowing to find frequent behavioural patterns in real time, whatever the number of connected users is. Considering how fast the frequent behaviour patterns can change since the last analysis of the access log file, this result thus provide completely adapted navigation schemas for user behaviour predictions. Based on a distributed heuristic, our method also answers several tackled problems within the data mining framework: Discovering ``interesting zones'' (a great number of frequent patterns concentrated over a period of time, or the discovering of ``super-frequent'' patterns), discovering very long sequential patterns and interactive data mining (``on the fly'' modification of the minimum support).
real time, interactive data mining, zone mining, heuristic, distributed, long sequential patterns.
M. Teisseire, P. Poncelet and F. Masseglia, "Real-Time Web Usage Mining: A Heuristic Based Distributed Miner," Proceedings of 2nd International Conference on Web Information Systems Engineering(WISE), Kyoto, Japan, 2001, pp. 0288.