17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)
A Robust Approach to Sequence Classification
Hong Kong, China
November 14-November 16
ISBN: 0-7695-2488-5
We report results for classification of representations of music, spoken words, and text documents. Experimental comparisons with other state-of-the-art algorithms yield improved results for all three examples. We use a Support Vector Machine (SVM) as our classifier in all experiments. This is driven by a kernel matrix of similarity measures between the sequences. Our similarity measure is based on n-grams of varying length (multi-grams), weighted to reflect discrimination ability. To alleviate the problem of the exponential growth of feature size with n, we use a modified LZ78 algorithm [1] to guide feature selection. Our method exhibits good performance over the three widely distinct tasks reported here, and is very computationally efficient and may therefore be useful in real time applications.