Los Angeles, CA
March 31, 2009 to April 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.346
In this paper, an efficient text classification algorithm for repeating-text information on the e-commerce site can automatically classify and sort the similar string. This algorithm will greatly increase the efficiency and accuracy of audited information. All tests show that for the number of information between 100 and 1000 the algorithm is very efficient, and the 1000 text information(strings) comparison can be controlled in two seconds. When the amount of information is over 1000, the computation time will be significantly increased. The precision can be rectified to adjust the relevant parameters of the algorithm, such as the number of the same substring in comparison results and the length of split string. For too short information, such as less than 10 words, the algorithm can be combined with the Levenshtein algorithm, in order to improve the text-search flexibility.
text classification algorithm, e-commerce, Text similarity
Wu Da-sheng, Yu Qin-fen, Liu Li-juan, "An Efficient Text Classification Algorithm in E-commerce Application", CSIE, 2009, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE 2009, pp. 458-461, doi:10.1109/CSIE.2009.346