2018 IEEE International Conference on Big Data and Smart Computing (BigComp) (2018)
Jan 15, 2018 to Jan 17, 2018
We applied three different machine learning models to classify Chinese news into a group of classes in two schemes. The first scheme is to process the texts into TF-IDF matrices prior to running support vector machines (SVM) and maximum entropy (MAXENT) models, while the second scheme uses an embedding layer in a convolutional neural network (CNN) in order to learn features during the training process. We then compare the results obtained by all the models in terms of overall accuracy, precision, recall and F-scores. The MAXENT model showed the best performance, with an overall accuracy of 93.71%. The CNN model showed a lower performance in comparison with MAXENT and SVM models, with an overall accuracy around 73.58%. This result was not expected and we conclude with some considerations about the CNN design and possible future improvements.
entropy, feedforward neural nets, information resources, learning (artificial intelligence), pattern classification, support vector machines
D. Cecchini and L. Na, "Chinese News Classification," 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, China, 2018, pp. 681-684.