The Community for Technology Leaders
Green Image
Issue No. 04 - July/August (2003 vol. 18)
ISSN: 1541-1672
pp: 34-42
Hans van Halteren , University of Nijmegen
<p>Machine learning feature sets that were originally developed for authorship attribution can be used for summarization by sentence extraction. In the author's pilot experiment, these feature sets distinguished significantly better between extract and nonextract sentences than a random baseline classifier, but it had to be carefully combined with other features to outperform a positional baseline classifier. In the DUC 2002 competition, an actual combination system trained on 400-word single document extracts was one of the best performers in the 200- and 400-word multidocument extraction task. Further experiments showed that this system could be improved significantly with training material that better reflected the intended task.</p>
summarization, sentence extraction, machine learning, style recognition
Hans van Halteren, "New Feature Sets for Summarization by Sentence Extraction", IEEE Intelligent Systems, vol. 18, no. , pp. 34-42, July/August 2003, doi:10.1109/MIS.2003.1217626
96 ms
(Ver 3.3 (11022016))