2012 IEEE 12th International Conference on Data Mining Workshops (2012)
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
ISBN: 978-1-4673-5164-5
pp: 865-868
This paper presents a technical description of a solution for International Conference on Data Mining 2012 Contest -- Consumer Products number 1. The Contest provided a dataset including thousands of text items, a product catalog with over fifteen million products, and hundreds of manually annotated product mentions to support data-driven approaches. The task was to identify product mentions within a large user-generated web-based textual corpus and disambiguate the mentions against the large product catalog. The solution consists of an ensemble-based algorithm for processing a textual content. It uses Conditional Random Fields and a special approach which recognizes product mentions. This solution finished in the third place in the contest.
Catalogs, Consumer products, Lead, Algorithm design and analysis, Conferences, Training data, Prediction algorithms, ICDM Contest, Conditional Random Field, Sequence Tagging, Consumer products, Named Entity Recognition

