This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2012 IEEE 12th International Conference on Data Mining Workshops
An NER-based Product Identification and Lucene-based Product Linking Approach to CPROD1 Challenge: Description of Submission System to CPROD1 Challenge
Brussels, Belgium Belgium
December 10-December 10
ISBN: 978-1-4673-5164-5
This paper presents our methodology for CPROD1 Challenge, which is to identify the product mentions from text and then link the product to the entries in the catalog file. Our solution follows 2 steps. First, we use processing pipelines to extract product mentions by incorporating multiple techniques including traditional named entities recognition (NER), regular expression rules and gazetteer-based string matching. Second, we view product linking task into an information retrieval (IR) problem, where the description catalog file is populated into a database. Thus, each product mention acts as a search query and the returned results from catalog entry database serve as the links. The F1 scores of our submission on public and private test data are 24.82% and 16.04%, respectively.
Index Terms:
Catalogs,Training data,Joining processes,Data mining,Information retrieval,Indexing,Feature extraction,named entity recognition,product disambiguation,product identification,product linking
Citation:
Zhiqiang Toh, Wenting Wang, Man Lan, Xiaoli Li, "An NER-based Product Identification and Lucene-based Product Linking Approach to CPROD1 Challenge: Description of Submission System to CPROD1 Challenge," icdmw, pp.869-871, 2012 IEEE 12th International Conference on Data Mining Workshops, 2012
Usage of this product signifies your acceptance of the Terms of Use.