loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Conference on Database and Expert Systems Applications (DEXA 2007)
Generating a Topic Hierarchy from Dialect Texts
Regensburg, Germany
September 03-September 07
ISBN: 0-7695-2932-1
Win De Smet, K.U. Leuven, Belgium
Marie-Francine Moens, K.U. Leuven, Belgium
We built a system for the automatic creation of a textbased topic hierarchy, meant to be used in a geographically defined community. This poses two main problems. First, the appearance of both standard language and a community-related dialect, demanding that dialect words should be as much as possible corrected to standard words, and second, the automatic hierarchic clustering of texts by their topic.

The problem of correcting dialect words is dealt with by performing a nearest neighbor search over a dynamic set of known words, using a set of transition rules from dialect to standard words, which are learned from a parallel corpus. We solve the clustering problem by implementing a hierarchical co-clustering algorithm that automatically generates a topic hierarchy of the collection and simultaneously groups documents and words into clusters.

Citation:
Win De Smet, Marie-Francine Moens, "Generating a Topic Hierarchy from Dialect Texts," dexa, pp.249-253, 18th International Conference on Database and Expert Systems Applications (DEXA 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.