2008 19th International Conference on Database and Expert Systems Application Segmentation of Legislative Documents Using a Domain-Specific Lexicon September 01-September 05 ISBN: 978-0-7695-3299-8
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/DEXA.2008.45
The amount of legal information is continuously growing. New legislative documents appear everyday in the Web. Legal documents are produced on a daily basis in briefing-format, containing changes in the current legislation, notifications, decisions, resolutions, etc. The scope of these documents includes countries, states, provinces and even city councils. This legal information is produced in a semi-structured format and distributed daily on official web-sites; however, the huge amount of published information makes difficult for an user to find a specific issue, being lawyers probably the most representative example, who need to access to these sources regularly. This motivates the need of legislative information search engines. Standard general web search engines return to the user full documents (web pages typically), within hundreds of pages. As users expect only the relevant part of the document, techniques that recognise and extract these relevant bits of documents are needed to offer quick and effective results. In this paper we present a method to perform segmentation based on domain-specific lexicon information. Our method was tested with a manually tagged data-set coming from different sources of Spanish legislative documents. Results show that this technique is suitable for the task achieving values of 97'85% recall and 95'99% precision.
Index Terms:
Legislative documents, domain lexicon, segmentation
Citation:
Ismael Hasan, Javier Parapar, Roi Blanco, "Segmentation of Legislative Documents Using a Domain-Specific Lexicon," dexa, pp.665-669, 2008 19th International Conference on Database and Expert Systems Application, 2008 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||