The Community for Technology Leaders
2011 44th Hawaii International Conference on System Sciences (2011)
Kauai, HI
Jan. 4, 2011 to Jan. 7, 2011
ISSN: 1530-1605
ISBN: 978-1-4244-9618-1
pp: 1-5
ABSTRACT
From a management perspective, understanding the information that exists on a network and how it is distributed provides a critical advantage. This work explores the use of topic modeling as an approach to automatically determine the classes of information that exist on an organization's network, and then use the resultant topics as centroid vectors for the classification of individual documents in order to understand the distribution of information topics across the enterprise network. The approach is tested using the 20 Newsgroups dataset.
INDEX TERMS
business data processing, distributed processing, document handling, pattern classification, sampling methods
CITATION

R. M. Patton, J. M. Beaver and T. E. Potok, "Classification of Distributed Data Using Topic Modeling and Maximum Variation Sampling," 2011 44th Hawaii International Conference on System Sciences(HICSS), Kauai, Hawaii USA, 2011, pp. 1-5.
doi:10.1109/HICSS.2011.101
82 ms
(Ver 3.3 (11022016))