16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008) (2011)
Ayia Napa, Cyprus
Feb. 9, 2011 to Feb. 11, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PDP.2011.18
In current distributed systems, such as Grids, Clouds, or P2P systems, the amount of information to handle influences the way the system is managed. In P2P systems containing large quantities of data, or in Grid systems containing a large number of (often heterogeneous) resources, information about data or resources must be spread through the system in an efficient way in order to allow them to be found. An information discovery technique based on data summarization, via clustering, is presented. These summaries can be used to classify information to provide users with greater insight about documents or computing resources compared to raw data. Also, meta-schedulers or brokers would benefit from the proposed technique due to the fact that they would have to deal with less data from resources, thus aiding to the scalability of the system. An evaluation of the approach is subsequently provided to identify the impact of choosing particular parameters to be used as part of the summary.
summary creation, classification, information discovery, distributed systems
C. Carrión, A. C. Caminero, O. Rana, E. Huedo, I. M. Llorente and B. Caminero, "Summary Creation for Information Discovery in Distributed Systems," 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)(PDP), Ayia Napa, Cyprus, 2011, pp. 167-171.