The Community for Technology Leaders
2012 IEEE 8th International Conference on E-Science (e-Science) (2012)
Chicago, IL
Oct. 8, 2012 to Oct. 12, 2012
ISBN: 978-1-4673-4467-8
pp: 1-8
Peng Chen , School of Informatics and Computing, Indiana University
Beth Plale , School of Informatics and Computing, Indiana University
Mehmet S. Aktas , Information Technologies Institute, Tubitak
ABSTRACT
Provenance of digital scientific data is an important piece of the metadata of a data object. It can however grow voluminous quickly because the granularity level of capture can be high. It can also be quite feature rich. We propose a representation of the provenance data based on logical time that reduces the feature space. Creating time and frequency domain representations of the provenance, we apply clustering, classification and association rule mining to the abstract representations to determine the usefulness of the temporal representation. We evaluate the temporal representation using an existing 10 GB database of provenance captured from a range of scientific workflows.
INDEX TERMS
abstract data types, data mining, frequency-domain analysis, meta data, pattern classification, pattern clustering, scientific information systems, temporal databases, time-domain analysis, workflow management software
CITATION

P. Chen, B. Plale and M. S. Aktas, "Temporal representation for scientific data provenance," 2012 IEEE 8th International Conference on E-Science (e-Science)(E-SCIENCE), Chicago, IL, 2013, pp. 1-8.
doi:10.1109/eScience.2012.6404477
83 ms
(Ver 3.3 (11022016))