The Community for Technology Leaders
Green Image
Issue No. 03 - May/June (2008 vol. 12)
ISSN: 1089-7801
pp: 78-82
Craig W. Thompson , University of Arkansas
Josh Eno , University of Arkansas
ABSTRACT
Synthetic data sets can be useful for repeatable regression testing and for providing realistic — but not real — data to third parties for testing new software. In some cases, it is desirable that the synthetic data set be realistic, preserving various properties of the original data. Several synthetic data generators generate data that superficially matches known characteristics of data. This paper shows how to generate data that exhibits some of the same hidden patterns that can be discovered by data mining algorithms, in particular, decision tree patterns.
INDEX TERMS
Synthetic data generation, data mining, decision trees
CITATION
Craig W. Thompson, Josh Eno, "Generating Synthetic Data to Match Data Mining Patterns", IEEE Internet Computing, vol. 12, no. , pp. 78-82, May/June 2008, doi:10.1109/MIC.2008.55
109 ms
(Ver )