International Conference on Computing: Theory and Applications (ICCTA'07)
Discretization Using Clustering and Rough Set Theory
Kolkata, India
March 05-March 07
ISBN: 0-7695-2770-1
The majority of the Data Mining algorithms are applied to data described by discrete or nominal attributes. In order to apply these algorithms effectively to any dataset the continuous attribute need to be transformed to discretized ones. This paper presents an approach using Clustering and Rough Set Theory (RST). The experiments are performed on four datasets from UCI ML repository. The performance of the proposed approach is compared with some common discretization methods based on the two parameters - the number of intervals and the Class-Attribute Interdependence Redundancy (CAIR) value. The results of the proposed method show a satisfactory trade off between the number of intervals and the information loss due to discretization.