Subscribe

Issue No.03 - March (2010 vol.22)

pp: 305-317

Neil Mac Parthaláin , Aberystwyth University, Wales

Qiang Shen , Aberystwyth University, Wales

Richard Jensen , Aberystwyth University, Wales

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.119

ABSTRACT

Feature Selection (FS) or Attribute Reduction techniques are employed for dimensionality reduction and aim to select a subset of the original features of a data set which are rich in the most useful information. The benefits of employing FS techniques include improved data visualization and transparency, a reduction in training and utilization times and potentially, improved prediction performance. Many approaches based on rough set theory up to now, have employed the dependency function, which is based on lower approximations as an evaluation step in the FS process. However, by examining only that information which is considered to be certain and ignoring the boundary region, or region of uncertainty, much useful information is lost. This paper examines a rough set FS technique which uses the information gathered from both the lower approximation dependency value and a distance metric which considers the number of objects in the boundary region and the distance of those objects from the lower approximation. The use of this measure in rough set feature selection can result in smaller subset sizes than those obtained using the dependency function alone. This demonstrates that there is much valuable information to be extracted from the boundary region. Experimental results are presented for both crisp and real-valued data and compared with two other FS techniques in terms of subset size, runtimes, and classification accuracy.

INDEX TERMS

Rough sets, fuzzy sets, attribute reduction, boundary region, classification.

CITATION

Neil Mac Parthaláin, Qiang Shen, Richard Jensen, "A Distance Measure Approach to Exploring the Rough Set Boundary Region for Attribute Reduction",

*IEEE Transactions on Knowledge & Data Engineering*, vol.22, no. 3, pp. 305-317, March 2010, doi:10.1109/TKDE.2009.119REFERENCES

- [2] A. Chouchoulas and Q. Shen, “Rough Set-Aided Keyword Reduction for Text Categorisation,”
Applied Artificial Intelligence, vol. 15, no. 9, pp. 843-873, 2001.- [3] W.W. Cohen, “Fast Effective Rule Induction,”
Proc. 12th Int'l Conf. Machine Learning, pp. 115-123, 1995.- [4] J.S. Deogun, V.V. Raghavan, and H. Sever, “Exploiting Upper Approximation in the Rough Set Methodology,”
Proc. First Int'l Conf. Knowledge Discovery and Data Mining, pp. 1-10, 1995.- [5] D. Dubois and H. Prade, “Putting Rough Sets and Fuzzy Sets Together,”
Intelligent Decision Support, pp. 203-232, Kluwer Academic Publishers, 1992.- [6]
Rough-Fuzzy Hybridization: A New Trend in Decision Making, S.K. Pal and A. Skowron, eds. Springer Verlag, 1999.- [7] P. Devijver and J. Kittler,
Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.- [9] A. Hedar, J. Wang, and M. Fukushima, “Tabu Search for Attribute Reduction in Rough Set Theory,” Technical Report 2006-008, Dept. of Applied Mathematics and Physics, Kyoto Univ., 2006.
- [10] M. Inuiguchi and T. Tanino,
New Fuzzy-Rough Sets Based on Certainty Qualification, Rough-Neural Computing: Techniques for Computing with Words, S.K. Pal, L. Polkowski, and A. Skowron, eds. Springer-Verlag, 2003.- [11] M. Inuiguchi and M. Tsurumi, “Measures Based on Upper Approximations of Rough Sets for Analysis of Attribute Importance and Interaction,”
Int'l J. Innovative Computing, Information and Control, vol. 2, no. 1, pp. 1-12, 2006.- [13] H.R. Li and W.X. Zhang, “Applying Indiscernibility Attribute Sets to Knowledge Reduction,”
Lecture Notes in Artificial Intelligence, pp. 816-821, Springer, 2005.- [14] K. Li, Y. Liu, “Rough Set Based Attribute Reduction Approach in Data Mining,”
Proc. 2002 Int'l Conf. Machine Learning and Cybernetics, vol. 1, pp. 60-63, 2002.- [15] N. Mac Parthaláin, R. Jensen, and Q. Shen, “Fuzzy Entropy-Assisted Fuzzy-Rough Feature Selection,”
Proc. 15th Int'l Conf. Fuzzy Systems (FUZZ-IEEE '06) 2006.- [16] N. Mac Parthaláin, R. Jensen, and Q. Shen, “Distance Measure Assisted Rough Set Feature Selection,”
Proc. 16th Int'l Conf. Fuzzy Systems (FUZZ-IEEE '07), pp. 1084-1089, 2007.- [18] M. Modrzejewski, “Feature Selection Using Rough Sets Theory,”
Proc. European Conf. Machine Learning, P.B. Brazdil, ed., pp. 213-226, 1993.- [19] D.J. Newman, S. Hettich, C.L. Blake, and C.J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Science, Univ. of California, http://www.sciencedirect.com/science/article/ B6V14-4TDC09M-1/2/9735ab90392246f032a2632 eda77ae0ehttp:/ /www.ics. uci.edu/ mlearnMLRepository.html, 1998.
- [20] R. Nie and J. Yue, “An Attribute Reduction Method Based on Rough Set and SVM and with Application in Oil-Gas Prediction,”
Proc. Sixth IEEE/ACIS Int'l Conf. Computer and Information Science (ICIS '07), pp. 502-506, 2007.- [21] S.H. Nguyen and A. Skowron, “Searching for Relational Patterns in Data,”
Proc. First European Symp. Principles of Data Mining and Knowledge Discovery, pp. 265-276, 1997.- [22] S. Piramuthu, “The Hausdorff Distance Measure for Feature Selection in Learning Applications,”
Proc. 32nd Ann. Hawaii Int'l Conf. System Sciences, vol. 6, 1999.- [24] Z. Pawlak, “Rough Sets,”
Int'l J. Computer and Information Science, vol. 11, pp. 341-356, 1982.- [25] J.R. Quinlan,
C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.- [26] W. Rucklidge,
Efficient Visual Recognition Using the Hausdorff Distance. Springer, 1996.- [29] A. Skowron and J. Stepaniuk, “Tolerance Approximation Spaces,”
Fundamenta Informaticae, vol. 27, pp. 245-253, 1996.- [30] D. Slezak, “Various Approaches to Reasoning with Frequency Based Decision Reducts: A Survey,”
Rough Set Methods and Applications, L. Polkowski, S. Tsumoto, T.Y. Lin, eds., pp 235-285, Physica-Verlag, 2000.- [31]
Intelligent Decision Support, R. Slowinski, ed. Kluwer Academic Publishers, 1992.- [34] I.H. Witten and E. Frank,
Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, 2000.- [35] I.H. Witten and E. Frank, “Generating Accurate Rule Sets without Global Optimization,”
Proc. 15th Int'l Conf. Machine Learning, 1998.- [36] Y. Yao, “A Comparative Study of Fuzzy Sets and Rough Sets,”
Information Sciences, vol. 109, pp. 21-47, 1998.- [37] N. Zhong, J. Dong, and S. Ohsuga, “Using Rough Sets with Heuristics for Feature Selection,”
J. Intelligent Information Systems, vol. 16, no. 3, pp. 199-214, 2001. |