This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses
January/February 2003 (vol. 15 no. 1)
pp. 86-102

Abstract—A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses.

[1] S. Agrawal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi, "On the Computation of Multidimensional Aggregates," Proc. 22nd Int'l Conf. Very Large Databases, pp. 506-521,Mumbai (Bombay), India, Sept. 1996.
[2] R. Agrawal, A. Gupta, and S. Sarawagi, Modeling Multidimensional Databases Proc. Int'l Conf. Database Eng. (ICDE), pp. 232-243, Apr. 1997.
[3] N.B. Amor, S. Benferhat, D. Dubois, H. Geffner, and H. Prade, “Independence in Qualitative Uncertainty Frameworks,” Proc. Seventh Int'l Conf. Principles of Knowledge Representation and Reasoning, pp. 235-246, Apr. 2000.
[4] P. Bosc, “Some Approaches for Processing SQLf Nested Queries,” Intelligent Systems, vol. 11, no. 9, pp. 65-74, Sept. 1996.
[5] P. Bosc, F. Connan, and D. Rocacher, “Flexible Querying in Multimedia Databases with an Object Query Language,” Proc. Seventh IEEE Int'l Conf. Fuzzy Systems, pp. 1308-1313, May 1998.
[6] P. Bosc and O. Pivert, “SQLf: A Relational Database Language for Fuzzy Querying,” IEEE Trans. Fuzzy Systems, vol. 3, no. 1, pp. 1-17, 1995.
[7] L. Cabibbo and R. Torlone, “From a Procedural to a Visual Query Language for OLAP,” Proc. IEEE Int'l Conf. Very Large Data Bases, 1996.
[8] L. Cabibbo and R. Torlone, “A Framework for the Investigation of Aggregate Functions in Database Queries,” Proc. Seventh Int'l Conf. Database Theory, 1999.
[9] S. Chaudhuri and U. Dayal, “An Overview of Data Warehousing and OLAP Technology,” SIGMOD Record, vol. 26, no. 1, Mar. 1997.
[10] E. Cox, The Fuzzy Systems Handbook. Academic Press, Inc., 1994.
[11] D. Srivastava, S. Dar, H.V. Jagadish, and A.Y. Levy, “Answering Queries with Aggregation Using Views,” Proc. Int'l Conf. Very Large Data Bases, pp. 318-329, 1996.
[12] C.R. De, “Fuzzy and Uncertain Object-Oriented Databases: Concepts and Models.” World Scientifc, 1997.
[13] P.M. Deshpande, K. Ramasamy, A. Shukla, and J.F. Naughton, “Caching Multidimensional Queries Using Chunks,” Proc. ACM SIGMOD Conf., pp. 259-270, 1998.
[14] D. Dubois, D.L. Berre, H. Prade, and R. Sabbadin, “Logical Representations and Computation of Optimal Decisions in a Qualitative Setting,” Proc. 15th Int'l Conf. Artificial Intelligence and 10th Innovative Applications of Artificial Intelligence, pp. 588-593, July 1988.
[15] D. Dubois, D.L. Berre, H. Prade, and R. Sabbadin, “Using Possibilistic Logic for Modeling Qualitative Decision: ATMS-Based Algorithms,” Fundamental Informaticae, vol. 37, no. 1-2, pp. 1-30, 1999.
[16] D. Dubois, F. Esteva, P. Garcia, L. Godo, D.M.R. Lopez, and H. Prade, “Fuzzy Set Modeling in Case-Based Reasoning,” Intelligent Systems, vol. 13, pp. 301-374, 1996.
[17] D. Dubois, L. Godo, H. Prade, and A. Zapico, “Making Decisions in a Qualitative Setting: From Decision Under Uncertainty to Case-Based Decision,” Proc. Fifth Int'l Conf. Principles of Knowledge Representation and Reasoning, pp. 594-607, June 1998.
[18] D. Dubois and H. Prade, “Fuzzy Sets, Probability and Measurement,” European J. Operational Research, vol. 40, pp. 135-154, 1989.
[19] D. Dubois and H. Prade, “Fuzzy Sets and Systems: Theory and Applications,” Mathematics in Science and Eng., vol. 144, p. 40, 1980.
[20] M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J.D. Ullman, “Computing Iceberg Queries Efficiently,” Proc. 24th Int'l Conf. Very Large Data Bases, pp. 299-310, Aug. 1998.
[21] H. Garcia-Molina, W. Labio, and J. Yang, “Expiring Data in a Warehouse,” Proc. 23rd Int'l Conf. Very Large Data Bases, Aug. 1998.
[22] S.R. Gardner, “Building the Data Warehouse,” IEEE Trans. Systems, Man, and Cybernetics, vol. 41, no. 9, pp. 52-60, Sept. 1998.
[23] F. Gingras and L.V.S. Lakshmanan, “nD-SQL: A Multi-Dimensional Language for Interoperability and OLAP,” Proc. 23rd Int'l Conf. Very Large Data Bases, Aug. 1998.
[24] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals Proc. Int'l Conf. Database Eng. (ICDE), pp. 152-159, 1996.
[25] T. Griffin and L. Libkin, "Incremental Maintenance of Views with Duplicates," Proc. ACM SIGMOD Int'l Conf. Management of Data,San Jose, Calif., 1995.
[26] A. Gupta, I.S. Mumick, and K.A. Ross, "Adapting Materialized Views after Redefinitions," Proc. ACM SIGMOD Int'l Conf. Management of Data,San Jose, Calif., pp. 211-222, 1995.
[27] H. Gupta, “Selection of Views to Materialize in a Data Warehouse,” Proc. Int'l Conf. Database Theory, Jan. 1997.
[28] H. Gupta, “Selection of Views to Materialize in a Data Warehouse,” Proc. 23rd Int'l Conf. Very Large Data Bases, 1997.
[29] M.S. Hacid, P. Marcel, and C. Rigotti, “A Rule Based CQL for 2-Dimensional Tables,” Proc. Second Int'l Workshop Constraint Database Systems, pp. 92-104, Jan. 1997.
[30] M.S. Hacid, P. Marcel, and C. Rigotti, “A Rule Based Data Manipulation Language for OLAP Systems,” Proc. Fifth Int'l Conf. Deductive and Object-Oriented Databases, Dec. 1997.
[31] V. Harinarayan, A. Rajaraman, and J. D. Ullman, “Implementing Data Cubes Efficiently,” Proc. ACM SIGMOD, pp. 205-216, June 1996
[32] N. Huyn, “Multiple-View Self-Maintenance in Data Warehousing Environments,” Proc. 23rd Int'l Conf. Very Large Data Bases, 1997.
[33] H.V. Jagadish, P.P.S. Narayan, S. Seshadri, and R. Kanneganti, “Incremental Organization for Data Recording and Warehousing,” Proc. 23rd Int'l Conf. Very Large Data Bases, pp. 16-25, 1997.
[34] T. Johnson and D. Shasha, “Some Approaches to Index Design for Cube Forests,” IEEE Data Eng. Bull., vol. 20, no. 1, pp. 27-35, Mar. 1997.
[35] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley&Sons, 1990.
[36] G. J. Klir and T. A. Folger,Fuzzy sets, Uncertainty, and Information. Englewood Cliffs, NJ: Prentice-Hall, 1988.
[37] R. Kruse, J. Gebhardt, and F. Klawonn, Foundations of Fuzzy Systems. John Wiley&Sons, 1994.
[38] W. Labio, D. Quass, and B. Adelberg, “Physical Database Design for Data Warehousing,” Proc. Int'l Conf. Data Eng., 1997.
[39] L. Libkin, R. Machlin, and L. Wong, “A Query Language for Multidimensional Arrays: Design, Implementation, and Optimization Techniques,” Proc. ACM SIGMOD Int'l Conf. Management of Data, June 1996.
[40] P. Bosc and H. Prade, “Uncertainty Management in Information Systems: From Needs to Solutions,” An Introduction to Fuzzy Sets and Possibility Theory-based Treatment of Soft Queries and Uncertain or Imprecise Databases, Kluwer Academic Publisher, Inc., A. Motro and P. Smets, eds., 1997.
[41] I.S. Mumick, D. Quass, and B.S. Mumick, “Maintenance of Data Cubes and Summary Tables in a Warehouse,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 100-111, June 1997.
[42] H. Nakajima and Y. Senoh, “A Spreadsheet-Based Fuzzy Retrieval System,” Intelligent Systems, vol. 11, no. 9, pp. 661-670, Sept. 1996.
[43] A.M. Norwich and I.B. Turksen, “A Model for the Measurement of Membership and the Consequences of Its Empirical Implementation,” Fuzzy Sets and Systems, vol. 12, pp. 1-25, 1984.
[44] B.M. Waxman, Performance Evaluation of Multipoint Routing Algorithms Proc. IEEE INFOCOM, pp. 980-986, Mar. 1993.
[45] F. Petry, Fuzzy Databases: Principles and Applications. Kluwer Academic Publishers, 1996.
[46] D. Quass and J. Widom, “On-Line Warehouse View Maintenance for Batch Updates,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 393-404, June 1997.
[47] P. Zysno, “Empirical Semantics,” Modeling Membership Functions, B. Rieger, ed., vol. 1, pp. 350-375, 1981.
[48] K.A. Ross, D. Srivastava, and D. Chatziantoniou, “Complex Aggregation at Multiple Granularities,” Proc. Int'l Conf. Extended Database Technology, Apr. 1998.
[49] G. Roy, E.P. Frederick, P.B. Bill, and S. Radhakrishnan, “Fuzzy Database Systems—Challenges and Opportunities of a New Era,” Intelligent Systems, vol. 11, no. 9, pp. 649-659, Sept. 1996.
[50] A. Shukla, P.M. Deshpande, J.F. Naughton, and K. Ramasamy, "Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies," Proc. 22nd Int'l Conf. Very Large Databases, pp. 522-531,Mumbai (Bombay), India, Sept. 1996.
[51] T. Terano, K. Asai, and M. Sugeno, Fuzzy Set Theory and Its Applications. Academic Press, Inc., 1991.
[52] I.B. Turksen, “Measurement of Membership Functions and Their Acquisition,” Fuzzy Sets and Systems, vol. 40, pp. 5-38, 1991.
[53] J. Widom, “Research Problems in Data Warehousing,” Proc. Int'l Conf. Information and Knowledge Management, pp. 25-30, Nov. 1995.
[54] M.C. Wu and A.P. Buchmann, “Research Issues in Data Warehousing,” Technical Report BTW, Ulm, Mar. 1997.
[55] M.C. Wu and A.P. Buchmann, “Encoded Bitmap Indexing for Data Warehouses,” Proc. Int'l Conf. Data Eng., pp. 220-230, 1998.
[56] R.R. Yager, “Quantifiers in the Formulation of Multiple Objective Decision Functions,” Information Science, vol. 31, pp. 107-139, 1983.
[57] R.R. Yager, “Aggregation Operators and Fuzzy Systems Modeling,” Fuzzy Sets and Systems, vol. 67, pp. 129-146, 1995.
[58] R.R. Yager, “A Unified Approach to Aggregation Based on MOM and MAM Operators,” Intelligent Systems, vol. 10, pp. 809-855, 1995.
[59] R.R. Yager, “Database Discovery Using Fuzzy Sets,” Intelligent Systems, vol. 11, no. 9, pp. 691-712, Sept. 1996.
[60] R.R. Yager, “Quantifier Guided Aggregation Using OWA Operators,” Intelligent Systems, vol. 11, pp. 49-73, 1996.
[61] R.R. Yager and A. Rybalov, “Uninorm Aggregation Operators,” Fuzzy Sets and Systems, vol. 80, pp. 111-120, 1996.
[62] J. Yang, K. Karlapalem, and Q. Li., “A Framework for Designing Materialized Views in Data Warehousing Environment,” Proc. Int'l Conf. Distributed Computing Systems, May 1997.
[63] J. Yang and J. Widom, “Maintaining Temporal Views over Nonhistorical Information Sources for Data Warehousing,” Proc. Int'l Conf. Data Eng., 1998.
[64] L.A. Zadeh, “Fuzzy Sets,” Information and Control, vol. 8, pp. 338-353, 1965.
[65] L.A. Zadeh, “Outline of a New Approach to the Analysis of Complex Systems and Decision Processes,” IEEE Trans. Systems, Man, and Cybernetics, vol. 3, no. 1, 1973.
[66] L.A. Zadeh, “A Computational Approach to Fuzzy Quantifiers in Natural Languages,” Comput. Math. Applications, vol. 9, pp. 149-184, 1983.
[67] Y. Zhao, P.M. Deshpande, J.F. Naughton, and A. Shukla, “Simultaneous Optimization and Evaluation of Multiple Dimensional Queries,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 271-282, 1998.
[68] Y. Zhao, K. Ramasamy, K. Tufte, and J.F. Naughton, “Array-Based Evaluation of Multidimensional Queries in Object Relational Database Systems,” Proc. Int'l Conf. Data Eng., 1998.
[69] Y. Zhuge and H. Garcia-Molina, “Graph Structural Views and Their Incremental Maintenance,” Proc. Int'l Conf. Data Eng., 1998.
[70] Y. Zhuge, J.L. Wiener, and H. Garcia-Molina, “Multiple View Consistence for Data Warehousing,” Proc. Int'l Conf. Data Eng., 1997.
[71] H.-J. Zimmermann, Fuzzy Set Theory and Its Applications, Kluwer, Boston, 1991.

Index Terms:
Data warehouse, semantic model, algebraic operator, extended SQL, fuzzy set, membership function.
Citation:
Ling Feng, Tharam S. Dillon, "Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 1, pp. 86-102, Jan.-Feb. 2003, doi:10.1109/TKDE.2003.1161584
Usage of this product signifies your acceptance of the Terms of Use.