This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Evaluating Aggregate Operations Over Imprecise Data
April 1996 (vol. 8 no. 2)
pp. 273-284

Abstract—Imprecise data in databases were originally denoted as null values, which represent the meaning of "values unknown at present." More generally, a partial value corresponds to a finite set of possible values for an attribute in which exactly one of the values is the "true" value. In this paper, we define a set of extended aggregate operations, namely sum, average, count, maximum, and minimum, which can be applied to an attribute containing partial values. Two types of aggregate operators are considered: scalar aggregates and aggregate functions. We study the properties of the aggregate operations and develop efficient algorithms for count, maximum and minimum. However, for sum and average, we point out that in general it takes exponential time complexity to do the computations.

[1] D. Barbará, H. Garcia-Molina, and D. Porter, “The Management of Probabilistic Data,” IEEE Trans. Knowledge and Data Eng., vol. 4, pp. 487-501, 1992.
[2] J.A. Bondy and U.S.R. Murty, Graph Theory with Applications,New York: Macmillan, 1976.
[3] P. Bosc, M. Galibourg, and G. Hamon, “Fuzzy Querying with SQL: Extensions and Implementation Aspects,” Fuzzy Set and Systems, vol. 28, pp. 333–349, 1988.
[4] B.P. Buckles and F.E. Petry, "A Fuzzy Representation of Data for Relational Databases," Fuzzy Sets and Systems, vol. 7, pp. 213-226, 1982.
[5] R. Cavallo and M. Pittarelli,“The theory of probabilistic databases,” Proc. VLDB Conf. , pp. 71-81, 1987.
[6] E.F. Codd, "Understanding Relations," Installment no. 7, ACM SIGMOD Record FDT Bulletin, vol, 7, no. 3-4, pp. 23-28, 1975.
[7] E. Codd,“Extending the database relational model to capture more meaning,” ACM Trans. Database Systems, vol. 4, no. 4, pp. 397-434, 1979.
[8] E. Codd,“Missing information (applicable and inapplicable) in relational databases,” ACM SIGMOD Record, vol. 15, no. 4, pp. 53-78, 1986.
[9] C.J. Date, A Guide to the SQL Standard.Reading, Mass.: Addison-Wesley, 1989.
[10] L.G. DeMichiel, "Performing Database Operations over Mismatched Domains," PhD dissertation, Stanford Univ., 1989.
[11] L.G. Demichiel, “Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains,” IEEE Trans. Knowledge and Data Eng., vol. 4, pp. 485-493, 1989.
[12] D. Dubois and H. Prade, Possibility Theory: An Approach to Computerized Processing.New York: Plenum Press, 1986.
[13] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[14] J. Grant, "Partial Values in a Tabular Database Model," Information Processing Letters, vol. 9, no. 2, pp. 97-99, 1979.
[15] J. Grant and J. Minker, "Answering Queries in Indefinite Databases and the Null Value Problem," Advances in Computing Research, P. Kanellakis, ed., vol. 3, The Theory of Databases, pp. 247-267, JAI Press, 1986.
[16] P. Hall, "On Representatives of Subsets," J. London Math. Soc., vol. 10, pp. 26-30, 1935.
[17] J.E. Hopcroft and R.M. Karp, "An n5/2Algorithm for Maximum Matching in Bipartite Graphs," SIAM J. Computing, vol. 2, no. 4, pp. 225-231, 1973.
[18] T. Imielinski and K. Vadaparty,“Complexity of query processing in databases with OR-objects,” Proc. ACM PODS Conf., pp. 51-65, 1989.
[19] T. Imielinski, "Incomplete Deductive Databases," Annals of Mathematics and Artificial Intelligence, vol. 3, no. 2-4, pp. 259-294, 1984.
[20] W. Lipski,“On semantic issue connected with incomplete information systems,” ACM Trans. Database Systems, vol. 4, no. 3, pp. 262-296, 1979.
[21] D. Maier, The Theory of Relational Databases.Rockville, Md.: Computer Science Press, 1983.
[22] A. Motro, “Accommodating Imprecision in Database Systems: Issues and Solutions,” Proc. ACM SIGMOD Record, vol. 19, no. 4, pp. 69–74, Dec. 1990.
[23] A. Ola,“Relational databases with exclusive disjunctions,” Proc. IEEE Data Eng. Conf., pp. 328-336, 1992.
[24] G. Ozsoyoglu, Z.M. Ozsoyoglu, and V. Matos, “Extending Relational Algebra and Relational Calculus with Set-Valued Attributes and Aggregate Functions,” ACM Trans. Database Systems, vol. 14, no. 4, Dec. 1987.
[25] H. Prade and C. Testemale, "Generalizing Database Relational Algebra for the Treatment of Incomplete/Uncertain Information and Vague Queries," Information Sciences, vol. 34, pp. 115-143, 1984.
[26] E.A. Rundensteiner and L. Bic, "Aggregates in Possibilistic Databases," Proc. 15th Int'l Conf. Very Large Data Bases, pp. 287-294, 1989.
[27] E.A. Rundersteiner and L. Bic, “Evaluating Aggregates in Possibilistic Relational Databases,” Data and Knowledge Eng. J., vol. 7, pp. 239-267, 1992.
[28] P.S.M. Tsai and A.L.P. Chen, "Querying Uncertain Data in Heterogeneous Databases," Proc. IEEE Int'l Workshop Research Issues on Data Engineering (RIDE), pp. 161-168, 1993.
[29] F.S.C. Tseng,A.L.P. Chen,, and W.P. Yang,“Searching a minimal semantically-equivalent subset of a set of partial values,” The VLDB J., vol. 2, no. 4, pp. 489-512, 1993.
[30] F.S.C. Tseng, A.L.P. Chen, and W.-P. Yang, “Answering Heterogeneous Database Queries with Degrees of Uncertainty,” Distributed and Parallel Databases, vol. 1, pp. 281-302, 1993.
[31] F.S.C. Tseng, A.L.P. Chen, and W.P. Yang, "Generalizing the Division Operation on Indefinite Databases," Proc. Second Far-East Workshop Future Database Systems, pp. 347-354,Kyoto, Japan, 1992.
[32] F.S.C. Tseng,A.L.P. Chen, and W.P. Yang,“Refining imprecise data by integrity constraints,” Data and Knowledge Eng. J., vol. 11, no. 3, pp. 299-316, 1993.
[33] L.A. Zadeh, "Fuzzy Sets as a Basis for a Theory of Possibility," Fuzzy Sets and Systems, vol. 1, no. 1, pp. 3-28, 1978.
[34] M. Zemankova,“Implementing imprecision in information systems,” Information Science, vol. 37, pp. 107-141, 1985.

Index Terms:
Relational databases, null values, partial values, scalar aggregates, aggregate functions, graph theory.
Citation:
Arbee L.P. Chen, Jui-Shang Chiu, Frank S.C. Tseng, "Evaluating Aggregate Operations Over Imprecise Data," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, pp. 273-284, April 1996, doi:10.1109/69.494166
Usage of this product signifies your acceptance of the Terms of Use.