This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Management of Probabilistic Data
October 1992 (vol. 4 no. 5)
pp. 487-502

It is often desirable to represent in a database, entities whose properties cannot be deterministically classified. The authors develop a data model that includes probabilities associated with the values of the attributes. The notion of missing probabilities is introduced for partially specified probability distributions. This model offers a richer descriptive language allowing the database to more accurately reflect the uncertain real world. Probabilistic analogs to the basic relational operators are defined and their correctness is studied. A set of operators that have no counterpart in conventional relational systems is presented.

[1] R. Cavallo and M. Pittarelli, "The theory of probabilistic databases," inProc. 13th Conf. on Very Large Databases, 1987.
[2] E. F. Codd, "A relational model of data for large shared data banks,"Commun. ACM, pp. 377-387, June 1970.
[3] P. Dadamet al., "A DBMS prototype to support extended NF2 relations: An integrated view on flat tables and hierarchies," inProc. ACM SIGMOD, 1986, pp. 356-367.
[4] C. J. Date,An Introduction to Database Systems, vols. 1 and 2. Reading, MA: Addison-Wesley, vol. 1, 1981, vol. 2, 1983.
[5] N. Fuhr, "A probabilistic framework for vague queries and imprecise information in databases," inProc. 16th Int. Conf. on Very Large Databases, 1990.
[6] E. Gelenber and G. Hebrail, "A probability model of uncertainty in databases," inProc. Int. Conf. on Data Engineering, Feb. 1986.
[7] S.P. Ghosh, "Statistical Relational Tables for Statistical Database Management,"IEEE Trans. on Software Eng., Vol. SE-12, No. 12, Dec. 1986, pp. 1,106- 1.116.
[8] T. Imielinski, "Query processing in deductive databases with incomplete information," Rutgers Tech. Rep. DCS-TR-177, Mar. 1986.
[9] T. Imielinski, "Automated deduction in databases with incomplete information," Rutgers Tech. Rep. DCS-TR-181, Mar. 1986.
[10] T. Imielinski and W. Lipski, "Incomplete Information in Relational Databases,"J. ACM, Vol. 31, No. 4, Oct. 1984, pp. 761-791.
[11] W. Lipski, "On semantic issues connected with incomplete information databases,"ACM Trans. Database Syst., vol. 4, no. 3, pp. 262-296, Sept. 1979.
[12] W. Lipski, "On the logic of incomplete information," inProc. 6th Int. Symp. of Mathematical Foundations of Computer Science, Sept. 1977.
[13] K.C. Liu and R. Sunderraman, "On representing indefinite and maybe information in relational databases," inProc. Fourth Int. Conf. Data Eng., Los Angeles, CA, Feb. 1988, pp. 250-257.
[14] H. Mendelson and A. Saharia, "Incomplete information costs and database design,"ACM Trans. Database Syst., vol. 11, June 1986.
[15] P. Meyer,Introductory Probability and Statistical ApplicationsReading, MA: Addison-Wesley, 2nd ed., 1970.
[16] J. Morrissey and C. van Rijsbergen, "A formal treatment of missing&imprecise information," inProc. SIGIR, 1987.
[17] J. Pearl, "Fusion, propagation, and structuring in belief networks,"Artif. Intell., vol. 29, no. 3, pp. 241-288, 1986.
[18] J. W. Pitman., "On coupling of Markov chains,"Z. Wahrscheinlichkeitstheorie und verwandte Gebiete, vol 35, pp. 315-322, 1976.
[19] H. Prade and C. Testemale, "Generalizing database relational algebra for the treatment of incomplete/uncertain information and vague queries,"Inform. Sci., vol. 34, pp. 115-143, 1984.
[20] K. V. S. V. N. Raju and A. Majumdar, "Fuzzy functional dependencies and lossless join decomposition of fuzzy relational database systems,"ACM Trans. Database Syst., vol. 13, no. 2, June 1988.
[21] R. Reiter, "A sound and sometimes complete query evaluation algorithm for relational databases with null values,"J. ACM, vol. 33, no. 2, pp. 349-370, Apr. 1986.
[22] M. A. Roth, H. F. Korth, and A. Silberschatz, "Extended algebra and calculus for non-1nf relational databases," Univ. Texas, Austin, Tech. Rep. No. TR-84-36, 1984.
[23] M. Roth, H. Korth, and A. Silberschatz, "Null values in non 1NF relational databases," Univ. of Texas at Austin Tech. Rep. TR-85-32, Dec. 1985.
[24] A. Shoshani and H. K. T. Wong, "Statistical and scientific database issues,"IEEE Trans. Software Eng., vol. SE-11, pp. 1040-1047, Oct. 1985.
[25] A. Tzvielli, "PL--A probabilistic logic," inProc. 4th Int. Con. on Data Engineering, Feb. 1988.
[26] J. D. Ullman,Database and Knowledge-base Systems. Rockville, MD: Computer Science Press, 1988.
[27] Y. Vassilou, "Functional dependencies and incomplete information," inProc. 6th Int. Conf. on Very Large Databases, Montreal, Oct. 1980.
[28] E. Wong, "A statistical approach to incomplete information in database systems," inACM TODS, vol. 7, no. 3, pp. 470-488, Sept. 1982.
[29] L. A. Zadeh, "Fuzzy sets,"Inform. Cont., vol. 8, pp. 338-353, 1965.
[30] L. A. Zadeh, "Fuzzy logic,"IEEE Comput., vol. 21, no. 4, pp. 83-93, Apr. 1988.

Index Terms:
probabilistic data management; database representation; data model; missing probabilities; partially specified probability distributions; descriptive language; relational operators; database theory; relational algebra; relational databases
Citation:
D. Barbará, H. Garcia-Molina, D. Porter, "The Management of Probabilistic Data," IEEE Transactions on Knowledge and Data Engineering, vol. 4, no. 5, pp. 487-502, Oct. 1992, doi:10.1109/69.166990
Usage of this product signifies your acceptance of the Terms of Use.