The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2012 vol.24)
pp: 251-264
Jiuyong Li , University of South Australia, Mawson Lakes
Jixue Liu , University of South Australia, Mawson Lakes
Yongfeng Chen , Xian University of Architecture and Technology, Xian
ABSTRACT
Functional and inclusion dependency discovery is important to knowledge discovery, database semantics analysis, database design, and data quality assessment. Motivated by the importance of dependency discovery, this paper reviews the methods for functional dependency, conditional functional dependency, approximate functional dependency, and inclusion dependency discovery in relational databases and a method for discovering XML functional dependencies.
INDEX TERMS
Integrity constraint, functional dependencies, inclusion dependencies, conditional functional dependencies, XML, knowledge discovery, data quality.
CITATION
Jiuyong Li, Jixue Liu, Yongfeng Chen, "Discover Dependencies from Data—A Review", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 2, pp. 251-264, February 2012, doi:10.1109/TKDE.2010.197
REFERENCES
[1] M. Arenas and L. Libkin, "A Normal Form for xml Documents," ACM Trans. Database Systems, vol. 29, pp. 195-232, 2004.
[2] P. Atzgeni and V.D. Antonellis, Relational Database Theory. The Benjamin/Cummings Publishing Company, Inc., 1993.
[3] J. Bauckmann, U. Leser, and F. Naumann, "Efficiently Computing Inclusion Dependencies for Schema Discovery," Proc. Second Int'l Workshop Database Interoperability, 2006.
[4] C. Beeri, M. Dowd, R. Fagin, and R. Statman, "On the Structure of Armstrong Relations for Functional Dependencies," J. Assoc. for Computing Machinery, vol. 31, no. 1, pp. 30-46, 1984.
[5] S. Bell, "Discovery and Maintenance of Functional Dependencies by Independencies," Proc. Workshop. Knowledge Discovery in Databases (KDD '95), pp. 27-32, 1995.
[6] P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis, "Conditional Functional Dependencies for Data Cleaning," Proc. IEEE 23rd Int'l Conf. Data Eng. (ICDE), pp. 746-755, 2007.
[7] T. Calders, R.T. Ng, and J. Wijsen, "Searching for Dependencies at Multiple Abstraction Levels," ACM Trans. Database Systems, vol. 27, no. 3, pp. 229-260, 2002.
[8] F. Chiang and R.J. Miller, "Discovering Data Quality Rules," VLDB Endowment, vol. 1, no. 1, pp. 1166-1177, 2008.
[9] G. Cormode, L. Golab, K. Flip, A. McGregor, D. Srivastava, and X. Zhang, "Estimating the Confidence of Conditional Functional Dependencies," Proc. SIGKDD Int'l Conf., pp. 469-482, 2009.
[10] S.S. Cosmadakis, P.C. Kanellakis, and N. Spyratos, "Partition Semantics for Relations," Proc. Fourth ACM SIGACT-SIGMOD Symp. Principles of Database Systems (PODS), pp. 261-275, 1985.
[11] W. Fan, F. Geerts, L.V.S. Lakshmanan, and M. Xiong, "Discovering Conditional Functional Dependencies," Proc. IEEE 25th Int'l Conf. Data Eng. (ICDE), pp. 1231-1234, 2009.
[12] P.A. Flach and I. Savnik, "Database Dependency Discovery: A Machine Learning Approach," Artificial Intelligence Comm., vol. 12, no. 3, pp. 139-160, 1999.
[13] C. Giannella and E. Robertson, "On Approximation Measures for Functional Dependencies," Information Systems, vol. 29, no. 6, pp. 483-507, 2004.
[14] L. Golab, H. Karloff, F. Korn, D. Srivastava, and B. Yu, "On Generating near-Optimal Tableaux for Conditional Functional Dependencies," Proc. Very Large Databases (VLDB) Conf., pp. 376-390, 2008.
[15] G. Gottlob and L. Libkin, "Investigations on Armstrong Relations, Dependency Inference, and Excluded Functional Dependencies," Acta Cybernetica, vol. 9, no. 4, pp. 395-402, 1990.
[16] Y. Huhtala, J. Karkkainen, P. Porkka, and H. Toivonen, "Tane : An Efficient Algorithm for Discovering Functional and Approximate Dependencies," Computer J., vol. 42, no. 2, pp. 100-111, 1999.
[17] I.F. Ilyas, V. Mark, P. Haas, P. Brown, and A. Aboulnaga, "Cords: Automatic Discovery of Correlations Soft Functional Dependencies," Proc. SIGMOD Int'l Conf. Management of Data, 2004.
[18] M. Kantola, H. Mannila, K.-J. Räihä, and H. Siirtola, "Discovering Functional and Inclusion Dependencies in Relational Databases," Int'l J. Intelligent Systems, vol. 7, no. 7, pp. 591-607, 1992.
[19] R.S. King and J. Oil, "Discovery of Functional and Approximate Functional Dependencies in Relational Databases," J. Applied Math. and Decision Sciences, vol. 7, no. 1, pp. 49-59, 2003.
[20] J. Kivinen and H. Mannila, "Approximate Dependency Inference From Relations," Proc. Fourth Int'l Conf. Database Theory (ICDT '92), pp. 86-98, 1992.
[21] A. Koeller and E.A. Rundensteiner, "Heuristic Strategies for Inclusion Dependency Discovery," On the Move to Meaningful Internet Systems 2004: Proc. Int'l Conf. CoopIS, DOA, and ODBASE, pp. 891-908, 2004.
[22] S. Lopes, J.-M. Petit, and L. Lakhal, "Efficient Discovery of Functional Dependencies and Armstrong Relations," Proc. Seventh Int'l Conf. Extending Database Technology (EDBT): Advances in Database Technology, vol. 1777, pp. 350-364, 2000.
[23] S. Lopes, J.-M. Petit, and L. Lakhal, "Functional and Approximate Dependency Mining: Database and Fca Points of View," J. Experimental and Theoretical Artificial Intelligence, vol. 14, no. 2, pp. 93-114, 2002.
[24] H. Mannila and K.-J. Räihä, "Dependency Inference," Proc. 13th Int'l Conf. Very Large Data Bases (VLDB), pp. 155-158, 1987.
[25] H. Mannila and K.-J. Räihä, "On the Complexity of Inferring Functional Dependencies," Discrete Applied Math., vol. 40, pp. 237-243, 1992.
[26] F. De Marchi, F. Flouvat, and J.-M. Petit, "Adaptive Strategies for Mining the Positive Border of Interesting Patterns: Application to Inclusion Dependencies in Databases," Proc. Workshop Constraint-Based Mining and Inductive Databases, pp. 81-101, 2006.
[27] F. De marchi, S. Lopes, and J.-M. Petit, "Efficient Algorithms for Mining Inclusion Dependencies," Proc. Eighth Int'l Conf. Extending Database Technology (EDBT), pp. 199-214, 2002.
[28] F. De Marchi, S. Lopes, and J.-M. Petit, "Unary and N-Ary Inclusion Dependency Discovery in Relational Databases," J. Intelligent Information Systems, vol. 32, no. 1, pp. 53-73, 2009.
[29] F. De Marchi and J.-M. Petit, "Approximating a Set of Approximate Inclusion Dependencies," Advances in Soft Computing— Intelligent Information Processing and Web Mining, vol. 31, pp. 633-640, 2005.
[30] F. De Marchi and J.-M. Petit, "Semantic Sampling of Existing Databases through Informative Armstrong Databases," Information Systems, vol. 32, no. 3, pp. 446-457, 2007.
[31] W.Y. Mok, Y.-K. Ng, and D.W. Embley, "A Normal Form for Precisely Characterizing Redundancy in Nested Relations," ACM Trans. Database Systems, vol. 21, no. 1, pp. 77-106, 1996.
[32] N. Novelli and R. Cicchetti, "Fun: An Efficient Algorithm for Mining Functional and Embedded Dependencies," Proc. Eighth Int'l Conf. Database Theory (ICDT), pp. 189-203, 2001.
[33] I. Savnik and P.A. Flach, "Bottom-Up Induction of Functional Dependencies from Relations," Proc. AAAI Workshop Knowledge Discovery in Databases (KDD), pp. 174-185, 1993.
[34] A.M. Silva and M.A. Melkanoff, "A Method for Helping Discover the Dependencies of a Relation," Advances in Data Base Theory, vol. 1, pp. 115-133, 1981.
[35] M. Vincent, J. Liu, and C. Liu, "Redundancy Free Mappings from Relations to Xml," Proc. Fourth Int'l Conf. Web-Age Information Management (WAIM), pp. 55-67, 2003.
[36] M. Vincent, J. Liu, and C. Liu, "Strong Functional Dependencies and Their Application to Normal Forms in Xml," ACM Trans. Database Systems, vol. 29, no. 3, pp. 445-462, 2004.
[37] C. Wyss, C. Giannella, and E. Robertson, "Fastfds: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances—Extended Abstract," Proc. Third Int'l Conf. Data Warehousing and Knowledge Discovery (DaWaK '01), pp. 101-110, 2001.
[38] H. Yao and H.J. Hamilton, "Mining Functional Dependencies from Data," J. Data Mining and Knowledge Discovery, vol. 16, no. 2, pp. 197-219, 2008.
[39] C. Yu and H.V. Jagadish, "Efficient Discovery of Xml Data Redundancies," Proc. 32nd Int'l Conf. Very large Data Bases (VLDB '06), pp. 103-114, 2006.
[40] C. YU and H.V. Jagadish, "XML Schema Refinement through Redundancy Detection and Normalization," The Int'l J. Very Large Data Bases, vol. 17, no. 2, pp. 203-223, 2008.
449 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool