The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.25)
pp: 419-432
Danushka Bollegala , The University of Tokyo, Tokyo
Yutaka Matsuo , The University of Tokyo, Tokyo
Mitsuru Ishizuka , The University of Tokyo, Tokyo
ABSTRACT
The World Wide Web includes semantic relations of numerous types that exist among different entities. Extracting the relations that exist between two entities is an important step in various Web-related tasks such as information retrieval (IR), information extraction, and social network extraction. A supervised relation extraction system that is trained to extract a particular relation type (source relation) might not accurately extract a new type of a relation (target relation) for which it has not been trained. However, it is costly to create training data manually for every new relation type that one might want to extract. We propose a method to adapt an existing relation extraction system to extract new relation types with minimum supervision. Our proposed method comprises two stages: learning a lower dimensional projection between different relations, and learning a relational classifier for the target relation type with instance sampling. First, to represent a semantic relation that exists between two entities, we extract lexical and syntactic patterns from contexts in which those two entities co-occur. Then, we construct a bipartite graph between relation-specific (RS) and relation-independent (RI) patterns. Spectral clustering is performed on the bipartite graph to compute a lower dimensional projection. Second, we train a classifier for the target relation type using a small number of labeled instances. To account for the lack of target relation training instances, we present a one-sided under sampling method. We evaluate the proposed method using a data set that contains 2,000 instances for 20 different relation types. Our experimental results show that the proposed method achieves a statistically significant macroaverage F-score of 62.77. Moreover, the proposed method outperforms numerous baselines and a previously proposed weakly supervised relation extraction method.
INDEX TERMS
Context, Semantics, Syntactics, Bipartite graph, Data mining, Feature extraction, Companies, Web mining, Relation extraction, domain adaptation
CITATION
Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka, "Minimally Supervised Novel Relation Extraction Using a Latent Relational Mapping", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 2, pp. 419-432, Feb. 2013, doi:10.1109/TKDE.2011.250
REFERENCES
[1] G. Salton and C. Buckley, Introduction to Modern Information Retreival, McGraw-Hill Book Company, 1983.
[2] M. Banko, M. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni, "Open Information Extraction from the Web," Proc. 20th Int'l Joint Conf. Artifical Intelligence (IJCAI '07), pp. 2670-2676, 2007.
[3] Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura, H. Takeda, K. Hasida, and M. Ishizuka, "Polyphonet: An Advanced Social Network Extraction System," Proc. 15th Int'l Conf. World Wide Web (WWW '06), 2006.
[4] R. Bunescu and R. Mooney, "A Shortest Path Dependency Kernel for Relation Extraction," Proc. Conf. Human Language Technology and Empirical Methods in Natural Language Processing (EMNLP '05), pp. 724-731, 2005.
[5] A. Culotta and J. Sorensen, "Dependency Tree Kernels for Relation Extraction," Proc. 42nd Ann. Meeting on Assoc. for Computational Linguistics (ACL '04), pp. 423-429, 2004.
[6] Z. GuoDong, S. Jian, Z. Jie, and Z. Min, "Exploring Various Knowledge in Relation Extraction," Proc. 43rd Ann. Meeting on Assoc. for Computational Linguistics (ACL '05), pp. 427-434, 2005.
[7] M. Banko and O. Etzioni, "The Tradeoffs Between Traditional and Open Relation Extraction," Proc. 43rd Ann. Meeting on Assoc. for Computational Linguistics (ACL '08), pp. 28-36, 2008.
[8] D. Bollegala, Y. Matsuo, and M. Ishizuka, "Measuring the Similarity Between Implicit Semantic Relations from the Web," Proc. 18th Int'l Conf. World Wide Web (WWW '09), pp. 651-660, 2009.
[9] J. Zhu, Z. Nie, X. Liu, B. Zhang, and J.R. Wen, "Statsnowball: A Statistical Approach to Extracting Entity Relationships," Proc. 18th Int'l Conf. World Wide Web (WWW '09), pp. 101-110, 2009.
[10] M. Baroni and A. Kilgarriff, "Large Linguistically-Processed Web Corpora for Multiple Languages," Proc. European Assoc. Computational Linguistics (EACL '06), pp. 87-90, 2006.
[11] M. Hearst, "Automatic Acquisition of Hyponyms from Large Text Corpora," Proc. 14th Conf. Computational Linguistics (COLING '92), pp. 539-545, 1992.
[12] R. Snow, D. Jurafsky, and A. Ng, "Learning Syntactic Patterns for Automatic Hypernym Discovery," Proc. Neural Information Processing Systems (NIPS '05), pp. 1297-1304, 2005.
[13] M. Berland and E. Charniak, "Finding Parts in Very Large Corpora," Proc. 37th Ann. Meeting of the Assoc. for Computational Linguistics on Computational Linguistics (ACL '99), pp. 57-64, 1999.
[14] D. Ravichandran and E. Hovy, "Learning Surface Text Patterns for a Question Answering System," Proc. 40th Ann. Meeting on Assoc. for Computational Linguistics (ACL '02), pp. 41-47, 2001.
[15] R. Bhagat and D. Ravichandran, "Large Scale Acquisition of Paraphrases for Learning Surface Patterns," Proc. Ann. Meeting on Assoc. for Computational Linguistics (ACL '08), pp. 674-682, 2008.
[16] D. Bollegala, Y. Matsuo, and M. Ishizuka, "Relational Duality: Unsupervised Extraction of Semantic Relations Between Entities on the Web," Proc. 19th Int'l Conf. World Wide Web (WWW '10), pp. 151-160, 2010.
[17] J. Pei, J. Han, B. Mortazavi-Asi, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, "Mining Sequential Patterns by Pattern-Growth: The Prefixspan Approach," IEEE Trans. Knowledge and Data Eng., vol. 16, no. 11, pp. 1424-1440, Nov. 2004.
[18] J. Blitzer, M. Dredze, and F. Pereira, "Biographies, Bollywood, Boom-Boxes and Blenders: Domain Adaptation for Sentiment Classification," Proc. 45th Ann. Meeting on Assoc. for Computational Linguistics (ACL '07), pp. 440-447, 2007.
[19] S.J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, "Cross-Domain Sentiment Classification via Spectral Feature Alignment," Proc. 19th Int'l Conf. World Wide Web (WWW '10), 2010.
[20] Z. Harris, "Distributional Structure," The Philosophy of Linguistics, J.J. Katz, ed., pp. 26-47, Oxford Univ. Press, 1985.
[21] D. Lin and P. Pantel, "Dirt: Discovery of Inference Rules from Text," Proc. ACM SIGKDD Conf. Knowledge Discovery and Data Mining (SIGKDD '01), pp. 323-328, 2001.
[22] F.R.K. Chung, Spectral Graph Theory, Regional Conf. Series in Math., Am. Math. Soc., 1997.
[23] I. Dhillion, "Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning," Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '01), pp. 269-274, 2001.
[24] M. Belkin and P. Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation," Neural Computation, vol. 15, no. 6, pp. 1373-1396, 2003.
[25] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, "Analysis of Representations for Domain Adaptation," Proc. Advances in Neural Information Processing (NIPS '06), 2006.
[26] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[27] X. Wang and I. Davidson, "Flexible Constrained Spectral Clustering," Proc. 16th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '10), pp. 563-572, 2010.
[28] C. Ding and X. He, "K-Means Clustering via Principal Component Analysis," Proc. 21st Int'l Conf. Machine Learning (ICML '04), pp. 225-232, 2004.
[29] N. Halko, P.G. Martinsson, and J.A. Tropp, "Finding Structure with Randomness: Stochastic Algorithms for Constructing Approximate Matrix Decompositions," technical report, California Inst. of Tech nology, 2009.
[30] I. Tomek, "Two Modifications of CNN," IEEE Trans. System, Man and Cybernetics, vol. SMC-6, no. 11, pp. 769-772, Nov. 1976.
[31] M. Kubat and S. Matwin, "Addressing the Curse of Imbalanced Training Sets: One-Sided Selection," Proc. Int'l Conf. Machine Learning (ICML '97), pp. 179-186, 1997.
[32] F. Provost, "Machine Learning from Imbalanced Data Sets," Proc. Workshop Imbalanced Data Sets (AAAI '00), 2000.
[33] E. Agichtein and L. Gravano, "Snowball: Extracting Relations from Large Plain-Text Collections," Proc. Fifth ACM Conf. Digital Libraries (ICDL '00), 2000.
[34] F.M. Suchanek, G. Kasneci, and G. Weikum, "Yago: A Core of Semantic Knowledge," Proc. 17th Int'l Conf. World Wide Web (WWW '07), 2007.
[35] J. Jiang, "Multitask Transfer Learning for Weakly-Supervised Relation Extraction," Proc. 47th Ann. Meeting on Assoc. for Computational Linguistics (ACL '09), pp. 1012-1020, 2009.
[36] R.C. Bunescu and R.J. Mooney, "Subsequence Kernels for Relation Extraction," Proc. Neural Information Processing Systems (NIPS '05), 2005.
[37] L. Qian, G. Zhou, F. Kong, Q. Zhu, and P. Qian, "Exploiting Constituent Dependencies for tree Kernel-Based Semantic Relation Extraction," Proc. 22nd Int'l Conf. Computational Linguistics (COLING '08), pp. 697-704, 2008.
[38] O. Etzioni, M. Cafarella, D. Downey, A.-M. Popescu, T. Shaked, S. Soderland, D.S. Weld, and A. Yates, "Unsupervised Named-Entity Extraction from the Web: An Experimental Study," Artificial Intelligence, vol. 165, no. 1, pp. 91-134, 2005.
[39] M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain, "Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge," Proc. 21st Nat'l Conf. Artificial Intelligence (AAAI '06), pp. 1400-1405, 2006.
[40] E. Riloff and R. Jones, "Learning Dictionaries for Information Extraction by Multilevel Bootstrapping," Proc. 16th Nat'l Conf. Artificial Intelligence (AAAI '99), pp. 474-479, 1999.
[41] Z. Kozareva and E. Hovy, "Not all Seeds are Equal: Measuring the Quality of Text Mining Seeds," Proc. Ann. Conf. North Am. Chapter of the Assoc. for Computational Linguistics (NAACL '10), 2010.
[42] Y. Shinyama and S. Sekine, "Preemptive Information Extraction using Unrestricted Relation Discovery," Proc. Ann. Conf. North Am. Chapter of the Assoc. for Computational Linguistics (NACCL '06), 2006.
[43] H. DauméIII, "Frustratingly Easy Domain Adaptation," Proc. Ann. Meeting on Assoc. for Computational Linguistics (ACL '07), pp. 256-263, 2007.
[44] J. Jiang and C. Zhai, "Instance Weighting for Domain Adaptation in NlP," Proc. 45th Ann. Meeting on Assoc. for Computational Linguistics (ACL '07), pp. 264-271, 2007.
[45] J. Blitzer, R. McDonald, and F. Pereira, "Domain Adaptation with Structural Correspondence Learning," Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP '06), 2006.
[46] J. Jiang and C. Zhai, "A Two-Stage Approach to Domain Adaptation for Statistical Classifiers," Proc. Sixteenth ACM Conf. Information and Knowledge Management (CIKM '07), pp. 401-410, 2007.
[47] R.K. Ando and T. Zhang, "A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data," J. Machine Learning Research, vol. 6, pp. 1817-1853, 2005.
[48] H. Guo and H.L. Viktor, "Learning from Imbalanced Data Sets with Boosting and Data Generation: The Databoost-Im Approach," SIGKDD Newsletters, vol. 6, pp. 30-39, 2004.
[49] N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, "Smote: Synthetic Minority Over-Sampling Technique," J. Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
57 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool