The Community for Technology Leaders
RSS Icon
Issue No.05 - Sept.-Oct. (2013 vol.28)
pp: 44-48
Paul Groth , Network Institute, VU University Amsterdam
The Web has enabled us to construct massive knowledge bases that enable both intelligent applications and complex analyses. The construction of these knowledge bases invariably involves combining multiple sources using a stack of human and automated decisions, which leads to an opaque process that hampers both transparency and repurposing. The intricacies of this problem are concretely formulated, along with potential research directions for addressing it.
data science, intelligent systems, natural language processing, knowledge bases, knowledge acquisition systems, remix, Web science,
Paul Groth, "The Knowledge-Remixing Bottleneck", IEEE Intelligent Systems, vol.28, no. 5, pp. 44-48, Sept.-Oct. 2013, doi:10.1109/MIS.2013.138
1. A. Kilgarriff and G. Grefenstette,“Introduction to the Special Issue on the Web as Corpus,” J. Computational Linguistics, vol. 29, no. 3, 2003, pp. 333-347.
2. J. Chu-Carroll et al., “Textual Resource Acquisition and Engineering,” IBM J. Research and Development, vol. 56, nos. 3-4, 2012, pp. 4:1-4:11.
3. NASA, A.40 Computational Modeling Algorithms and Cyberinfrastructure, tech. report, NASA, 19 Dec. 2011.
4. D. Boyd and K. Crawford,“Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon,” Information, Comm., & Society, vol. 15, no. 5, 2012, pp. 662-679.
5. D. Gaffney,“#iranElection: Quantifying Online Activism,” Proc. Web Science 2010: Extending the Frontiers of Society On-Line, Web Science Trust, 2010; http://journal.webscience.org295.
6. D. Boyd,S. Golder,, and G. Lotan,“Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter,” Proc. Hawaii Int’l Conf. System Sciences, IEEE, 2010; doi:10.1109HICSS.2010.412.
7. D. Freelon,“On the Interpretation of Digital Trace Data in Communication and Social Computing Research,” J. Broadcasting & Electronic Media, to be published, 2013; 06dfreelon_tracedata_preprint_JOBEM.pdf .
8. C. Bizer et al., “DBpedia—A Crystallization Point for the Web of Data,” J. Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, no. 3, 2009, pp. 154-165.
9. F.M. Suchanek,G. Kasneci,, and G. Weikum,“YAGO: A Large Ontology from Wikipedia and WordNet,” J. Web Semantics: Science, Services and Agents on the World Wide Web, vol. 6, no. 3, 2008, pp. 203-217.
10. G.A. Miller,“WordNet: A Lexical Database for English,” Comm. ACM, vol. 38, no. 11, 1995, pp. 39-41.
11. F. Flöck,D. Vrandecic,, and E. Simperl,“Towards a Diversity-Minded Wikipedia,” Proc. ACM 3rd Int’l Conf. Web Science, ACM, 2011; 112_paper.pdf.
12. R. Priedhorsky et al., “Creating, Destroying, and Restoring Value in Wikipedia,” Proc. 2007 Int’l ACM Conf. Supporting Group Work, ACM, 2007, pp. 259-268; .
13. R. Almeida,B. Mozafari,, and J. Cho,“On the Evolution of Wikipedia,” Proc. Int’l Conf. Weblogs and Social Media, 2007;
14. S. Niederer and J. van Dijck,“Wisdom of the Crowd or Technicity of Content? Wikipedia as Socio-Technical System,” New Media & Society, vol. 12, no. 8, 2010, pp. 1368-1387.
15. C.Y.A. Brenninkmeijer et al., “Including Co-Referent URIs in a SPARQL Query,” Proc. 4th Int’l Workshop on Consuming Linked Data,, 2013; .
16. E. Laurier,I. Strebel,, and B. Brown,“Video Analysis: Lessons from Professional Video Editing Practice,” Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, vol. 9, no. 3, 2008, article no. 37.
17. R.D. Peng,“Reproducible Research in Computational Science,” Science, vol. 334, no. 6060, 2011; pp. 1226-1227.
18. C. Goble,“The Reality of Reproducibility of Computational Science,” IEEE eScience Conf. keynote presentation;
19. D. De Roure and C. Goble,“Anchors in Shifting Sand: The Primacy of Method in the Web of Data,” Proc. Web Science 2010: Extending the Frontiers of Society On-Line, Web Science Trust, 2010; http://journal.webscience.org325.
20. P. Groth and L. Moreau, eds., PROV-Overview: An Overview of the PROV Family of Documents, W3C Working Group Note NOTE-prov-overview-20130430, World Wide Web Consortium (W3C), Apr. 2013;
21. P. Szekely et al., “Connecting the Smithsonian American Art Museum to the Linked Data Cloud,” Proc. 10th Extended Semantic Web Conf., LNCS 7882, Springer-Verlag, 2013, pp. 593-607.
22. A. Carlson et al., “Toward an Architecture for Never-Ending Language Learning,” Proc. Conf. Artificial Intelligence, AAAI, 2010; .
23. B. Pang, and L. Lee,“Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2., nos. 1-2, 2008, pp. 1-135.
24. F. Giunchiglia,V. Maltese,, and B. Dutta,“Domains and Context: First Steps Towards Managing Diversity in Knowledge,” J. Web Semantics: Science, Services and Agents on the World Wide Web, vols. 12-13, 2012, pp. 53-63.
50 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool