1541-1672/09/$31.00 © 2009 IEEE
Published by the IEEE Computer Society
The Emerging Field of Semantic Scientific Knowledge Integration
As science informatics and e-Science blossom around the world, teams of collaborating researchers are finding needs for next-generation cyberinfrastructure along with knowledge and tool support for data-intensive scientific research.
Many geosciences researchers are taking advantage of the emergence of virtual repositories and observatories, such as those in astronomy, heliophysics, environmental science, hydrology, and solar-terrestrial physics, where distributed and often heterogeneous collections of scientific data are made available transparently. 2
But geoscience is but one of many fields leading the way, as is evident in this issue. Such efforts strive to present a scientific research environment that provides software tools and interfaces to interoperating data archives and associated services. The initial goals for these efforts focused more on traditional database considerations and relatively simple uses of AI techniques. Recently, there has been substantial adoption of mainstream AI techniques, and the medium- and long-range goals for these efforts call for full-scale semantic integration of scientific data and associated knowledge. Consequently, they present interesting motivations for and tests of existing and emerging AI techniques.
Alongside next-generation information technology for science is a growing demand for semantic technologies. Knowledge representation languages and environments continue to evolve. Importantly for science applications, several of the languages have become stable international standards or recommendations. As a result, a number of academic and commercial tools are now available for use from new as well as established organizations and companies. The e-Science community has been an early adopter of many of these tools, and in turn has provided essential feedback to semantic-technology developers, driving infrastructure evolution. As e-Science projects mature, many new challenges are appearing for both AI and IT, stimulating new research directions that in turn are motivating development of the next generation of scientific cyberinfrastructure.
To date, growth in e-Science and in semantic technologies has been largely independent, but the need to coalesce the two communities is becoming increasingly apparent. One such cross-disciplinary forum was the Semantic Scientific Knowledge Integration Workshop held in March 2008 ( www.ksl.stanford.edu/people/dlm/sss08
) as part of the AAAI's Spring Symposium Series. One goal of that workshop 3
and subsequently for this special issue was to help identify and catalyze the semantic scientific knowledge integration community into collaborative efforts. That workshop, this special issue, and the emerging community provide a forum for collecting and sharing e-Science experiences with representation languages, modeling techniques and tools, reasoning techniques and tools, and reusable resources such as science ontologies and Web services.
Emerging themes cover a wide range of topics and include the following:
• foundational aspects of science knowledge representation and integration,
• evaluation of science ontologies for use in knowledge integration,
• semantically enabled information architectures and infrastructure supporting scientific research,
• methodologies for integrating semantics within existing scientific research infrastructure,
• semantically enabled science applications involving knowledge integration (aimed at readership beyond a single discipline),
• semantic environments for building and integrating science applications,
• use case studies for semantic interdisciplinary science applications,
• ontology-enhanced search and integration of scientific information,
• ontology-enhanced science workflow tools involving knowledge integration,
• provenance-aware semantic e-Science tools and applications,
• explanation services for semantic e-Science applications, and
• Semantic Web services supporting knowledge integration in e-Science.
The fact that these themes span the range from pure research to practical application is also driving the nature of collaboration that is required to make effective progress. As an informatics field of study, knowledge integration driven by use cases is immediately interdisciplinary. Thus, the themes are also driving the need for very different communities to come together and exchange research ideas, methods, and solutions as well as spur new research from application experience and evaluation.
Special Issue Contributions
This special issue aims to report on the state of the art in semantic e-Science. The contributions include original research articles that bridge the semantic-technologies community with the scientific community in the area of knowledge integration.
David Poole, Clinton Smyth, and Rita Sharma discuss ontology design for scientific theories. They highlight the needs for representing and reasoning with probabilistic information, and provide a multi-dimensional design paradigm with an OWL representation.
Hajo Rijgersberg, Jan Top, and Marcel Meinders introduce their Ontology of Quantitative Research (OQR). They discuss and demonstrate the requirements and use of the OQR using an example from the area of quantitative food research.
Kwok Cheung, Jane Hunter, and John Drennan introduce an ontology-based search environment for material scientists. The MatSeek system includes a material-science ontology that enables integration and access across multiple disparate data sources relevant to material scientists. The authors describe how this system leverages semantic technologies and moves toward a more complete materials informatics workbench.
Daniel Rubin, Pattanasak Mongkolwat, Vladimir Kleper, Kaustubh Supekar, and David Channin introduce their work on Annotation and Image Markup (AIM). They describe AIM's image annotation ontology, along with its annotation tool and serialization module.
Finally, Boyan Brodaric and Florian Probst discuss their work integrating geoscience ontologies with the Dolce foundational ontology. In particular, they focus on the Semantic Web Earth and Environmental (Sweet) ontology and the GeoSciML schema as motivated by their work on groundwater pollution estimation. They show how Dolce as a foundational ontology can and does play an important role in cross-disciplinary e-Science.
We hope this special issue is just one of the early collections of the growing literature in the emerging interdisciplinary field of semantic e-Science.