Issue No. 05 - May (2006 vol. 7)
ISSN: 1541-4922
pp: 1
Andreas Eberhart , HP Germany
Steffen Staab , University of Koblenz-Landau
Daniel Oberle , SAP Research
Application servers and Web services offer many possibilities for developing Web applications, but they also present new challenges. For instance, managing component dependencies, versions, and licenses is a typical problem in an ever-growing repository of programming libraries. Similarly, Web services let developers reap advantages roughly similar to those of application servers. However, developers and administrators must cope with local management issues as well as incorporate external, possibly varying dynamic information about third-party services.
Specifying issues such as transactions, session management, and user rights in an application-independent way facilitates managing application servers and Web services. Developers achieve this by configuring generic software with the help of administration tools and corresponding XML configuration files. This lets you flexibly develop and administrate a distributed application; however, because configuration files lack a coherent formal model, they don't provide a high level of abstraction to facilitate management, even if they're more or less human-readable XML. So, it's difficult or often impossible to query a system for conclusions that come from integrating several descriptions.
We remedy the problem by applying semantic technology—that is, ontologies and inference engines—in the middleware solutions. We contribute a state-of-the-art management ontology and propose a design to integrate the ontology infrastructure in existing application servers. We implemented a prototype of our proposed scheme in KAON SERVER, an amalgamation of JBoss and the KAON ontology tool suite.
Our motivation: Management problems
Of the several use cases we've identified in previous work, 1,2 we use two in this section to show the difficulties of managing application servers and Web services. We don't intend to show how to solve these specific use cases. Rather, we demonstrate how seemingly trivial problems that involve handling the source code and configuration files of different software components and Web services lead to complex situations that are hard for software developers or administrators to understand.
Application servers
The first use case commonly occurs when you must link legacy components. It deals with indirect permissions due to context switches in application servers (see figure 1). Suppose a customer, identified by the user account dob, logs into a Web shop via HTTP basic authentication. The script on this page—say, a servlet—might connect to the CustomerEntityBean, an Enterprise JavaBean,which in turn accesses the Customer table in a legacy database. The legacy database defines its own set of user accounts, which differs from the user accounts in the J2EE (Java 2 Platform, Enterprise Edition) realm. We assume that only the dbuser (typically the administrator) can access the Customer table. So, the EJB must perform an explicit context switch (frequently called the run as paradigm 3). The call succeeds because dbuser's credentials are propagated.

Figure 1. An example of indirect permission.

J2EE application servers define context switches independent of applications. This means that the responsibility shifts from coding to deployment. Although reducing the amount of source code you must write is always a good idea, deployment can be tricky. The J2EE specification describes the structure of XML deployment metadata (deployment descriptors). In our example, the developer would have to analyze two different deployment descriptors as well as the source code to configure the context switch. First, the configuration file ( web.xml) of the servlet container states that only authenticated users can access the WebShopServlet (see figure 2).

Figure 2. A relevant snippet of the servlet container's web.xml deployment descriptor.

Second, the WebShopServlet accesses the CustomerEntityBean (see figure 3). The servlet's doGet() method serves the incoming HTTP requests. In our case, it queries user account information from the Customer table by means of the bean to display it to the user. After retrieving a handle to the bean via the Home interface, the servlet invokes the bean's getCustomerName() method.

Figure 3. A relevant snippet of WebShopServlet.java showing the invocation of the CustomerEntityBean.

Third, the CustomerEntityBean's deployment descriptor, called ejb-jar.xml, states that the bean performs a context switch via the <run-as-specified-identity> tag. It thus accesses the Customer table with dbuser's credentials (see figure 4).

Figure 4. A relevant snippet of the ejb-jar.xml deployment descriptor showing the tags for the context switch.

This case isn't a bug but a common way of integrating legacy components. The developer must align two disjoint sets of user accounts via deployment descriptors. J2EE implementations such as JBoss provide tools to help developers generate such deployment descriptors. However, the tools act merely as an input mask, which generates the specific XML syntax for the developer. This is a nice feature; however, the developer must fully understand the complicated concepts behind the options for the context switch. The deployment tools don't help avoid or even repair configurations that might cause harmful system behavior. Even worse, this problem could be duplicated because a plethora of deployment descriptors exists for different kinds of components (servlets, EJBs, and managed beans) and aspects (security, transactions, and so on).
Web services
Similar to deployment descriptors in application servers, WS* descriptions manage orthogonal aspects in an application-independent way. By WS*, we mean Web service specifications, such as WSDL (Web Service Definition Language), WS-Security, or WS-Policy (see www-128.ibm.com/developerworks /views/webservices/libraryview.jsp? type_by=Standards). WS* descriptions are XML files that declaratively describe how developers should deploy and configure Web services. So, WS* descriptions are exchangeable, and developers might use different implementations for the same Web service description. WS* descriptions' disadvantages, however, are also visible. Although the different standards are complementary, you might produce models composed of different WS* descriptions that are inconsistent but don't easily reveal their inconsistencies. This happens because no coherent formal model of WS* descriptions exists, so it's hard to query the system for conclusions that come from integrating several WS* descriptions.
As an example of a trivial conclusion derived from both a WS-BPEL (Web Service Business Process Execution Language) and WS-Policy description, consider the following case. Let's return to the Web shop example and assume we've realized it with internal and external Web services composed and managed by a WS-BPEL engine. After a customer submits an order, we must check the customer's credit card for validity, depending on the credit card type (VISA, MasterCard, and so on). We assume that credit card providers offer this functionality via Web services. The corresponding WS-BPEL process checkAccount invokes the provider's Web services.
Figure 5 shows a snippet of the WS-BPEL process definition.

Figure 5. Snippet of the WS-BPEL document showing the checkAccount process.

Suppose that a credit card provider's Web service accepts only authenticated invocations conforming to Kerberos or X509. It states such policies in a corresponding WS-Policy document, such as the one in figure 6. The invocation will fail unless the developer ensures that it meets the policies are met. The developer must check the policies manually at development time or implement this functionality to react to policies at runtime, assuming that no policy-matching engine is in place.

Figure 6. The MasterCard service's WS-Policy document.

Several tools are available to define WS-Security and WS-Policy descriptors. However, as with deployment descriptors, they act as an input mask that generates the specific XML syntax for the developer.
Summary
Both deployment descriptors in application servers and WS* descriptors for Web services provide convenience and flexibility, but management remains cumbersome. The conceptual models underlying the configurations are implicitly encoded in different XML-DTDs or schemas. So, software developers must manually discover conclusions (that is, by reading and analyzing the descriptor files) that derive from the integration of such descriptions, with little to no help from formal machinery. The use cases we presented would be easy to solve in isolation using dedicated tools. However, a generic, ontology-based semantic-management approach would let us solve a plethora of use cases in a common framework.
Ontology
As a solution to the missing coherent formal model, we propose an ontology covering aspects from the heterogeneous deployment and WS* descriptors. An ontology is a conceptual model with formal logic-based semantics. Ontologies formalize concepts and concept relationships (associations) similar to conceptual database schemas or UML class diagrams. 4 However, ontologies differ from existing methods and technologies in several ways:

• Ontologies aim to enable agreement on the meaning of specific vocabulary terms and so facilitate information integration across applications.

• Ontologies are formalized in logic-based representation languages. Their semantics are thus specified unambiguously.

• Ontology representation languages come with executable calculi enabling querying and reasoning.

A concept hierarchy, or taxonomy, forms an ontology's backbone. Associations define relationships between concepts and can be instantiated accordingly.
Design
Semantic management of software components in application servers and of Web services comprises two layers. First, you formally specify an ontology of components and services—that is, you formalize what components and services are made of. You need only build the ontology once, although the ontology for the application domain (for example, shopping for books on the Web) might vary. Second, you formally specify a concrete set of components or services and their properties by providing semantic metadata (also called instances in common literature) aligned to the ontologies from the first layer. The semantic metadata formalizes a particular instantiation of a distributed application.
For the first layer, we used a foundational ontology as a starting point. Foundational ontologies capture insights from philosophy, logics, and software engineering to prescribe good ontology engineering practice at an upper level of the ontology. For example, our ontologies distinguish carefully between subconcepts and functional roles; for instance, data might play the role of Web service input or output, but the concept "data" is a subconcept of neither. The foundational ontology provides the basis for modeling relevant aspects of components and services. Because modeling is usually time consuming, we generally strive to reuse existing ontologies. We could have reused several recently proposed Web service ontologies (such as OWL-S 5 or WSMO 6). However, these ontologies have several shortcomings 7 that lead to conceptual ambiguity and inferior design. So, we decided against reuse and created our own ontologies for components and services.
Correspondingly, we've pursued a modularized, layered approach that adds ontological commitment in a piecewise manner to maximize ontology reuse at all layers. Figure 7 depicts our ontology modules (see http://cos.ontoware.org).

Figure 7. Reused and contributed ontology modules as a UML package diagram. Packages represent ontology modules; dashed lines represent dependencies between modules. An ontology module A depends on B if it specializes concepts of B, has associations with domains and ranges to B, or reuses its axioms.

Reused ontology modules. Dolce (Descriptive Ontology for Linguistic and Cognitive Engineering) 8 is a typical foundational ontology with a rich axiomatization of generic (domain-independent) concepts, explicit construction principles, and careful reference to interdisciplinary literature.
Several additional theories exist for Dolce that come in the form of ontology modules. For example, Descriptions & Situations (D&S) can be considered an ontology design pattern for structuring (or restructuring) application ontologies that require contextualization. The domain we want to model, namely that of software components and Web services, requires an ontological formalization of context. The most prominent examples of the need for context modeling are the different views that can exist on data. Data can play the role of both input and output, depending on the context. Aldo Gangemi and Peter Mika describe D&S in detail. 9
Another requirement is the possibility to model workflow information between software components or between Web services. One Dolce module, the Ontology of Plans, generically formalizes a theory of plans 10 that you can use to model workflow information.The Ontology of Plans applies the D&S ontology design pattern to characterize planning concepts. The module's intended use is to specify plans at an abstract level independent from existing calculi.
Finally, the Ontology of Information Objects provides a semiotic ontology design pattern centered around information objects. 10 It lets us concisely model the relationship between entities in an information system and the real world. This is required for modeling Web services later on.
Contributed ontology modules. We've applied the reused ontology modules to formalize a Core Software Ontology, a Core Ontology of Software Components, and a Core Ontology of Services. Core ontologies define concepts that are generic across a set of domains; they're situated between the extremes of generic and domain ontologies. Their goal is to facilitate reuse in specific settings and platforms. The borderline between generic and core ontologies isn't clearly defined because there's no exhaustive enumeration of fields and their conceptualizations. However, the distinction offers intuitively meaningful and useful information for building libraries.
The Core Software Ontology formalizes the most fundamental concepts for modeling both software components and Web services, such as software, data, users, access rights, or interfaces. It formalizes such concepts by reusing the modeling basis of Dolce, D&S, the Ontology of Plans, and the Ontology of Information Objects. We've defined the fundamental concepts in the Core Software Ontology and separately from the Core Ontology of Software Components to facilitate reuse.
The Core Ontology of Software Components is based on the Core Software Ontology to formalize our understanding of the term "software component." This term requires special attention because you can interpret it differently, leading to ambiguity. We also put libraries and licenses in this core ontology. The ComponentProfile is the core concept and aggregates a component's relevant aspects. It allows for browsing and querying the various component properties.
The Core Ontology of Services is also based on the Core Software Ontology. It formalizes our understanding of the term "Web service" and introduces the notion of ServiceProfiles. You can access more information on all our core ontologies elsewhere. 11
Domain ontology modules. Domain ontologies enrich the core ontologies with additional domain-dependent knowledge. We specialize the concepts and associations of both core ontologies for the particular needs of the infrastructure in our prototype. Several idiosyncrasies of the underlying platform come into play and are formalized accordingly.
Examples
Our ontologies can formalize the use cases we discussed in the "Application servers" and "Web services" sections.
Application server example. Returning to our indirect-permission example, we now show how using our ontology can support developers. The Core Ontology of Software Components introduces concepts such as Resource, User, RequestContext, or AccessRight. Associations hold between concepts: grantedForUser between AccessRight and User, definedOnResource between AccessRight and Resource, invokes between one Resource and another, and so on.
The association below, namely permission, holds when a User is directly granted an AccessRight on a Resource. It's defined intensionally, that is, in the form of a rule (the notation resembles Prolog syntax):
(1) permission(u,r) <- User(u) and Resource(r) and AccessRight(ar) and grantedForUser(ar,u) and definedOnResource(ar,r).
We extend the definition with a second rule to also grasp indirect permission. Rule 2 states that a User u1 is also granted permission on Resource r2 if u1 is granted permission on another Resource r1. This in turn invokes r2, performs a context switch, and accesses r2.
(2) permission(u1,r2) <- permission(u1,r1) and executes(r1,i) and Invocation(i) and accesses(i,r2) and Resource(r2) and associatedWith(i,ct) and RequestContext(ct) and contextUser(ct,u2) and permission(u2,r2).
We modeled the invocation as a concept to attach context information. In rule 2, the Invocation i is associatedWith a RequestContext. The latter holds all kinds of information, such as security or transaction context settings. In our case, contextUser models the context switch to another user u2.
Applying an inference engine lets us query for user-resource pairs that these rules deduce at runtime (as we'll demonstrate later in the article.)
Web services example. The Core Ontology of Services provides the basic means to capture WS-BPEL and WS-Policy information coherently. This coherence makes it simple to reason with and query such information. In our case, Rule 3 infers all pairs of Web services x and y where x invokes y and y is attached with a policy:
(3) invokesWebServiceWithPolicy(x,y) <- invokes(x,y) and WebService(x) and WebService(y) and describes(sp,y) and ServiceProfile(sp) and profiles(sp,pd) and PolicyDescription(pd).
The ServiceProfile is the key to obtaining this knowledge. First, it contains a Plan to represent the workflow information obtained from a WS-BPEL file, where the invokes association represents a WS-BPEL partnerLink—that is, an invocation of an external service. Second, it contains a PolicyDescription. In our example, it suffices to model a policy's existence rather than the policy itself.
KAON SERVER
We propose semantic management of software components and Web services in an application server to simplify distributed applications' development and management. We want to leverage our ontology to support developers and administrators in their cumbersome tasks during development, deployment, and runtime. We achieve this by integrating ontology infrastructure in the application server, allowing for reasoning and querying at runtime. We implemented a prototype of our scheme in KAON SERVER, which we created using an existing tool suite and open source application server. 12
Architecture

Figure 8. Our system's overall architecture.

In the indirect-permission example, the component metadata collector obtains semantic metadata from web.xml and ejb-jar.xml files and information from the source code, as well as metadata from the database regarding the Customer table. The component metadata collector integrates the information into the ontology. The developer or administrator can now use the administration console to query for indirect permission as we sketched in the "Application server example" section.
Regarding the semantic management of Web services, we assume a Web application built of software components (servlets or EJBs) that in turn invoke a number of Web services—that is, they contain the "wiring" application logic. A service metadata collector parses the Web services' WSDL (Web Service Definition Language), WS-BPEL, or WS-Policy documents and integrates this information as instances in the ontology. We might also leverage WS-MetaDataExchange (WS-MEX)—a proposal that enriches each Web service with standardized methods for retrieving WSDL, XML-Schema, and WS-Policy information. The service metadata collector integrates the data into the ontology similarly to the component metadata collector. The service metadata collector can integrate runtime information stemming from, for example, load monitors as well. The developer can query the inference engine using the administration console. We take into account both WSDL and WS-BPEL documents whose URLs the administrator must provide to the Metadata Collector by hand,leaving other documents and the exploitation of WS-MEX or runtime information for future work. 2
In our Web service example, the service metadata collector obtains semantic metadata from the WS-BPEL and WS-Policy files and integrates them into the ontology. The developer or administrator can now use the administration console to query for any externally invoked Web services with attached policies at development time.
Implementation
Our intention wasn't to develop an application server from scratch but to reuse existing ontology tools in an existing application server. So, we used tools from KAON ( http://kaon.semanticweb.org) (our own Karlsruhe Ontology and Semantic Web Tool suite) in the open-source application server JBoss. 13 JBoss provides a flexible and extensible infrastructure based on the Java Management Extensions (JMX, http://java.sun.com/products/JavaManagement/). JMX makes it possible to configure, manage, and monitor Java applications at runtime and to partition applications into exchangeable components. Basically, JMX defines interfaces of managed beans (MBeans), which are JavaBeans that represent JMX manageable resources. MBeans are hosted by an MBeanServer, which allows their runtime deployment and manipulation. All management operations are performed on the MBeans through interfaces on the MBeanServer. We deploy our KAON inference engine and ontology store as an additional MBean and augment the existing component loader and dependency management to exploit the inference engine. You can use our ontology editor to browse, query, and reason with this MBean at runtime. 14
Demonstration
We demonstrate only the indirect-permission example because we're still implementing semantic management of Web services. We use the KAON Toolsuite's ontology editor to browse and query the ontology at runtime. The developer or administrator can execute a query to reveal all of an authenticated user's indirect permissions (the user dob belongs to this group). Click on figure 9 for an audiovisual multimedia demonstration. You can download the code from http://kaon.semanticweb.org/server.

Figure 9. Browsing the ontology at runtime. Click on image to view a multimedia demo.

Benefits
Demonstrating that we've achieved our goal of facilitating the management of application servers and Web services proves to be problematic for two reasons. First, application servers' complexity makes it difficult to single out, measure, and evaluate improvements of any kind. An application server consists of many interwoven parts. Often an application server subsumes several other types of middleware in one product.
Second, it's usually difficult to numerically substantiate the advantages of ontology-based applications. The best way to demonstrate their benefits is to perform a controlled experiment on a modularized application. Modules providing the same functionality with and without the use of ontologies must be applied and the application evaluated each time. Such experiments are difficult to set up, and in many cases, the nature of the application makes them impossible. This is the case with semantic management because its usage is spread throughout the target platform. Furthermore, the developer and administrator must familiarize themselves with ontologies and semantic technology, much like they have or had to familiarize themselves with deployment and WS* descriptors. In both cases, they would have to compare this familiarization effort with the management effort the system would save them. In addition, we must consider the ontology maintenance effort, because ontologies typically evolve over time. Because finding measures for a sensible comparison is difficult, we take a qualitative approach to assessing the benefits of semantic management.
Our fundamental observation is that semantic management doesn't come for free. We encounter a trade-off between expending efforts for management and expending efforts for semantic modeling. The objective of automating all management tasks by semantic modeling requires fine-grained, detailed modeling of all aspects—essentially everything that a human must know to be able to manage the middleware. Savings in other places would hardly justify such detailed modeling. So, our approach strives at facilitating the management of some tasks with justifiable modeling efforts. In essence, we compare management and modeling efforts with and without semantic management.
Application server example.Table 1 compares the management and modeling efforts of the indirect-permission example, putting aside the initial efforts of creating the ontology and integrating the ontology infrastructure in the application server. Without semantic management, detecting an indirect permission requires the understanding and checking of i servlet web.xml files; their code, j ejb-jar.xml files; and k table metadata. Using semantic management, the component metadata collector automatically integrates all the information in the ontology, so we don't have to expend additional modeling effort. In this example, one query for the permission predicate fulfills the management task. The effort required to formulate and execute the query is negligible. In an ideal case, the query would be hidden behind a button in a semantic-management console.

Table 1. Effort comparison for our indirect-permission example.

O(i*j*k)—that is, for i servlets, j Enterprise JavaBeans, and k tables, compare iweb.xml files and code, jejb-jar.xml files, and k table metadata
Web service example.Table 2 compares the management and modeling efforts of our Web service example. We've learned that formal machinery hardly supports discovering external service invocations with attached policy. Using semantic management, we can simply query for invokesWebServiceWithPolicy with no additional modeling effort.

Table 2. Effort comparison for our Web service example.

Ontologies and their corresponding semantic technology provide great practical benefits for handling middleware environments (as demonstrated by several related approaches). We plan to extend the implementation of our proposed KAON SERVER design, which is only a prototype. It isn't surprising that KAON SERVER is only a prototype because the breadth and depth of the presented use cases are large. Also, the semantic management of Web services requires considering many more details. Although the problems are similar to the semantic management of software components, this situation is more complex because of the mere fact of distribution, which entails network delays as well as reliability, trust, and other security issues. Finally, we base our assessment of semantic management's benefits on a qualitative comparison between modeling and management efforts with and without semantic management. However, a full assessment will need to include further crucial factors, such as

• application requirements (the number and size of applications, the frequency of changes, the required service level, and so on),

• organizational factors (the number of developers and administrators, their skills, their turnover, the learning curves for using deployment descriptors or for using semantic technology, and so on), and

• service characteristics (the number of Web services, as well as their diversity and complexity).

So, future research should strive for a comprehensive economic analysis of the semantic-management approach. We've taken the first steps in this direction. 15
The SmartWeb Project (sponsored by the German Federal Ministry for Education and Research) under grant number 01IMD01A, ASG (Adaptive Services Grid), EU IST project FP6-004617, and EU IST project DIP 507483 partially funded research for this article.

#### References

Daniel Oberle is a senior researcher at SAP Research. He received his PhD from the University of Karlsruhe, Institute AIFB. His thesis discussed applying Semantic Web technologies in middleware solutions—such as application servers and Web services—to facilitate developers' and administrators' daily tasks. He's the author of Semantic Management of Middleware (Springer, 2006). Contact him at SAP Research, CEC Karlsruhe, SAP AG, Vincenz-Priessnitz-Str. 1, 76131 Karlsruhe, Germany; d.oberle@sap.com; www.aifb.uni-karlsruhe.de/WBS/dob.

Steffen Staab is a professor of databases and information systems in the University of Koblenz-Landau's Institute for Computer Science. He heads the research group on information systems and the Semantic Web (ISWeb). His research interests include the Semantic Web, text mining, ontologies, peer-to-peer computing, and service management with semantic descriptions. He coedited The Handbook on Ontologies (Springer, 2004). He is a member of German Informatics Association (GI e.V.) and the German society for knowledge management. Contact him at University of Koblenz-Landau, 56016 Koblenz, Germany; staab@uni-koblenz.de; http://isweb.uni-koblenz.de.

Andreas Eberhart is a software architect for virtualization and grid technologies at HP Germany. Before joining HP, he worked for Informix Software, the International University in Germany, and the University of Karlsruhe, where he led several Semantic Web research projects. Contact him at Essentials Software Organization, Hewlett-Packard GmbH, Altrottstr. 31, D-69190 Walldorf; eberhart@cs.pdx.edu.