Pages: pp. 32-37
If you're like most of us, these days you're hearing about agents in virtually every computer publication you read. Even the Wall Street Journal and the New York Times are talking about "intelligent agents" that will revolutionize the Internet. Is all this hype, or is there a serious core to agent research that deserves all the attention? To be honest, these days, I'm not really sure. What I do know is that many aspects of agents—the communities using the term, the communication languages that support the research, and the functionalities that different so-called agents provide—have confused terminologies and multiple threads that all blend together. Although there's no way an editorial such as this one can unravel the tapestry, perhaps I can shed light on some of the confusions and point out some of the differences.
I've been giving talks about intelligent agents for more than a decade, but increasingly in the past two years I've been asked, "But what do you mean by 'agents'?" Once upon a time I knew what the term meant, and my community seemed to agree with me. In my first journal publication on the topic, my coauthors and I had a clear idea in mind—we were working on a planning system that interacted with a changing environment in real time. 1 Our dynamic-reaction model described how an intelligent agent—that is, a planning system embedded in a dynamic world—could change behaviors and avoid catastrophic results in a manner very different from the underlying mechanisms used in Strips, Nonlin, and other generative planning systems. This idea was then very much in vogue, and many common aspects of what is now called agent technology grew from these roots. The belief-desire-intention architecture, the dynamic planning agent, the key ideas in softbots, and many other such ideas have entered the vocabulary through this intelligent agents literature. (If this were a scientific article, not an editorial, I would feel compelled to properly cite the dozens of papers in this and other areas discussed in this article. Claiming editorial license, I'm going to recommend the Web resources listed in the related sidebar, which will take the interested reader to many of the papers that I would otherwise cite.)
Unfortunately, at the same time my research colleagues and I had this clear view of intelligent agency, so did several other AI communities. For example, work in decision making, going all the way back to early economic and philosophical literatures, discussed the decision making of a "rational" being, an entity that could be used in explaining many behaviors. As computer scientists looked at these behaviors, the term agent emerged as a convenient way to describe the decision-making entity exhibiting them. For example, a "bounded rational agent" had decision-making capabilities that were restricted in terms of time or computational resources. Similarly, the economic models are convenient for predicting the behavior of large sets of entities (aka agents) that bid for contracts using limited economic resources. This idea resonates particularly well with e-commerce approaches, reinforcing these models as an important part of the agents literature.
Meanwhile, both the embedding of an agent in an environment and the notion of a decision-making agent come to the forefront when you start working on mobile robotics, particularly autonomous mobile robots (aka agents) that operate in natural environments. As the capabilities of such robots improved—that is, as they seemed more intelligent—more agent terminology entered the robotics literature. As researchers built multiple robots that solved tasks cooperatively, the term "multiple agents" also joined this literature. Thus roboticists also started using "agent" to discuss their work—in fact, a roboticist colleague of mine (David Miller, then at NASA's Jet Propulsion Laboratory, now the Technical Director of the KISS Institute for Practical Robotics— www.kipr.org) was fond of telling government agencies that they should stop thinking about AI (artificial intelligence) and start funding IA (intelligent agents). Many current agent architecture ideas, particularly those of multiple-level architectures and emergent behaviors, grew largely from this sort of robotics activity.
While the above examples by no means exhaust AI work using the term agent, they were some of the major strains. To see this work's impact in the field, simply consider the past few winners of the International Joint Conference in AI's Computers and Thought Award—AI's top prize for young researchers. Such work in robotics was recognized by the presentation of the award to Rodney Brooks in 1991. That year, the award's corecipient was Martha Pollack, who worked on planning systems and promoted the research in that sort of intelligent agents. In 1995, the joint recipients were Stuart Russell, for his work in bounded rational agents, and Sarit Kraus, who worked in multiagent negotiation. In 1997, Leslie Kaelbling won the award, acknowledging her work that combined all three of the strands described above. In short, the field recognized all these strands as critical, and many of the top young researchers in the field worked on them.
If only several strands of the AI community used the term "agents," that would be bad enough. But several other communities were also using it, which drastically complicated the situation. The Internet's growth was giving rise to a new model of software—one that took encapsulation beyond the object level and gave software modules a "temporal" extent. That is, it created subroutinelike entities that could be called separately from the individual programs that created them. These "software agents" became particularly valuable when coupled with mobile code—either in the form of Java applets or as "mobile agents" that could move around a network.
Given such mobile agents, why not consider them as workers who could do tasks for users? Thus, a "search agent" could help a user find things on the Net, a "routing agent" could help packets move faster, and other agents could perform various other tasks for their users—analogous to the way travel agents help us plan trips in the noncyber world. One important aspect of this work was the design of "information-filtering agents," which could look for particular conditions in a network or data resource, and alert a user or take other actions. Thus, ideas from the database community, particularly the notion of standing queries, also started to have an impact on the growing panoply of agent designers.
User-interface researchers were also taken by the term agents. In an effort to explain new interfaces in a way that would distinguish them from direct-manipulation approaches—interfaces that required the user to be in control at all times—the phrase "autonomous agent" emerged. The notion of autonomy and mobile networks allows for a model of "interface agents" that, using the software-agent ideas above, lets the user dispatch tasks. Similarly, the entities on the screen could look more lifelike (not just like that annoying little Microsoft paper clip, but really lifelike, including VRML and other 3D models of humanlike forms), so you could relate to them as agents with which you would interact.
By this point, it should be clear that a lot of exciting research areas, only some of which related to each other, were emerging—all using the term agents. Because of all this attention, both commercial and government entities decided to use the new technologies (lumping them all together as "agents"). This, of course, led to investment, which led to the current feeding frenzy. Pretty much anyone who is implementing anything, especially if it is distributed or network-based, is willing to call it an agent. Thus, the term agent, even limited to an information technology context, refers to many different software concepts (see Figure 1).
Figure 1 A list of some agent technologies.
I wish I could say the situation is getting better, but I'm afraid it isn't. The only consoling thought is that if history is any predictor, in a few years people will be frustrated with "agents," funding will dry up, and only the core uses of the term—mobile software, autonomous systems (software and robotic), and interfaces—will likely survive. We can but hope....
Just as the term agent has been used very ambiguously, one of the key underlying ideas—that of agent communication languages—also has covered a number of different areas. These areas are often conflated, thereby adding to the confusion. While one ACL group might worry solely about mobility, a second might be discussing auction bidding mechanisms and ignoring mobility all together. A third might be concerned with the content exchange between distributed planning systems, worrying about neither of the other two.
Roughly speaking, researchers have seemed to use "ACL" to refer to four different key components: the performative, service, content, and control add-ins levels (see Figure 2).
Figure 2 The four levels of agent communication languages.
The performative level is where much of the action has been. By focusing on a small set of terms that we can widely use to distinguish one agent's information requirements from another's, we can design standard mechanisms that let agents negotiate an agreement during cooperative problem solving. Essentially, this level provides the protocols with which agents can establish some form of knowledge interchange. Many current proposals for standard ways for agents to communicate (for example, KQML and ACL) focus primarily on the design of unambiguous performative terms.
However, those models that assume that agents must find each other across a distributed network might need a second level. For agents to find each other, they must be able to advertise their capabilities, find agents that can provide the capabilities they cannot, or both. To use a human analogy, if you wish to buy flowers, you need to find someone who can sell them. This can be harder than it seems, because either agreement must exist within the community as to what terms will be used for advertising capabilities, or else some sort of complex thesaurus must exist to provide matchmaking services. Thus, a flower buyer might want to interact with an agent that can "sell roses."
This service level of communication does require agreement on terminology or at least a common thesaurus, but the level of ontology needed to support this activity might not be extensive. Thus, when you try to buy the flowers, the two agents brokering the deal might need to know a lot about flowers, prices, delivery addresses, and more. However, finding the other agent might only require those terms that define major classes and activities. There is an open debate as to how much information is needed to provide matchmaking for agent applications, but the need for some sort of yellow pages is manifest. (These yellow pages are not the same as what Internet services call yellow pages. Currently, such services essentially take a URL—for example, cs.umd.edu—and produce a corresponding machine address—128.8.28.whatever. This is more similar to white pages in the real world. Here, we refer to real yellow pages, where you can look up florists or other such services.)
However, as I just pointed out, once agents are communicating (using performative terms and having found each other through the service advertisements), they do need a deep level of problem-solving knowledge (often domain-dependent). This is the content level. Research in deep ontologies, such as those supported by the DARPA High Performance Knowledge Bases program (see www.teknowledge.com:80/HPKB/), focuses on providing the sort of complex knowledge of the entities and actions in a domain that such problem solvers need. These "content" languages are another key need of complex, distributed multiagent systems, and so are themselves referred to, at times, as "agent languages."
Finally, other sorts of language features might be necessary as multiple agents try to coordinate their activities, or even as single agents are created to allow more complex functionality on distributed systems. These features constitute the control add-ins level. Where multiple agents might vie for resources, they need mechanisms that let them represent and reason about their own computational needs or those of other agents. Further information might be needed to determine computational aspects of agents, such as whether they are enabled with mobile code, whether they are authorized for various functions, and what languages they communicate with or are encoded in. These sorts of features tend to be unnecessary in non-agent-based systems, so they too are a part of agent communications.
These four levels are often confused when people discuss agent communications. A gray area does exist between these levels, but keeping them in mind can help you understand some of the differences between competing systems and standards.
Another kind of confusion arises when we look at what agents can do using the approaches and language levels described previously. Although agents have been deployed in many domains, the functions that researchers are exploring have some commonalities. I divide these into roughly four categories: problem-solving, user-centric, control, and translation agents (see Figure 3). As with so much else with agents, these agent capabilities are often confused together, making it hard to determine what kind of agent approach is coupled with what functionality.
Figure 3 The four categories of agent functionality.
In many cases, researchers working on various agent approaches are looking at complex problem-solving behaviors delivered by network-based software. These problem-solving agents are often derived from AI planning research (and might also include hardware robots). These agents often resemble expert systems, in that they encode domain-specific information to achieve functionality—sometimes using rule-based approaches. The key difference between these agents and the more traditional expert systems is that the agent-based approaches generally focus on agents that provide capabilities for the user—that is, the ability to gather information, to access (and download) Web resources or data from databases, and so on. Colloquially speaking, these agents are like expert systems with hands and feet—abilities to manipulate the information (or physical) world on the users' behalf. A common use of such agents is to gather or filter information. Thus, a user might use an agent to monitor a particular information source and provide an alert if some set of conditions holds.
A second breed of agents worries less about the problem-solving and more about its interaction with a user. Such an agent might include an animated interface tool that can engage the user in an interaction (perhaps aimed at eliciting and solving user goals, perhaps simply for entertainment). This user agent might also focus more on serving as the users' tool for interacting with the network or other applications. This latter situation typically involves interaction with some sort of problem-solving agents (either combined in a single program or through a mechanism that lets two agents cooperatively solve a problem—one gathering the users' goals, the other solving the problem).
The control agent, which is exclusive to multiagent systems, primarily provides control services to other agents. Thus, the matchmaking function we discussed earlier (the yellow pages for agents to find each other) could be considered a control agent. Such an agent's problem-solving behavior is not tied to a particular application domain. Rather, the agent is a program that helps other agents function together—to find each other, perhaps to control resource use, and otherwise to allow larger-scale multiagent systems to function. In e-commerce applications, for example, where a great many very simple agents compete for bids or the like, agents are clearly needed that can help manage auctions, project how to value resources, and control the distributed system's overall functioning.
Finally, translation agents, although not often seen in current applications, are becoming increasingly important. Consider two systems with differing data standards. Although they might be able to find each other through matchmaking, they're not designed to be compatible. In some cases, a translation agent could provide a bridge between these services. In a distributed system, messages between two agents could be routed through the translation agent, thus letting them communicate without having to be aware of the incompatibility. As the Internet and the different, often competing, agent standards evolve, such agents will be needed for distributed systems to cope with the increasing heterogeneity that results.
As is the case with the different levels of ACLs, the differences between these different types of agents are often blurred. However, the implementation styles across these types have many similarities, adding to the terminological confusion, but enabling common research and standards development among the communities doing the different work. (For a look at one initiative that attempts to combine these four levels, see the sidebar, " Putting it all together: DARPA's CoABS program.")
It should be clear by now that the so-called field of agent-based systems is really not one field at all. Just as agents conferences present a wide range of papers ranging from Internet routers to large-scale planning and scheduling systems (as well as robots and search programs), so too does the scientific literature often confuse these issues. I hope this short article helps you understand the articles in this, and many other, publications.
As with most hot topics on the Web, many sites describe work in agent-based systems. These links might help you learn more about this research:
These three sites describe particularly interesting research:
As with all other Web pages, the quality of these pages is variable and changing.
The accompanying article describes four kinds of agents: problem-solving, user-centric, control, and translation agents. Large-scale, cooperative teams comprising interacting agents from all four groups could offer new capabilities that are now beyond the realm of software designers. An infrastructure that provides these capabilities would let software developers design smaller pieces of code that would primarily solve problems by interacting with each other, rather than by each trying to duplicate functions provided by others.
In such a world, heterogeneous systems, separately developed, could be integrated into compound systems at runtime, based on the needs of the particular problems being solved. Finding these code pieces would be enabled by yellow-pages servers using taxonomies of common functionalities. Where gaps exist between the agents, functions such as translation services could provide greater interoperability by seamlessly filling in those gaps. Users would be able to program and interact with their agents using interface tools that let them set preferences and describe needs without specifying algorithmic details. In addition, brokering agents could manage the entire "grid" of cooperating agents—helping to manage the efficient flow of information across the grid. These agents could also provide tools for access control and information security, and provide a database allowing post hoc analysis of problem solving and other grid-management services.
Such a system would provide many capabilities that cannot be realized with any but the most state-of-the-art tools. Users would be able to set up queries to search and filter large knowledge bases, to search through the Net or other information sources, or to find computational resources needed for the problems they were trying to solve—all without needing to know the details of the underlying systems or information repositories. Current legacy systems could be brought to the grid through software wrappers and service descriptions, allowing their functionality to be tapped without major recoding. In addition, the cooperative nature of the problem solving, using existing software components, would let both military and industrial users develop large-scale applications without large-scale software development.
This vision of the software of the future drives the Control of Agent-Based Systems program, supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL). CoABS provides funding to almost two dozen universities, companies, and research institutes to cooperate in developing the underlying science for developing and controlling such an agents grid. In addition, CoABS is developing software components for validating this concept. Through iterative development and evaluation, the CoABS researchers are providing the necessary software tools and techniques to realize this vision.
For more information about CoABS and the researchers it supports, see our Web site, www.globalinfotek.com.
This article was written before my arrival at DARPA and does not necessarily represent the opinions of DARPA, the ISO, the CoABS program, or any other US government agency. Figure 2 is based on terms in the CYC ontology created by Douglas Lenat and Cycorp.