Issue No. 03 - July-September (2009 vol. 8)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MPRV.2009.58
Roy Want , Intel Research
In computer science we've been able to invent and explore a rich set of alternate realities. The short list includes virtual reality, augmented reality, embodied virtuality (ubiquitous computing), cross-reality, and mixed reality (some combination of the others). Given the topic of this special issue is cross-reality, it's useful to review these related concepts and their distinguishing features (see Figure 1).
The first of these alternate realities that's been widely written about is virtual reality (VR); the creation of a new world that exists solely within the data structures of a computer. Such systems allow a user to participate in a virtual world through sensory immersion using a head-mounted display, and a body-worn sensor system (often simplified to a sensor glove). VR has been successfully used in the development of games, and you might consider most of the first-person 3D games on the market as a form of VR, loosely coupling the gamer with a screen and keyboard. Here, the graphical perspective is rendered from the viewpoint of the game character, and although this isn't total game immersion in the VR sense, the quality of the game graphics and storyline is still good enough to suspend disbelief, making the interaction very compelling for users, even though it's through a narrow window into the VR world.
Augmented reality overlays information onto the real world. It's most effective for vision, but can be extended to other sensory input such as sound and touch, although smell and taste are more challenging. The use of a heads-up display capable of mixing in text and graphical overlays in specific regions corresponding to objects in the real world is a fundamental capability of this approach. An engineering challenge facing augmented reality is to be able to accurately register the overlay information onto a view of the world in real-time. Applications include maintenance engineering, enabling less skilled workers to perform advanced maintenance procedures; and navigation, enabling a person who's unfamiliar with a location to find their way around. A common form of augmented reality being sold for use in automobiles today is the encapsulated GPS system. Not only does it speak directions, telling you when and where to turn, but modern implementations provide a perspective view of the world. Admittedly, this is a simple graphical view, but sufficient in detail to make the direction choices quite clear. A driver can glance between the GPS perspective display and the car window to create an overlay in their mind's eye.
Embodied virtuality is a less well-known term, and probably the only reason I'm familiar with it is because I worked with Mark Weiser in the 1990s. His 1991 Scientific American article, "The Computer of the Twenty-First Century," describes his interpretation of the term and I remember that he toyed with the idea of using it as the title of the finished article. His notion of ubiquitous computing or ubicomp (and hence pervasive computing) was essentially the opposite of VR. Instead of users working with virtual representations of data on a PC—for example, desktop icons representing documents, printers, and trash cans—in the ubicomp vision, computers and their data were destined to be reintegrated into the world, embodied in the objects they were designed to enhance. For example: Post-it notes, notebooks, and whiteboards were part of the vision that PARC worked on in the early '90s, an exploration facilitated by Tab, Pad, and Liveboard computers taking on these physical forms. In the article Weiser wrote:
"Indeed, the opposition between the notion of virtual reality and ubiquitous, invisible computing is so strong that some of us use the term 'embodied virtuality' to refer to the process of drawing computers out of their electronic shells. The 'virtuality' of computer-readable data—all the different ways in which it can be altered, processed and analyzed—is brought into the physical world."
I'll leave an in-depth discussion of cross-reality to the guest editors and the featured papers, but to summarize, cross-reality augments a virtual world in a similar way to how augmented reality augments the real world. Thus, system designers can link sensors in a virtual world to sensors in the real world. By moving through the virtual world, users can effortlessly monitor what's happening in a corresponding area of the physical world. There are many advantages that result from such a system. First, the task is unimpeded by the environmental conditions—for example, the weather might make it difficult to perform the task in the real world. Second, it provides a spatial metaphor for representing many types of sensor reading; that is, it provides a semantic link between a sensor and the location it's monitoring. Last, other users in the same virtual space can monitor, share, and discuss the data, even though they might be widely distributed in the real world.
To some degree in these systems we're changing how our senses perceive reality to provide a more effective interface with the world. In recent time, computers and computer networks have enabled us to do this on a scale that's unprecedented in history, and it begs the question: what makes an effective augmented- or cross-reality? Even before computers, there were analog equivalents to this type of transformation. Consider the following examples:
• looking through tinted sunglasses, we see the world darker than it is; and
• looking at specimens under a microscope, biologists can resolve minute details in a world that's too small for our eyes to resolve naturally.
Both of these mechanisms change our perception in a useful way. The latter example provides further illustration, as a biologist will sometimes use dye to stain cells in a specimen and create a contrast between features under investigation and the surrounding tissue. The result can be a dramatic change in the visible detail, but the picture no longer reflects the original image of the cell. On the other hand, the result is far more useful to the biologist.
Now that digital photography has replaced photographic film-based solutions, it's common practice to modify pictures after they've been taken; removing red-eye and changing the contrast or color balance to make them more visually pleasing. Are these pictures now fake or just a different kind of type of representation?
Computers are accelerating our ability to distort reality, and it isn't clear where the limits should be placed. For example, what transformation can we make to a photograph and still consider it the same photograph? Part of the answer might be to consider if the operation is being applied uniformly, versus a localized change. Adding a new object to an image clearly changes the composition semantics. However, changing the contrast affects everything equally. The picture changes its appearance, but the image semantics do not. The examples of photographic red-eye reduction and staining a cell are more problematic because it has a localized result, the process affecting specific regions more than others. On the other hand, a stain affects all instances of features that absorb the stain, and red-eye reduction applies to all eyes in the picture, so these techniques still preserve the underlying composition of the viewed image.
However, computers have the ability to make arbitrary non-linear, local or global changes to any data that's presented to a user, and thus can augment our perception of reality, hide it, or falsify it. From a user's perspective, we now have a problem—data presented to us in a cross-reality could belong in any of these categories, and we can't tell which one. This isn't just an issue for systems created to mislead us, but it can also occur as the result of inexperience or design error. Programming any kind of system has the potential to introduce bugs, and complexity accentuates the problem. For example, post-digitally focusing an image is likely to involve complex transforms that result in a clearer photograph, but how does a programmer know when the code is working properly? If a few test images work fine, many programmers will considered the job done, but transformations of some alternate images could result in undesirable artifacts.
A challenge for changing our reality in a pervasive computing world is to ensure we're benefiting from the advantages and not from the disadvantages of a badly-designed transform. In nature, evolution has resulted in many different kinds of sensory systems being developed in animals that need to survive in diverse environments. Presumably, mutations led to these successes—but there must have been many failures along the way that were removed from the gene pool. The analogy with cross-reality may be the failure, or success, of applications that attempt to present information in a specialized form, either resulting in systems that are unusable, or systems that prove indispensable tools for specific types of work practice. As with many of our explorations in Pervasive Computing, it's all a big adventure, and you never know where a new method of interacting with our world will lead us.