Issue No. 06 - June (2006 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MDSO.2006.41
Heinz Stockinger , Swiss Institute of Bioinformatics
Grid computing has gained tremendous popularity in the last five years and remains a major topic. However, with this popularity has come a lot of criticism. Some people see in the Grid the next-generation Internet and a potentially huge impact on our future lives. Others call it hype that hasn't fulfilled what it promised to do. Still others argue that there's no real distinction between Grid computing and Web services.
In this article, I analyze the roots of Grid computing, revise its goals and characteristics, and present a critical discussion of Grid technologies. In addition, I give my view on where the Grid is positioned within computer science and to a lesser extent in industry. My main aim is to review the Grid's current state and if it's applicable in a business environment. These questions emerged in the BIG (Business In the Grid, http://www.cs.univie.ac.at/big) project at the University of Vienna, where new Grid business models were studied. 1
The Grid's background
Grid computing has roots in parallel and distributed computing, dating back to the '80s and '90s. At the time, supercomputers dealt with problems including Grand Challenges 2 (such as climate modeling) and computing-intensive simulations in weather forecasting.
A main aim of the supercomputers era was to deal with the large tasks that typically appear in computational science, engineering, and so on. A characteristic of such supercomputers was their homogeneous, typically expensive hardware. Several centers worldwide provided computing power by means of supercomputers. Interconnectivity among processors within supercomputers was usually good, even within a computer center. However, wide-area network connectivity among computer centers was often not very good. What's more, a homogeneous programming and security model to connect these computing resources wasn't in place, considering that technologies such as MPI (message passing interface) were mainly designed to run on a single machine where CPUs and networks have similar hardware and performance characteristics. The Distributed European Infrastructure for Supercomputing Applications ( http://www.deisa.org) now aims to close this gap by connecting national supercomputing centers.
Because supercomputing technology became too expensive, cluster computing 3 began to gain ground. This application domain is similar to the original supercomputing domain. The major difference and breakthrough is that computing power is gained by connecting several standard, off-the-shelf hardware components and building a single machine. Exchanging individual components such as processors, disks, or network cards without big costs becomes rather simple.
Within the supercomputing community, several projects focused on fast data access for mainly parallel, out-of-core computation. A special research community dealt with such parallel I/O systems. 4 These storage activities were also direct predecessors of modern data grids.
Another important development is the Internet and its protocols enabling the Web and Web services. Well-established protocols such as TCP/IP and HTTP build the base for newer protocols such as SOAP. Protocols are one of the main building blocks that let Grids and other modern distributed computing technologies work together on different platforms. In particular, HTTP (which Tim Berners-Lee invented when he was working for CERN) enabled a great breakthrough, first in academia but soon in the commercial world as well. Today, the Web is part of our daily lives.
These three domains (supercomputing, cluster computing, and the Internet) strongly influence grid computing. Additionally, they influence each other such that it's often impossible to clearly separate them. I discuss this in more detail later.
Grid characteristics and definitions
The term Grid computing, or simply Grid, was introduced in 1998. 5 The driving force was the need to carry out computing-intensive tasks. In response, the Grid was developed to provide computing power on demand to everybody who needed it.
Originally, Grid computing mainly targeted supercomputing application users. In addition, the applications can be even more computing intensive than conventional supercomputing applications—that is, if a single supercomputer or cluster can't manage the application workload, several supercomputers or clusters connected via a wide-area network can provide more CPU capacity. More specifically, high-performance computing (also referred to as supercomputing) and high-throughput computing are distinguished in distributed Grid computing. In the former case, a single application is processed as quickly as possible to optimize its execution performance; in the latter, many user applications are executed so as to minimize execution time. High-throughput applications are typically also more loosely coupled than high-performance computing, making it easier to distribute them to a heterogeneous Grid environment.
Another important aspect of Grid computing is that it provides large storage space. Similar to sharing CPU cycles, several data centers can be connected to make available large amounts of storage. This is also known as a Data Grid. A driving force in the scientific domain is CERN's Large Hadron Collider (LHC) program with its huge storage requirements: several Petabytes will be created every year for about a decade. This program's importance became evident in the early Grid computing conferences in 2000, when several researchers mentioned LHC in keynotes and discussions as a sample data grid application. Data Grids, with their storage and data management capabilities, have become integral to Grid computing.
Beyond scientific computing?
Looking at application domains, Grid computing covers mainly the research sector. Many people believe that in the future, the Grid will change computing and possibly even everyday life. If we consider the percentage of people who work in science, however, this belief seems unfeasible if the application domain doesn't change. In other words, a nonscientist's daily life doesn't often include large computing or storage requirements. This rules out the use and therefore attractiveness of Grids for the average citizen. You could argue that the same was true when the Web was introduced, but the Grid seems to have been designed with the idea of creating a big market and influencing our lives. For other emerging technologies, we haven't seen as much marketing.
To reach the aims of pervasive and ubiquitous computing, the Grid must attract more people by being able to solve day-to-day computing problems. Furthermore, the Grid should use simple concepts that many people can understand. The system's complexity must be hidden by easy-to-use tools, interfaces, and high-level applications. One step in this direction seems to be service orientation and the use of Web services, a concept that's already widely used in the computing world. However, further work is required. This became particularly evident within the scope of the BIG project ( http://www.cs.univie.ac.at/big), which aimed to discover new business models.
Here's a practical example. If you decide to promote Grid technology for enterprises and business customers, the core Grid application domain is limited unless you write new services that connect distributed sites. For instance, Grid technologies can be used efficiently in the automotive sector, where computing-intensive simulations are required in the design phase. 6 On the other hand, a small company might not need to apply Grid technology unless it opens a new market with either customers or business partners.
Another important factor is stability. Grid technology must stabilize and not change every few months. We've seen many major changes in Grid systems in the scientific domain. Sometimes even early adopters don't want to continue changing their systems. The scientific community reacts less reluctantly to system upgrades, but business users must have a continuous and guaranteed service operation.
Most importantly, the Grid provides connectivity and therefore access to distributed resources. Rather than only promoting the mainstream domains of computing- or data-intensive applications, for the day-to-day user, transparent access to distributed resources is much more important. This access can be simple, such as connecting to an image database via a cell phone. Keeping in mind this connectivity-driven thinking, it's easier to define new applications that can be built with standard Grid or Web service technology.
Grid, Web, and distributed computing
Several discussions exist on the difference between Grid, Web, and distributed computing. Given the development of the three as I pointed out earlier, it's clear that both the Grid and the Web are special forms of distributed computing. In particular, both typically deal with wide-area distributed computing where computing and storage resources are connected via at least one wide-area network link.
The difference between Grid and Web computing has become less evident with the introduction of the Open Grid Services Architecture and the definition of a Grid service. In OGSA as well as in the Web Services Resource Framework, a Grid service is basically a Web service with some additions to make it persistent (that is, it's able to store state information persistently rather than transiently at the server beyond the lifetime of a single request). You might still claim that Grid services have different goals from pure Web services, but a clear separation seems neither useful nor feasible.
The term "Grid" isn't clearly defined and allows for a lot of interpretation. This is a common problem in computer science in general; exact definitions of terms don't always exist as they do in, say, physics or mathematics. For example, velocity, acceleration, weight, mass, and so on are all clearly defined, and physicists can clearly understand them. We can therefore observe that the Grid and the Web become more alike regarding services capabilities. Both usually include a combination of programming and data.
As a bottom line, we can say that Grid technology frequently uses Internet as well as Web service technology. It provides service add-ons and therefore extends the field of Web computing.
Applying Grid technologies
I have gained a lot of experience in Grid development, 7 deployment, 8 coordination, and teaching 9 within several projects in Europe—mainly in the EU DataGrid (EDG) Project( http://www.cern.ch/edg), the LHC Grid Computing Project ( http://cern.ch/lcg), the Enabling Grids for E-Science (EGEE) Project ( http://www.eu-egee.org), and the BIG project. It's of major importance to apply this technology for Grid users. This section focuses on how to apply the Grid, from introducing the concept and giving first ideas to planning for a computing infrastructure and then actually installing and deploying the system. I point out the main things to keep in mind and the issues to prepare for.
Before we can think of applying Grid technologies, we should ask if the technology is ready for efficient deployment in the scientific community or even in a business or an enterprise environment. In my experience, several projects provide promising tools that already have specific goals and therefore clear application areas. However, several general questions arise.
Can I use a Grid computing model?
Recently, developers have looked to Grid technology to solve several computing problems. However, you can't apply Grid computing to just any problem. As with any other computer technology, the problem must have certain, clearly identifiable characteristics. Among other things, typical Grid applications have access to distributed resources and are computing- and data-intensive.
Which software should I use?
If you want to start a new project, the first question is which Grid software solution to use. Projects such as the Globus Alliance ( http://www.globus.org) provide the most widely known Grid middleware toolkit. However, for several years it has been in a transition phase, so it's neither 100 percent stable nor completely ready for intensive production use. Other main projects, such as EDG, LCG, and EGEE (which all provide the same code base), offer a more high-level, integrated set of services that's already in use in more than 200 sites worldwide, although not commercially. A longer list of Grid products is available at www.gridcomputing.com. However, to the best of my knowledge, none of the products has been in extensive commercial use, except perhaps Avaki ( http://www.avaki.com).
How do I install the system?
Once you've selected a solution, installation and deployment questions arise. Because Grid solutions often interact closely with the underlying system (except pure Grid services), installation and configuration aren't easy yet. For instance, there's no simple installation and configuration method that lets a user without knowledge of the software install and use it. Almost anybody knows how to download and install a Web browser, but Grid systems haven't yet come this far.
Who will provide long-term support?
This question can be answered in close relation with the short lifetime of current Grid solutions. On one hand, the Grid community is rather new; on the other hand, a lot of technological change occurs in short-term rather than long-term software solutions. Several companies—such as Avaki and Univa ( http://www.univa.com), to name two prominent ones—support their own solutions. We can expect short- and middle-term support, but a long-term support strategy will depend heavily on how much market share Grid technologies gain within the next decade.
What about upgrades to new versions?
Upgrading and versioning of existing Grid systems is another essential factor that closely relates to the previous items. Until now, new technology trends have typically introduced software versions that weren't compatible with previous systems. A clear strategy must be provided.
Many of these issues aren't yet fully solved, making it difficult to use Grid in business environments. In addition, many technical issues are still open. 10
Many problems must be solved to make Grid technology broadly attractive and usable, particularly in a business or enterprise environment. I've mainly discussed the software side of Grid computing, but we must distinguish between the software and hardware infrastructure. Both must be provided to make the Grid usable.
Software technologies (middleware)
First, you must have the software (middleware) to actually build a Grid. Several different, mainly layered architecture models exist. Often, Grid software is an add-on to the operating system and is used to interface to the end user application code. You might argue whether this is the right model for all applications. For instance, can Grid software be part of the operating system, as in Apple's Xgrid ( http://www.apple.com/acg/xgrid), or should it be an easy plug-in into the browser? To make it interesting for a broad community of users, it should be easily accessible and available.
Software alone doesn't build a Grid. You must have a hardware infrastructure where the Grid software is running. Often, the actual resource sharing isn't trivial because issues such as ownership for resources and usage costs arise.
Who provides the infrastructure? Projects such as EGEE and Open Science Grid ( http://opensciencegrid.org) provide such an infrastructure, mainly for research and academia. Currently, resource use is free for members, due among other things to the fact that accounting systems aren't yet fully deployed. On the other hand, industrial initiatives build a Grid infrastructure for business partners (such as Megagrid ( http://www.oracle.com/technologies/grid/megagrid.html)) where resource access isn't free.
Companies must determine if they have the in-house resources (hardware, software maintenance team, and so on) to build their own enterprise Grid or if they want to use and access an existing infrastructure. This issue hasn't been discussed much, but it will become more important once companies are ready to use Grids. The Grid itself actually provides for the latter with its notion of a virtual organization. However, security systems, accounting, and billing must be in place to make it useful in the commercial world.
Consequently, the equivalent to the electric power grid first needs to be built. One might even imagine that commercial Grid service providers will take over the role of current Grid infrastructure builders—that is, similar to providing a wireless telephone network, Grid service providers might one day provide a commercial Grid infrastructure for business users.
And then there's desktop computing
I've mainly been discussing the main trend in Grid computing, where services run on dedicated machines, usually for a rather long time. However, there's also the domain of desktop Grids that take advantage of CPU time only when the CPU is idle and not busy with other tasks.
Desktop Grids can attract a large amount of resources, but they can't guarantee resource availability and stability. Therefore, they're only useful for certain applications where data or computing results can be lost or easily reproduced.
Not many companies rely on Grid technologies, and several improvements must be made for more companies to integrate these technologies. I'd be happy to see a counter article to this one in upcoming years that provides positive and constructive answers to the questions and concerns I've raised.
Thank you to Flavia Donno and Helmut Wanek for valuable comments on the article. Project number 10547 of the OeNB (Oesterreichische Nationalbank) Anniversary Fund supported this work. The BIG project was funded for two years, from 1 January 2004 to 31 December 2005.
Heinz Stockinger is a scientist and Grid specialist at the Swiss Institute of Bioinformatics. He also holds a teaching appointment at the Swiss Federal Institute of Technology (EPFL) in Lausanne. He received his PhD in computer science and business administration from the University of Vienna, Austria. He worked many years for CERN within the EU DataGrid and the EGEE projects, where his research work focused on Grid data and resource management. In 2005, he was the Program Chair of the 1st International Conference on e-Science and Grid Technologies and Program Vice Chair for the International Workshop on Grid Computing. Contact him at the Swiss Inst. of Bioinformatics, CH-1015 Lausanne, Switzerland; email@example.com.