David Alan Grier is a writer and scholar on computing technologies and was President of the IEEE Computer Society in 2013. He writes for Computer magazine. You can find videos of his writings at video.dagrier.net. He has served as editor in chief of IEEE Annals of the History of Computing, as chair of the Magazine Operations Committee and as an editorial board member of Computer. Grier formerly wrote the monthly column The Known World. He is an associate professor of science and technology policy at George Washington University in Washington, DC, with a particular interest in policy regarding digital technology and professional societies. He can be reached at email@example.com.
Out With the Old and in With the New
Computer scientists and computing engineers don't deal with obsolescence well. We quickly abandon old forms of technology as soon as new ones show their promise. We claim that old software is useless and call the people who still use old systems "dinosaurs" or some other term that suggests that they are no longer productive members of the community. Yet when we discard old technologies, we often fail to see how one technology builds upon another and how old ideas reappear in new solutions.
The problem becomes worse when we start talking about new fields of research. Over the past 12 or 18 months, the IEEE Computer Society leadership has been discussing three new fields of research and wondering if they are about to replace older subjects that were used for categorizing knowledge. Has big data replaced data mining and knowledge engineering? Is the Internet of Things really anything more than the combinations of electronic sensors with network technology? Is cloud computing anything more than a novel name for services computing or an application of the ideas of distributed and parallel computing?
The answers to these questions are important for two reasons. First, they determine how the Computer Society will allocate its resources. Quite recently, the Computer Society decided that big data did indeed represent an important new field of computing, a field that included database design, real-time computing, data mining, statistics, knowledge engineering, and high-performance computing. As a result, we are devoting some of our publication staff and our funds to a new journal on the subject.
The second reason that makes these answers important is more personal. It involves the identities of our members. When we decide that we are going to invest in one field and not invest in another, some members feel that we passing a judgment on them. When we discussed cloud computing at our meetings, some members argued that we had many researchers who were doing cloud computing research but they called their work distributed computing. By changing the name of the field, these members claimed, we would be dismissing the contributions of those researchers.
Of course, good technical contributions will always be important, no matter how they may be described or how they might be named. A rose by any other name will still smell as sweet. Yet, names are important and they do not die easily. A friend of mine became deeply involved in grid computing and has been sad to see the field become eclipsed by cloud computing. She started working on grid computing problems in 2008, she told me, because she thought it was going to be the next important field in computer science. "And then," she said, "2009 became the year of the smartphone." That year was followed by the year of the tablet computer, which probably lasted for two years. The year of the tablet was followed by the year of cloud computing in 2012, and we are now the year of big data in 2013.
Of course, grid computing is not technically a form of cloud computing, though it is often presented as a form of the cloud. Some commentators view it as a form of Infrastructure-as-a-Service or Platform-as-a-Service. In fact, the two have different origins and different goals. Grid computing grew out of the academic world and embraces the power of complexity. It borrows from parallel computing, cluster computers, and virtual machines.
Cloud computing came from the commercial world and strives to present a simple interface to the user. Though it draws from some of the same sources as grid computing, it is also based in some of the ideas about common resources that could be found in Web hosting, timesharing systems, and even data center management.
Grid computing clearly had its first big successes in the scientific community during the early 1990s, as the Internet began to grow and the cheap workstations were starting to become more cost effective then the big supercomputers. CERN, the European Nuclear Research Center in Geneva, helped promote the ideas that became central to the grid when it did some early experiments with networks of Apollo Workstations in 1990.
Another highly visible experiment in grid computing was SETI@home, which was conducted by the Search for Extraterrestrial Intelligence in California. They created a network of personal computers to process radio signals to see if they had any sign of being produced by intelligent life. These computers were owned by private individuals who volunteered their machines for the project. The SETI@home software would use these computers when they were not being used by their owners.
Because grid computing was developed as a scientific tool, it initially lacked many of the services that would be needed by a commercial firm. The early versions of the grid were rarely able to log all the jobs being processed, determine the cost of the computation, or make the best use of grid resources. By contrast, cloud computing came from the commercial world and included those features from the very start. "Cloud computing enables the use of computational capacity as a utility," explained a recent paper on the subject. If you are going to provide computing power as a utility, you need to monitor it, management it and bill for it as if it were a utility.
However, if we look closely at the origins of cloud computing, we find that the start of the cloud is not in any specific technology but in the fact that computing centers have always been centers of expertise. You cannot use a computer without expertise. If you do not have much expertise yourself, you have to rely on the expertise of others. Furthermore, some expertise you only need on rare occasions – the expertise of building a new database, for example, or of developing a cybersecurity strategy. You will usually get the best expertise for these specialized circumstances if you can share it with others.
The customers of IBM were probably the first group to recognize that computing expertise should be shared. In the mid 1950s, they created an organization to share the expertise of their employees. At the time, there were only 18 organizations that used IBM's scientific computer and they basically knew all the skilled programmers in the world.
From this humble beginning came computing service bureaus, central time sharing, Web hosting companies and eventually, cloud computing organizations. All of them provided computing services but more importantly, they allowed all of their users to have access to expertise, to people who knew how to operate computers well.
Cloud computing, of course, has become an important service business that demands new technologies. The Chinese Academy of Sciences estimated that Chinese businesses spend ¥3.5 Billion (RMB) for cloud computing services, which includes ¥2.8 Billion for the most common form of cloud computing, hosted software or Software-as-a-Service. It also estimates that this market will grow by 70 percent over the next 3 years and may represent about 2.5 percent of the global cloud computing market.
As the Computer Society has debated how to treat cloud computing, we often returned to the idea that the cloud does not seem to be a unique form of technology but instead seems to borrow many different forms of technology, both new and old. Recently, we began to realize that cloud computing may quickly become tightly connected to the problems of big data because big data demands special expertise that many companies will not be able to find.
Big data will involve large, distributed databases, high-performance computing, and specialized analytical techniques. Many of companies will find it difficult to hire people with these skills and so will want to turn to organizations that can provide them. The logical provider of these skills will likely be the companies that offer cloud computing services.
Of the skills needed for big data, at least some of them will be derived from grid computing. In particular, the grid projects have learned how to handle massive distributed databases with existing computer resources. So even though we have declared cloud computing to be an important topic and have decided to invest our resources in the cloud, we may find that were are actually promoting ideas from other fields, including grid computing, that are no longer getting the attention that they once received.