Eucalyptus: Delivering Private Cloud Software

Dejan Milojicic, HP Labs
Rich Wolski, Eucalyptus Systems

In this interview conducted by Dejan Milojicic, editor in chief of Computing Now, Rich Wolski, CTO of Eucalyptus Systems, discusses the services and applications his company provides. A shorter version of this interview appears in print, in the April 2011 issue of Computer magazine, and the audio is available through Computing Now.

Dejan Milojicic: I’d like to ask you a few questions about this whole area and the hot topic of cloud computing, but first of all, what was your original motivation to start Eucalyptus?

Rich Wolski: Eucalyptus started in my research lab at the University of California, Santa Barbara. I’m a computer science professor there on leave to do the commercialized version of it.

It was originally designed to solve a distributed computing problem in which we were attempting to link together the National Science Foundation’s supercomputer centers and Amazon’s AWS and several university sites. We had a legacy, large-scale HPC science code that we were trying to run in Amazon that had already been ported to the supercomputer centers, and we needed to find a quick way to get the same version of the code to work at a number of different disparate university sites.

The fastest way that we could think of to do it was to build an emulator for Amazon’s AWS so that the version that was running inside Amazon could quickly be ported to other nonstandard university environments. That emulation layer — that set of Web services that could fool the science code into believing it was Amazon — became Eucalyptus.

Milojicic: So compatibility with Amazon Web Services was one of the key design choices. What were the other choices and the trade-offs that you had to make while you were designing and implementing the system?

Wolski: Absolutely right. The science was driven by this notion of doing a hybrid. The other design criteria were portability to a variety of infrastructures that had different levels of management and perhaps different technology life cycles.

As you can imagine, university datacenters that are available to researches vary widely. They vary widely in the equipment that’s available. They vary widely in the availability of the system administration support, in operating system choice, in vintage. And we, in order to run the experiment, needed Eucalyptus to be able to exist in all of those environments with equal functionality.

We designed it with a great deal of attention to portability and to what we call agnosticism for the underlying technologies. We tried to take as much of the existing implementation or installation that we found in each datacenter as being part of the infrastructure, including the version of Linux that’s there, the systems libraries that were there, the compiles that were there, [and] the storage architectures. All of the things that make a datacenter a datacenter, we took as being necessary for Eucalyptus to be able to manipulate. That really made the design of the system very particular, because we couldn’t depend on any specific functionality being in place at any one time.

The other design trade-offs centered around the actual model. We spent a lot of time studying how it was that Amazon’s cloud implemented the nature of its abstractions. We don’t actually know how the abstractions are implemented, but we studied the semantics of the abstractions very carefully. What we found is that they were very much driven by the e-commerce model, the notion that users would interact with the system transactionally and that service would be asynchronous. If you think about the way Amazon sells books, there’s a transaction that takes place to purchase the book, and then the book arrives sometime later. There’s an asynchronous delivery, and that’s very true for cloud abstractions as well. So, what we knew going in was that it had to be an e-commerce style of interaction and that it had to be utterly and completely vanilla with respect to the infrastructure it was going to run on.

Milojicic: You said you started with high-performance computing types of applications and HPC users, but today, you have a large variety of users that go outside of HPC. Can you tell us a little bit more about the kind of users you have?

Wolski: Most of the [people who] have used Eucalyptus since the science project reached its conclusion have been commercially- or open source-focused users who are interested in clouds, but their use cases are varied. Many of them use Eucalyptus because of the cloud paradigm, because if you’re building a network-facing application today, the ability to quickly provision or unprovision infrastructure resources is a new capability — it’s really a new application development paradigm. So a lot of users are using Eucalyptus to build new applications or to site new applications that are increasingly taking advantage of the cloud provisioning semantics that are present.

We see a lot of commercial users adopting a self-service, scalable, transactional model as a datacenter management technology. There are lots of ways to manage datacenters, and the e-commerce self-service nature of cloud computing is very appealing for IT management and IT operations, so we see production users gravitating toward Eucalyptus for that reason.

Milojicic: Can you give us any examples of specific cloud services and applications that are running on Eucalyptus?

Wolski: The commercial users are obviously a little bit nervous about talking about how their infrastructure works. Eucalyptus currently runs USAspending.gov, which is a government website. It tracks how the stimulus money is being spent, where it’s being spent, and so on and so forth, so it’s a fairly large-scale site. The Web services that are serving that traffic are running inside Eucalyptus DMs. That’s probably the best example that we can talk about, and it’s a good example of what the early applications looked like: very network-facing, very Web-service oriented. It’s not all HTML. There are lots of websites today that are more complicated than just images. It’s a more interactive application than you might think of when you think of a website.

So, that use case is fairly common, where you have interactive content service that’s network-facing. The reason Eucalyptus is useful for that is because you can dynamically change the infrastructure footprint in response to the load. If you wish to run a legacy version and then upgrade, you can run them side by side. The cloud is a very powerful model for deploying Web services.

Milojicic: Eucalyptus has achieved quite a bit of popularity, but are there any alternatives to Eucalyptus? Who do you consider your competition? Who do you consider partners, enablers?

Wolski: Our competition is primarily VMware. VMware, which was previously a virtualization company, has begun to move into the cloud space, and [it has had] a substantive commercial uptake for its virtualization technology. So, we’ve talked to customers, and they’re trying to decide between VMware’s approach and the Eucalyptus approach. It’s a valid set of questions to consider.

On the partnership side, the enabler side, we have a number of partners. Canonical, the people who essentially manage the Ubuntu Linux distribution, has been a long-term partner of ours. We’ve released Eucalyptus in Ubuntu for a couple of years now. Canonical has its own sort of technology ecosystem around Eucalyptus called the UEC, and we work very closely with it to make sure that that the ecosystem and platform is production quality, and it has been for some time. We also partner with Red Hat. You can see that there’s a lot of back and forth with the Linux community. Red Hat and Eucalyptus just established a partnership, and we will be doing some technology development there as well.

There are also components of the cloud space that we get from the AWS ecosystem, and a good example of that is RightScale, [which] is a cloud dashboard. What this means is that it gives you a much higher-level and visual interface to the cloud, and it provides additional higher-level programming semantics or provisioning semantics above what the base platform does. It’s very powerful. It’s beautiful. The technology is really beautiful, and it was built for Amazon’s AWS.

It turns out that Eucalyptus’s implementation of the AWS API is good enough so that RightScale can control Eucalyptus as if it’s controlling AWS. We think of that as being very valuable. RightScale customers who are using AWS today can very quickly build hybrid clouds by simply pointing their RightScale installation to a Eucalyptus installation that runs on their premises.

Lastly, we have partnerships in the datacenter management space, and a good example of that is Puppet. Puppet is a really, really great system for managing Linux — for distributing collections of Linux boxes inside datacenters — and it’s cloud compatible, so a lot of our customers talk to us about our compatibility with Puppet and vice versa, I suspect.

So, yes, so we have a lot of partners by virtue of the fact that Eucalyptus is really one of the first — if not the first — on-premise private cloud platform that was available, and because the quality of the implementation is so high.

Milojicic: What’s next for Eucalyptus? Do you have a roadmap of R&D productization?

Wolski: Yes, near-term, we’re going to focus on providing more enterprise-quality features in the platform. For example, the current open source version of Eucalyptus doesn’t support hot failover of an internal component. If you lose a critical component of Eucalyptus while it’s running, it will stop. It doesn’t crash, the DMs keep running or whatever, but it can no longer take requests and service them. We’re going to fix that. We’re going to have an HA version of Eucalyptus that does hot failover of internal components.

We’re going to enhance the accounting system. The original version of Eucalyptus didn’t have an accounting system at all. We were just using it for research, but in order for it to be useful in an enterprise, clearly you need to be able to do quotas—you need to be able to do hierarchical kinds of account delegation and integration with existing identity management systems. So that’s coming.

Amazon has recently released something called IAM, which is a user group kind of abstraction, a very scalable user group abstraction, and we’ll be supporting that. That’ll be out in the productization track in the near-term.

Longer-term R&D, we’re really thinking that hybrid clouds are the future. We hear a lot about interest in hybrid clouds. People are just trying to figure out how to use an on-premise cloud and a public cloud either separately or together, but it’s coming. It really is going to be the future, the combination of the on-premise platform and the public platform. So we’re trying to get ready for that, and a lot of our R&D is really focused on that eventuality, on the ability to link Eucalyptus on premise with a public cloud to make the seamless use of both as easy as possible.

Milojicic: In your opinion, what were the key enablers for the success of Eucalyptus? And if you had a magic wand, what would be the one or two things that you would wish for Eucalyptus?

Wolski: Key enablers were high-quality open source for e-commerce technology. We looked at a lot of other technologies—grid technologies, datacenter virtualization technologies in open source — as being platforms, and what we found is (a) those technologies were really built for another purpose and (b) the maturity of open source e-commerce technologies was just astounding.

It’s not an area I had worked on substantively before the Eucalyptus project, and I was really impressed, personally, by how high-quality and really enterprise-ready a lot of these open source Web 2.0, or whatever you want to call them, technologies were. That was absolutely a key enabler.

The magic wand I would wave would bring about a happy marriage between the Linux packaging and distribution rules or conventions and the way Java-based e-commerce technologies are packaged and distributed. You can tell when you start looking at them closely that they were designed during different eras, and they were designed for really different models of distribution, installation, and maintenance. Neither is wrong. It’s important not to take what I’m saying as an endorsement of one style over the other, but in many ways, they’re incompatible. And that incompatibility makes it very difficult for us to package and distribute Eucalyptus as a Linux platform, a Linux-support platform, given that it’s really built from open source e-commerce technology. I would love for that tension to be resolved.

Milojicic: So open source appears to be one of the crucial factors for the success of Eucalyptus.

Wolski: Yes.

Milojicic: Going forward, how do you see it happening? Are you going to feed certain features back into the open source version? And then there’s also the community. It’s almost an ecosystem that’s sensitive to perturbations. How are you going to gently manage it?

Wolski: We have up to this point, and we will going forward, continue to put out new releases of open source Eucalyptus with additional features.

Our policy so far has been to essentially put out features in open source that are really endemic to the cloud abstraction. If we see a feature that really must be part of the platform and that it’s a cloud abstraction, it’s scalable, it’s transactional, these kinds of things, then that absolutely goes in open source.

The things that we’ve been trying to put into a small amount of proprietary software are things that take that open source and deliver it in an enterprise setting in a way that allows it to be managed according to best practices for enterprise infrastructure management. The thinking there is that the functionality is really something that should be open source. It will stimulate experimentation. It will stimulate new application development. It will stimulate new contributions. The sort of hardcore engineering but not necessarily innovation that’s required to make it compatible with best practices for infrastructure management, that’s the sort of thing that we think we should be able to monetize.

Community is very important. We get a tremendous amount out of our community. Code contributions are absolutely something we get, but they’re not the only valuable contribution. We get QA: a lot of our community tests the code that they contribute and others contribute and that we put out to make sure that it works. They do a much better job of it, of describing what the system does and how to install it and the tips and tricks, than we do. The community really does enhance the open source and, in fact, the product offering tremendously.

Milojicic: If you could afford to redesign Eucalyptus from scratch, start it all over, would you do it the same way? Are there any mistakes that you learned that you can pass onto our listeners?

Wolski: The methodology would be the same, and that was to first study the paradigm at hand and really try to understand what differentiates it from previous distributed computing paradigms. My research group had worked in distributed computing for many years, and what we wanted to fundamentally understand before we did any implementation was, what was new? Why was this not — or was it, in fact — a recast of an existing technology approach? And I would do that again the same way.

What I wouldn’t do — and this is more about the tension between an open source project and a research project — is use the research development life cycle to drive an open source project. What I mean by that is, for the researchers, we had a release schedule and sort of a feature roadmap that we designed to make the research project go. We knew at what point certain other researchers on the project needed various functionalities, and we were incrementally putting those things out; at the same time, we were making that same code available as open source.

What very quickly happened was a tension between the extremely goal- and deadline-driven needs of the research project and the very legitimate requests and complaints that the open source community had about the code that was being produced for this research project. I think I would have sequenced those. I would’ve finished the research project before putting out any code and then taken a look at what was, in retrospect, a good decision for us. [I might] say, “now that we’ve gotten the project completed, we’re going to reverse or shift to an open source project stance where we’ll be able to focus on much more of the maintenance and support activities without development being so critical during that timeframe.”

Milojicic: Are there any architectural components that are missing from today’s cloud solutions?

Wolski: One of the things that hasn’t come about yet, and it’s not entirely clear whether it can, is the cloud file system.

Clouds have several different storage abstractions, and new ones are being proposed all the time, but the one that really hasn’t happened yet is an honest to goodness file system. Now, there are scaling questions, right? Clouds operate at scales where file systems fear to tread, but I think that’s a very interesting research question to begin with.

Secondarily, if you look at cloud abstractions, what’s interesting about them is that they’re designed sometimes to operate at different scales. So, there’s usually a storage abstraction that’s planetary scale, but it’s not a file system, and then there’s a storage abstraction that might be regional scale, and then there’s a storage abstraction that might be datacenter scale, these kinds of things, and those abstractions have different semantics.

It’s conceivable that a file system would fit into that abstraction hierarchy in terms of scale at some point and be a natural fit, and it’s kind of curious to me as to why that hasn’t emerged yet, because there are really viable distributed file system technologies out there. It’s just may be that the abstractions that are there are so good that the semantics of a regular file system — a shared file system — aren’t necessary, but I suspect not. I suspect that there is going to be a cloud file system at some point.

Milojicic: If we talk about Eucalyptus specifically, are there any pieces of hardware virtualization, for example, or operating system or middleware that Eucalyptus could significantly benefit from today and in the future?

Wolski: I think the big piece that all clouds and certainly Eucalyptus could benefit from is better support for virtualized I/O. Right now, the hypervisors are excellent at virtualizing memory and CPU, but they vary from version to version in their vintage and their ability to provide virtualized I/O.

For network, it’s excellent. For storage, it’s pretty good. For other devices — PCI devices like NVIDIA cards, GPUs — there are standards that are emerging. But I think true hardware support for virtualized I/O, which I imagine is coming, and access to that support through the hypervisors, would be a huge benefit.

Milojicic: You mentioned standards. What are the most critical standards needed in cloud computing, especially now that there are new cloud providers emerging?

Wolski: I think there are two ways to look at this. A lot of people spend time worrying about whether there should be a cloud API standard, and my view is slightly different. I think if there’s going to be standards, they should center around cloud federation. It should be cloud interoperability and not API that the standard should focus on. Now, it might be that if we had cloud interoperability standards, that APIs would gravitate toward [them], and in fact, there would just be one API. But where standards seem to make the most sense is where we’re trying to interconnect things, and I’m just taking a page out of the Internet, right?

The standardization efforts for the Internet have been very successful, and what they’re really doing is telling how to have two disparate computer systems communicate, interoperate. I think that’s going to be true for clouds. I think that independent of what the local APIs are, where the standardization effort is going to have the most impact is where we take two different clouds or two different sets of abstractions and find ways for them to interoperate.

I also think that standards will be most successful when users and not technology providers join the standardization efforts in earnest. Right now, if I look at the standardization efforts, several technology providers like Eucalyptus are eager to participate, and I understand why. There’s a tremendous incentive for us at Eucalyptus to try to influence standards to be more congruent with our technological approach, but we don’t do this. We don’t participate. Not because we don’t want to, but because we feel as open source players it’s not fair for us to use our footprint and the ubiquity of Eucalyptus as a way to drive business toward us through a standardization effort.

However, we do feel like users of clouds — people who actually are trying to get something done with clouds and not people who are trying to sell a cloud — do have something to say. So when those users join the standardization efforts, I think they will become far more successful.

Milojicic: We started with a mention of high-performance computing, maybe we can close with it. Do you think that the cloud community and high-performance community are converging? There’s a lot of usage of cloud in high-performance computing. It may never be true for the very high-end systems, but is the bar getting lower for the entrance of HPC?

Wolski: Yes, absolutely, I see them converging in the future. There will always be special-purpose, bare-metal machines, although as virtualization technology improves, I think the case for not using virtualization will gradually become less and less viable.

But there’s no reason that the cloud paradigm could not be used to service an HPC paradigm. In fact, Eucalyptus was born with that realization. We were happily running HPC programs on Eucalyptus and in the cloud as well as on batch-controlled supercomputers. There are management styles and user interfaces that have to be homogenized, but as underlying technologies, clouds and HPC will converge.

Milojicic: Is there anything else that you would like to communicate to our community that I haven’t asked you about?

Wolski: There’s one thing that I think is worth knowing. At least, this is my opinion. Having done distributed systems for a long time, both for HPC and now for cloud, I can say this is a very exciting time for that kind of research.

Distributed systems have come of age with cloud computing, and that’s a good thing. It really is the next-generation large-scale platform, and I urge students and other researchers who are interested in systems research or systems work to take it very seriously, because I think this is going to be with us for some time.

Milojicic: Thank you very much, Rich. I’m sure our audience will enjoy this as much as I have.

Rich Wolski: Thank you so very much.

Dejan Milojicic is a senior researcher and director of the Open Cirrus Cloud Computing testbed at HP Labs. He has worked in the areas of operating systems, distributed systems, and service management for more than 20 years and is an IEEE fellow, ACM distinguished engineer, and member of Usenix. Milojicic has a PhD from the University of Kaiserslautern. Contact him at dejan.milojicic@hp.com.

Rich Wolski is the CTO and cofounder of Eucalyptus Systems and a professor of computer science at the University of California, Santa Barbara. He’s also a strategic advisor to the San Diego Supercomputer Center and an adjunct faculty member at the Lawrence Berkeley National Laboratory. Wolski has a PhD from the University of California, Davis. Contact him at rich@eucalyptus.com.