Businesses Turn to Object Storage to Handle Growing Amounts of Data

by Sixto Ortiz Jr.

Big Data has gotten so big that traditional, hierarchical file systems are straining to keep up with today's exponential information growth.

As businesses collect more and more information — particularly unstructured data such as multimedia files — administrators are having trouble managing, indexing, accessing, and securing the material.

The challenge with traditional file systems is maintaining their hierarchical organization and central data indices as the number of files and the amount of unstructured information grows.

In response, companies are turning to object storage, which stores data as variable-size objects rather than fixed-sized blocks.

Rather than housing information that can only be found somewhere in a hierarchical system, object storage uses unique identifier addresses to locate and identify data objects, explained Russ Kennedy, vice president for product strategy, marketing, and customer relations at object-storage vendor Cleversafe.

Object stores have nonhierarchical, near-infinite address spaces, said Mike Matchett, a senior analyst with the Taneja Group, a market-research firm.

Thus, even as the amount of data grows, the storing and finding of information doesn't become more complicated.

Nonetheless, widespread object-storage use faces several challenges.

The Storage Crisis

Traditional storage systems house data in fixed-size blocks in directories, folders, and files. There is a limit to how many files can be housed in this hierarchical system, said Jeff Lundberg, Hitachi Data Systems' senior product marketing manager for file, content, and cloud.

Users can't go directly to information but instead must work via a central index, noted Janae Stow Lee, senior vice president of the File System And Archive Product Group at storage vendor Quantum Corp.

A complicating factor is that because of the increase in multimedia, Kennedy noted, file sizes are growing to the gigabyte and even terabyte range.

As the amount of data and number of files have grown, current storage systems have become very large, explained Tom Leyden, director of alliances and marketing for object-storage vendor Amplidata. This makes finding information in their huge hierarchies increasingly difficult, he said.

The difficulty of trying to find data via increasingly large indices limits the number of files and amount of data traditional storage systems can work with, said the Taneja Group's Matchett.

And as traditional systems store more data, they become more likely to experience mechanical drive failure. Administrators then must copy data to additional systems to guarantee reliability and availability, which could be cost prohibitive for some organizations.

Added Quantum's Stow Lee, as information volumes grow, traditional file systems' data-replication approaches become too expensive and time-consuming to use.

Data backups also become costly, which could create serious problems for organizations that need timely recovery points, noted Tad Hunt, chief technology officer of storage vendor Exablox.

Many companies are using storage-area networks and network-attached storage to cope with spiking data volumes, but these approaches typically use hierarchical file systems and thus are also beginning to experience problems, noted Ross Turk, vice president of community at storage consultancy Inktank.

Object Storage Steps Up

Work on object-storage technology began in 1994 at Carnegie Mellon University and has been supported over the years by the National Storage Industry Consortium and the Storage Networking Industry Association.

However, there was no big need for object storage until recently.

Numerous vendors — such as Amplidata, Caringo, Cleversafe, DataDirect Networks, Exablox, and Quantum — are now developing and selling object-storage products.

How it works

Searching for specific content in a large traditional file hierarchy requires analysis of the entire index and the reading of long lists of nodes and their contents, explained Dustin Kirkland, chief technology officer at Gazzang, a security and operations diagnostics company.

This process can consume considerable time and CPU resources, he noted.

Many organizations are thus turning to object storage, which uses the same types of hardware systems as the traditional approach but stores files as objects, which are self-contained groups of logically related data. The information is stored nonhierarchically, with an object identifier and metadata that provides descriptive attributes about the information.

Applications that interface with object-storage systems use identifiers to access objects easily and directly, wherever they are. The objects thus aren't tied to a physical location on a disk or predefined organizational structure. To applications, all of the information appears as one big pool of data.

There is no large central index that users must work through to access data. These indices act as a bottleneck in traditional storage systems, noted Quantum's Lee. Not using indices lets the object-based systems add storage hardware and scale well.

Object systems' identifiers contain more metadata than traditional storage files. According to Amplidata's Leyden, this makes finding data much easier for searchers.

This also lets companies apply detailed policies — such as file-access controls — to objects for more efficient and automated management.

Object storage also simplifies data management and use because administrators don't have to organize and manage hierarchies, according to Cleversafe's Kennedy.

And, he said, the systems are less expensive to set up and operate because they are less complex and highly scalable, and also require fewer administrators.

Object storage — which enables easier, quicker data access than traditional systems — saves money because it can work with slower, less expensive drives without losing performance.

Object-based systems typically secure information via Kerberos, Simple Authentication and Security Layer, or some other Lightweight Directory Access Protocol-based authentication mechanism, Kennedy noted.

Object storage systems' scalability; suitability for use with lower-cost, high-capacity hard drives; and improved automation make the approach good for cloud computing, he said.

How it's used

Because it is highly scalable and enables easy information access even from large data collections, object storage is best for large unstructured files such as those containing multimedia.

The approach is good for unstructured data also because this type of information doesn't always fit easily into the hierarchical systems that traditional storage houses.

Currently, Leyden said, object storage is used mostly in cloud applications like Dropbox, Amazon's Simple Storage Service, and Google's Picasa photo-storage program. These applications form the basis for cloud-based services such as file sharing, backups, and archiving.

Object Lession

The widespread use of object storage faces several challenges.

Some companies have to rewrite their application interfaces to use object-storage APIs natively, said Quantum's Lee.

The security and privacy of data in object-storage systems is an important issue, said Gazzang's Kirkland.

He explained, "All information should, without question, be encrypted before being written to disk. Object storage without comprehensive encryption should be as unfathomable in 2012 as a minivan without seat belts."

According to Matchett, object-storage use is already spreading, particularly in public and private cloud implementations.

Jeffrey Bolden, Managing Partner at IT consultancy Blue Lotus SIDC, said object storage will remain a niche technology.

He noted that traditional file systems enforce relational integrity — which ensures that relationships between tables remain consistent despite any changes that may be made to information in the database — while object storage doesn't.

Quantum's Stow Lee said object storage will be a niche application at first—primarily for customers needing at least 500 terabytes of storage—but then will be widely adopted as the technology improves and cloud services grow in popularity.

However, she added, no storage technology is best for all uses, so traditional file systems will still be around.