Jump to content

Object storage

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Leoinspace (talk | contribs) at 05:23, 19 September 2013 (Created object storage article). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Template:Unreviewed Object storage (also known as object-based storage) is a storage architecture that explicitly separates metadata from the data itself to enable native capabilities not typically addressed by other storage architectures like file systems and block storage. These capabilities include higher level interfaces, a namespace that can span multiple instances of physical hardware, and data management functions like data replication and data distribution at object-level granularity.

History

Origins

It is unclear where the first object storage technology was developed. Some point to the Coda File System project at Carnegie Mellon, which started in 1987[1] and spawned the Lustre file system.[2] Others point at the OceanStore project at UC Berkeley[3] , which started in 1999.[4] One of the earliest and best-known object storage products, EMC's Centera, debuted in 2002.[5] However, development of Centera's technology started even earlier, at a company called Filepool (which was acquired by EMC in 1999).

Development

Overall industry investment in object storage technology has been sustained for over a decade. From 1999 to 2013, there has been at least $300 million of venture financing related to object storage.[6] This doesn't include millions of dollars of private engineering from vendors like EMC (Centera, Atmos, ViPR), IBM, HP (OpenStack), HDS (HCP), Amazon (AWS S3), Microsoft (Microsoft Azure) and Google (Google Cloud Storage), or the many man years of open source development at Lustre, OpenStack (Swift) and Ceph.

Architecture

Abstraction

One of the design principles of object storage is abstracting some of the lower layers of storage away from the administrators and applications. Thus, data is exposed and managed as objects instead of files or blocks. Objects contain additional descriptive properties which can be used for better indexing or management. Administrators do not have to perform lower level storage functions like constructing and managing logical volumes to utilize disk capacity or setting RAID levels to deal with disk failure.

Object storage also allows the addressing and identification of individual objects by more than just file name and file path. Object storage adds a unique identifier within a bucket, or across the entire system, to support much larger namespaces and eliminate name collisions.

Metadata/data

Object storage explicitly separates file metadata from data to support additional capabilities:

  • Additional metadata to capture application-specific or user-specific information for better indexing purposes
  • Additional metadata to support data management policies (e.g. a policy to drive object movement from one storage tier to another)
  • Independent scale of metadata nodes and data nodes
  • Unified access to data across many distributed nodes and clusters
  • Centralized management of storage across many individual nodes and clusters
  • Optimization of metadata storage (e.g. database) vs data storage (e.g. high capacity SAS)

Implementation

Archive storage

Early incarnations of object storage were used almost exclusively for archiving, as implementations were optimized for data services, not performance. EMC Centera and Hitachi HCP (formerly known as HCAP) are two commonly cited object storage products for archiving.

Cloud storage

The vast majority of cloud storage available in the market leverages an object storage architecture. Two notable examples of cloud storage services are Amazon Web Services S3 and Rackspace Files. AWS S3 debuted in 2005 and has since been synonymous with cloud storage services. Other major cloud storage services include Microsoft Azure and Google Cloud Storage.

"Captive" object storage

Some large internet companies developed their own software when object storage products were not commercially available or use cases were very specific. Facebook famously invented their own object storage software, code-named Haystack, to address their particular massive scale photo management needs efficiently.[7]

Object storage systems

More general purpose object storage systems came to market around 2008. Lured by the incredible growth of "captive" storage systems within web applications like Yahoo Mail and the early success of cloud storage, object storage systems promised the scale and capabilities of cloud storage, with the ability to deploy the system within an enterprise, or at an aspiring cloud storage service provider. Notable examples of object storage systems include EMC Atmos and OpenStack Swift.

Market adoption

As the storage backend to many popular applications like Smugmug and Dropbox, AWS S3 has grown to massive scale, citing over 2 trillion objects stored in April 2013.[8] Two months later, Microsoft claimed that they stored even more objects in Azure at 8.5 trillion.[9]

Object storage systems have also gotten some traction, particularly around very large custom applications like eBay's auction site, where EMC Atmos is used to manage over 500 million objects a day.[10] Hitachi's HCP product also claims many petabyte-scale customers.[11]

References

  1. ^ "Coda File System". Retrieved 17 September 2013.
  2. ^ Braam, Peter. "Lustre: The intergalactic file system" (PDF). Retrieved 17 September 2013.
  3. ^ "OceanStore". Retrieved 18 September 2013.
  4. ^ Kubiatowicz, John (November 2000). "OceanStore: An Architecture for Global-Scale Persistent Storage" (PDF). Proceedings of the Ninth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000). Retrieved 18 September 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
  5. ^ "EMC Unveils Low-Cost Data-Storage Product". LA Times. April 30, 2002. Retrieved 17 September 2013.
  6. ^ Leung, Leo (16 September 2013). "After 10 years, object storage investment continues and begins to bear significant fruit". Retrieved 17 September 2013.
  7. ^ Vajgel, Peter. "Needle in a haystack: efficient storage of billions of photos". Retrieved 17 September 2013.
  8. ^ Harris, Derrick (18 April 2013). "Amazon S3 goes exponential, now stores 2 trillion objects". Gigaom. Retrieved 17 September 2013.
  9. ^ Wilhelm, Alex (27 June 2013). "Microsoft: Azure powers 299M Skype users, 50M Office Web Apps users, stores 8.5T objects". thenextweb.com. Retrieved 18 September 2013.
  10. ^ Robb, Drew (11 May 2011). "EMC World Continues Focus on Big Data, Cloud and Flash". Infostor. Retrieved 19 September 2013.
  11. ^ "Hitachi Content Platform Supports Multiple Petabytes, Billions of Objects". Techvalidate.com. Retrieved 19 September 2013.