Jump to content

Voldemort (distributed data store)

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Sae1962 (talk | contribs) at 14:28, 5 April 2011 (Added a new article stub). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Voldemort is a distributed data store that is designed as a key-value store used by LinkedIn for high-scalability storage.[1]

Voldemort is still under development. It is neither an object database, nor a relational database. It does not try to satisfy arbitrary relations and the ACID properties, but a big, distributed, fault-tolerant, persistent hash table.[2]

Advantages

Voldemort offers a number of advantages over other databases:[2]

  • It combines in-memory caching with the storage system so that a separate caching tier is not required (instead the storage system itself is just fast)
  • It is possible to emulate the storage layer, as it is completely mockable. This makes the development and the unit testing easy, as it can be done against a throw-away in-memory storage system without the need for a real cluster or real storage system
  • Reads and writes scale horizontally
  • Simple API: The API decides data replication and placement and accommodates a wide range of application-specific strategies
  • Transparent data portioning: This allows for cluster expansion without rebalancing all data

Properties

The Voldemort distributed data store has following properties:[1] •Data placement: Support for pluggable data placement strategies exists to support things like distribution across data centers that are far apart.

  • Data replication: The data is automatically replicated over a large number of servers.
  • Data partitioning: The data is automatically partitioned so that the server contains only a subset of the total data
  • Good single node performance: 10-20k operations per second can occur depending on the machines, the network, the disk system, and the data replication factor

•Node independence: Each node is independent of other nodes with no central point of failure or coordination

  • Pluggable serialization: This allows rich keys and values including lists and tuples with named fields, as well as the integration with common serialisation frameworks. Examples for these frameworks are Avro, Java Serialization, Protocol Buffers, and Thrift
  • Transparent failures: Server failures are handled transparently so that the user doesn't see such problems
  • Versioning: The data items are versioned to maximize data integrity in case of failure without compromising availability of the system

References

  1. ^ a b "Voldemort is a distributed key-value storage system". http://project-voldemort.com/: Project Voldemort - A distributed database. Retrieved 2011-04-05. {{cite web}}: External link in |location= (help)
  2. ^ a b "Comparison to relational databases". http://project-voldemort.com/: Project Voldemort - A distributed database. Retrieved 2011-04-05. {{cite web}}: External link in |location= (help)

See also