Jump to content

User:Ingenthr/draft on membase

From Wikipedia, the free encyclopedia
name                   = Membase
logo                   =
screenshot             =
caption                =
developer              = NorthScaleZyngaNHN_(corporation)
latest release version = 1.6.0 beta2
latest release date    = July 21, 2010 (2010-07-21)
operating system       = Cross-platform
programming language   = C++
genre                  = distributed memory caching system
license                = Apache_License
website                = http://membase.org/


Membase (pronunciation: mem-base) is an Open Source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, membase is designed to provide simple, fast, easy to scale key-value data operations with low latency and high sustained throughput. It is designed to be clustered for single machine to very large scale deployments.

For those familiar with memcached, membase provides on-the-wire protocol compatibility, but is designed to add disk persistence (with hierarchical storage management), data replication, live cluster reconfiguration and rebalancing and multi-tenancy with data partitioning.

In the parlance of Eric Brewer’s CAP theorem, membase is a CA type system.

History

[edit]

Membase was developed by several leaders of the memcached project, who had founded a company, NorthScale, expressly to meet the need for an key-value database that enjoyed all the simplicity, speed, and scalability of memcached, but also provided the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors Zynga and NHN (corporation) to a new project on membase.org in June 2010.


Design Drivers

[edit]

membase design decisions are weighed against three non-negotiable requirements. By design, membase is simple, fast, and elastic[1].

Simple. Membase is extremely easy to manage, and simple to develop against. Every node is alike in a membase cluster – clone a node, join it to the cluster and press the rebalance button to automatically rebalance data to it. Membase enjoys the widest language and application framework support of any NoSQL database technology due to its on-the-wire protocol compatibility with memcached; in fact, membase directly incorporates memcached “front end” source code, leveraging the memcached engine interface, guaranteeing compatibility today and in to the future.

Fast. Membase distributes data and data operation I/O across commodity servers (or VMs), replicates data for high-availability, transparently caches data in main memory, persists the data with a design for multi-tier storage management model (planned to support Solid-state_drive and Hard_disk_drive media). It is a consistently low-latency and high-throughput processor of data operations. It is multi-threaded, with low lock contention; it automatically de-duplicates writes and is internally asynchronous everywhere possible.

Elastic. Membase scales elastically, with linear cost. Servers can be added to, or removed from, a running cluster with no application downtime. Employing commodity servers, virtual machines or cloud machine instances, data management resources can be dynamically matched to the needs of an application with little effort.


Data model

[edit]

Key Features (persistence, replication/failover, scalability/performance) Persistence

  • Asynchronously writes data to disk after acknowledging write to client.
    • Tunables to define item ages that affect when data is persisted[2].
  • Supports working set greater than a memory quota per "node" or "bucket"
    • Tunables to affect how max memory and migration from main-memory to disk is handled[3].
  • Configurable “tap” interface: External systems can subscribe to filtered data streams – supporting, for example, full text search indexing, data analytics or archiving.


Replication and Failover

[edit]
  • Multi-model replication support: Peer-to-peer replication support with underlying architecture supporting master-slave replication.
  • Configurable replication count: Balance resource utilization with availability requirements
  • High-speed failover: Fast failover to replicated items based upon request


Scalability and Performance

[edit]
  • Distributed object store: Easily store and retrieve large volumes of data from any application, using any language or application framework
  • Dynamic cluster resizing and rebalancing: Effortlessly grow or shrink a membase cluster, adapting to changing data management requirements of an application
  • Guaranteed data consistency: Never grapple with consistency issues in your application – no quorum reads required
  • High sustained throughput
  • Low, predictable latency. When operating out of memory, most operations occur in far less than 1ms (assuming gigabit ethernet).


Prominent users

[edit]
  • Zynga - membase is the key-value database behind FarmVille[4]
  • NHN[5]

See also

[edit]

References

[edit]


[edit]

Commercially Supported Distributions

[edit]


Category:Open source database management systems Category:Distributed computing architecture Category:NoSQL Category:Cross-platform software Category:Structured storage