Jump to content

Continuous data protection

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Networkengine (talk | contribs) at 22:51, 29 January 2008. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Continuous data protection (CDP), also called continuous backup, refers to backup of computer data by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. It allows the user or administrator to restore data to any point in time.

CDP is a service that captures changes to data to a separate storage location. There are multiple methods for capturing the continuous changes involving different technologies that serve different needs. CDP-based solutions can provide fine granularities of restorable objects ranging from crash-consistent images to logical objects such as files, mail boxes, messages, and database files and logs.

Differences from traditional backup

Continuous data protection is different from traditional backup in that you don't have to specify the point in time to which you would like to recover until you are ready to perform a restore. Traditional backups can only restore data to the point at which the backup was taken. With continuous data protection, there are no backup schedules. When data is written to disk, it is also asynchronously written to a second location, usually another computer over the network. This introduces some overhead to disk-write operations but eliminates the need for nightly scheduled backups.

Some solutions may be marketed as continuous data protection, but they may only let you restore to fixed intervals such as 1 hour ago, or 24 hours ago. Some do not consider this to be true continuous data protection, as you do not have the ability to restore to any point in time. Such solutions are often termed "Snapshot based". There is some debate in the industry as to whether the granularity of backup needs to be "every write" in order to be considered CDP or whether a solution which captures the data every few seconds is good enough. Some argue that data capture every few seconds is considered Near Continuous Backup. The debate hinges on the use of the term "continuous:" whether only the backup process needs to be continuous, which is sufficient to achieve the benefits cited above, or whether the ability restore from the backup also has to be continuous. The Storage Networking Industry Association (SNIA) uses the "every write" definition.

Differences from RAID/replication/mirroring

Continuous data protection differs from RAID, replication, or mirroring in that these technologies only protect against a storage hardware failure by protecting the most recent copy of the data. If a software problem corrupts the data, these technologies will simply protect the corrupt data. Continuous data protection will protect against some effects of data corruption by allowing an installation to restore a previous, uncorrupted version of the data. (Transactions that took place between the corrupting event and the restoration will be lost, however. They must be recovered through other means, such as journaling.)

Backup disk size

In some situations, continuous data protection will require less space on backup media (usually disk) than traditional backup. Most continuous data protection solutions save byte or block-level differences rather than file-level differences. This means that if you change one byte of a 100 GB file, only the changed byte or block is backed up. Traditional incremental and differential backups make copies of entire files.

See also