Jump to content

Oracle Zero Data Loss Recovery Appliance

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by KS bnr (talk | contribs) at 01:13, 30 January 2019 (Backup and Recovery Challenges: Added additional details and links to sources). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Oracle Recovery Appliance
Original author(s)Oracle Corporation
Initial releaseSeptember, 2014
Operating systemOracle Linux
PlatformZero Data Loss Recovery Appliance
LicenseCommercial
Website'www.oracle.com/zdlra'


File:Recovery Appliance X7 Full Rack.jpg
Recovery Appliance X7

The Oracle Zero Data Loss Recovery Appliance[1] (Recovery Appliance or ZDLRA) is a computing platform that includes Oracle Corporation (Oracle) hardware and software built for backup and recovery of the Oracle Database. The Recovery Appliance validates backups, automatically corrects many issues, and provides alerts when backups fail validation.[2][3][4].

Oracle's Recovery Appliance is designed for use with the Oracle database, and does not work with non-Oracle databases, and takes the place of 3rd party backup and recovery products. While it's limited to Oracle database backup only, it does provide more capability than general-purpose backup/recovery solutions.[5][6][7][8]. Industry analyst firm ESG[9] (via ESG Lab) reviewed the Oracle Recovery Appliance and noted that it meets the needs of the financial services industry and provided what they call “Fiduciary Class Data Recovery[10][11][12] to meet the high level of trust required by Financial institutions.

The Recovery Appliance was introduced in 2014 as part of Oracle Corporation's family of Engineered Systems[13] and shares components with the Oracle Exadata Database Machine, with an additional layer of software that provides the specific features for backup, recovery, replication, monitoring and management.

The Recovery Appliance uses an elastic configuration starting with a "Base Rack" that can be incrementally increased to a "Full Rack" or larger "multi-rack" configurations. A Base Rack is capable of managing over 100 terabytes of backup data, while a Full Rack can manage over 700 terabytes. Multi-Rack configurations of up to 18 racks wide can manage more than 13 petabytes of data[1]. Since Recovery Appliance only needs to store data that has changed, the actual size of databases that are protected can be many times larger than the storage capacity of a Recovery Appliance[14].

Backup and Recovery of Databases

Backing up databases is a standard practice of database administrators, and virtually all databases use some form of backup. A backup is a separate copy of data that can be restored and used in place of a damaged or unavailable database. Recovery is the main goal, while backups are a mechanism that simply enable recovery. If a backup copy is itself corrupt, it cannot be used for recovery. If the backup data isn’t current, some or all of the changes made to the database since the backup could be lost[15].

The value of being able to recover data varies greatly, depending on the nature of the database. Financial transactions, medical records, and national security information are examples of databases that require the greatest level of data protection. The average cost of downtime for such databases is estimated at millions of U.S. dollars per hour[2].

Due to the high cost of database downtime, critical databases are usually protected with a standby database that is closely synchronized with the primary database using a "data replication[16]" mechanism. If the primary database becomes unavailable, the applications can switch to the standby (now the primary) and continue to operate. However, until the original primary is restored and synchronized, the organization is exposed if the new primary also becomes unavailable. Backups are, therefore, the ultimate failsafe for data protection. The more important the database the more important the backup and recoverability of the backup.

Backup and Recovery Steps

The general process of database backup and recovery includes the following steps:

Backup

  1. On a regular schedule, run a backup utility that makes a copy of the full database or just the incremental changes since the last backup. Typically a full backup of the database is made once a week followed by incremental backups daily.
  2. In between backups, periodically save (archive) changes in the form of transaction logs.

Recovery

  1. Restore the last full database backup, then apply all subsequent incremental backups in order, plus any subsequently archived transaction logs.
  2. If possible, recover completed transactions that occurred after the last archived log. This usually requires re-entry based on a separate journal maintained by the applications. If not possible, data loss has occurred.

Backup and Recovery Challenges

Oracle designed the Recovery Appliance to overcome problems that commonly plague backup and recovery of critical databases[1][17]

  • Long-running backups - Critical databases are often the largest and can take a long time for a weekly full backup to complete. The longer the backup, the more computing and backup resources are required, and the higher the likelihood of a failure. Some vendors encourage full backups[18] in order to highlight the ability of their products to deduplicate data[19]. Full backups are also faster to recover than a series of incremental backups, but the repeated full backup and subsequent deduplication process unnecessarily wastes a lot of CPU and I/O resources on the production database, as well as increased backup network consumption.
    • Recovery Appliance - After an initial full RMAN backup, the Recovery Appliance uses an "incremental forever" strategy, and only performs a backup of the changes, minimizing the time and resources required. The database already tracks data file block changes via RMAN block change tracking, so they can be efficiently sent to the Recovery Appliance. "Virtual full" backups are automatically compiled in the background on the Recovery Appliance to provide the fast recovery of a full backup.
  • Application slow-down during backups - Most backup processes consume significant computing resources on the system where the database resides, particularly if data compression and deduplication are performed there on full backups. This impacts the performance of running applications and may force other applications to be turned off until backups complete.
    • Recovery Appliance - With the incremental forever approach, backups run for the shortest time possible and have minimal impact on other workloads. All post processing of backups is then performed on the Recovery Appliance and not on the production systems.
  • Excessive backup storage - The cost of storing backups can grow quickly, depending on how many backup copies are maintained and whether or not they are full or incremental backups, and how much deduplication of data is applied. Most databases have a small percentage of daily changes, but may be fully backed up to simplify recovery.
    • Recovery Appliance - Only changed data is stored for backups, which is the same as deduplicating all databases at their source without any of the processing overhead associated with deduplication on the production system. Not having to perform full backups can easily reduce the backup storage requirements by 75% or more, depending upon database size and the rate of change.[14]
  • Corrupt backups - If a backup has corrupt or missing data, it cannot be used for recovery. Data can be corrupted in many ways as it moves from the primary database through the computing infrastructure and across the network to the backup destination. There are many elements of hardware and software that could cause corruptions of the data. Data corruptions are difficult to detect ahead of time, often until it is too late and a recovery is in process. It is also possible to have operational mistakes that result in an incomplete backup. Incomplete backups can result in unrecoverable databases.
    • Recovery Appliance - The Recovery Appliance understands internal Oracle database block formats, which enables deep levels of data validation. All backup data is validated at various times in the Recovery Appliance to ensure that recovery operations will always restore valid data. If a corruption or missing data is detected, the Recovery Appliance is able to repair it automatically or alert the administrator.
  • Lost transactions - Ensuring no data loss is one of the biggest challenges with database recovery, since backups occur periodically on a set schedule, whereas the database is constantly changing. If backups are daily it is possible to lose an entire day of changes during recovery, which is unacceptable for critical databases. A common practice is to periodically capture changes in the form of transaction logs maintained by the database. This practice reduces, but doesn't eliminate, the likelihood of lost transactions. Continuous data protection is the process of capturing changes as they occur in between backups, and adding them to the recovery process to eliminate most or all lost transactions.
    • Recovery Appliance - Redo logging is the fundamental means of implementing transactional changes within the Oracle database. In between backups, Oracle databases can continuously send redo directly from inside the database to the Recovery Appliance. This provides real-time continuous data protection that allows databases to be protected as transactions are happening until the last sub-second.
  • Error-prone recovery processes - Data protection is a multi-step process, with recovery occurring under stress. If any data is corrupt or missing or a recovery step is skipped or out of sequence, the recovery will fail. Seldom does an organization practice recovery or place high value on the certainty of recovery, as it is rarely required. This large gap between the cost of downtime and the perceived need for recovery often leads to underinvestment, until it is too late.
    • Recovery Appliance - The Recovery Appliance introduces the concept of protection policies, which define recovery windows that are enforced on a per-database basis: this means that all backups – data files and archived logs - are validated for recoverability on a continual basis, to any point-in-time within the specified recovery window. Using protection policies, databases can be grouped by recovery service tier, with a new database inheriting the policies of the tier to which it is added. When a failure occurs and a database must be restored, Recovery Appliance automates the steps to recover the database[20] either to the point of failure or to a specified point in time.
  • Consistent restore performance - Once a restore is needed it is very important to understand how long it will take to recover the database. Traditional deduplication systems have to process and hydrate data before it can be used for recovery. This process can be unpredictable during database recovery and makes estimating recovery time very difficult.
    • Recovery Appliance - The Recovery Appliance understands the database format and optimizes the backup storage to ensure that the latest backups are reordered in a contiguous fashion which minimizes fragmentation and produces predictable restore performance.[21]

Recovery Appliance History and Architecture

The Recovery Appliance debuted in 2014 and doesn't use a traditional model number. Instead, the Recovery Appliance uses generations which are based on the hardware technology used at the time of release.  Recovery Appliance generations are interoperable and elastic, so expansion is not limited to a single generation of hardware.

ZDLRA

Generation

Release

Date

Base Rack

Capacity

Full Rack

Capacity

Full Rack

Backup & Restore

X4[22] 2014 37 TB* 224 TB 12 TB/hour
X5[23] 2015 50 TB 340 TB 12 TB/hour
X6 2016 94 TB 580 TB 12 TB/hour
X7 2017 119 TB 729 TB 24 TB/hour

*TB = terabyte


The Recovery Appliance also uses a full stack of Oracle software technology:


References

  1. ^ a b c Various authors (2017). "Oracle Zero Data Loss Recovery Appliance X7 Technical Data Sheet" (PDF). Oracle Corporation. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  2. ^ a b Moore, Fred (June 1, 2015). "White Paper: Implementing a Modern Backup Architecture: Oracle's Tiered Data Protection Strategy". Horison.com. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  3. ^ "CIO Magazine White Paper: Extreme Protection That Eliminates Data Loss for All of Your Oracle Databases" (PDF). Oracle Corporation. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  4. ^ "Video: Resume Business Faster With Engineered Database Recovery". Oracle Corporation. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  5. ^ Vellante, David (October 22, 2015). "Oracle Backup and Recovery Strategies: Moving to Data-Protection-as-a-Service". Wikibon. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  6. ^ Floyer, David (September 2, 2016). "Real-time Recovery Architecture as a Service". Wikibon. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  7. ^ Goodwin, Phil (November 1, 2016). "Oracle's Zero Data Loss Recovery Appliance: A Transaction DVR for the Enterprise" (PDF). IDC (International Data Corp). Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  8. ^ "Taneja Group Whitepaper: FULL DATABASE PROTECTION WITHOUT THE FULL BACKUP PAIN" (PDF). Taneja Group. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  9. ^ "Enterprise Strategy Group". Enterprise Strategy Group. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  10. ^ Choinski, Sr, Vinny (October 1, 2016). "ESG Lab Validation: Zero Data Loss Recovery Appliance from Oracle". Enterprise Strategy Group. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  11. ^ Peters, Mark (November 3, 2016). "Better Business Protection – Fiduciary Class Data Recovery". Enterprise Strategy Group. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  12. ^ Peters, Mark (2016). "Understanding Oracle Engineered Storage: Oracle Zero Data Loss Recovery Appliance". Oracle Corporation. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  13. ^ Hollis, Chuck (September 9, 2015). "Chuck's Blog: Grown-up IT for Grown-Up Applications". typepad. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  14. ^ a b Craft, Chris (February 21, 2018). "De-Duplication in ZDLRA". Wordpress. Retrieved July 30, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  15. ^ Miller, Lawrence (2017). Database Protection for Dummies. https://go.oracle.com/LP=48477?elqCampaignId=49435&src1=ad:pas:go:dg:stor&src2=wwmk160606p00067c0001&SC=sckw=WWMK160606P00067C0001&mkwid=sGGgtXNTW%7cpcrid%7c238844513031%7cpkw%7cdatabase%20backup%20and%20recovery%7cpmt%7ce%7cpdv%7cc%7csckw=srch:database%20backup%20and%20recovery&gclid=CjwKCAjwkYDbBRB6EiwAR0T_-via_PHAabILjjVjJPczumLzdLKeDLCvmV-iLULWRskzHbQ6ahfoLhoChNwQAvD_BwE&gclsrc=aw.ds: John Wiley & Sons, Inc. pp. 1–35. ISBN 978-1-119-37957-7. {{cite book}}: External link in |location= (help)CS1 maint: location (link)
  16. ^ Rouse, Margaret (June 1, 2018). "What is Database Replication". Search Data Management. Retrieved July 31, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  17. ^ "Database Recovery Without the Drama". Oracle Corporation. Retrieved July 31, 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  18. ^ Walker, Tilman. "How to avoid the 8 deadly misconceptions about data deduplication". TechBeacon. Retrieved 2019-01-30.
  19. ^ "Data deduplication", Wikipedia, 2019-01-20, retrieved 2019-01-30
  20. ^ Vellante, David (November 11, 2018). "Oracle's Recovery Appliance Reduces Complexity Through Automation" (PDF). Oracle. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  21. ^ Various authors. "Zero Data Loss Recovery Appliance Deep Dive: Direct from Development" (PDF). Oracle Corporation. Retrieved Jan 29, 2019. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  22. ^ Bradley, Marcie (September 29, 2014). "Oracle Reinvents Database Protection with Zero Data Loss Recovery Appliance". Oracle Corporation Press Release. Retrieved July 30, 2018.
  23. ^ Whitaker, Teri (January 21, 2015). "Oracle Tackles Data Center Cost and Complexity with Next-Generation Engineered Systems". Oracle Corporation Press Release.