Data contract
Certain historical revisions of this page may meet criterion RD1 for revision deletion, as they contain significant copyright violations of https://medium.com/profitoptics/data-contract-101-568a9adbf9a9 (Copyvios report) that have been removed in the meantime.
Note to admins: In case of doubt, remove this template and post a message asking for review at WT:CP. With this script, go to the history with auto-selected revisions. Note to the requestor: Make sure the page has already been reverted to a non-infringing revision or that infringing text has been removed or replaced before submitting this request. This template is reserved for obvious cases only, for other cases refer to Wikipedia:Copyright problems. Note to others: Please do not remove this template until an administrator has reviewed it. |
In data management, a data contract is a link between data producers and data consumers. It also is a link between business (logical representation of the data) and technology (its physical implementation). A data contract also describes advanced metadata, such as data quality rules, SLA, and behavior.
The Linux Foundation project Bitol has published a data contract standard called Open Data Contract Standard (OCDS).[1]. Its current version is 3.0.1.
History
In May 2023, PayPal open-sourced its Data Contract Template.[2].
In June 2023, Andrew Jones published Driving Data Quality with Data Contracts: A comprehensive guide to building reliable, trusted, and effective data platforms[3], which is, up to now, the only published book on this topic.
In November 2023, Bitol, a Linux Foundation project, released the first version of ODCS (Open Data Contract Standard), a compatible fork from the PayPal template.[4]
In October 2024, Bitol released ODCS v3.0.0 with enhanced support for data quality.[5]
Implementation
Data contracts are divided into several sections:
Best Practices
Usually, a data contract is created by one data producer to one or many data consumers.
A data contract is designed to be enhanced iteratively. Data engineers can start by the few elements in the header and the schema. Data engineers and owners can add more information, like data quality and SLA, over time.
Most data contracts are implemented using a YAML file, which is both human -and computer-readable, as well as language-agnostic.
References
- ^ "Open Data Contract Standard (ODCS)". GitHub. Retrieved 2025-03-18.
- ^ "Data Contract Template". GitHub. PayPal. Retrieved 18 March 2025.
- ^ Andrew, Jones (Jun 30, 2023). Driving Data Quality with Data Contracts (1 ed.). Packt. p. 206. ISBN 9781837635009. Retrieved 18 March 2025.
- ^ "Open Data Contract Standard". GitHub. Bitol. Retrieved 18 March 2025.
- ^ "ODCS Version 3.0.0". GitHub. Bitol. Retrieved 18 March 2025.