Data definition specification
![]() | This article may be too technical for most readers to understand.(December 2012) |
In computing, a data definition specification (DDS) is a guideline to ensure comprehensive and consistent data definition. It represents the attributes required to quantify data definition. A comprehensive data definition specification encompasses enterprise data, the hierarchy of data management, prescribed guidance enforcement and criteria to determine compliance.
Overview
A data definition specification may be developed for any organization or specialized field, improving the quality of its products through consistency and transparency. It eliminates redundancy (since all contributing areas are referencing the same specification) and provides standardization, making it easier and more efficient to create, modify, verify, analyze and share information across the enterprise.[1]
To understand how a data definition specification works in an enterprise, we must look at the elements of a DDS. Writing data definitions, defining business terms (or rules) in the context of a particular environment, provides structure for an organization’s data architecture. In developing these definitions, the words used must be traceable to clearly-defined data.
A data definition specification may be used in the following activities to provide consistency and clarity between departments supporting the activity:[2]
- Business intelligence
- Business process modeling
- Business rules management
- Data analysis and modeling
- Information architecture
- Metadata modeling
- Report generation
Criteria
A data definition specification requires data definitions to be:
- Atomic – singular, describing only one concept. Commonly-used and ambiguous terms should be defined.[2] While a term refers to one concept, several words may be used in a term:
- File – A concept identifiable with one word
- File extension – A concept identifiable with more than one word
- Traceable – Mapped to a specific data element. In business, a term may be traced to an entity (for example, a customer) or an attribute (such as a customer's name). A term may be a value in a data set (such as gender), or designate the data set itself. Traceability indicates relationships in the data hierarchy.
- Consistent - Used in a standard syntax; if used in a specific context, the context is noted
- Accurate - Precise, correct and unambiguous, stating what the term is and is not[3]
- Clear - Readily understood by the reader
- Complete - With the term, its description and contextual references
- Concise - To avoid circular references
Applications
Enterprise data
A data definition specification was produced by the Open Mobile Alliance to document charging data.[4] The document, the centralized catalog of data elements defined for interfaces, specifies the mapping of these data elements to protocol fields in the interfaces. Created for the exchange of financial data, Market Data Definition Language (MDDL) is an XML specification designed
to enable the interchange of information necessary to account, to analyze, and to trade financial instruments of the world's markets. It defines an XML-based interchange format and common data dictionary on the fields needed to describe: (1) financial instruments, (2) corporate events affecting value and tradability, and (3) market-related, economic and industrial indicators. The principal function of MDDL is to allow entities to exchange market data by standardizing formats and definitions. MDDL provides a common format for market data so that it can be efficiently passed from one processing system to another and provides a common understanding of market data content by standardizing terminology and by normalizing the relationships of various data elements to one another ... From the user perspective, the goal of MDDL is to enable users to integrate data from multiple sources by standardizing both the input feeds used for data warehousing (i.e., define what's being provided by vendors) and the output methods by which client applications request the data (i.e., ensure compatibility on how to get data in and out of applications)."[5]
Clinical submissions
The Clinical Data Interchange Standards Consortium (CDISC), a global, multidisciplinary, non-profit organization, has established standards to support the acquisition, exchange, submission and archive of clinical research data and metadata. CDISC standards are vendor-neutral, platform-independent and freely available via the CDISC website. The Case Report Tabulation Data Definition Specification (define.xml) now in draft version 2.0, is the most mature of the data definition specifications. The specification is part of the evolution from the 1999 FDA electronic submission (eSub) guidance and the electronic Common Technical Document (eCTD) documents that specify a document describing the content and structure of the included data should be provided within a submission. Define.xml was developed to help automate the review process by providing a means to generate a Data Definition Document in machine-readable format. Define.xml has significantly improved the review process by standardizing the numerous submissions to the FDA and enabling the interchange and regulatory submissions efficiently. Define XML standard for transmission of Study Data Tabulation Models (SDTM), Standard for the Exchange of Non-clinical Data (SEND) and Analysis Data Model (ADaM) metadata has reduced review cycle times from over two years down to months.[6]
Archival data
A Data Definition Specification (DDS) forms the foundation of building the metadata within the schema for data archiving. While not a DDS, Metadata Encoding & Transmission Standard (METS) does utilize some of the same principles of a DDS: consistent use of well defined terms. METS utilizes these key terms to catalog digital objects for global use. The METS schema provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a digital library object, and for expressing the complex links between these various forms of metadata. It can therefore provide a useful standard for the exchange of digital library objects between repositories.[7]
A similar effort is underway to preserve the complex data associated with the archiving of video games. Preserving Virtual Worlds sought to address the deficiencies in current archival formats siting the absence of suitable ways of documenting interactive fiction and games at the bit-level: specifically, they failed to provide the “representation information” needed to map the raw bits into higher-level data constructs.[8] Preserving Virtual Worlds 2 is the ongoing research project that continues and expands upon the initial effort in this field.[9]
See also
References
- ^ Gouin, Deborah. & Corcoran, Charmane K. (2008). Developing the MSU Enterprise Data Definition Standard. Michigan State University Web site: http://eis.msu.edu/uploads/---University%20EIS%20Working%20Committee%20Meetings/05%20August%202008/Enterprise%20Data%20Definition%20Standard%20Presentation082708.pdf
- ^ a b Thomas, Gwen. (2008). Writing Enterprise-Quality Data Definitions: Tips for Creating Terms and Definitions. Data Governance Institute Web site: http://www.datagovernance.com/dgi_wp_writing_enterprise-quality_data_definitions.pdf
- ^ International Organization for Standardization JTC1/SC32 Committee. (2004) ISO 11179-4. http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html.
- ^ "Charging Data" (PDF). Open Mobile Alliance. 1 February 2011. pp. 6, 35. Archived from the original (PDF) on 6 October 2013. Retrieved 12 March 2014.
- ^ "Market Data Definition Language (MDDL)". Cover Pages. December 26, 2002. Archived from the original on December 14, 2013. Retrieved March 12, 2014.
- ^ “Define.XML”. (2012). Clinical Data Interchange Standards Consortium CDISC Web site: http://www.cdisc.org/define-xml
- ^ Metadata Encoding & Transmission Standard (METS) Web site from the The Library of Congress- Standards http://www.loc.gov/standards/mets/
- ^ “Meta Data Schema Development”. (2008). Preserving Virtual Worlds Web site: http://pvw.illinois.edu/pvw/?page_id=25
- ^ Preserving Virtual Worlds 2, Researching best practices for videogame preservation. (2012). http://pvw.illinois.edu/pvw2/