Data ecosystem
![]() | This template is not to be used in article space. This is the sandbox page where you will draft your initial Wikipedia contribution. If you're starting a new article, you can develop it here until it's ready to go live. If you're working on improvements to an existing article, copy only one section at a time of the article to this sandbox to work on, and be sure to use an edit summary linking to the article you copied from. Do not copy over the entire article. You can find additional instructions here. Remember to save your work regularly using the "Publish page" button. (It just means 'save'; it will still be in the sandbox.) You can add bold formatting to your additions to differentiate them from existing content. |
Data Ecosystem

A data ecosystem refers to the complex environment of co-dependent networks and actors that contribute to data collection, transfer and use [1]. They can span across sectors - such as healthcare or finance, to inform one another's practices [2]. A data ecosystem often consists of numerous data assemblages (see below)[3]. Research into data ecosystems has developed in response to the rapid proliferation and availability of information through the web, which has contributed to the commodification of data [1].
What Are Data?
In today's highly computerized environment, data (singular datum) refers to digitized information that is compressed for efficient transmission [4]. Data is constituted of binary values, expressed as 1 or 0, which allows complex thoughts, images, videos and more to be abstracted [4]. The level of data production and exchange has exploded in recent decades, with government and public agencies freely publishing vast swaths of data, particularly in environmental, cultural, scientific and statistical fields [1]. It has also led to a highly profitable industry for companies that collect, categorize and disseminate data as a tradable resource and operate within the newly defined data ecosystems [1].
Defining a Data Ecosystem
The nature of an ecosystem denotes a symbiotic relationship between elements. Thus, when describing a data environment as an ecosystem, we correctly assume that it describes a co-constitutive relationship. Their primary purpose is to create, manage and sustain the sharing of data across platforms and disciplines [1]. Key to this initiative are data intermediaries, which facilitate access to the data, and are categorized into seven types including data trusts, data exchanges and data platforms [2][5]. A data ecosystem also comprises data providers and consumers, who as their titles denote, provide and consume the data through the intermediaries [3].
A common example of data ecosystem exists within the realm of web browser. A third-party tracking app on a website (what we refer to as cookies) acts as an intermediary by collecting and organizing data. The web browser becomes the data provider, as it shares a users information as they navigate through different websites. The websites themselves become consumers as they utilize the tracking information to tailor content based on user behaviour.
As mentioned, data ecosystems can span across sectors, for example, a client's medical data is shared with an insurance company to calculate a premium. The point of an ecosystem is that all actors within the shared environment are are contributing to a common resource or knowledge-base [1].

Mapping a Data Ecosystem
Data ecosystems posses three major characteristics: network, platform, and co-evolution [1]. Network loosely refers to the groups of data and technology developers, providers, and resellers [1]. The platform then is the service, tool or platform that is collaboratively used by the network of actors [1]. The platform provides the interface for the actors to produce their shared product or service [1]. The final characteristic refers to how the different actors and platform enable one another to evolve or improve upon itself [1]. The metaphorical use of the term ecosystem intrinsically demands that all parties involved are mutually benefited by their engagement. That would be the betterment or evolution of their own functioning, which leads to positive outcomes for the larger ecosystem. Again, to use the example of a web browser - the third-party tracking app collects data to help websites evolve their content strategies, which then provide more accurate user data to third-party trackers in an endless feedback loop.
Data Assemblages
Within the broad landscape of a data ecosystem are numerous data assemblages. An assemblage is described as interconnected socio-technical systems that work in tandem with one another for a common purpose [3]. These systems encompass the technological, political, financial and best practices that sustain the collection, transfer and dispersion of data [6]. The below table demonstrates the common elements of a data assemblage which facilitate and govern datafication.

Within a data ecosystem, you will uncover numerous data assemblages, as each actor within the system have their own sets of tangible and non-tangible elements for their operation. Your web browser as a data provider have their own assemblage of hardware, software, servers, finances, infrastructure and practices etc. Each website that consumes the data and the broader companies that they represent similarly present an assemblage of systems. And the intermediary tracking sites which collect and sell the data operate according to an assemblage. It's possible that different assemblages may share elements within the broader ecosystem, or have individual elements, such as opposing hardware or platforms, that come into conflict [8].
Big Data
The rise of data ecosystems is part and parcel with the development of big data.
- huge in volume, consisting of terabytes or petabytes of data;
- high in velocity, being created in or near real-time;
- diverse in variety, being structured and unstructured in nature;
- exhaustive in scope, striving to capture entire populations or systems (n=all);
- fine-grained in resolution and uniquely indexical in identification;
- relational in nature, containing common fields that enable the conjoining of different data sets; Electronic copy available at: https://ssrn.com/abstract=2474112
- flexible, holding the traits of extensionality (can add new fields easily) and scaleability (can expand in size rapidly). (boyd and Crawford 2012; Dodge and Kitchin 2005; Laney 2001; Marz and Warren 2012; Mayer-Schonberger and Cukier 2013; Zikopoulos et al., 2012) [6]

Open Data
References
- ^ a b c d e f g h i j k Oliveira, Marcelo Iury S.; Lóscio, Bernadette Farias (2018-05-30). "What is a data ecosystem?". Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age. dg.o '18. New York, NY, USA: Association for Computing Machinery: 1–9. doi:10.1145/3209281.3209335. ISBN 978-1-4503-6526-0.
- ^ a b Abdulla, Ahmed (March 8, 2021). "Data ecosystems made simple". McKinsey Digital.
{{cite web}}
: CS1 maint: url-status (link) - ^ a b c Kitchin, Rob (2022). The data revolution : a critical analysis of big data, open data & data infrastructures (Second edition ed.). Los Angeles, CA: Sage Publications Ltd. ISBN 978-1-5297-3375-4. OCLC 1285687714.
{{cite book}}
:|edition=
has extra text (help) - ^ a b Vaughan, Jack (July 2019). "data". TechTarget.
{{cite web}}
: CS1 maint: url-status (link) - ^ Massey, Joe (August 18, 2022). "Data Institutions". Open Data Institute. Retrieved November 20, 2022.
{{cite web}}
: CS1 maint: url-status (link) - ^ a b P., Kitchin, Rob Lauriault, Tracey (2014-07-27). Towards critical data studies: Charting and unpacking data assemblages and their work. The Programmable City Working Paper 2. Programmable City. OCLC 1291151213.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ Kitchin, Rob (2022). The data revolution : a critical analysis of big data, open data & data infrastructures (Second edition ed.). Los Angeles, CA. ISBN 978-1-5297-3375-4. OCLC 1285687714.
{{cite book}}
:|edition=
has extra text (help)CS1 maint: location missing publisher (link) - ^ Cui, Yesheng; Kara, Sami; Chan, Ka C. (2020-04). "Manufacturing big data ecosystem: A systematic literature review". Robotics and Computer-Integrated Manufacturing. 62: 101861. doi:10.1016/j.rcim.2019.101861. ISSN 0736-5845.
{{cite journal}}
: Check date values in:|date=
(help)