User:Gbg06/Big data ethics
![]() | This is the sandbox page where you will draft your initial Wikipedia contribution.
If you're starting a new article, you can develop it here until it's ready to go live. If you're working on improvements to an existing article, copy only one section at a time of the article to this sandbox to work on, and be sure to use an edit summary linking to the article you copied from. Do not copy over the entire article. You can find additional instructions here. Remember to save your work regularly using the "Publish page" button. (It just means 'save'; it will still be in the sandbox.) You can add bold formatting to your additions to differentiate them from existing content. |
Article Draft
[edit]Regulation and relevant laws
[edit]Several relevant laws and acts define individuals’ right to control how their data is being used and collected. In the United States, acts such as the Health Insurance Portability and Accountability Act (HIPAA) and Fair Credit Reporting Act (FCRA) are intended to protect the privacy of personal health data and credit data respectively. These acts are intended to provide safeguards for sensitive data, particularly for marginalized populations who may have limited knowledge or resources to protect themselves. The FCRA has helped individuals dispute inaccurate credit data[1] and helped to reduce unfair or predatory lending practices. Using FCRA in this way can be especially critical for lower-income individuals who are more vulnerable to system errors or predatory lending practices. However, neither HIPAA nor FCRA addresses ethical algorithm design, leading to cases where algorithms can comply with HIPAA or FCRA guidelines but biases in data collection can still perpetuate bias.
Transaction transparency
[edit]Concerns have been raised around how biases can be integrated into algorithm design resulting in systematic oppression,[2]whether consciously or unconsciously. These manipulations often stem from biases in the data, the design of the algorithm, or the underlying goals of the organization deploying them. One major cause of algorithmic bias is that algorithms learn from historical data, which may perpetuate existing inequities. In many cases, algorithms exhibit reduced accuracy when applied to individuals from marginalized or underrepresented communities. A notable example of this is pulse oximetry, which has shown reduced reliability for certain demographic groups due to a lack of sufficient testing or information on these populations[3]. Additionally, many algorithms are designed to maximize specific metrics, such as engagement or profit, without adequately considering ethical implications. For instance, companies like Facebook and Twitter have been criticized for providing anonymity to harassers and for allowing racist content disguised as humor to proliferate, as such content often increases engagement[4]. These challenges are compounded by the fact that many algorithms operate as "black boxes" for proprietary reasons, meaning that the reasoning behind their outputs is not fully understood by users. This opacity makes it more difficult to identify and address algorithmic bias.
In terms of governance, big data ethics is concerned with which types of inferences and predictions should be made using big data technologies such as algorithms.[5]
Privacy
[edit]Privacy has been presented as a limitation to data usage which could also be considered unethical.[6] For example, the sharing of healthcare data can shed light on the causes of diseases, the effects of treatments, an can allow for tailored analyses based on individuals' needs.[6] This is of ethical significance in the big data ethics field because while many value privacy, the affordances of data sharing are also quite valuable, although they may contradict one's conception of privacy. Attitudes against data sharing may be based in a perceived loss of control over data and a fear of the exploitation of personal data.[6] However, it is possible to extract the value of data without compromising privacy.
Government surveillance has the potential to undermine individual privacy by collecting and storing data on phone calls, internet activity, and geolocation, among other things. For example, the NSA’s collection of metadata exposed in global surveillance disclosures raised concerns about whether privacy was adequately protected, even when the content of communications was not analyzed. The right to privacy is often complicated by legal frameworks that grant governments broad authority over data collection for “national security” purposes. In the United States, the Supreme Court has not recognized a general right to "informational privacy," or control over personal information, though legislators have addressed the issue selectively through specific statutes[7]. From an equity perspective, government surveillance and privacy violations tend to disproportionately harm marginalized communities. Historically, activists involved in the Civil rights movement were frequently targets of government surveillance as they were perceived as subversive elements. Programs such as COINTELPRO exemplified this pattern, involving espionage against civil rights leaders. This pattern persists today, with evidence of ongoing surveillance of activists and organizations[8].
Additionally, the use of algorithms by governments to act on data obtained without consent introduces significant concerns about algorithmic bias. Predictive policing tools, for example, utilize historical crime data to predict “risky” areas or individuals, but these tools have been shown to disproportionately target minority communities[9]. One such tool, the COMPAS system, is a notable example; Black defendants are twice as likely to be misclassified as high risk compared to white defendants, and Hispanic defendants are similarly more likely to be classified as high risk than their white counterparts[10]. Marginalized communities often lack the resources or education needed to challenge these privacy violations or protect their data from nonconsensual use. Furthermore, there is a psychological toll, known as the “chilling effect,” where the constant awareness of being surveilled disproportionately impacts communities already facing societal discrimination. This effect can deter individuals from engaging in legal but potentially "risky" activities, such as protesting or seeking legal assistance, further limiting their freedoms and exacerbating existing inequities.
Some scholars such as Jonathan H. King and Neil M. Richards are redefining the traditional meaning of privacy, and others to question whether or not privacy still exists.[5] In a 2014 article for the Wake Forest Law Review, King and Richard argue that privacy in the digital age can be understood not in terms of secrecy but in term of regulations which govern and control the use of personal information.[5] In the European Union, the right to be forgotten entitles EU countries to force the removal or de-linking of personal data from databases at an individual's request if the information is deemed irrelevant or out of date.[11] According to Andrew Hoskins, this law demonstrates the moral panic of EU members over the perceived loss of privacy and the ability to govern personal data in the digital age.[12] In the United States, citizens have the right to delete voluntarily submitted data.[11] This is very different from the right to be forgotten because much of the data produced using big data technologies and platforms are not voluntarily submitted.[11] While traditional notions of privacy are under scrutiny, different legal frameworks related to privacy in the EU and US demonstrate how countries are grappling with these concerns in the context of big data. For example, the "right to be forgotten" in the EU and the right to delete voluntarily submitted data in the US illustrate the varying approaches to privacy regulation in the digital age.[13]
Examples of ethical violations
[edit]Snowden disclosures
[edit]The fallout from Edward Snowden’s disclosures in 2013 significantly reshaped public discourse around data collection and the privacy principle of big data ethics. The case revealed that governments controlled and possessed far more information about civilians than previously understood, violating the principle of ownership, particularly in ways that disproportionately affected disadvantaged communities. For instance, activists were frequently targeted, including members of movements such as Occupy Wall Street and Black Lives Matter[11]. This revelation prompted governments and organizations to revisit data collection and storage practices to better protect individual privacy while also addressing national security concerns. The case also exposed widespread online surveillance of other countries and their citizens, raising important questions about data sovereignty and ownership. In response, some countries, such as Brazil and Germany, took action to push back against these practices[11]. However, many developing nations lacked the technological independence necessary to resist such surveillance, leaving them at a disadvantage in addressing these concerns.[11]
Cambridge Analytica scandal
[edit]The Cambridge Analytica scandal highlighted significant ethical concerns in the use of big data. Data was harvested from approximately 87 million Facebook users without their explicit consent and used to display targeted political advertisements. This violated the currency principle of big data ethics, as individuals were initially unaware of how their data was being exploited. The scandal revealed how data collected for one purpose could be repurposed for entirely different uses, bypassing users' consent and emphasizing the need for explicit and informed consent in data usage[14]. Additionally, the algorithms used for ad delivery were opaque, challenging the principles of transaction transparency and openness. In some cases, the political ads spread misinformation[15], often disproportionately targeting disadvantaged groups and contributing to knowledge gaps. Marginalized communities and individuals with lower digital literacy were disproportionately affected as they were less likely to recognize or act against exploitation. In contrast, users with more resources or digital literacy could better safeguard their data, exacerbating existing power imbalances.
References
[edit]- ^ "Section 319 of the Fair and Accurate Credit Transactions Act of 2003: Fifth interim report to Congress" (PDF). Federal Trade Commission. Federal Trade Commission. Retrieved 11 December 2024.
- ^ O'Neil, Cathy (2016). Weapons of Math Destruction. Crown Books. ISBN 978-0553418811.
- ^ Buolamwini, Joy; Gebiru, Timnit (2018). "Gender shades: Intersectional accuracy disparities in commercial gender classification" (PDF). Proceedings of the Conference on Fairness, Accountability, and Transparency. 81: 1–15. Retrieved 11 December 2024.
- ^ Farkas, Johan; Matamoros-Fernandez, Ariadna (22 January 2021). "Racism, Hate Speech, and Social Media: A Systematic Review and Critique". Television & New Media. 22 (2): 205–224. Retrieved 11 December 2024.
- ^ a b c Cite error: The named reference
:3
was invoked but never defined (see the help page). - ^ a b c Kostkova, Patty; Brewer, Helen; de Lusignan, Simon; Fottrell, Edward; Goldacre, Ben; Hart, Graham; Koczan, Phil; Knight, Peter; Marsolier, Corinne; McKendry, Rachel A.; Ross, Emma; Sasse, Angela; Sullivan, Ralph; Chaytor, Sarah; Stevenson, Olivia; Velho, Raquel; Tooke, John (17 February 2016). "Who Owns the Data? Open Data for Healthcare". Frontiers in Public Health. 4: 7. doi:10.3389/fpubh.2016.00007. PMC 4756607. PMID 26925395.
- ^ Gellman, Barton; Adler-Bell, Sam. "The Disparate Impact of Surveillance". The Century Foundation. Retrieved 11 December 2024.
- ^ Von Solms, Sune; Van Heerden, Renier (2015). "The consequences of Edward Snowden NSA related information disclosures". Proceedings of the 10th International Conference on Cyber Warfare and Security, ICCWS 2015: 358–368. Retrieved 11 December 2024.
- ^ Larson, Jeff; Mattu, Surya; Kirchner, Lauren; Angwin, Julia. "How We Analyzed the COMPAS Recidivism Algorithm". ProPublica. Retrieved 11 December 2024.
- ^ Hamilton, Melissa (2019). "The biased algorithm: Evidence of disparate impact on Hispanics" (PDF). American Criminal Law Review. 56 (4).
- ^ a b c d e f Walker, R. K. (2012). "The Right to be Forgotten". Hastings Law Journal. 64: 257–261.
- ^ Hoskins, Andrew (November 4, 2014). "Digital Memory Studies |". memorystudies-frankfurt.com. Retrieved 2017-11-28.
- ^ "ERRATUM". Ethics & Human Research. 44 (1): 17. January 2022. doi:10.1002/eahr.500113. ISSN 2578-2355. PMID 34910377.
- ^ Isaak, Jim; Hanna, Mina J. (14 August 2018). "User Data Privacy: Facebook, Cambridge Analytica, and Privacy Protection". Computer. 51 (8): 56–59.
- ^ Cite error: The named reference
:14
was invoked but never defined (see the help page).