User:Alarasun/Report

Acknowledgements

This is an advising report prepared for the course “COM481 - Online Communities” (Autumn 2024) at the University of Washington. It addresses the current considerations surrounding the acceptable possibilities of utilizing AI- or “neural network language models” on Wikipedia. Further information about these considerations can be found in the original article here^[1].

The points discussed below were evaluated with respect to the Wikimedia Foundation’s, as well as Wikipedia’s mission as a community and volunteer-based platform where the people, the contributors, are cherished and aimed to be empowered through their efforts in disseminating educational, encyclopedic content effectively and internationally.

Advising Report: Potential Use of AI on Wikipedia

The use of AI tools have seen a noticeable spike ever since its publicization^[2]. Considering this widespread utilization in many diverse settings^[3], I can see why a global online platform such as Wikipedia would consider using/integrating AI to their inner workings when these systems are literally made to work with big amounts of data!

On the other hand, the very same reason is why we should be cautious and intentional with how much we let AI do to Wikipedia content- because there is so much at stake. I agree that thorough assessments of the possible acceptable AI contributions to see whether they add value, or they’re damaging to it, is necessary.

Technical shortcomings

Wikipedia has been subjected to scrutiny about its credibility as a platform since its early days^[4], but it's interesting to see how much the use of AI systems are being normalized considering how it consists of data sets of (almost) everything that is out there^[5], yet also everything that is not out there- or rather a trail mix of both with often no way to track sources or authors. It can’t not respond when given a prompt^[5]- and because of this, it also can’t guarantee credibility^[6].

Therefore, the use of AI to generate ready-to-publish Wikipedia content is once again, unacceptable. Below are a couple of examples that we’ll delve deeper into to see why that is.

Translation: Translation was proposed as a potential use, with the requirement that it would need to be reviewed (by a human) before publishing- and I'd agree for the most part. However, the limitation to this utilization is the fact that AI datasets and their training rates are not fast enough to keep up with new trends - for example, ChatGPT4o mini’s latest data update is only till October 2023- so it's not current. Alternatively, if the text is dominantly a cultural one, and the target language’s datasets don’t have the terminology or a concept for the translation material yet, it won't be dependable. Similarly, AI generators are also oftentimes too literal to encapsulate the emotions of an artistic text that needs to be translated. Similar dynamics apply to all proposals made- like copyediting.

Planning an article: This proposal also has some time related concerns like the previous one. While it may be useful in most cases to create a custom framework for constructing/expanding an article, it won't be much use when asked the scope of an event that’s very recent- or what’s relevant to that event. Out of all proposals made, this might be the most promising in terms of risk.

Overall, no matter what use we find for AI, I believe that human supervision and approval should be the one to finalize things- at least until the time where~~(if)~~ we all agree that AI is credible enough. I wouldn’t permit AI any autonomy- especially for any task beyond copyediting- and even then, I agree with showing due diligence in navigating that^[1]- OR I’d say using it only to moderate and troubleshoot without taking action on the articles unless approval or real-time input is given by a Wikipedian is the safest way to go about it (which could also be an aid in preserving/increasing user participation rates^[7]).

Community

Speaking of, we should also consider- what about the community? When implementing any kind of new system or regulations, it is imperative that people who are affected by these changes are taken into account. A reminder, we do not want to replace humans, we seek to assist them through this implementation.

If we consider the cost & benefit of time's involvement in the matter once more, AI automation technically will shorten the time that is needed to process an incomplete article’s needs- but if a human can only review it in the same amount of time as before- this will only create a new backlog of articles that are flagged as "needs verification/approval". Are there enough existing, active volunteers to balance this output out? Probably not! Which is why it is good to be a little cautious before getting too excited about automation.

Additionally, people can be very particular about how they like to do things, and to what extent their commitment would be worth dedicating to them^[7]- and AI use can make some people feel as if they're no longer needed, or they're being put up for a competition that they haven't signed up for^[7]. It's not hard to foresee that this will eventually lead to a decrease in user participation when it is readily lower than desired. In other words, AI utilization might kill certain intrinsic incentives for human authors and editors even further.

Even if they welcomed AI use, will automation necessarily motivate them to participate? Not sure. Will it perhaps also be more of a chore and annoyance to maintain after the fact? Highly likely. If AI does every copy-edit automatically, it not only might be a threat to the talk pages' existence, but it might also revive the early 2000s Scunthorpe problem (i.e. 'clbuttic'). The feedback & discussion dynamics that Wikipedians engage in through talk pages are what makes the process special, it's "human", with humans in mind. When things are automated, how can those talk pages get used as much as they do now?

In short, the effect of unmotivated participants^[7] is already a big threat to Wikipedia's well-being as a volunteer-based platform that has been dependent on human contributions- and implementation of AI tools just might bring the end for most.

Potentials for AI use

On the bright side, there might be *some* sweet middle ground for certain uses of AI.

Let's think about Grammarly. There can be a similar plug-in specifically programmed to be compliant with Wikipedia rules and style to offer users grammar/tone check suggestions, but not a bot that does auto-edits to the articles. One good side of this is not only the fact that it preserves and respects the authors/editors autonomies, it can also play a big part in minimizing the chaotic and unordered edits that are often made by newcomers^[7].

On the article level, I would suggest implementing special disclaimers/tags that mark articles in progress of being flagged through AI, have critical edits suggested by AI, used the assistance of AI, etc. This would be different from the current Wikipedia article tags. To differentiate the two, AI-indicating badges can be used within those tags (similar to article-class badges). This would create a sense of "informed consent" to consume AI-involving material, and a way of moderation for Wikipedians.

Finally, if we really want AI to get close to having automated article edit permissions, I can offer the idea of "AI sandboxes" tabs, for previewing the “AI suggested edits pending approval”. If implemented, these AI sandbox tabs would automatically be created similar to an article's "talk page", and the edits can be published live upon a Wikipedian's review & approval.

References

^ ^a ^b "Wikipedia:Using neural network language models on Wikipedia", Wikipedia, 2024-06-04, retrieved 2024-11-10
^ Office, U. S. Government Accountability (2024-06-06). "Artificial Intelligence's Use and Rapid Growth Highlight Its Possibilities and Perils | U.S. GAO". www.gao.gov. Retrieved 2024-11-10.
^ "Applications of artificial intelligence (AI)". Google Cloud. Retrieved 2024-11-10.
^ "Reliability of Wikipedia", Wikipedia, 2024-11-05, retrieved 2024-11-11
^ ^a ^b Coldewey, Devin (2024-06-01). "WTF is AI?". TechCrunch. Retrieved 2024-11-10.
^ Shaw, Ben. "Research Guides: Artificial Intelligence (AI) and Information Literacy: What does AI get wrong?". lib.guides.umd.edu. Retrieved 2024-11-10.
^ ^a ^b ^c ^d ^e Kraut, Robert E.; Resnick, Paul; Kiesler, Sara; Burke, Moira; Chen, Yan; Kittur, Niki; Konstan, Joseph; Ren, Yuqing; Riedl, John (2011). Building Successful Online Communities: Evidence-Based Social Design. The MIT Press. ISBN 978-0-262-01657-5.

[:0-1] "Wikipedia:Using neural network language models on Wikipedia", Wikipedia, 2024-06-04, retrieved 2024-11-10

[2] Office, U. S. Government Accountability (2024-06-06). "Artificial Intelligence's Use and Rapid Growth Highlight Its Possibilities and Perils | U.S. GAO". www.gao.gov. Retrieved 2024-11-10.

[3] "Applications of artificial intelligence (AI)". Google Cloud. Retrieved 2024-11-10.

[4] "Reliability of Wikipedia", Wikipedia, 2024-11-05, retrieved 2024-11-11

[:2-5] Coldewey, Devin (2024-06-01). "WTF is AI?". TechCrunch. Retrieved 2024-11-10.

[:1-6] Shaw, Ben. "Research Guides: Artificial Intelligence (AI) and Information Literacy: What does AI get wrong?". lib.guides.umd.edu. Retrieved 2024-11-10.

[:3-7] Kraut, Robert E.; Resnick, Paul; Kiesler, Sara; Burke, Moira; Chen, Yan; Kittur, Niki; Konstan, Joseph; Ren, Yuqing; Riedl, John (2011). Building Successful Online Communities: Evidence-Based Social Design. The MIT Press. ISBN 978-0-262-01657-5.

[1]

[2]

[3]

[4]

[5]

[6]

[7]