Wikipedia talk:Large language models/Archive 7

This is an archive of past discussions on Wikipedia:Large language models. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

←

Archive 5

Archive 6

Archive 7

LLM Experiment - Sources and unsourced information helper

I conducted an experiment to see if LLMs (Chat-GPT4) could identify sourced and unsourced information in a Wikipedia article. The experiment can be viewed step-by-step here. The Weather Event Writer (Talk Page) 03:05, 21 September 2023 (UTC)

Hello WeatherWriter and thanks for sharing your experiment. I think there is some potential in using LLMs to help with verification. One note regarding your experiment: the amount of history of which ChatGPT is aware is limited. This is the case even if you tell it to "Keep track of this source". So as the chat grows longer, it will forget the earlier steps in the exchange, which contain the text of the sources. But it does not tell you this and may attempt to answer your question nonetheless, probably with hallucinations. This issue becomes more serious with very long sources or very many sources. One way to solve this problem is to restrict oneself to the verification of one claim and one source a time and start a new chat for each new claim/source. Another issue is that your process only tells you which sources verify a claim but does not cite the passage that verifies it. So you would have to trust that it is not a hallucination instead of being able to check it for yourself.

For a user script that implements a similar idea, see User:Phlsph7/SourceVerificationAIAssistant. Phlsph7 (talk) 08:17, 21 September 2023 (UTC)

Great work! I did some follow up work with the 32k version of GPT4. Could fit the texts of all of the RS in and pose a single prompt for find unsupported material. See User_talk:WeatherWriter/LLM_Experiment_1 for more details. Nowa (talk) 07:38, 22 September 2023 (UTC)

"Wikipedia:CHATGPT" listed at Redirects for discussion

The redirect Wikipedia:CHATGPT has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 October 2 § Wikipedia:CHATGPT until a consensus is reached. - CHAMPION ^{(talk) (contributions) (logs)} 10:22, 2 October 2023 (UTC)

Information page

Based on the discussion above, #RfC: Is this proposal ready to be promoted?, it is clear that this page will not be promoted either to policy or to guideline. However, the RfC didn't really specify what would happen to the page if it isn't promoted. In the absence of such clarity, I am considering swapping out the {{Proposal}} template with {{Information page}}. (An information page is a kind of project-space page that "intends to describe some aspect(s) of Wikipedia's norms, customs, technicalities, or practices", and it doesn't require much consensus for a page to be labelled as such.) Would there be any objection to that? It would be a softer template than something like {{failed}}, and I'm not sure if we have enough consensus to use {{supplement}}. Mz7 (talk) 03:43, 7 October 2023 (UTC)

I would object because this page doesn't describe any aspect of Wikipedia's norms, customs, etc. I would suggest it be labeled an essay (not a failed proposal), because it conveys the opinion of some editors on an issue relating to Wikipedia, and that's what an essay is. Levivich (talk) 05:34, 7 October 2023 (UTC)

I have no objection to {{essay}}. Mz7 (talk) 19:53, 7 October 2023 (UTC)

Done: switched {{proposal}} to {{essay}} in the absence of objections. [1] The "essay" label doesn't really require much consensus to use, anyway, and as indicated above, it doesn't seem like there is consensus to use a different label for this page. Mz7 (talk) 21:37, 9 October 2023 (UTC)

Essay should be fine. Cheers, · · · Peter Southwood ^(talk): 12:11, 12 October 2023 (UTC)

I would prefer {{failed proposal}} to essay as long as the page remains substantially unchanged, since it was drafted as policy, is worded like one, and risks being brought up as semi-authoritative in the same way many essays are. If someone decides to substantially rewrite it into a more typical essay format, no objection to the essay tag being brought back. DFlhb (talk) 19:54, 14 October 2023 (UTC)

I took a first step to make it more essay-like by using less authoritative language. I tried to keep the changes to a minimum: most changes involve replacing expressions like "must" with "should". Phlsph7 (talk) 07:52, 15 October 2023 (UTC)

Paper

Good paper on the topic of LLM assisted writing and the kind of tools we might like to create if there are any software folk here. https://arxiv.org/pdf/2309.15337.pdf Talpedia 14:56, 9 November 2023 (UTC)

Copyright on that seems to be 2018, btw. - Dank (push to talk) 15:07, 9 November 2023 (UTC)

Hmm... odd the paper says "7 Sep 2023" in the margin and is using gpt-4 (which was released this year). Talpedia 18:02, 9 November 2023 (UTC)

AI being used by reliable sources

@ActivelyDisinterested has suggested a list in the Wikipedia: project namespace to collect notes on sources that we have considered reliable in the past, but which seem to be using LLM-generated content. Please see Wikipedia talk:Reliable sources#Use of AI content generation by previously reliable sources if you're interested in this subject and/or would like to help compile the list. WhatamIdoing (talk) 22:29, 1 December 2023 (UTC)

Discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:OpenAI

You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:OpenAI. –Novem Linguae (talk) 05:32, 4 December 2023 (UTC)

Discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:AI-generated notification

You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:AI-generated notification. –Novem Linguae (talk) 05:32, 4 December 2023 (UTC)

Drafts for 15,616 genes, only 9% wrong

As a demonstration of the potential for AI to impact scientific practice, we use WikiCrow to generate draft articles for the 15,616 human protein-coding genes that currently lack Wikipedia articles, or that have article stubs. WikiCrow creates articles in 8 minutes, is much more consistent than human editors at citing its sources, and makes incorrect inferences or statements about 9% of the time, a number that we expect to improve as we mature our systems.

https://www.futurehouse.org/wikicrow

Welp. Sandizer (talk) 18:41, 13 December 2023 (UTC)

Estimating risk

Raw chatgpt assist

Question to ChatGPT: "write a list of activities that can be assisted by LLMs, to write encyclopedias, in order of riskiness". ChatGPT's response appeared to have errors e.g. drafting likely more risky than language refinement, but ChatGPT said the opposite. "Assisting in the generation of citations and references" is almost halluncinatory?

ChatGPT response:Sure, it's important to note that language models like GPT-3.5, often referred to as large language models (LLMs), can be powerful tools but may not always provide accurate or up-to-date information. The riskiness of using LLMs for certain activities can vary, and it's crucial to verify the information obtained from these models. Here's a list of activities related to writing encyclopedias, ordered from lower to higher perceived risk: Gathering Basic Information: Summarizing established facts and information. Compiling lists of well-known data. Creating Outlines: Generating initial outlines for encyclopedia articles. Providing structure and key points for various topics. Generating Drafts: Assisting in drafting sections of encyclopedia entries. Offering ideas and context for historical events or scientific concepts. Language Refinement: Improving the readability and coherence of drafted content. Suggesting alternative phrasing for clarity. Explaining Concepts: Providing explanations for complex concepts in simple terms. Offering background information on specific subjects. Citing Sources and References: Assisting in the generation of citations and references. Helping with proper attribution of information. Fact-Checking: Cross-referencing information with reliable sources. Verifying the accuracy of historical or scientific details. In-Depth Research Assistance: Assisting in more comprehensive research on specific topics. Offering insights into less well-known or niche subjects. Handling Controversial Topics: Providing information on controversial issues with caution. Giving balanced perspectives and highlighting multiple viewpoints. Ethical Considerations: Assessing and addressing ethical concerns in content creation. Ensuring that information presented is unbiased and fair. Up-to-Date Information: Verifying the recency of information to ensure it is current. Handling topics that may evolve rapidly with care. Legal and Sensitivity Issues: Avoiding the generation of content that may violate laws or ethical guidelines. Exercising caution when dealing with sensitive or potentially harmful topics. Remember that while LLMs can be valuable tools, they are not infallible, and human oversight is essential. Critical evaluation, fact-checking, and ensuring compliance with ethical and legal standards should always be part of the content creation process.

Grateful for corrections. This list, bot-assisted, has been revised by a human. It estimates riskiness from least to most.

For talk pages
- Gathering Basic Information. Summarizing established facts and information. Compiling lists of well-known data.
- Generating outlines for articles. Providing structure and key points.
- Offering ideas and context for historical events or scientific concepts.
- In-Depth Research Assistance: Assisting in more comprehensive research on specific topics. Offering insights into less well-known or niche subjects.

Main space
- Spell-checking
- Language Refinement: Improving readability and coherence of drafted content. Suggesting alternative phrasing for clarity. Copyedit, reformulate, simplify.
- Explaining Concepts: Providing explanations for complex concepts in simple terms. Offering background information on specific subjects.
- Suggest images
- Suggest wikilinks
- Generating summaries e.g. lead summaries or other summaries
- Suggest expansion

Higher risk:
- Generating Drafts: Assisting in drafting sections of entries.
- Fact-Checking: Cross-referencing information with reliable sources. Verifying the accuracy of historical or scientific details.
- Up-to-Date Information: Verifying the recency of information to ensure it is current. Handling topics that may evolve rapidly with care.
- Handling Controversial Topics: Providing information on controversial issues with caution. Giving balanced perspectives and highlighting multiple viewpoints.
- Citing Sources and References: Assisting in the generation of citations and references. Helping with proper attribution of information.
- Ethical Considerations: Assessing and addressing ethical concerns in content creation. Ensuring that information presented is unbiased and fair.
- Legal and Sensitivity Issues

Tom B (talk) 16:23, 20 December 2023 (UTC)

Discussion at Wikipedia talk:WikiProject AI Cleanup § Proposal: adopting WP:LLM as this WikiProject's WP:ADVICEPAGE

You are invited to join the discussion at Wikipedia talk:WikiProject AI Cleanup § Proposal: adopting WP:LLM as this WikiProject's WP:ADVICEPAGE. ^Queen of _Hearts 21:37, 9 January 2024 (UTC)

Someone implemented a full-on article generator, and Anthropic gave it an award

Blog post description, GitHub repo, based on this Stanford work, which also has a repo, and a live working demo.

Have people noticed those kind of articles? The outline structure is more distinctive and flamboyant than we usually see from human editors. 141.239.252.245 (talk) 05:43, 26 April 2024 (UTC)

Just saw this. Taking a look now. Thanks for posting. Nowa (talk) 12:07, 9 November 2024 (UTC)

I took a look at the tool and in its current state, it is unsuitable for drafting Wikipedia articles. The big problem is the text in the drafted articles is not necessarily supported by the cited references. Here is an example. See my comments at the bottom of the article. Another problem is that the LLM engages in wp:synth where it takes two facts stated in the references and extrapolates a conclusion from them. You can see an example of synth here. Again, scroll to my comments at the bottom.Nowa (talk) 23:24, 10 November 2024 (UTC)

Using the app a bit more, it's actually a helpful "search engine". The references are relevant to the topic of concern. So it might be useful for getting references for a Wikipedia article. Here is an example. Nowa (talk) 21:48, 11 November 2024 (UTC)

Opening sentence

There seems to be something wrong with the opening sentence. I can't parse it:

"While large language models (often known as "chatbots") are very useful, machine-generated text (like human-generated) often contains errors, is useless, whilst seeming accurate."

Am I missing something? Or does it need editing? AndyJones (talk) 12:35, 28 August 2024 (UTC)

Yeah it was made not-very-sensical in Special:Diff/1230068490

—Alalch E. 14:36, 28 August 2024 (UTC)

I have reverted the above mentioned change. —Alalch E. 21:37, 29 August 2024 (UTC)

Ban it. Full stop.

I was shocked and horrified to see a banner on an article announcing that it may contain "hallucinations" due to use of LLM content. A more accurate word for that template would be "lies", "fabrications", or "misinformation". So I would like that change to be made. But even more so, I firmly believe that using LLM-generated content goes directly against the ethos of the Wikipedia project and as such, that material needs to be completely, unambiguously banned from addition. Where can we go about enacting such a policy? Matt Gies (talk) 15:08, 2 November 2024 (UTC)

Wikipedia:Village pump (policy) is the place to do it. Make sure you read the previous discussions about this topic before, or you'll get nowhere. The main ones are linked at the top of this talk page. Tercer (talk) 15:39, 2 November 2024 (UTC)

Using Claude 3.5 to summarize reference content for Wikipedia

I've been evaluating some of the more improved LLMs to see if they do a better job of summarizing reference content for Wikipedia without undue copyright infringement. Claude 3.5 seems to do a reasonably good job. See Feral_pig#Canada. I fed the content of a National Geographic article into Claude and asked it to summarize the content in Wikipedia format, including the citation.

To check for copyvio, I used Microsoft Word's "compare" function to see how the wiki draft compared to the original reference. I posted the results of the comparison on my google drive here.

As far as I can tell:

The wiki draft content looks faithful to the reference.
There are no hallucinations.
The longest extracted phrase from the original text was "...throughout western and central Canada, from British Columbia to Manitoba.."
There is no wp:synth.

Did I miss anything? Is there a more challenging task we should give it? Nowa (talk) 12:38, 21 November 2024 (UTC)

Simply put, I see no value in using these systems to "summarize" a citation. — The Hand That Feeds You:^Bite 17:00, 21 November 2024 (UTC)

Seems awfully like close paraphrasing anyway, when you change

"In the late 1980s and early 1990s, some Canadian farmers imported wild boars from Europe to raise for meat. But as wild boars are wont to do, some of them escaped, either digging under fences or barreling through them. Others were set free once the boar meat market cooled. At first, it didn’t seem like a big problem; many thought they couldn’t survive Canada’s long winters. "

into

"Canadian farmers began importing wild boars from Europe in the late 1980s and early 1990s for meat production. Some of these animals escaped from farms by digging under or breaking through fences, while others were intentionally released when the boar meat market declined. Initially dismissed as a minor concern due to assumptions about their inability to survive Canadian winters" Fram (talk) 17:18, 21 November 2024 (UTC)

@Fram Interesting. I thought I would see how the paraphrasing of a human-drafted original new section based on a reference compared to an LLM drafted new section based on the same reference. I went back to when the section "Attacks on Humans" was introduced into the feral pig article here. I then asked Claude to draft a two sentence section based on the same NYT reference Here's how the human-drafted and Claude-drafted sections compare:

NYT reference excerpts	Human-drafted section	Claude-drafted section
Christine Rollins, 59, was feet away from the front door when she was killed. It was only the fifth documented fatal wild hog attack in the country since 1825... it just a few steps away from the front door...where she worked as a caretaker for an older couple. she was attacked by a herd of feral hogs... There were about 100 documented attacks by feral hogs on humans in the United States between 1825 and 2012	Feral pigs in the living in the United States have been known to attack and fatally injure human beings. A 59 year old Texas woman named Christine Rollins was attacked and killed only a few feet away from the front door of her workplace by a herd of feral pigs in the Texas town of Anahuac, which is 50 miles east of Houston. This incident was the fifth documented fatal feral pig attack in the United States since 1825.	While feral hog attacks on humans are rare, with only 100 documented attacks in the United States between 1825 and 2012, they can be fatal as demonstrated by the 2019 death of Christine Rollins in Anahuac, Texas, who was killed by a herd of feral hogs outside a home where she worked as a caretaker. This was only the fifth documented fatal wild hog attack in the United States since 1825.

I agree that the Claude-drafted section is a closer paraphrase of the reference than the human-drafted section.

Any other comparisons worth noting?Nowa (talk) 16:29, 22 November 2024 (UTC)

or we could get a person to do it Skeletons are the axiom (talk) 15:01, 31 January 2025 (UTC)

For an LLM tester

I wonder if this very boring task could be done by an LLM.

We now have Wikipedia:Missing Wikipedians#D and Wikipedia:Missing Wikipedians/2000–2009#D. (I've manually split the first three sections). The goal is to evaluate each line of wikitext. Any line that mentions *only* years before 2010 goes in one list. Any line that mentions any years after 2009 goes in another list. (Any line that does not mention any years should go in a third list, but I don't think there are any of these.)

Is anyone willing to try it for me? WhatamIdoing (talk) 02:25, 14 December 2024 (UTC)

Wouldn't this be a better task for a script than an LLM? Since LLMs by design predict text based on processed input, rather than actually comprehend said input, doesn't that mean there's every chance plugging this wikitext into an LLM wouldn't actually reproduce the content but instead an invented approximation of it? (Like fake usernames 'prompted' by the real ones?) Hydrangeans (she/her | talk | edits) 18:48, 29 January 2025 (UTC)

In this case, that should be obvious from the diff that you would (obviously, right?) check before posting the change.

I didn't think it was worth someone writing a specialized script for one-time use if there were alternatives. If it could be done with grep, then so much the better. WhatamIdoing (talk) 18:53, 29 January 2025 (UTC)

I tried removing with ChatGPT but text messages are limited in length so I had to manually copy and paste sections at a time, and ChatpGPT quickly got overwhelmed as I reached the daily limit, so it switched to an older version, which didn't work so well and it made a fair amount of errors. Ultimately I just copy and pasted the list into a text file, uploaded into R as a tsv named Missing, Missing <- Missing[!grepl("^\\*", Missing$V1) | grepl("201[0-9]|202[0-5]", Missing$V1), ], then exported it, and copy and pasted the text back into the article. Photos of Japan (talk) 04:41, 30 January 2025 (UTC)

a script would be better imo Skeletons are the axiom (talk) 15:05, 31 January 2025 (UTC)

Doing it in R seems to have worked, with only a handful of false positives (all due oddities in the text). The good news is that it's done for now, until we need to split off the next decade. WhatamIdoing (talk) 16:52, 31 January 2025 (UTC)

Copyright of LLM output

Generative AI output is widely held to be ineligible for copyright (except under UK and Hong Kong law), but just in case, the terms of use for ChatGPT and Claude contain provisions assigning any copyrights to outputs to the user, meaning that the user would have the authority to license those outputs under CC-BY-SA. Meta AI's terms contain no such provision. I suggest that editors who contribute AI-generated text to Wikipedia only use GenAI services that grant them copyright ownership of the output. Qzekrom (she/her • talk) 18:01, 27 December 2024 (UTC)

LLM-generated content

I've noticed links containing "utm_source=chatgpt.com" in some articles (for example in Lucid_Motors). It's quite likely that the content supported by these sources was written by LLMs. While now there are only few such articles, maybe it would make sense to create some kind of a filter or alert? Alaexis_¿question? 23:28, 28 December 2024 (UTC)

If I understand it correctly, this is a UTM parameter to track where the user came from. For example, the article Lucid Motors gives one source as https://marketrealist.com/p/who-makes-lucid-motors-battery/?utm_source=chatgpt.com. If my interpretation is correct, this means that the user clicked on a link on chatgpt.com, which brought them to the website marketrealist.com. In that case, it doesn't say anything about whether the website marketrealist.com was created by an LLM. Phlsph7 (talk) 09:50, 29 December 2024 (UTC)

Probably you're right. Alaexis_¿question? 20:58, 30 December 2024 (UTC)

A filter might still be useful as someone could be having ChatGPT writing a section for them and then citing the sources. E.g. "ChatGPT, write me a summary of X with sources", and then copying and pasting the ChatGPT summary along with the sources. Photos of Japan (talk) 21:56, 30 December 2024 (UTC)

@Photos of Japan: to confirm, I'm accurately understanding what you're describing...

Are you imagining a user experience where Wikipedia is able to detect when someone is copying and pasting content that was "written" by service likes ChatGPT, Claude, Gemini, etc.? If so, what can you imagine happening next?

For context, I ask the above as a product manager at the Wikipedia Foundation who is helping to develop a system that is meant to offer the kind of "in-the-moment" guidance/feedback/awareness I understand you to be describing here. Further, we're in the midst of building a version of – what we're calling "Paste Check" – as I type this! In fact, as part of this work, we investigated whether we could reliably detect pastes from popular LLMs. PPelberg (WMF) (talk) 22:10, 27 January 2025 (UTC)

@User:PPelberg (WMF) If a link includes "utm_source=chatgpt.com" then that means it was copied from ChatGPT, but that doesn't mean any other associated text added alongside the link was also copied from ChatGPT, but it makes it more likely that it was. I would propose a filter flagging such edits so that a user patrolling recent changes who is interested in reviewing content added by LLMs (such as potentially a member of the WP:WikiProject AI Cleanup) is able to review it. Photos of Japan (talk) 00:15, 28 January 2025 (UTC)

@Photos of Japan I don't think it makes it especially more likely (as ChatGPT would most likely link to content it is familiar with from its training data, rather than content someone just created with ChatGPT). To clarify, utm_source=chatgpt.com isn't part of the address itself: https://en.wikipedia.org/wiki/Wikipedia_talk:Large_language_models and https://en.wikipedia.org/wiki/Wikipedia_talk:Large_language_models?utm_source=chatgpt.com are the exact same page.

However, ChatGPT might not be ideal in terms of recommending links (as it hasn't necessarily read the content of the pages), and such a link is good evidence that the edit should be reviewed by a human. I'm pinging @Queen of Hearts (WPAIC's edit filter manager) on this one, and cross-posting this just in case. Chaotic Enby (talk · contribs) 11:22, 31 January 2025 (UTC)

Here's a list of articles that currently have a utm_source=chatgpt.com link. Nobody (talk) 12:06, 31 January 2025 (UTC)

Oof, that's a lot. Checking out the first one, here. The source chat found is about the topic, but doesn't support all its assigned content (that presumably come from somewhere else in its knowledge base), labelled the link as dead for some reason, and got the publisher wrong. CMD (talk) 12:20, 31 January 2025 (UTC)

As someone who regularly uses large language models, but not for the purpose of contributing to any Wikimedia project, I have noticed that large language models can, and often do, pick unreliable sources. Google's AI overviews often list Wikipedia and blogs in its sources. I have noticed that Google Gemini sometimes lists Wikipedia as a source when I use it. If Wikipedia imposes a duty to use reliable sources, in my opinion, we should create an edit filter, and warn, if not disallow, any citations that involve links to large language models, such as ChatGPT, Google Gemini, Claude, and Microsoft Copilot. Wikipedia already lists some AI-generated articles as generally not reliable. See Wikipedia:Reliable sources/Perennial sources#Large language models. Z. Patterson (talk) 22:49, 1 February 2025 (UTC)

page_namespace == 0 &
added_lines rlike "utm_source=chatgpt\.com"

Chaotic Enby (talk · contribs) 23:18, 1 February 2025 (UTC)

@Chaotic Enby: I think this will work. I also suggest writing this.

equals_to_any(page_namespace, 0, 10, 118) & 
(
    llmurl := "\b(chatgpt|copilot\.microsoft|gemini\.google|groq|)\.\w{2,3}\b";
    added_lines irlike (llmurl) &
    !(removed_lines irlike (llmurl)) &
    !(summary irlike  "^(?:revert|restore|rv|undid)|AFCH|speedy deletion|reFill") &
    !(added_lines irlike "\{\{(db[\-\|]|delete\||sd\||speedy deletion|(subst:)?copyvio|copypaste|close paraphrasing)|\.pdf")
)

An edit filter manager can add more LLM links, if necessary. It would work similarly to Special:AbuseFilter/1045. Z. Patterson (talk) 23:43, 1 February 2025 (UTC)

Do we currently have an edit filter for references that cite Wikipedia? There's about 1,500 articles that link to wikipedia.org and once you get past our articles about ourselves most of them are inappropriate. Almost all of our 172 articles linking to the mobile version of our site are inappropriate as well. Photos of Japan (talk) 05:47, 2 February 2025 (UTC)

Filter 1057 (hist · log) Nobody (talk) 08:05, 2 February 2025 (UTC)

Thanks. I noticed that it doesn't include any other wikimedia-related sites such as Wiktionary of which we appear to have a few hundred citations to. Addition of a citation referencing Wiktionary doesn't appear to trigger any filter and Citebot will also add dates to it which I find curious given WP:UPSD highlights it as a 'Wikimedia-related website'. Photos of Japan (talk) 07:28, 3 February 2025 (UTC)

The filter was prettty much created by accident wenn modifying a LTA filter. (Here) They kept it, because it seemed useful for dealing with CIRCULAR issues. It wasn't really a discussion if references to, for example: Wikibooks, Wiktionary, Wikivoyage, Wikiquote, Wikidata, Wikisource should be on the filter too as CIRCULAR or SPS sources. Nobody (talk) 08:33, 3 February 2025 (UTC)

I see. There's few enough of them that I could probably manually remove most of them some day. Photos of Japan (talk) 03:52, 4 February 2025 (UTC)

Demon#Philippines

This section appears to be AI-generated and could use attention from someone more familiar with fixing or removing AI-generated output. Skyerise (talk) 13:56, 29 January 2025 (UTC)

"Wikipedia:PADEMELONS" listed at Redirects for discussion

The redirect Wikipedia:PADEMELONS has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2025 March 9 § Wikipedia:PADEMELONS until a consensus is reached. Jay 💬 06:17, 9 March 2025 (UTC)

When is it a crisis? Blocked LLM edit count increasing 10x per year

Data: https://quarry.wmcloud.org/query/92310 Cramulator (talk) 01:31, 30 March 2025 (UTC)

Maybe it could be a crisis when it's an everyday thing, instead of something that isn't even happening once a week?

This isn't really useful data. For example:

These are decontexualized numbers. About 80,000 registered accounts got blocked last year, and a whopping 11 (yes, just eleven – not even one a month) accounts were blocked with an edit summary that included the letters "LLM". We usually block about 1,500 registered accounts a week and almost as many IPs. We are currently blocking about one (1) a week over LLMs. One a week is not a problem. Compare 1 a week for LLMs against [including registered+IP accounts, from 17–24 March] ~600 a week for spam, 600 for vandalism, 350 for socking, 300 for promotional usernames, 250 for disruptive editing... Overall, it looks like LLM misuse is the least of our problems.
Last year's LLM rate is 275% as many as the year before (not 10x). It is, if you want to be dramatic, also an infinite increase compared to ten years ago, when LLMs were largely unavailable to the general public. But even if you want to panic about the slope of the curve, it's still a teeny tiny fraction of all blocked accounts.
Edit summaries are not high-fidelity signals of the reason for a block. An LLM problem can be labeled as vandalism, or as a WP:NOTHERE problem, or as a WP:BLUDGEON problem (false negatives). A problem with POV pushing might be softened to LLM misuse (false positive). And this picks up any mention of LLMs, so it could be that someone has multiple problems, of which LLM misuse is only a minor component. Does the edit summary "Disruptive editing: Copyright violations, disruptive use of LLMs, mangled prose, inappropriate use of speedy deletion, unhelpful answers ate the Teahouse, improper usurpation of an article" give you the impression that the main problem is LLM misuse? That's not what I get from it. (There have also been multiple false accusations of LLM use, though I hope none of those have reached the point of blocking.)
An edit summary can contain the string 'LLM' without referring to LLMs. For example, any edit summary linking to Wikipedia:Sockpuppet investigations/WTillman29 contains "llm" (though perhaps the script is looking only for the ALLCAPS string).
Whether an admin mentions LLMs (and not, e.g., "AI") may be idiosyncratic/personal preference. That means that whether the number goes up or down depends on who's doing the blocking, rather than what the editor was doing.

So overall, my answer to "When is it a crisis?" is "not yet, and probably not any time soon". WhatamIdoing (talk) 06:45, 31 March 2025 (UTC)

I appreciate the thoughtful response. I'm suggesting that the trend is the issue here. It's probably best to revisit this in several months to see what that trend does. Cramulator (talk) 12:08, 2 April 2025 (UTC)

WP:UPSD Update

Following Wikipedia:Village_pump_(policy)/Archive_201#URLs_with_utm_source=chatgpt.com_codes, I have added detection for possible AI-generated slop to my script.

Possible AI-slop sources will be flagged in orange, thought I'm open to changing that color in the future if it causes issues. If you have the script, you can see it in action on those articles.

For now the list of AI sources is limited to ChatGPT (utm_source=chatgpt.com), but if you know of other chatGPT-like domains, let me know!

Headbomb {t · c · p · b} 22:27, 8 April 2025 (UTC)

WP:LLMDISCLOSE Proposed Update

Following AI-use discussions, there is limited consensus on adopting any policy changes regarding LLM use on wiki. One aspect of these discussions that I think has clear benefits is use disclosure. Though there is doubt on its practicality (see disclosure tagging). This ambiguity on how to disclose and on acceptable use cases may contribute to mixed incentives for voluntarily disclose of LLM use.

I'm proposing adding additional clarity to WP:LLMDISCLOSE about how one could disclose use. These changes could include appropriate terminology to describe use and what aspects are important to disclose (i.e., source gathering, content editing, grammar checking, etc.). Potentially out of scope for this change, but open to discussing more... Disclosure should be interpreted as a sign of good faith because it demonstrates at least a base competency of LLM-literacy with respect to editing. Zentavious (talk) 17:38, 9 April 2025 (UTC)

Going on article talk pages, clearly disclosing LLM use including with full source code, proposing a single paragraph to address a longstanding tagged issue, and asking human editors to review and revise it can result in overwhelming opposition. So I would say the guidance needs a bit more extensive of an overhaul.

I'm afraid we've reached a very odd social situation where those using AI to help edit are being incentivised to hide it completely instead of disclosing. Cramulator (talk) 01:38, 28 April 2025 (UTC)

That seems a very normal and standard social situation. People doing things that they expect would meet disapproval don't tend to disclose it to wider society. CMD (talk) 03:07, 28 April 2025 (UTC)

Discussions all over the place

Would it be possible to centralize discussions like this thanks? I see the basic topic has been discussed all over the place. I can see there is very little appetite for any large scale use of LLM's editing content on Wikipedia and I fully agree with that, but it would e nice to have a place where the arguments got gradually thrashed out and people could get a clear idea of what editors think actually would be okay.

For instance I just added WT:AI-generated content#Problem of recursive AI editing of Wikipedia where I get the feeling it will disappear without trace. And I'm sure others must feel the same too about their contributions. I'm no stranger to the feeling of futility when trying to convince people of things but it would be nice to know one had actually been able to try in a reasonable way. NadVolum (talk) 09:08, 18 April 2025 (UTC)

WMF AI strategy

See:

Albon, Chris; Zia, Leila (30 April 2025). "Our new AI strategy puts Wikipedia's humans first". Wikimedia Foundation. Retrieved 1 May 2025.

More at Meta-Wiki: meta:Strategy/Multigenerational/Artificial intelligence for editors. See also: meta:Category:Artificial intelligence

One of WMF's goals is to:

“

Establish the Foundation as the leader in developing and using AI with a human-first approach, prioritizing tools that empower and not replace editors, protect community values, and increase access to knowledge.

”

—Alalch E. 21:37, 1 May 2025 (UTC)

Wikipedia_talk:WikiProject_AI_Cleanup#WP:LLMN?

For the interested. Gråbergs Gråa Sång (talk) 06:30, 24 May 2025 (UTC)