Wikipedia talk:Large language models/Archive 4

This is an archive of past discussions on Wikipedia:Large language models. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Creating drafts with AI before verification

I want your thoughts on the practice of creating a draft article with AI as a starting point before verifying all points in it. I see this as a potentially useful strategy for making a well flowing starting point to edit. Immanuelle ❤️💚💙 (talk to the cutest Wikipedian) 18:13, 3 April 2023 (UTC)

Hello, Immanuelle. This is an automated technique for writing an article the wrong way. The best practice is to identify several reliable, independent sources that devote significant coverage of the topic. AI tends to be indiscriminate about sources. Please read Wikipedia:Writing Wikipedia articles backward. Cullen328 (talk) 18:33, 3 April 2023 (UTC)

Aside from the issue identified by Cullen, which I completely agree with, there's the possibility that another editor might come across an abandoned draft and assume that it just needs to be copyedited before moving to mainspace. This is particularly concerning when an article contains fabricated facts and fabricated sources, since WP:AGF would lead an editor to assume that the content is legitimate and that the sources are simply difficult to find and access. –dlthewave ☎ 20:33, 3 April 2023 (UTC)

Personally, I feel that writing exercises with unvetted information aren't suitable for submission to any page on Wikipedia. I think the collaborative process is better served when the content being shared has undergone some degree of review by the editor with respect to accuracy and relevance. Otherwise it's indistinguishable from anything made up. isaacl (talk) 21:55, 3 April 2023 (UTC)

Noticeboard for AI generated things

AI generated articles show up a lot on ANI. I think it might be helpful to add a dedicated noticeboard for this stuff. IHaveAVest talk 02:01, 4 April 2023 (UTC)

Applicable TOS

The Terms of use for ChatGPT[1] say As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms.

In threads above there is a link to a Sharing & publication policy[2] with attribution requirements. It's not clear to me whether this in generally in force. I think it may be meant for invited research collaboration on products that aren't yet publically available. Sennalen (talk) 16:47, 7 April 2023 (UTC)

Outright falsification

This doesn't go far enough: "LLM-generated content can be biased, non-verifiable, may constitute original research, may libel living people, and may violate copyrights." LLMs also blatantly falsify both citations (creating plausible-looking cites to non-existent sources) and quotations (making up fake quotes from real sources). — SMcCandlish ☏ ¢ 😼 09:28, 13 April 2023 (UTC)

added something—Alalch E. 10:09, 13 April 2023 (UTC)

Attribution section

This section might be the part I have the most issues with. It uses "in-text attribution" incorrectly, and it requires the use of {{OpenAI}} for OpenAI LLMs, which blurs the line between OpenAI TOS and Wikipedia policy (and doesn't comply with OpenAI's ToS anyway). I also don't think we've reached a consensus on how to attribute, or resolved the outstanding issues involving inline attribution, whatever we go with. DFlhb (talk) 03:31, 1 April 2023 (UTC)

You are right about the lack of consensus. Since we have already discussed the issue at length, it would probably be best to do an RfC on this issue, as suggested previously by dlthewave. We need to clarify whether or in what cases the following are needed: attribution using a template (top or bottom of the page), in-line attribution, in-text attribution, and edit summaries. Maybe split it up into 2 RfCs: one for the type of attribution on the page and one for edit summaries. Phlsph7 (talk) 06:15, 1 April 2023 (UTC)

RfC yes, but held at WP:VPP, with a pre-RfC at WP:VPI first, because we've become too insular and we need outside input on this, rather than something only we (and WP:FRS) will see. I also think we don't need to bother with an RfC on an edit summary requirement, since that has a low chance of gaining consensus; we should come up with something else.

The question is: do we think it's wise to hold the VPI discussion now, or do we want to wait a little, to give the community time to get used to LLM misconduct and organically experiment with solutions? And secondly, before we go to VPI, would it be worthwhile for each of us here to post a short one/two sentence summary of our positions on attribution, so we clarify lines of agreement and disagreement? I'll admit I lost track, and don't really know where anyone stands. DFlhb (talk) 06:39, 1 April 2023 (UTC)

I'm not sure what is the best process here in terms of WP:VPP and WP:VPI. To simplify the process, it might be a good idea to get as many options as possible off the list so it's more likely that some kind of consensus is reached.

Attribution makes the use of LLMs transparent to readers and makes it easier to detect (and forestall) misuse but it also makes appropriate use more difficult. The decision should probably be based on arriving at some kind of balance between these points. I advocated the use of in-text attribution earlier but this may make appropriate use too difficult so, as far as I'm concerned, we could take it off the list.

Concerning edit summaries: the discussion you mentioned is about edit tags (like the tag Mobile edit), not edit summaries, and the consensus (or lack thereof) is disputed in the discussion. If we decide against in-line attribution, edit summaries would be important to track which contents were produced by LLMs since a general attribution template only indicates that they were used somewhere on the page.

Besides the type of attribution, we would also need to narrow down the conditions under which it (and edit summaries) is necessary. Some of the relevant options would be:

no attribution is required in any case
attribution is required for all changes to articles and drafts
attribution is required for all non-trivial changes to articles and drafts (however we want to define "non-trivial" here, maybe as non-minor changes?)
attribution is required for all changes to articles and drafts if the LLM produced new information (this would be particularly difficult to check)
- or alternatively: attribution is required for all changes to articles and drafts that add new claims (i.e. excluding copyedits and summaries of contents already present in the article)

I'm not sure about the best option myself but I would tend to require an attribution template at the bottom for non-trivial changes or for introducing new claims, together with an edit summary. Phlsph7 (talk) 08:55, 1 April 2023 (UTC)

I'd support requiring attribution in all cases; this ties the editor to their responsibility. If minor changes were made with AI, by definition, they could have been made without AI. Iseult^Δxparlez moi 17:17, 1 April 2023 (UTC)

To take an example: grammar checkers are examples of programs using language models, and these days I assume the better ones have been built with AI-training techniques. Having a blanket rule would essentially mean anyone using a grammar checker would be required to make a disclosure, even for one being applied automatically by your browser. I think any potential value of disclosures is at risk of being lost in this situation. isaacl (talk) 17:55, 1 April 2023 (UTC)

I'm unclear on what additional value AI-guided grammar checkers have over older ones. It's also unclear which checkers, asides from Grammarly, use AI. My checker in Word 2012 certainly doesn't. In any case, I oppose giving AI latitude as a rule; however, given that checkers tend to require individual checkoffs by the user, which reduces it to the role of drawing attention to potential errors or improvements, this doesn't fall under the greater concern of AI-generated content. Iseult^Δxparlez moi 18:41, 1 April 2023 (UTC)

I think we should use the term "disclosure" in any broader conversation. Sources are attributed using citations, but I believe there is agreement that for now the output of these types of programs is being treated as just generated text that must be cited as necessary to suitable sources.

I think key points to discuss are at what point does disclosure become desirable, and for whom is the disclosure targeted? I think neither editors nor readers are interested in the use of grammar checkers to be disclosed. Readers and editors may be interested when substantial portions of the text have come from a program, as they might feel dubious about the amount of editorial oversight. Editors might be concerned as well with copy editing, in order to verify fidelity of the changes (though that's not a concern limited to copyediting done by programs). Personally I think the relevant consideration is more about how a writing tool is used, rather than the nature of the writing tools. Unfortunately, that's a fairly nebulous thing for anyone other than the author of the change to label, as I suspect numerical thresholds such as a changed-word percentage is going to have a large zone of uncertainty. isaacl (talk) 17:46, 1 April 2023 (UTC)

Phlsph7, isaacl, that isn't what I meant by "short". We're gonna keep going around in circles if we keep veering into minutiæ. We don't need to care about grammar checkers or search engines, because people will use common sense to interpret this policy. DFlhb (talk) 17:59, 1 April 2023 (UTC)

I wasn't summarizing my position on disclosure. I was raising what I think should be considered in planning a broader discussion on disclosure. I feel we need to think about the considerations that editors will raise, to try to figure out a concise question, or set of options to initially provide. isaacl (talk) 18:41, 1 April 2023 (UTC)

I was not under the impression that I was going in circles but I apologize if I was. My position is explained in the last sentence of my previous post. The other sentences concern the different options to be presented at the RfC(s). One RfC could be about the conditions under which attribution is required. If we present the options as a continuum we increase the chances that some kind of consensus is reached. The continuum could be something like the following: attribution for LLM-assisted edits is required...

for all changes, including usage on talk pages
for all changes to articles and drafts
for all non-trivial/non-minor changes to articles and drafts
for all changes that add substantial material with new claims to articles and drafts
for no changes

Are there any other relevant options to be added? An alternative would be to just ask an open-ended question like "Under what circumstances is attribution for LLM-assisted edits required?" and see what everyone comes up with. The danger of this approach is that it may be much harder to reach a consensus. This would be better for brainstorming than for reaching a consensus.

Once we have this issue pinned down, we could have a 2nd RfC about what type of attribution and/or edit summaries are required. Phlsph7 (talk) 18:44, 1 April 2023 (UTC)

I am not sure what to make of this, but it's worth noting that of the two current uses of this template, one is crediting "[Bing] by OpenAI" - the Bing chat tool does not seem to have the same kind of TOS as the OpenAI one does. Andrew Gray (talk) 14:59, 1 April 2023 (UTC)

My opinion is that in the interest of transparency and accountability, AI-generated content should be disclosed to the reader where they'll see it. This means using some sort of in-text notice or template at the section level. AI-assisted minor edits (copyediting etc) can be disclosed in an edit summary. This should be a Wikipedia policy that stands on its own regardless of the LLM's requirements.

This seems to be the direction that publications are leaning toward: Medium, Science, Robert J Gates and NIEHS have good explanations of why they suggest or require disclosure of AI use. –dlthewave ☎ 16:19, 1 April 2023 (UTC)

Template at the section level isn't appropriate because templates aren't meant to be permanent features of articles. They are used for things that are meant to be fixed and changed to allow for the template to be removed in the future. In this case, the usage for such text would be intended to be a permanent addition. Instead of a template, either a note at the top of the references list (such as an [a] note) that states the text was partially made with an LLM (similar to what we do when text is used wholesale from public domain works) or just an icon tag at the top right of the article, such as what we use for protection tags, would suffice. Silver seren^C 17:59, 1 April 2023 (UTC)

If we are going to attribute content added with the assistance of LLM, there are two things to keep in mind:

the requirements of the Terms of Use of the LLM provider, and
the requirements of the Terms of Use of Wikimedia.

As there are legal implications to both of these, that means that regardless of our discussions here or what anyone's opinion or preference here is, the ultimate output of any policy designed here, must be based on those two pillars, and include them. Put another way: no amount of agreement or thousand-to-one consensus here, or in any forum in Wikipedia, can exclude or override any part of the Terms of Use of either party, period. We may make attribution requirements stricter, but not more lax than the ToU lays out. (In particular, neither WP:IAR nor WP:CONS can override ToU, which has legal implications.)

Complying with Terms of Use at both ends

I'm more familiar with Wikimedia's ToU than ChatGPT's (which is nevertheless quite easy to understood). The page WP:CWW interprets the ToU for English Wikipedia users; it is based on Wikimedia's wmf:Terms of use, section 7. Licensing of Content, sub-sections b) Attribution, and c) Importing text. There's some legalese, but it's not that hard to understand, and amounts to this: the attribution must state the source of the content, and must 1) link to it, and 2) be present in the edit summary. The WP:CWW page interpretation offers some suggested boilerplate attribution (e.g., Content in this edit was copied from [[FOO]]; see that article's history for attribution.) for sister projects, and for outside content with compatible licenses. (One upshot of this, is that *if* LLM attribution becomes necessary, suggestions such as one I've seen on the project page to use an article-bottom template, will not fly.)

Absent any update to the WMF ToU regarding LLM content, we are restricted only by the LLM ToU, at the moment. The flip side of this, is that one has to suspect or assume that WMF is currently considering LLM usage and attribution, and if and when they update the ToU, the section in any proposed new LLM policy may have to be rewritten. The best approach for an attribution section now in my opinion, is to keep it as short as possible, so it may be amended easily if and when WMF updates its ToU for LLMs. In my view, the attribution section of our proposed policy should be short and inclusive, without adding other frills for now, something like this:

Any content added to Wikipedia based wholly or in part on LLM output must comply with:

the Terms of Use of the LLM provider, and
Wikipedia's attribution policy for copied content, in particular, an attribution statement in the edit summary containing a link to the LLM provider's Terms of Use.

Once WMF addresses LLMs, we could modify this to be more specific. (I'll go ask them and find out, and link back here.)

We may also need to expand and modify it, for each flavor of LLM. Chat GPT's sharing/publication policy is quite easy to read and understand. There are four bullets, and some suggested, "stock language". I'd like to address this later, after having a chat with WMF.

Note that it's perfectly possible that WMF may decide that attribution to non-human agents is not needed, in which case we will be bound only by the LLM's ToU; but in that case, I'd advocate for stricter standards on our side; however, it's hard to discuss that productively until we know what WMF's intentions are. (If I had to guess, I would bet that there are discussions or debates going on right now at WMF legal about the meaning of "creative content", which is a key concept underlying the current ToU, and if they decide to punt on any new ToU, they will just be pushing the decision about what constitutes "creative content" downstream onto the 'Pedias, which would be disastrous, imho; but I'm predicting they won't do that.) I'll report back if I find anything out. Mathglot (talk) 03:51, 10 April 2023 (UTC)

Wikipedia:Copying within Wikipedia is about copying content from one Wikipedia page to another, and so doesn't apply with respect to how editors incorporate content derived from external sources. The Licensing of Content section in the terms of use does have a subsection c, "Importing text", which states: ...you warrant that the text is available under terms that are compatible with the CC BY-SA 3.0 license (or, as explained above, another license when exceptionally required by the Project edition or feature)("CC BY-SA"). ... You agree that, if you import text under a CC BY-SA license that requires attribution, you must credit the author(s) in a reasonable fashion. It gives attribution in the edit summary as an example for copying within Wikimedia projects, but doesn't prescribe this as the only reasonable fashion. Specifically regarding OpenAI, though, based on its terms of use, it assigns all rights to the user. So even if the U.S. courts one day ruled that a program could hold authorship rights, attribution from a copyright perspective is not required. OpenAI's sharing and publication policy, though, requires that The role of AI in formulating the content is clearly disclosed in a way that no reader could possibly miss, and that a typical reader would find sufficiently easy to understand.

Wikipedia terms of use section 7, subsection c further states The attribution requirements are sometimes too intrusive for particular circumstances (regardless of the license), and there may be instances where the Wikimedia community decides that imported text cannot be used for that reason. In a similar manner, it may be the case that the community decides that enabling editors to satisfy the disclosure requirement of the OpenAI sharing and publication policy is too intrusive. isaacl (talk) 04:52, 10 April 2023 (UTC)

Yes indeed; CWW is only about copying content from one Wikipedia page to another (or among any Wikimedia property, or other compatibly licensed project), because it an interpretation of the ToU for en-wiki. The point I was trying to make, not very clearly perhaps, is the wording at CWW offers us a model of what we might want to say about LLM, as long as we take into consideration their ToU, as well as whatever ends up happening (if anything) with the wmf ToU (which, by the way, is scheduled for an update, and discussion is underway now and feedback is open until 27 April to anyone who wishes to contribute). Mathglot (talk) 08:27, 10 April 2023 (UTC)

You proposed following Wikipedia:Copying within Wikipedia for attribution, which in essence means extending it to cover more than its original purpose of ensuring compliance with Wikipedia's copyright licensing (CC BY-SA and GFDL). Personally I think it would be better to preserve the current scope of the copying within Wikipedia guideline, both to keep it simpler and to avoid conflating disclosure requirements with copyright licensing issues. isaacl (talk) 16:12, 10 April 2023 (UTC)

I like your proposed phrasing. True what we need the WMF's input; they published meta:Wikilegal/Copyright Analysis of ChatGPT a few weeks back, but it seems non-committal. As for the ToS of other LLM providers, Bing Chat only allows use for "personal, non-commercial purpose", so it's straightforwardly not compatible. DFlhb (talk) 10:00, 10 April 2023 (UTC)

Content about attribution was not good and I've removed it

DFlhb had removed as a part of their reverted trim, and now I've removed it again. This topic is covered in wmf:Terms of Use/en. No useful specific guidance was provided here. There's no agreement that a policy needs to require use of Template:OpenAI as it is not obviously compatible with OpenAI ToS requirements. Editors advocating to include specific guidance about requiring attribution on this page should get consensus for the concrete version of text about this that they are committed to and want to see it becoming Wikipedia policy. —Alalch E. 11:41, 14 April 2023 (UTC)

LLMs on Wikisource

It has been proposed to me on Wikisource that LLMs would be useful for predicting and proposing fixes to transcription errors. Is there a place to discuss how such a thing might technically be implemented? BD2412 T 23:00, 15 April 2023 (UTC)

@BD2412: Do you know about LangChain? It's by far the most serious platform for building apps from LLMs in an open non-proprietary way. Although the guy behind it is on Twitter, he and others are far more responsive on their Discord server. Good luck! Sandizer (talk) 13:16, 16 April 2023 (UTC)

Great, thanks! BD2412 T 13:27, 16 April 2023 (UTC)

It was me that had asked the OP in this thread, based on their OCR cleanup edits at English Wikisource. naturally any developed LLM OCR/scan error finder would of course have to be approved by the applicable community process before widespread use. ShakespeareFan00 (talk) 17:16, 16 April 2023 (UTC)

I believe that today's commercial OCR software does include language models for error correction, but while they are not "large" as in LLMs, I believe they are substantially larger than typical autocorrect systems. A very good correction system involving LangChain and Pywikibot should be possible to make from open ~7B size models (e.g. Dolly, see Ars Technica's summary) which run fairly fast on typically four ordinary server CPU cores. It should be possible for project communities to thoroughly test such at a sufficiently large scale to find any issues which might cause serious problems. I suspect that corrections can be automatically classified into those which should require human review, and those which most probably don't need it. Sandizer (talk) 17:29, 16 April 2023 (UTC)

I think we ought to focus more on specific uses of software rather than the implementation, because that's not always published by the provider and can change rapidly. If there is concern regarding using OCR programs, that should be addressed regardless of how the programs are implemented. isaacl (talk) 21:05, 16 April 2023 (UTC)

BTW I created an article for LangChain which has a couple good starter resources. Sandizer (talk) 04:36, 18 April 2023 (UTC)