Wikipedia talk:Large language models/Archive 4
![]() | This is an archive of past discussions on Wikipedia:Large language models. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | Archive 6 | Archive 7 |
Creating drafts with AI before verification
I want your thoughts on the practice of creating a draft article with AI as a starting point before verifying all points in it. I see this as a potentially useful strategy for making a well flowing starting point to edit. Immanuelle β€οΈππ (talk to the cutest Wikipedian) 18:13, 3 April 2023 (UTC)
- Hello, Immanuelle. This is an automated technique for writing an article the wrong way. The best practice is to identify several reliable, independent sources that devote significant coverage of the topic. AI tends to be indiscriminate about sources. Please read Wikipedia:Writing Wikipedia articles backward. Cullen328 (talk) 18:33, 3 April 2023 (UTC)
- Aside from the issue identified by Cullen, which I completely agree with, there's the possibility that another editor might come across an abandoned draft and assume that it just needs to be copyedited before moving to mainspace. This is particularly concerning when an article contains fabricated facts and fabricated sources, since WP:AGF would lead an editor to assume that the content is legitimate and that the sources are simply difficult to find and access. βdlthewave β 20:33, 3 April 2023 (UTC)
- Personally, I feel that writing exercises with unvetted information aren't suitable for submission to any page on Wikipedia. I think the collaborative process is better served when the content being shared has undergone some degree of review by the editor with respect to accuracy and relevance. Otherwise it's indistinguishable from anything made up. isaacl (talk) 21:55, 3 April 2023 (UTC)
Noticeboard for AI generated things
AI generated articles show up a lot on ANI. I think it might be helpful to add a dedicated noticeboard for this stuff. IHaveAVest talk 02:01, 4 April 2023 (UTC)
Applicable TOS
The Terms of use for ChatGPT[1] say As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms.
In threads above there is a link to a Sharing & publication policy[2] with attribution requirements. It's not clear to me whether this in generally in force. I think it may be meant for invited research collaboration on products that aren't yet publically available. Sennalen (talk) 16:47, 7 April 2023 (UTC)
Outright falsification
This doesn't go far enough: "LLM-generated content can be biased, non-verifiable, may constitute original research, may libel living people, and may violate copyrights." LLMs also blatantly falsify both citations (creating plausible-looking cites to non-existent sources) and quotations (making up fake quotes from real sources). βββ―SMcCandlish β Β’βπΌβ 09:28, 13 April 2023 (UTC)
- added somethingβAlalch E. 10:09, 13 April 2023 (UTC)
Attribution section
This section might be the part I have the most issues with. It uses "in-text attribution" incorrectly, and it requires the use of {{OpenAI}}
for OpenAI LLMs, which blurs the line between OpenAI TOS and Wikipedia policy (and doesn't comply with OpenAI's ToS anyway). I also don't think we've reached a consensus on how to attribute, or resolved the outstanding issues involving inline attribution, whatever we go with. DFlhb (talk) 03:31, 1 April 2023 (UTC)
- You are right about the lack of consensus. Since we have already discussed the issue at length, it would probably be best to do an RfC on this issue, as suggested previously by dlthewave. We need to clarify whether or in what cases the following are needed: attribution using a template (top or bottom of the page), in-line attribution, in-text attribution, and edit summaries. Maybe split it up into 2 RfCs: one for the type of attribution on the page and one for edit summaries. Phlsph7 (talk) 06:15, 1 April 2023 (UTC)
- RfC yes, but held at WP:VPP, with a pre-RfC at WP:VPI first, because we've become too insular and we need outside input on this, rather than something only we (and WP:FRS) will see. I also think we don't need to bother with an RfC on an edit summary requirement, since that has a low chance of gaining consensus; we should come up with something else.
- The question is: do we think it's wise to hold the VPI discussion now, or do we want to wait a little, to give the community time to get used to LLM misconduct and organically experiment with solutions? And secondly, before we go to VPI, would it be worthwhile for each of us here to post a short one/two sentence summary of our positions on attribution, so we clarify lines of agreement and disagreement? I'll admit I lost track, and don't really know where anyone stands. DFlhb (talk) 06:39, 1 April 2023 (UTC)
- I'm not sure what is the best process here in terms of WP:VPP and WP:VPI. To simplify the process, it might be a good idea to get as many options as possible off the list so it's more likely that some kind of consensus is reached.
- Attribution makes the use of LLMs transparent to readers and makes it easier to detect (and forestall) misuse but it also makes appropriate use more difficult. The decision should probably be based on arriving at some kind of balance between these points. I advocated the use of in-text attribution earlier but this may make appropriate use too difficult so, as far as I'm concerned, we could take it off the list.
- Concerning edit summaries: the discussion you mentioned is about edit tags (like the tag Mobile edit), not edit summaries, and the consensus (or lack thereof) is disputed in the discussion. If we decide against in-line attribution, edit summaries would be important to track which contents were produced by LLMs since a general attribution template only indicates that they were used somewhere on the page.
- Besides the type of attribution, we would also need to narrow down the conditions under which it (and edit summaries) is necessary. Some of the relevant options would be:
- no attribution is required in any case
- attribution is required for all changes to articles and drafts
- attribution is required for all non-trivial changes to articles and drafts (however we want to define "non-trivial" here, maybe as non-minor changes?)
- attribution is required for all changes to articles and drafts if the LLM produced new information (this would be particularly difficult to check)
- or alternatively: attribution is required for all changes to articles and drafts that add new claims (i.e. excluding copyedits and summaries of contents already present in the article)
- I'm not sure about the best option myself but I would tend to require an attribution template at the bottom for non-trivial changes or for introducing new claims, together with an edit summary. Phlsph7 (talk) 08:55, 1 April 2023 (UTC)
- I'd support requiring attribution in all cases; this ties the editor to their responsibility. If minor changes were made with AI, by definition, they could have been made without AI. Iseult Ξx parlez moi 17:17, 1 April 2023 (UTC)
- To take an example: grammar checkers are examples of programs using language models, and these days I assume the better ones have been built with AI-training techniques. Having a blanket rule would essentially mean anyone using a grammar checker would be required to make a disclosure, even for one being applied automatically by your browser. I think any potential value of disclosures is at risk of being lost in this situation. isaacl (talk) 17:55, 1 April 2023 (UTC)
- I'm unclear on what additional value AI-guided grammar checkers have over older ones. It's also unclear which checkers, asides from Grammarly, use AI. My checker in Word 2012 certainly doesn't. In any case, I oppose giving AI latitude as a rule; however, given that checkers tend to require individual checkoffs by the user, which reduces it to the role of drawing attention to potential errors or improvements, this doesn't fall under the greater concern of AI-generated content. Iseult Ξx parlez moi 18:41, 1 April 2023 (UTC)
- I'd support requiring attribution in all cases; this ties the editor to their responsibility. If minor changes were made with AI, by definition, they could have been made without AI. Iseult Ξx parlez moi 17:17, 1 April 2023 (UTC)
- I think we should use the term "disclosure" in any broader conversation. Sources are attributed using citations, but I believe there is agreement that for now the output of these types of programs is being treated as just generated text that must be cited as necessary to suitable sources.
- I think key points to discuss are at what point does disclosure become desirable, and for whom is the disclosure targeted? I think neither editors nor readers are interested in the use of grammar checkers to be disclosed. Readers and editors may be interested when substantial portions of the text have come from a program, as they might feel dubious about the amount of editorial oversight. Editors might be concerned as well with copy editing, in order to verify fidelity of the changes (though that's not a concern limited to copyediting done by programs). Personally I think the relevant consideration is more about how a writing tool is used, rather than the nature of the writing tools. Unfortunately, that's a fairly nebulous thing for anyone other than the author of the change to label, as I suspect numerical thresholds such as a changed-word percentage is going to have a large zone of uncertainty. isaacl (talk) 17:46, 1 April 2023 (UTC)
- Phlsph7, isaacl, that isn't what I meant by "short". We're gonna keep going around in circles if we keep veering into minutiæ. We don't need to care about grammar checkers or search engines, because people will use common sense to interpret this policy. DFlhb (talk) 17:59, 1 April 2023 (UTC)
- I wasn't summarizing my position on disclosure. I was raising what I think should be considered in planning a broader discussion on disclosure. I feel we need to think about the considerations that editors will raise, to try to figure out a concise question, or set of options to initially provide. isaacl (talk) 18:41, 1 April 2023 (UTC)
- I was not under the impression that I was going in circles but I apologize if I was. My position is explained in the last sentence of my previous post. The other sentences concern the different options to be presented at the RfC(s). One RfC could be about the conditions under which attribution is required. If we present the options as a continuum we increase the chances that some kind of consensus is reached. The continuum could be something like the following: attribution for LLM-assisted edits is required...
- for all changes, including usage on talk pages
- for all changes to articles and drafts
- for all non-trivial/non-minor changes to articles and drafts
- for all changes that add substantial material with new claims to articles and drafts
- for no changes
- Are there any other relevant options to be added? An alternative would be to just ask an open-ended question like "Under what circumstances is attribution for LLM-assisted edits required?" and see what everyone comes up with. The danger of this approach is that it may be much harder to reach a consensus. This would be better for brainstorming than for reaching a consensus.
- Once we have this issue pinned down, we could have a 2nd RfC about what type of attribution and/or edit summaries are required. Phlsph7 (talk) 18:44, 1 April 2023 (UTC)
- Phlsph7, isaacl, that isn't what I meant by "short". We're gonna keep going around in circles if we keep veering into minutiæ. We don't need to care about grammar checkers or search engines, because people will use common sense to interpret this policy. DFlhb (talk) 17:59, 1 April 2023 (UTC)
- I am not sure what to make of this, but it's worth noting that of the two current uses of this template, one is crediting "[Bing] by OpenAI" - the Bing chat tool does not seem to have the same kind of TOS as the OpenAI one does. Andrew Gray (talk) 14:59, 1 April 2023 (UTC)
- My opinion is that in the interest of transparency and accountability, AI-generated content should be disclosed to the reader where they'll see it. This means using some sort of in-text notice or template at the section level. AI-assisted minor edits (copyediting etc) can be disclosed in an edit summary. This should be a Wikipedia policy that stands on its own regardless of the LLM's requirements.
- This seems to be the direction that publications are leaning toward: Medium, Science, Robert J Gates and NIEHS have good explanations of why they suggest or require disclosure of AI use. βdlthewave β 16:19, 1 April 2023 (UTC)
- Template at the section level isn't appropriate because templates aren't meant to be permanent features of articles. They are used for things that are meant to be fixed and changed to allow for the template to be removed in the future. In this case, the usage for such text would be intended to be a permanent addition. Instead of a template, either a note at the top of the references list (such as an [a] note) that states the text was partially made with an LLM (similar to what we do when text is used wholesale from public domain works) or just an icon tag at the top right of the article, such as what we use for protection tags, would suffice. SilverserenC 17:59, 1 April 2023 (UTC)
If we are going to attribute content added with the assistance of LLM, there are two things to keep in mind:
- the requirements of the Terms of Use of the LLM provider, and
- the requirements of the Terms of Use of Wikimedia.
As there are legal implications to both of these, that means that regardless of our discussions here or what anyone's opinion or preference here is, the ultimate output of any policy designed here, must be based on those two pillars, and include them. Put another way: no amount of agreement or thousand-to-one consensus here, or in any forum in Wikipedia, can exclude or override any part of the Terms of Use of either party, period. We may make attribution requirements stricter, but not more lax than the ToU lays out. (In particular, neither WP:IAR nor WP:CONS can override ToU, which has legal implications.)
Complying with Terms of Use at both ends
I'm more familiar with Wikimedia's ToU than ChatGPT's (which is nevertheless quite easy to understood). The page WP:CWW interprets the ToU for English Wikipedia users; it is based on Wikimedia's wmf:Terms of use, section 7. Licensing of Content, sub-sections b) Attribution, and c) Importing text. There's some legalese, but it's not that hard to understand, and amounts to this: the attribution must state the source of the content, and must 1) link to it, and 2) be present in the edit summary. The WP:CWW page interpretation offers some suggested boilerplate attribution (e.g., Content in this edit was copied from [[FOO]]; see that article's history for attribution.
) for sister projects, and for outside content with compatible licenses. (One upshot of this, is that *if* LLM attribution becomes necessary, suggestions such as one I've seen on the project page to use an article-bottom template, will not fly.)
Absent any update to the WMF ToU regarding LLM content, we are restricted only by the LLM ToU, at the moment. The flip side of this, is that one has to suspect or assume that WMF is currently considering LLM usage and attribution, and if and when they update the ToU, the section in any proposed new LLM policy may have to be rewritten. The best approach for an attribution section now in my opinion, is to keep it as short as possible, so it may be amended easily if and when WMF updates its ToU for LLMs. In my view, the attribution section of our proposed policy should be short and inclusive, without adding other frills for now, something like this:
- Any content added to Wikipedia based wholly or in part on LLM output must comply with:
- the Terms of Use of the LLM provider, and
- Wikipedia's attribution policy for copied content, in particular, an attribution statement in the edit summary containing a link to the LLM provider's Terms of Use.
Once WMF addresses LLMs, we could modify this to be more specific. (I'll go ask them and find out, and link back here.)
We may also need to expand and modify it, for each flavor of LLM. Chat GPT's sharing/publication policy is quite easy to read and understand. There are four bullets, and some suggested, "stock language". I'd like to address this later, after having a chat with WMF.
Note that it's perfectly possible that WMF may decide that attribution to non-human agents is not needed, in which case we will be bound only by the LLM's ToU; but in that case, I'd advocate for stricter standards on our side; however, it's hard to discuss that productively until we know what WMF's intentions are. (If I had to guess, I would bet that there are discussions or debates going on right now at WMF legal about the meaning of "creative content", which is a key concept underlying the current ToU, and if they decide to punt on any new ToU, they will just be pushing the decision about what constitutes "creative content" downstream onto the 'Pedias, which would be disastrous, imho; but I'm predicting they won't do that.) I'll report back if I find anything out. Mathglot (talk) 03:51, 10 April 2023 (UTC)
- Wikipedia:Copying within Wikipedia is about copying content from one Wikipedia page to another, and so doesn't apply with respect to how editors incorporate content derived from external sources. The Licensing of Content section in the terms of use does have a subsection c, "Importing text", which states:
...you warrant that the text is available under terms that are compatible with the CC BY-SA 3.0 license (or, as explained above, another license when exceptionally required by the Project edition or feature)("CC BY-SA"). ... You agree that, if you import text under a CC BY-SA license that requires attribution, you must credit the author(s) in a reasonable fashion.
It gives attribution in the edit summary as an example for copying within Wikimedia projects, but doesn't prescribe this as the only reasonable fashion. Specifically regarding OpenAI, though, based on its terms of use, it assigns all rights to the user. So even if the U.S. courts one day ruled that a program could hold authorship rights, attribution from a copyright perspective is not required. OpenAI's sharing and publication policy, though, requires thatThe role of AI in formulating the content is clearly disclosed in a way that no reader could possibly miss, and that a typical reader would find sufficiently easy to understand.
- Wikipedia terms of use section 7, subsection c further states
The attribution requirements are sometimes too intrusive for particular circumstances (regardless of the license), and there may be instances where the Wikimedia community decides that imported text cannot be used for that reason.
In a similar manner, it may be the case that the community decides that enabling editors to satisfy the disclosure requirement of the OpenAI sharing and publication policy is too intrusive. isaacl (talk) 04:52, 10 April 2023 (UTC)- Yes indeed; CWW is only about copying content from one Wikipedia page to another (or among any Wikimedia property, or other compatibly licensed project), because it an interpretation of the ToU for en-wiki. The point I was trying to make, not very clearly perhaps, is the wording at CWW offers us a model of what we might want to say about LLM, as long as we take into consideration their ToU, as well as whatever ends up happening (if anything) with the wmf ToU (which, by the way, is scheduled for an update, and discussion is underway now and feedback is open until 27 April to anyone who wishes to contribute). Mathglot (talk) 08:27, 10 April 2023 (UTC)
- You proposed following Wikipedia:Copying within Wikipedia for attribution, which in essence means extending it to cover more than its original purpose of ensuring compliance with Wikipedia's copyright licensing (CC BY-SA and GFDL). Personally I think it would be better to preserve the current scope of the copying within Wikipedia guideline, both to keep it simpler and to avoid conflating disclosure requirements with copyright licensing issues. isaacl (talk) 16:12, 10 April 2023 (UTC)
- Yes indeed; CWW is only about copying content from one Wikipedia page to another (or among any Wikimedia property, or other compatibly licensed project), because it an interpretation of the ToU for en-wiki. The point I was trying to make, not very clearly perhaps, is the wording at CWW offers us a model of what we might want to say about LLM, as long as we take into consideration their ToU, as well as whatever ends up happening (if anything) with the wmf ToU (which, by the way, is scheduled for an update, and discussion is underway now and feedback is open until 27 April to anyone who wishes to contribute). Mathglot (talk) 08:27, 10 April 2023 (UTC)
- I like your proposed phrasing. True what we need the WMF's input; they published meta:Wikilegal/Copyright Analysis of ChatGPT a few weeks back, but it seems non-committal. As for the ToS of other LLM providers, Bing Chat only allows use for "personal, non-commercial purpose", so it's straightforwardly not compatible. DFlhb (talk) 10:00, 10 April 2023 (UTC)
Content about attribution was not good and I've removed it
DFlhb had removed as a part of their reverted trim, and now I've removed it again. This topic is covered in wmf:Terms of Use/en. No useful specific guidance was provided here. There's no agreement that a policy needs to require use of Template:OpenAI as it is not obviously compatible with OpenAI ToS requirements. Editors advocating to include specific guidance about requiring attribution on this page should get consensus for the concrete version of text about this that they are committed to and want to see it becoming Wikipedia policy. βAlalch E. 11:41, 14 April 2023 (UTC)
LLMs on Wikisource
It has been proposed to me on Wikisource that LLMs would be useful for predicting and proposing fixes to transcription errors. Is there a place to discuss how such a thing might technically be implemented? BD2412 T 23:00, 15 April 2023 (UTC)
- @BD2412: Do you know about LangChain? It's by far the most serious platform for building apps from LLMs in an open non-proprietary way. Although the guy behind it is on Twitter, he and others are far more responsive on their Discord server. Good luck! Sandizer (talk) 13:16, 16 April 2023 (UTC)
- Great, thanks! BD2412 T 13:27, 16 April 2023 (UTC)
- It was me that had asked the OP in this thread, based on their OCR cleanup edits at English Wikisource. naturally any developed LLM OCR/scan error finder would of course have to be approved by the applicable community process before widespread use. ShakespeareFan00 (talk) 17:16, 16 April 2023 (UTC)
- I believe that today's commercial OCR software does include language models for error correction, but while they are not "large" as in LLMs, I believe they are substantially larger than typical autocorrect systems. A very good correction system involving LangChain and Pywikibot should be possible to make from open ~7B size models (e.g. Dolly, see Ars Technica's summary) which run fairly fast on typically four ordinary server CPU cores. It should be possible for project communities to thoroughly test such at a sufficiently large scale to find any issues which might cause serious problems. I suspect that corrections can be automatically classified into those which should require human review, and those which most probably don't need it. Sandizer (talk) 17:29, 16 April 2023 (UTC)
- I think we ought to focus more on specific uses of software rather than the implementation, because that's not always published by the provider and can change rapidly. If there is concern regarding using OCR programs, that should be addressed regardless of how the programs are implemented. isaacl (talk) 21:05, 16 April 2023 (UTC)
- It was me that had asked the OP in this thread, based on their OCR cleanup edits at English Wikisource. naturally any developed LLM OCR/scan error finder would of course have to be approved by the applicable community process before widespread use. ShakespeareFan00 (talk) 17:16, 16 April 2023 (UTC)
- Great, thanks! BD2412 T 13:27, 16 April 2023 (UTC)
BTW I created an article for LangChain which has a couple good starter resources. Sandizer (talk) 04:36, 18 April 2023 (UTC)
Talk pages / Non-article content
The current text starts by granting the permission to use LLM text as a basis for discussion on Talk pages. But how is this ever going to be appropriate as part of the process of building an encyclopedia (as opposed to FORUM-esque discussion?). The prohibition was for using LLMs for 'arguing your case', but the problem I'm seeing is not just this, but people using them for random[3] contribitutions; and if they're using for closing RfCs/AfDs etc ... ? Arghh. I have tried to clarify. Bon courage (talk) 02:06, 7 April 2023 (UTC)
- The idea was to allow people to quote from LLM outputs as part of discussions of how appropriate LLMs are for Wikipedia (i.e. here, or at ANI). Not at all to allow these kinds of junk contributions. Agree that it's so unclear that it's almost counterproductive. DFlhb (talk) 10:36, 8 April 2023 (UTC)
- When it comes to what's between meta LLM discussions and junk comments, the change from
you should not use LLMs to "argue your case for you" in talk page discussions
toyou must not use LLMs to write your comments
was in my opinion a pretty significant change in meaning here. To be clear, if a less-than-confident English speaker has a good argument, an argument of their own construction, should they be allowed to use an LLM to work out the phrasing and essentially have it "write their comment"? Or do we just say that competence is required and that editors who can't phrase their own arguments should not be on talk pages to begin with? PopoDameron β talk 10:46, 8 April 2023 (UTC)- Personally, I'd rather read poorly-written human comments than "LLM-assisted" comments where it's unclear how much "assistance" the LLM gave. DFlhb (talk) 11:28, 8 April 2023 (UTC)
- Same. βAlalch E. 11:30, 8 April 2023 (UTC)
- Yup, if they're not competent in English they won't be competent to assess whether the automatic content accurately represents their thought anyway. Bon courage (talk) 12:14, 8 April 2023 (UTC)
- Good point. Probably for the best (not that we can truly enforce that, but just in terms of policy). PopoDameron β talk 19:12, 8 April 2023 (UTC)
- Yup, if they're not competent in English they won't be competent to assess whether the automatic content accurately represents their thought anyway. Bon courage (talk) 12:14, 8 April 2023 (UTC)
- Same. βAlalch E. 11:30, 8 April 2023 (UTC)
- Personally, I'd rather read poorly-written human comments than "LLM-assisted" comments where it's unclear how much "assistance" the LLM gave. DFlhb (talk) 11:28, 8 April 2023 (UTC)
- When it comes to what's between meta LLM discussions and junk comments, the change from
- I would simply remove this entire line:
while you may include an LLM's raw outputs as examples in order to discuss them or to illustrate a point
- I feel it's redundant and just opens the door to people throwing LLM-generated "arguments" into a conversation, then claiming it's to illustrate a point as a get-out-of-jail-free card. If someone does need to illustrate a point with generated text, a simple "Would someone object if I pasted an example here?" on the Talk page should be good enough to get a yes/no out of the participants. β The Hand That Feeds You:Bite 12:47, 8 April 2023 (UTC)
- Agree. With the current text ("you must not use LLMs to write your comments") the door is already open to use LLMs, so long as it was clear you weren't trying to pass the stuff off as your own comment. Bon courage (talk) 12:58, 8 April 2023 (UTC)
- I did the trimming. βAlalch E. 17:58, 8 April 2023 (UTC)
- Agree. With the current text ("you must not use LLMs to write your comments") the door is already open to use LLMs, so long as it was clear you weren't trying to pass the stuff off as your own comment. Bon courage (talk) 12:58, 8 April 2023 (UTC)
- What about using for grammar improvement for example I gave my text and asked the chatgpt to improve the grammar does it allowed?--Shrike (talk) 18:08, 8 April 2023 (UTC)
- I personally wouldn't trust these systems to just improve the grammar, because they may change the entire meaning of the sentence. You'd be better off using a dedicated online grammer-checking tool. β The Hand That Feeds You:Bite 18:32, 8 April 2023 (UTC)
- I think I am as editor can see when the meaning is changed and when its not and I take full responsibility for the edit but you may have point that this should be allowed only to experienced editors Shrike (talk) 19:03, 8 April 2023 (UTC)
- I personally wouldn't trust these systems to just improve the grammar, because they may change the entire meaning of the sentence. You'd be better off using a dedicated online grammer-checking tool. β The Hand That Feeds You:Bite 18:32, 8 April 2023 (UTC)
- For this I will simply repeat what I said earlier: talk page comments generated via LLMs are a fundamentally different thing from presenting to our readers quasi-factual LLM statements (which can be objectively wrong) or LLM-assisted manipulations of data (which can contain objective errors, and thus be objectively wrong). That is the risk, not sentimental ideas about "real human communication." For instance, most of the AfD contributions linked above would be just as irrelevant if they were bespoke artisanal emanations from a real human being's soul, and others, Wikipedia:Articles for deletion/Uzair Aziz, are basically no different from the routine boilerplate "non-notable, delete" rationales posted by humans. The problem here is the pattern of disruptive edits to AfDs -- it seems like there may be some kind of personal dispute involved on the editor's part, based on their contribs -- not the way they were written.
- Furthermore, this is simply unenforceable. "AI-checking tools" are not infallible and will quickly become outdated as models change, and there is no clean line between "bad" LLMs and "acceptable" "online grammer-checking tools" since many grammar-checking tools have already incorporated LLMs, including major companies like Microsoft. Gnomingstuff (talk) 16:32, 11 April 2023 (UTC)
Proposed merge
This proposal was made by a sockpuppet of a blocked user, and it doesn't look like it's going anywhere. βDavid Eppstein (talk) 01:39, 13 April 2023 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
I'm proposing merging Wikipedia:Using neural network language models on Wikipedia into Wikipedia:Large language models. I think the sections of the former would fit well into WP:LLM, and it would ease in the process of creating a new guideline on ChatGPT. β CityUrbanism π© π 18:48, 10 April 2023 (UTC)
|
WP:MEATBOT and WP:BRFA
AI seems like a semiautomated (bot-like) tool to me, and I am missing the mention of already existing policies like WP:MEATBOT, or WP:BOTUSE on this page. I believe AI is a good thing to Wikipedia, I imagine AI to crawl through the stubs and expand the ones that are expandable. What I do not think is good for Wikipedia, if AI is used to generate thousands of stubs.Paradise Chronicle (talk) 05:55, 21 April 2023 (UTC)
- There was a very prominent mention but various changes and copyedits made it not be mentioned explicitly. MEATBOT is piped in this paragraph:
You must not use LLMs for unapproved bot-like editing, or anything even approaching bot-like editing. Using LLMs to assist high-speed editing in article space is always taken to fail the standards of responsible use, as it is impossible to rigorously scrutinize content for compliance with all applicable policies in such a scenario.
βAlalch E. 11:37, 22 April 2023- This was asked at the Bot Noticeboard in December 2022. The response was "it depends", based on speed/scale of editing and whether it becomes disruptive. This opinion seems to differ from WP:BOTDEF which puts all bot-related activity (including lower-speed assisted editing) under the puriew of BAG. I think it would make sense to add an explicit requirement that all LLM use be approved by BAG βdlthewave β 02:27, 23 April 2023 (UTC)
For the ones who don't explicitly admit they use LLM the whole "policy" is useless. I'd prefer that an editing pattern which appears to be fall under LLM to be the focus of the policy. If editors refuse to admit that they use AI, it will be similar like MEATBOT or MASSCREATE, which are also hardly applied and only to define who actually enters in the two policies is a problem. Paradise Chronicle (talk) 08:33, 23 April 2023 (UTC)
- This just seems to be an extension of the argument in the previous section. β The Hand That Feeds You:Bite 11:34, 23 April 2023 (UTC)
- Re-reading this, I believe HandThatFeeds is correct and I therefore removed the section header.Paradise Chronicle (talk) 13:08, 24 April 2023 (UTC)