Wikipedia talk:Version 1.0 Editorial Team/Index/Archive 8
![]() | This is an archive of past discussions on Wikipedia talk:Version 1.0 Editorial Team/Index. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 5 | Archive 6 | Archive 7 | Archive 8 | Archive 9 | Archive 10 | Archive 11 |
Stub criteria for the bot
Hi. Can you expand how the count of the number of Stubs in, for example, this table are computed by the bot? Does the bot checks all articles in a specific project for stub template(s) that can be set by the editors or does the bot use some pre-defined stub criteria (minimum number of characters, minimum number of words in the main body of the article, etc.)?
Some background: in the WMF Research team, we are working on a research proposal that can help with the expansion of stubs. The very first step for us is to understand what constitutes a stub, and how your bot counts stub can be one input that can help us get closer to finding a common definition of a stub article. There is a conversation on wiki-research-l about this topic as well, if you're interested to chime in there. Thank you! --LZia (WMF) (talk) 15:04, 21 September 2016 (UTC)
- @LZia (WMF): Stub status (as far as this bot reports are concerned) is determined by the presence of a WikiProject banner template on the talk page with the "class" parameter set to "stub". If there are no WikiProject banners, or if there are banners, but the class parameter is blank, the article won't show up as a stub in the statistics table. The bot doesn't count words or look for stub templates at the bottom of the article. Furthermore, the talk page banner assessments are often not updated when the article is expanded. There are certainly some lengthy articles that show up as stubs in this report solely because the assessment on the talk page is out of date. Because of this, 2,897,069 is a over-estimate of the number of stubs that have been assessed for a WikiProject (but when you factor in the articles that haven't been assessed for any WikiProject, the true number of stubs is surely over 3 million). Plantdrew (talk) 17:10, 21 September 2016 (UTC)
- Thanks for the explanation, Plantdrew. It's all clear now. :) One question that comes up from reading your response: Would it be useful for this bot or other bots/efforts if we build a model that can predict (with low false-positive and false-negative) if an article is a stub, and offer this information (via API, for example) as an input to the bot? Such an input can help updating the talk pages you refer to, for example. (We are in very early stages of this research, and I'm just trying to understand if there are specific outputs that can be helpful for a bot like this one.) --LZia (WMF) (talk) 18:26, 22 September 2016 (UTC)
- @LZia (WMF): Some sort of automated stub detection could be useful. A problem is that the editor base interested in labelling articles as stubs is rather low. For false-negatives (stubs not labelled as such) there are 520k articles have a WikiProject Banner with no class parameter set. That's pretty much solely due to lack of editors interested in setting the class, as these articles are fairly highly visible on almost any WikiProject page. There are also false-negatives that don't have any WikiProject Banner at all; these are harder to find, and many editors don't know about the tools that can be used to find them (e.g. PetScan); it still comes down to a small editor base that's interested in this work and that knows how to do it. For false-positives (articles labelled as stubs that aren't) being able to identify them in the first place is a much bigger problem than lack of editors. I'd be happy just to have a report that gave article size in bytes for articles tagged as stubs with a particular WikiProject banner or stub template (then I could examine the larger articles to determine if stub status was warranted). Plantdrew (talk) 20:28, 22 September 2016 (UTC)
- I see. Let me look into this a bit more and report back here. One thing that I was reminded of off-thread is that ORES does the stub prediction, but it will probably suffer from the false-negative issue as the training set for ORES is built based on what editors have currently labelled as stub. I'll look into that, too. --LZia (WMF) (talk) 21:30, 22 September 2016 (UTC)
- @LZia (WMF): Some sort of automated stub detection could be useful. A problem is that the editor base interested in labelling articles as stubs is rather low. For false-negatives (stubs not labelled as such) there are 520k articles have a WikiProject Banner with no class parameter set. That's pretty much solely due to lack of editors interested in setting the class, as these articles are fairly highly visible on almost any WikiProject page. There are also false-negatives that don't have any WikiProject Banner at all; these are harder to find, and many editors don't know about the tools that can be used to find them (e.g. PetScan); it still comes down to a small editor base that's interested in this work and that knows how to do it. For false-positives (articles labelled as stubs that aren't) being able to identify them in the first place is a much bigger problem than lack of editors. I'd be happy just to have a report that gave article size in bytes for articles tagged as stubs with a particular WikiProject banner or stub template (then I could examine the larger articles to determine if stub status was warranted). Plantdrew (talk) 20:28, 22 September 2016 (UTC)
- Thanks for the explanation, Plantdrew. It's all clear now. :) One question that comes up from reading your response: Would it be useful for this bot or other bots/efforts if we build a model that can predict (with low false-positive and false-negative) if an article is a stub, and offer this information (via API, for example) as an input to the bot? Such an input can help updating the talk pages you refer to, for example. (We are in very early stages of this research, and I'm just trying to understand if there are specific outputs that can be helpful for a bot like this one.) --LZia (WMF) (talk) 18:26, 22 September 2016 (UTC)
- (edit conflict) @LZia (WMF): In my experience, the bot, in effect, checks the class parameter of the WikiProject template on the article's talk page. The article is a stub if an editor has set class=stub. There's no checking for stub templates or evaluation of the article against criteria. Where there are multiple WikiProjects, and they rate the article differently (e.g. Talk:Indo-Pakistani War of 1965), for tables that combine WikiProjects, such as Wikipedia:Version 1.0 Editorial Team/Statistics, the bot seems to report the highest class rating. I believe the English Wikipedia is the only one to use WikiProject templates with ratings, so this is not a language-independent method of identifying stubs. --Worldbruce (talk) 17:40, 21 September 2016 (UTC)
- Understood, Worldbruce. Thank you! --LZia (WMF) (talk) 18:28, 22 September 2016 (UTC)
- LZia (WMF): Thanks for posting here. (You might also consider posting at WT:STUB.) The answers above are good technical answers, but I think you're left with the question of "what is the best way to identify a stub" (as opposed to "what ways are used by this existing tool"). Stub templates might indeed be a good indicator to consider. You might, for instance, look for the presence of
class=stub
in one or more WikiProject banners, check whether there is a conflicting designation in other WikiProject banners, and also check whether there's a stub template on the article itself. In cases of conflict, you might consider the most recent relevant change to be the most authoritative. And...you might also set some kind of threshold (like 5 paragraphs, or 10 citations, or something) that overrides a "stub" designation, to reduce the number of false positives (catching articles that have been incrementally and significantly expanded, without being reassessed). Lots of judgment calls, but I'd urge you to take an approach along those lines if possible, rather than simply reimplementing an existing automated process's approach. -Pete (talk) 00:16, 22 September 2016 (UTC)- Pete, thanks for your note. I agree with you. Two things to share: 1) We want to make sure that we don't start working on something that has a clear answer if you ask editors. From the different conversations happening here and in the wiki-research-l list, I'm leaning towards agreeing with you, i.e., we need to build a model that predicts if an article is a stub, and that model will have, as its inputs, many of the features you mentioned (I'm doing some literature review at the moment, cuz that model may have already been built by someone). 2) One aspect we haven't touched so far here is the issue of false-negatives: all the articles that are not stub but are not labelled as such either. I don't have a good sense if this is a serious problem, especially in other languages, but if there is, building a prediction model should hopefully address that, too. --LZia (WMF) (talk) 18:35, 22 September 2016 (UTC)
- LZia (WMF): Thanks for posting here. (You might also consider posting at WT:STUB.) The answers above are good technical answers, but I think you're left with the question of "what is the best way to identify a stub" (as opposed to "what ways are used by this existing tool"). Stub templates might indeed be a good indicator to consider. You might, for instance, look for the presence of
- Understood, Worldbruce. Thank you! --LZia (WMF) (talk) 18:28, 22 September 2016 (UTC)
- The actual criteria for assessing stubs is in the table at WP:1.0/A:
See also WP:STUB § How big is too big?. Depending on your use case, it might be easier to come up with your a new term that roughly correlates with "stub", but with a more precise definition. You may also be interested in SuggestBot, which uses algorithms to give articles a 1-, 2-, or 3-star rating - Evad37 [talk] 00:38, 22 September 2016 (UTC)A very basic description of the topic. However, all very-bad-quality articles will fall into this category. The article is either a very short article or a rough collection of information that will need much work to become a meaningful article. It is usually very short; but, if the material is irrelevant or incomprehensible, an article of any length falls into this category. Although Stub-class articles are the lowest class of the normal classes, they are adequate enough to be an accepted article, though they do have risks of being dropped from being an article all together.
- Thanks a lot, Evad37. Very helpful. Do you have examples of articles that are a collection of information in need of more work to become meaningful, and as a result they're currently labelled as stubs? I have not run into them, at least frequently, and looking into a collection of such articles would be be valuable for us, to identify other features in the article that can help us predict whether an article is a stub more reliably. --LZia (WMF) (talk) 18:51, 22 September 2016 (UTC)
- @LZia (WMF): WikiProject U.S. Roads uses that sort of assessment model, focused more around structure/organisation. See WP:USRD/A for their criteria, and see Category:Stub-Class U.S. road transport articles for the articles. - Evad37 [talk] 00:32, 23 September 2016 (UTC)
- @Evad37: Thanks for the links. Just to make sure I get this right: I see Enchanted Circle Scenic Byway in Category:Stub-Class U.S. road transport articles listed as a stub. Is the article really a stub or is this one of those cases that the talk page template is not updated?
- Personally I think that looks like it was not updated. But I say take away a couple of the prose sections and you would be left with mainly the lists and then you have that large stub talked about earlier. Agathoclea (talk) 15:52, 10 October 2016 (UTC)
- @Evad37: Thanks for the links. Just to make sure I get this right: I see Enchanted Circle Scenic Byway in Category:Stub-Class U.S. road transport articles listed as a stub. Is the article really a stub or is this one of those cases that the talk page template is not updated?
- @LZia (WMF): WikiProject U.S. Roads uses that sort of assessment model, focused more around structure/organisation. See WP:USRD/A for their criteria, and see Category:Stub-Class U.S. road transport articles for the articles. - Evad37 [talk] 00:32, 23 September 2016 (UTC)
- Thanks a lot, Evad37. Very helpful. Do you have examples of articles that are a collection of information in need of more work to become meaningful, and as a result they're currently labelled as stubs? I have not run into them, at least frequently, and looking into a collection of such articles would be be valuable for us, to identify other features in the article that can help us predict whether an article is a stub more reliably. --LZia (WMF) (talk) 18:51, 22 September 2016 (UTC)
Project quality listings not updating
Hi, I destubbed some mid-importance stubs on WP:MICRO, the Microbiology project. The number of those articles was 90 when I started. After I finished, I ran the bot on the project to update the list, but the number of mid-importance stubs (90) stayed the same, even though it should have changed. I did change the classes on the respective talk pages from stub class to start class. Does the bot in fact change this number? Icebob99 (talk) 16:04, 2 December 2016 (UTC)
- I see 86 mid-importance stubs, you may want to purge the project page. I've had this problem before when updating 'manually'...Jokulhlaup (talk) 17:04, 2 December 2016 (UTC)
- Thanks, it also refreshes automatically after 24-48 hours. Icebob99 (talk) 03:14, 6 December 2016 (UTC)
A class versus GA class
Hi, I'm still settling into editing at a WikiProject, so please excuse any apparent ignorance. About the scale for average WikiWork: the steps are calculated in respect to distance from FA, of course, so that means that both A class and GA class are counted along the way; however, I think it would be more accurate to count either A class or GA class as one step. My reason: GA class and A class are often evaluated at the same standards, with any differences being more academic than practical. In fact, I haven't seen too many A class articles, but I have seen many GA class articles in various WikiProjects and it does seem like GA is more of a priority compared to A class. Thus, it makes sense that the purpose of A class is to rank up articles without going through the GA nomination process. How about adjusting the WikiWork calculation to consider A class and GA class as the same rank? That would mean the scale of average WikiWork would be out of 5 and every average WikiWork value for every project would be decreased by 1. I think this change would make the amount of work shown by WikiWork to be more accurate; after all, once an article is A or GA class it can practically pass for the other as well. Thoughts? Icebob99 (talk) 03:13, 6 December 2016 (UTC)
- A-Class is above GA-Class on the standard scale. A-Class requires a minimum of two reviewers to award in the default method, while GA-Class only requires one. The criteria for A-Class are also a bit more stringent than GA-Class. Projects that offer their own A-Class Review (ACR) may have additional requirements; some projects may require an article to be listed as a GA before it can be nominated at that ACR. Imzadi 1979 → 03:25, 6 December 2016 (UTC)
Wikiwork factors
Hi any TPS for this bot's owner, can you give a direction as to how the bot can be fixed to update the wikiwork factor parameters in the WikiProject tables? —IB [ Poke ] 07:00, 22 November 2016 (UTC)
- Anyone? —IB [ Poke ] 05:31, 13 December 2016 (UTC)
- They have not been working for a loooong time, at least not for projects that I know. If it is not possible to fix the wikiwork factors, they should removed altogether. No information is better than misinformation. Micromesistius (talk) 09:14, 13 December 2016 (UTC)
- @IndianBio and Micromesistius: You might try posting a request at WP:Bot requests. --Izno (talk) 12:42, 13 December 2016 (UTC)
Pie chart is off
Unless I am missing something, the pie chart at Wikipedia:Version 1.0 Editorial Team/Statistics is inconsistent with the information in the table. Could someone take a look? Thanks, Mz7 (talk) 20:53, 26 December 2016 (UTC)
- The pie chart should probably either be changed, removed or get an explanation of what it actually displays. It doesn't count articles but the number of WikiProject categories for the corresponding class. See Template talk:Articles by Quality Pie Graph#Completely broken. PrimeHunter (talk) 16:21, 12 January 2017 (UTC)
Search is down
The search which should run when clicking on one of the cells in a WikiProject articles by quality and importance table is down, and has been down for more than a week. People have asked about what's going on with it on a couple WikiProject talk pages that I have on my watchlist. I'm mentioning it here in hopes of getting it fixed. Plantdrew (talk) 17:46, 30 January 2017 (UTC)
- I was wondering about that. Hope it gets fixed soon, I'm not able to assess any articles right now. Icebob99 (talk) 01:20, 31 January 2017 (UTC)
- @Plantdrew and Icebob99: The service has been restarted and is back up. --Bamyers99 (talk) 02:12, 31 January 2017 (UTC)
- Much appreciated! Icebob99 (talk) 02:29, 31 January 2017 (UTC)
Thank you DennisPietras (talk) 04:07, 31 January 2017 (UTC)
- Cheers! — JoeHebda • (talk) 07:54, 31 January 2017 (UTC)
- @Plantdrew and Icebob99: The service has been restarted and is back up. --Bamyers99 (talk) 02:12, 31 January 2017 (UTC)
wikiproject Medicine
I'm not certain if this is the correct place to post, however our [1] isn't working because we assessed[2] several articles and it is not moving start and stub from "???" column, thank you for any help,--Ozzie10aaaa (talk) 01:59, 18 February 2017 (UTC)
- See section "WP1.0 bot not producing many logs for WikiProjects" above. Keith D (talk) 02:32, 18 February 2017 (UTC)
Amiga and importance articles
I came across Wikipedia:Version 1.0 Editorial Team/Amiga and importance articles by quality statistics whilst looking through red link categories, which are discouraged by WP:REDNOT. It looks like gibberish and hasn't been updated in years - could it be deleted? Le Deluge (talk) 20:15, 26 February 2017 (UTC)
Björk and Pokémon
The bot is changing Björk to Björk and Pokémon to Pokémon. I know it has done this many times before and has been fixed but it started again. The history of each page shows how regularly this happens. Can it be fixed for good, please? anemoneprojectors 12:03, 15 February 2017 (UTC)
- Same thing happened for the Beyonce project here. Have reverted the bot for now. —IB [ Poke ] 10:10, 7 March 2017 (UTC)
No Stub Class?
Wikiproject articles by quality and importance tables generated by User:WP 1.0 bot don't seem to include Stub class articles. I'm looking at User:WP 1.0 bot/Tables/Project/Addiction and recovery, which I just manually regenerated using https://tools.wmflabs.org/enwp10/cgi-bin/update.fcgi I may have to switch back to using {{ArticlesByQuality}} over this issue. Bummer. Sondra.kinsey (talk) 20:20, 17 March 2017 (UTC)
- On closer comparison of Wikipedia:WikiProject Addictions and recovery/Assessment#Current status and Category:Addictions and recovery articles by quality, it appears the table may not have updated after all, in which case this bug report is moot. I'll keep you posted. Sondra.kinsey (talk) 20:23, 17 March 2017 (UTC)
6 Unassessed... assessed articles.
Shows 6 unassessed, but they all have Importance and Class ratings. SEMMENDINGER (talk) 21:58, 27 March 2017 (UTC)
- This is because the bot that updates the tables and logs hasn't been running properly, as mentioned in the thread above. You can manually force an update here (which I've just done for now, but you might want to do it in the future if the bot still isn't working).
- Perfect, thanks for that link. Didn't see anywhere else to "purge" it :) SEMMENDINGER (talk) 00:05, 28 March 2017 (UTC)