Jump to content

Wikipedia:Page Curation/Suggested improvements/Archive 1

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by DannyS712 (talk | contribs) at 23:57, 11 August 2019 (archived #8). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Archive 1Archive 2

1. Page curation tool: 'No problems'

Done

This request has not been resolved as of 15 June 2017.

In the "Page info" section of the toolbar, there is a subheading "Possible issues" which contains the text No problems have been found for this page so far. if there are no maintenance templates currently on the page. This is misleading and has no apparent function. Either list the problems that have already been automatically identified and listed on the New Page Feed, or just remove from the Page Curation flyout entirely. Originally brought up by Kudpung Augist 2012. Some talk, not done.--Kudpung กุดผึ้ง (talk) 00:59, 8 September 2016 (UTC)

I've edited the first sentence in your explanation for clarity, to explain where the confusing message is found and what it does. Quiddity (WMF) (talk) 18:36, 3 October 2016 (UTC)
@Quiddity (WMF) and Kaldari: still not done although listed at Phab. Perhaps they don't undersand what is required.--Kudpung กุดผึ้ง (talk) 08:40, 26 November 2016 (UTC)
The request is easy to understand. Kaldari (talk) 22:41, 28 November 2016 (UTC)
This has been completed and will be deployed on June 20 MusikAnimal talk 03:01, 15 June 2017 (UTC)

3. Feed symbols

 Done

Pages tagged for deletion by PROD, BLPPROD, CSD, or AfC, and pages taged for COPYVIO and NOTENGLISH, sgould be shown with the dustbin (AE: trash can) icon, and not the green 'checked' icon. It should be obvious that this would enable admins who are patrolling the quality of the patrollers themselves rather than new pages, can either immediately delete hose pages, or - just as importantly - revert any tags that have been inappropriately or erroneously applied, and then use the 'unreview button' which should then send the 'unreviewed' message automatically to the patroller, using a dropdown list of canned reasons. Kudpung กุดผึ้ง (talk) 04:39, 4 September 2016 (UTC)

  • Support - This is very important. In fact, there should also be an indication of the type of deletion nomination (rather than a generic trash can) and the user name of the patroller who tagged it as such. I will propose this separately.- MrX 13:08, 16 September 2016 (UTC)
  • Comment - I have come to feel that placing an AfD control in the NPP system is counterproductive. If a page is totally inappropriate for Wikipedia, it should be tagged for speedy deletion and kicked straight over to an admin to think about. If it's debatable, we should tell people to message the creator saying that they think the page has problems before they do anything, or draftify it. That could help avoid debacles like some dumb deletion listings I made when I was starting to do NPP, or more recently this one where the page creator just hadn't finished writing the article when it was sent to AfD. (Obviously the flip side to that is that there would probably be more CSD nominations, but they'd get rescinded immediately if inappropriate rather than becoming a deletion discussion that drags out for weeks.) Blythwood (talk) 03:05, 7 October 2016 (UTC)

5. Very short article

 Done

Any new pages that contain only one body section with less that 100 words or 700 characters (both parameters debatable here) or only contains an infobox and/or an image, should be displayed with a red alert alongside those for 'no citations' etc., as stub. Kudpung กุดผึ้ง (talk) 05:03, 4 September 2016 (UTC)

7. Recreations

done

New pages feed should display if the new page is a recreation of a previously deleted page. First suggested by Carcharoth in 2012, the reasons for this alert should be obvious but previous suggestions for this were not considered by the Foundation to be important. Kudpung กุดผึ้ง (talk) 21:54, 7 September 2016 (UTC)

UTC)

8. No Index until patrolled

 Done
No indexing for search engines until an article has been successfully patrolled as appropriate for the encyclopedia. This is not the same as not allowing articles to be published such as at AfC. First suggested by WereSpielChequers and discussd in relative depth in 2912 here, it gained serious support, but possibly due the the community believing this to be a fait accompli, it received no further attention. This is possibly a policy issue and may need consensus from the broader community, although as a cross-Wiki critical issue it could be implemented by the Foundation if at least they could be persuaded of the necessity - especially today.Kudpung กุดผึ้ง (talk) 22:19, 7 September 2016 (UTC)

There was in fact a RfC on this which gained consensus at [[1]]. As with many WMF promises to do it, they never did. --Kudpung กุดผึ้ง (talk) 10:45, 8 September 2016 (UTC)

Thanks, Kaldari. It would be good if you could get this done fairly quickly.Kudpung กุดผึ้ง (talk) 11:38, 5 October 2016 (UTC)
@Kudpung: I've submitted a patch to resolve this (and also the other part of Wikipedia:Requests for comment/NOINDEX, which was to noindex articles containing certain templates, such as speedy deletion templates). Once the patch is reviewed, merged, and live, I'll let you know and help set up the on-wiki configuration for the noindexing templates. Kaldari (talk) 06:38, 6 October 2016 (UTC)
Great, thanks Ryan.--Ymblanter (talk) 07:06, 6 October 2016 (UTC)
I've reviewed and merged Kaldari's patch. It'll be deployed as part of the weekly deployment on Thursday October 13th (in the evening hours UTC). --Roan Kattouw (WMF) (talk) 00:29, 7 October 2016 (UTC)
Also: Kaldari mentioned the "noindex template" feature, which his patch also reactivates. The way this works is you'll be able to add a list of template names, separated by | (sorry) to MediaWiki:Noindex templates, and any page that contains any one of those templates will be noindexed. When you first set this up, it won't work retroactively, as pages only become noindexed when they are created, edited or (un)patrolled or purged; so any pages that already had one of these templates when the feature was activated won't immediately be noindexed. For the main use case that it was suggested for (speedy deletion tags) that shouldn't be too much of a problem, because those templates tend to have high turnover.
Similarly, the "noindex unreviewed new articles" feature won't immediately apply to unreviewed new articles created before the feature was activated, but only when something happens to the article. --Roan Kattouw (WMF) (talk) 01:18, 7 October 2016 (UTC)
MediaWiki:Noindex templates seems like a mistake. Shouldn't noindex templates be handled community-side by just adding __NOINDEX__ to the template itself?
BTW this raises a question. What happens when an article has both __INDEX__ and __NOINDEX__? Alsee (talk) 12:01, 8 October 2016 (UTC)
Good points, I'll play with putting __NOINDEX__ in a template and see what that does. I'll also investigate the index+noindex conflict situation you mentioned. --Roan Kattouw (WMF) (talk) 20:40, 8 October 2016 (UTC)
Looks like __NOINDEX__ does not work when transcluded. We could look into making that work (which I agree would be a better solution than MediaWiki:Noindex templates), but we'd have to make sure that doesn't have unintended side effects for templates whose authors relied on this behavior (i.e. put a __NOINDEX__ directive in the template without bothering to protect it with <noinclude>). --Roan Kattouw (WMF) (talk)
@Roan Kattouw (WMF), it is extremely surprising that __NOINDEX__ doesn't work when transcluded. I just did a search[2] for it in EnWiki template namespace. The community clearly believes this does work, with many templates deliberately including NOINDEX onto "problem" pages that we don't want indexed. Several templates have explicit logic to only apply NOINDEX in certain namespaces. We even have a template {{NOINDEX}} for this sole purpose, with nearly 200,000(!) transclusions. I skimmed the 136 search hits in template namespace and I only found one pair of templates that was mistakenly including NOINDEX onto a cluster of 10 articles pages. The template-pair was created as an editor's first attempt at template-editing. I fixed the problem.
This definitely should be fixed. BTW, if you know why it doesn't work, or if you find out, I'd love to hear why. I haven't done much with templates, but everything I do know screams to me that this should work.... unless there's code explicitly (and mysteriously) filtering it out for some reason. Alsee (talk) 19:48, 10 October 2016 (UTC)
@Alsee: Interesting! Yeah, the existence of Template:NOINDEX is a pretty clear indication that it's expected to work. It clearly doesn't though: I sampled Special:WhatLinksHere/Template:NOINDEX and most of them are noindexed, but that's because they're not in the main namespace (and so are noindexed by default); not one of the main namespace pages on that list is noindexed. I'll see if I can discover why this doesn't work, but I imagine double-underscore directives may not be transcludable in general. --Roan Kattouw (WMF) (talk) 16:36, 11 October 2016 (UTC)
Circling back here from WP:VPT: it turns out there is a setting that completely disables / in the main namespace. Mystery solved. We could disable that for the main namespace on English Wikipedia and put in deletion templates, but we'd have to proceed carefully because we could accidentally noindex lots of pages that shouldn't be by lifting that restriction. --Roan Kattouw (WMF) (talk) 17:28, 12 October 2016 (UTC)
  • Roan Kattouw (WMF), I'm not sure what's going on here, but the requireent is clear that new pages are NOT indexed untill they are patrollesd as OK without glaring issues. Pages tagged for any deletion process or for COI or COPYVIO will also remain NO INDEX until they are either deleted or allowed to stay. We volunteer editors are not necessarily interested in the technicalities of how the software is addressed to get these features finally implemented. What is absolutely clear, however, is that these features must be fully automated - we do not have manpower enough to manually Index/NOINDEX pages as well as all the other controls we have to do. Kudpung กุดผึ้ง (talk) 14:09, 8 October 2016 (UTC)
    • @Kudpung: Sorry, let me clarify: once the feature is deployed on Thursday, you won't have to do any manual work to noindex pages that were created or tagged after Thursday, they'll be noindexed automatically. The only thing that won't be automatic is noindexing of pages that were created or tagged before Thursday. I expect the "tagged before Thursday" case is going to be low-impact, because as I understand it, the main use case is speedy deletion templates and those don't stay for long; but if I'm missing something here (e.g. longer-term templates), please explain and I'll see what can be done about it. As for the "created before Thursday" case, that will be annoying at first (although again, something we might be able to work around if it's a problem), but since the feature only works for 90 days that'll eventually be moot in January. (Reviewedness data is lost 90 days after creation, so we can't know if a page was reviewed past that point and have to stop noindexing it.) --Roan Kattouw (WMF) (talk) 20:38, 8 October 2016 (UTC)
Roan Kattouw (WMF), thanks for the clarification. Tihs is an extremely important feature because for one thing, it will also act as a strong deterrent to Orangemoody style of attacks on the encyclopedia. There is nothing that can be done about pre-Thursday creations because they are already indexed by search engines and those SE listings can't be reverted, so it's too late for them unless of course a script can be devised to search the site for pages that still have the 'patroll thus page' or 'unpatrolled' on them and thre NOINDEX can be added - not that it would help much, though. What we have to ensure however , for the future, is that NO INDEX really works because a Google bot indexes our new pages within a few milliseconds.

The pages to be NOINDEX are:

  • All new mainspace pages except those from WP:Autopatrolled users and admins
  • All new mainspace pages moved from mainspace from other namespaces except those from WP:Autopatrolled users and admins.
  • New Drafts, user pages and sub pages, ~are I believe NOINDEX by defalt, but please check that this is so and is working.

Most important: new pages that may be shown as patrolled but whee the patroller has added tags for:

  • CSD all criteria
  • COPYVIO (there are several different templates being used for this ranging from CSD to close-paraphrasing) including the ones applied automatically by the various duplication/copyvio detection bots.
    Articles not in English (such articles might have already been patrolled by patrollers who see no harm in foreign language articles and will pass them as patrolled. This might need some kind of language detector. The main offending languages are Arabic and Persian which often contain offensive political or religious propaganda, and Chinese, and occasionally articles in Cyrillic.

I think if we can get all this done, we'll see a significant reduction in the number of unwanted articles and it will give the patrollers more breathing space, and us more time to investigate ways of recruiting new, more qualified patrollers.Kudpung กุดผึ้ง (talk) 02:37, 9 October 2016 (UTC)

@Kudpung: I think we can do almost all of those things with the code that will ship on Thursday. Thanks for pointing out that the ship has already sailed on pre-Thursday creations because they've already been indexed anyway, I hadn't quite thought about it that way. As for your list:
  • New page creations after Thursday should be noindexed automatically until they are reviewed (if created by autopatrollers, they become reviewed immediately). This'll be easy to verify on Friday by inspecting an unpatrolled page (the easiest way is to right click, "view source" and search for "noindex").
  • Pages moved into the main namespace: from #9. Pages moved to mainspace from other namespaces it sounds like those should work just like page creations. This could also be verified on Friday but someone may have to deliberately create a test case, I don't know if these occur commonly enough that you can just easily find an existing case.
  • User pages and subpages appear to be noindexed (already, right now), but global user pages like mine aren't (probably a bug). Draft pages appear to already be noindexed as well.
  • CSD, copyvio, etc.: please compile a list of templates that should trigger noindexing and add them to MediaWiki:Noindex templates, separated by pipes like so: Db-g1|Db-g2|Copyvio. You can do this at any time, you don't have to wait until after Thursday (although the noindex feature itself won't work until Thursday), and you can also add templates to this list later (but then only pages that are tagged with those templates after you've added them will be noindexed).
As for articles in other languages: if patrollers are incorrectly approving those, then I think that's more of a social problem than a technical one. You may be able to get someone to write a bot/script to use a language detector to find such pages and tag them though. --Roan Kattouw (WMF) (talk) 00:55, 10 October 2016 (UTC)
That all sounds very good, thank you Roan. Pages moved into the main namespace: from #9 should work just like page creations. These are very common and are fairly high on our list of priorities - we want to preveny the hundreds of Orangemoodyists from slipping their paid spam through. Don't worry too much about the pages with deletion tags - as new pages they will already be NOINDEX. Pages that are 'Not English' are more of a problem because lazy patrollers simply tag them for deletion as gibberish, which is of course completely wrong; hoever, if the are new pages, they will at least be NOINDEX. Kudpung กุดผึ้ง (talk) 21:22, 10 October 2016 (UTC)

NOINDEX is disabled in mainspace due to the risk of abuse, as pointed out by User:TheDJ at VPT. A list of templates that noindex would be relatively easy to abuse, it would be easy for a vandal to transclude one of them in a template with a high transclusion count or itself transcluded in high-value target pages, in a way that neither displays the template nor categorizes. It is likely to result in the pages being noindexed for quite some time in the wrong circumstances (even after revert of the vandalism on the causative page). So e.g. Barack Obama and other high profile articles could quite easily be noindexed for some time. We should develop safeguards before going ahead with such a system. IMO, the noindex until patrolled system is enough. Pages older than 3 months appropriately tagged for speedy deletion as attack pages, vandalism or copyright violations are quite rare, and having already being indexed for more than 3 months, the extra minutes it will remain indexed until deletion isn't what we should worry about. Cenarium (talk) 12:10, 14 October 2016 (UTC)

I think you're pointing out something that has already been covered. Later tagging for deletion of a page that has already been passed as reviewed can occasionally happewn, but it's too late to get it unindexed in search engines.Kudpung กุดผึ้ง (talk) 12:35, 14 October 2016 (UTC)
So we won't be using MediaWiki:Noindex templates then. Cenarium (talk) 13:07, 14 October 2016 (UTC)
Good points, and thanks for pointing out that we need to be careful. Kaldari just redid the noindex templates feature and changed it to a config setting instead of an on-wiki list. So if we decide we still want to have certain templates trigger noindex (and it sounds like we might not), then we can configure it there. Apologies for the confusion; this sounded like a good idea until poeple pointed out what it really meant. --Roan Kattouw (WMF) (talk) 21:51, 14 October 2016 (UTC)

 Done It looks like this is working now. At the time of this writing List of Petz Club episodes was unpatrolled and noindexed, and 1936 Victorian Sporting Car Club Trophy had just been patrolled and was not noindexed. --Roan Kattouw (WMF) (talk) 21:53, 14 October 2016 (UTC)

So here's the current status of this feature: Right now any templates listed at MediaWiki:Noindex templates will trigger noindexing. However, after deploying this I realized that this system isn't very efficient. Every time a page is viewed (unless it is viewed from cache), it has to grab the contents of MediaWiki:Noindex templates, parse out all the template names, and compare them with the templates transcluded in the page. This potentially slows down page render for every page on Wikipedia. If all that is desired is to noindex pages tagged for speedy deletion, there is only one template that needs to be considered: {{db-meta}}. If we're only going to look for 1 or 2 templates (or possibly none per Cenarium), having an on-wiki configuration system is overkill (and an unnecessary performance hit). We should just decide which templates to check, set them in a server-side config variable, and leave it at that. I've already written some new code to do this, so let me know what you think. Kaldari (talk) 22:12, 14 October 2016 (UTC)
  • @Kaldari:, @Roan Kattouw (WMF):, NOINDEX is not working. If all new pages are not indexed until they are oficcialy patrolled, then the bug is that when they are tagged, the tag indexes them, xGoogle indexes them, then an admin deletes the page, leaving Google with an entry to a deleted page. The purpose of NOINDEX is to avoid pages being indexed at all. Kudpung กุดผึ้ง (talk) 11:15, 15 October 2016 (UTC)
    Maybe we could enable template noindexing but only for pages not older than 90 days, which would solve this issue but prevent abuse. Cenarium (talk) 12:57, 15 October 2016 (UTC)
    @Kudpung and Cenarium: The feature to noindex based on templates is currently in place. I had originally configured it to noindex any article transcluding the {{db-meta}} template (which means all articles marked for speedy deletion). I removed this, however, based on Cenarium's concerns above. As soon as there is consensus for how to use this feature, I'll be happy to configure it however you want. We can also add a 90 day threshold if that seems like a good idea. Kaldari (talk) 20:38, 15 October 2016 (UTC)
@Kaldari, Roan Kattouw (WMF), and Cenarium:, What I pointed out was that while all new pages created in or moved to mainspace should not be indexed (and AFAICS we never discussed fr how long), tagging them for CSD certainly indexes them for a split second, long enough for Google to grab them, leaving the deletion (not the deletED) page in Google. For older pages that were already indexed when later tagged for deletion, it's too late - we will have to live with that. Kudpung กุดผึ้ง (talk) 21:07, 15 October 2016 (UTC)
@Kudpung: I think we're all aware of that. Cenarium, however, objected to the use of this feature in the discussion above due to the potential for abuse. (This was also echoed by TheDJ and PrimeHunter in the village pump discussion.) I did some testing and confirmed that the feature could be abused by inserting something like <div style="display:none;">{{db-hoax|nocat=true}}</div> into an article. It sounds though like Cenarium would be willing to accept the potential for abuse if it were limited to articles less than 90 days old. What do you think of this idea? Kaldari (talk) 21:56, 15 October 2016 (UTC)
@Kaldari, Roan Kattouw (WMF), and Cenarium:, There's something I don't quite understand here, but that quite possible because it's nearly 6am here and I've been up all night preparing other stuff in anticipation of the consensus of the final RfC on the New Page Reviewer right. As Roan already understood, the ship will have for any articles that have already been passed as 'patrolled OK'. Retro tagging with tags that add NOINDEX doesn't remove the listings by the search engines. In any case it is nowadays impossible to get anything removed from a search engine almost under any conditions other than an expensive legal takedown order. All that needs to be ensured is that from now on all new pages created in, moved to, or otherwise posted to mainspace and draft space are NOINDEX until patrolled by an autorised reviewer (which means removing the 'mark this page as patroled' link from the sight of non approved patrollers) and that no deletion tags or maintenance tags for serious issues automatically mark a new page as 'patrolled'; userspace, AFAIK is by default (and we hope) NOINDEX, because users use their user pages and sub-pages, and even their talk pages for posting artspam and their commercial links. Perhaps Cenarium's concerns about <div style="display:none;">{{db-hoax|nocat=true}}</div> can be addressed by a local filter. How realistic are your concerns, Cenarium, what do you expect to be the frequency? Kudpung กุดผึ้ง (talk) 23:13, 15 October 2016 (UTC)
I don't think a filter would be a nice solution, not only it adds to the condition limit (which is a perennial concern of edit filter managers), but it can be evaded. We just had a fixed position vandalism that is supposed to be prevented by Special:AbuseFilter/139. In terms of frequency, if it does happen, I don't think we'd let it happen a second time, but I'm pretty sure it's bound to happen within a few months if enabled.
Regarding noindex, it isn't "once indexed, always indexed", because we can modify the html of our pages (and Google respects noindex). Google checks the page html for a noindex tag, so if the page suddenly gets a template that causes it to have a noindex tag, it will no longer be indexed by Google the next time it crawls the page (which happens quickly for wikipedia). The potential for abuse is precisely in some established, high-profile pages getting silently noindexed by a template. Cenarium (talk) 01:05, 16 October 2016 (UTC)
@Kudpung: The problem is that there isn't any such distinction as "patrolled OK" or "patrolled not OK". An article is "patrolled" (as far as the software is concerned) as soon as a reviewer is finished reviewing it, regardless of whether the final decision is to keep the article or not. Changing how that works would require rewriting a lot of software and it isn't clear how such an approval-based system would work anyway. (For example, there is no reliable mechanism for the software to detect when an AfD or PROD has been closed other than looking for templates or magic words, which brings us back to the original abuse problem.) Personally, I think we should either abandon the noindex template feature or implement Cenarium's suggestion. Using an abuse filter might help, but the abuse filter system is already under heavy strain (as mentioned by Cenarium) and I doubt it would be 100% effective. Plus, it would require every wiki using PageTriage to write their own abuse filters, and not all wikis have administrators that know how to do this. (Eventually PageTriage is going to be deployed to other wikis, so we have to make sure it's easy to use and hard to abuse.) Kaldari (talk) 01:41, 16 October 2016 (UTC)
Sorry, @Kaldari:, it's me confusing you. I just don't think like a software programmer, I think as a linguist ;) I know there is no distinction. A page is either 'patrolled' or it isn't - but it can be made 'unpatrolled' again if a patroller screwed up, thus leaving it open in the list for a more experienced patroller to review. The down side to this s that Google will already have snatched it. We'll have to live with that. An article is "patrolled" (as far as the software is concerned) as soon as a reviewer is finished reviewing it, regardless of whether the final decision is to keep the article or not. - absolutely correct, bearing in mind that adding tags of any kind other than deletion tags does make an article 'patrolled' . If you want to help me understand this more (nd I think I need to), don't hesitate to Skype me now. Kudpung กุดผึ้ง (talk) 02:04, 16 October 2016 (UTC)
@Kudpung: Currently, adding a deletion tag via the page curation toolbar does mark an article as patrolled. This is to remove it from the backlog and let other patrollers know that it doesn't need to be reviewed again. I'm still interested to know if you think limiting the noindex template feature to articles created in the past 90 days is a good idea or not. What are your thoughts on that? Kaldari (talk) 20:45, 17 October 2016 (UTC)
@Kaldari:, we obviously don't want pages tagged for deletion to be indexexed by Google - that would be counter to the whole concept of control of new content. I'm not sure however, what is even meant by the 90-day suggestion. I hadn't heard of it until you mentioned it recently. Is it supposed to mean that pages not reviewed will automatically become free for Google to index after 90 days? If so, isthere any reason why they can't remain NOINDEX indefinitely? I would be ready to understand an argment that we should perhaps nevertheless instill some form of urgency on reviewers and that this could be one way to do it. We could also move such pages to Draft where the could then be semi-automatically deleted after 6 months if not imporoved.
@Kudpung: The main danger of the noindex template feature (as explained by Cenarium) is that someone will use it to noindex a high profile article that they don't like. With the current system, this could be done in a way that is almost completely undetectable. For example, you could create a seemingly innocent template, add it to the Hillary Clinton article, and then the day before the U.S. election add <div style="display:none;">{{db-hoax|nocat=true}}</div> to the template that you embedded weeks before in the Clinton article. Nothing noticeable would change on the Clinton article except that it would suddenly stop being indexed by Google. This is potentially a much worse problem than Wikipedia spam getting indexed by Google. Cenerium's suggestion was to limit the noindex template feature to only affect articles that are less than 90 days old. That would dramatically limit the abuse potential, while still giving new page patrollers and administrators sufficient time to remove the obvious spam and attack pages. And as you mentioned, this might also give NPPers extra incentive to keep the backlog under 90 days old. This seems like a good solution to me, but I would like to know if you support the idea as well. Kaldari (talk) 19:05, 18 October 2016 (UTC)
@Kaldari:, Cenarium. Sorry for having been so slow to figure this. Now that I have understood, yes, of course I support it :) Kudpung กุดผึ้ง (talk) 19:24, 18 October 2016 (UTC)
This'll create a backlog, probably. Adotchar (talk) 09:35, 26 October 2016 (UTC)
It won't, Adotchar, and as you've been asked to refrain from patrolling anyway, you need not worry about it. Kudpung กุดผึ้ง (talk) 14:15, 26 October 2016 (UTC)
Good point. Adotchar (talk) 14:32, 26 October 2016 (UTC)
  • In order for this to be of real use the Curation Tool needs to be set to where it does not automark pages as patrolled on every action. For instance if some basic tags (or even CSD, AfD, BLPPROD) are placed the page is marked patrolled and as I understand it, it becomes indexed even if you immediately unpatrol it again.

    This is also an issue for reviewers who are just tagging low hanging fruit. Right now the practice is to immediately 'unpatrol' the page but this will still end up with the page indexed when it should not be.

    There is an option in TW to mark pages patrolled when tagged but for deletion tags it should never do so. JbhTalk 15:51, 26 October 2016 (UTC)
    • @Jbhunley and Kudpung: We originally implemented the feature as you describe: You had to explicitly indicate that you wanted to mark the article as reviewed. However, users complained that this was tedious and they wanted it to happen automatically, so we changed it way back in 2012. Also, it is not accurate that an article is immediately indexed between the time that you tag it and "unreview" it. Unless people are massively linking to it from outside Wikipedia (for example, during a news event), it will take Google several hours to find and index an article after it is reviewed. I've tested this myself. And to dispel one of Kudpung's earlier concerns, it is also not accurate that once Google finds an indexable article it will hold on to it forever. As soon as Google revisits an article that is deleted (or unreviewed) it will see the 404 or noindex tag and immediately remove the article from its index. The chances of a CSD-tagged article getting indexed by Google are now extremely low, and even if it does get indexed, it won't stay on Google for more than a day after its deleted. The actual risk of damage here is extremely low, especially compared with how much regular vandalism is indexed by Google every day. Kaldari (talk) 05:49, 27 October 2016 (UTC)
      @Kaldari and Jbhunley:, I'm relived to hear this, although up until recently, new pages were indexed by Google even quicker than our page reviewers were able to load the New Pages Feed. As these improvements to Page Curation are concomitant with the introduction of the new user right for new page reviewers, which as you know goes into effect at 22:00 UTC tonight (14 hours from now) we need to know that these things will be working because they are partly the reasons why both the right and the wrong people have drifted back to using the old feed and Twinkle. The reasons for the changes are to ensure that no one without the right can tag an unpatrolled page for maintenance or deletion either through twinkle or page Curtion. This is to avoid new users being bitten - and we've lost a few in the last few days. The other eqaually important objective is to ensure that inappropriate new pages do not get patrolled and released into the encyclopedia. The underlying major problem is that we have exhausted all admin capacity for constantly having to monitor the work of the patrollers, and locate the paid spammers who are gaming the system with their sleeper accounts. Kudpung กุดผึ้ง (talk) 06:18, 27 October 2016 (UTC)
      • @Kudpung: Yes, I believe you're correct that somehow Google was aggressively indexing new articles prior to the noindex fix. I'm not sure exactly how they were doing this, but it may have been through monitoring RCStream for new pages. Luckily or unluckily (depending on your point of view) Google is not as good at catching when articles become indexible after being reviewed (at least from my tests), so I think we're actually in pretty good shape now as far as noindexing. If anyone notices issues though, please let me know. Kaldari (talk) 07:01, 27 October 2016 (UTC)
        @Kaldari: So does 'unpatrolling' reinstate the NOINDEX? My concern is that doing initial tagging will remove the NOINDEX before the review is complete. For instance if, on first read I tag {{orphan}}, {{deadend}}, {{unsourced}} or similar low hanging fruit but I am not familiar enough with the subject complete the review I will 'unreview' so someone who knows the topic can make the final call. In that case would the article be in the state of unpatrolled and __INDEX__ simultamiously?

        A related question is do {{blpprod}} articles remain NOINDEXed? Now BLPPRODed articles are automarked patrolled but I would think we would want to insure unsourced BLPs are not indexed. Maybe that is a separate issue...

        JbhTalk 14:38, 27 October 2016 (UTC)
Jbhunley, Kaldari, FYI Roan Kattouw (WMF), Cenarium} The effort with the re-implementation if the NOINDEX feature is to ensure that no articles, tagged or otherwise, until finally approved by a reviewer, will be indexed by search engines.
The technical issues surrrounding deletion templates must be resolved because:
  • Many creators remove CSD, PROD, and BLPPROD templates contrary to instructions and policy. Allowing such articles to become indexed would defear the entire purpose of patrolling new pages.
  • Many patrollers use the wrong deletion templates (which has to be reverted), or some pages are wrongly templated for deletion but may certainly not be ready for publication (i.e. indexing by search engines.
We need some fast updates on this because the new user right goes live on Monday.Kudpung กุดผึ้ง (talk) 22:37, 29 October 2016 (UTC) Kudpung กุดผึ้ง (talk) 22:37, 29 October 2016 (UTC)
@Kudpung and Cenarium: I've submitted a new patch for PageTriage that limits the functionality of the noindex template feature to articles less than 90 days old. Right now, there are no active noindex templates, but I think the best course of action will be to designate the existing {{NOINDEX}} template as a noindex template and add a special tracking category (that can't be disabled via a template parameter) that tracks all main namespace articles that transclude the template. The community can then embed the {{NOINDEX}} template in whichever deletion templates they want it to apply to. (It's already embedded in {{Db-g11}} spam template.) To prepare for this, I've already added the tracking category to the template and set up Category:Noindexed articles. Once the new patch is merged and deployed (which may take a week or two), I can add {{NOINDEX}} as an official noindex template and it will start actually noindexing the articles in Category:Noindexed articles that are less than 90 days old. This approach should be fairly resistant to abuse since you won't be able to disable the category inclusion and it will be limited to new articles. Kaldari (talk) 03:20, 31 October 2016 (UTC)
@Kaldari: Thanks for writing that patch, I've merged it. Also thanks to Cenarium for reviewing it (twice) and providing good feedback. @Kudpung: This patch would normally be scheduled to be deployed on Thursday November 10th around 20:00-22:00 UTC (merged on a Tuesday + going to enwiki = longest possible wait of 9 days), but if there's a reason why it's needed earlier, tell me (and tell me what the reason is), and I can expedite it to Thursday November 3rd around 23:00-00:00 UTC. (Sorry to have been away from this discussion for a while, and then come back and ask a relatively basic question; I had a busy time followed by a family emergency and today is my first day back.) --Roan Kattouw (WMF) (talk) 02:04, 2 November 2016 (UTC)
Thanks Roan - I also know all about family emergencies and I hope all is well, I've just flown back to Thailand from the UK where my father passed away last week. We've been without the required NOINDEX feature for so long that I guess a few more days won't hurt (around 25% of all the 5.5mio articles on the en.Wiki probably shouldn't be there). One of the main objectives for no-indexing is to dissuade the paid SEO spammers from thinking that getting their client on Wikipedia with it associated top-of-the-results at Google will do them any good. In today's climate of Orangemoody it becomes rather critical and we volunteers don't get paid for our work, or for writing the core code. Kudpung กุดผึ้ง (talk) 02:18, 2 November 2016 (UTC)
I'm sorry for your loss. Mine was quite similar, I've just returned to California after an unplanned trip to Europe to attend my grandfather's funeral. It sounds like you're happy to wait until the 10th, which saves me the work of expediting this change. I did just realize that we have to explicitly enable the noindex template feature after Kaldari's is deployed (it's currently disabled because of the abuse concerns from this thread), so I wrote that patch too and scheduled it for Friday November 11 00:00-01:00 UTC (i.e. Thursday afternoon US time, a couple hours later than what I said before). --Roan Kattouw (WMF) (talk) 16:45, 2 November 2016 (UTC)

 Done The {{NOINDEX}} template change was just deployed, and I verified that it works. --Roan Kattouw (WMF) (talk) 00:19, 11 November 2016 (UTC)

@Kudpung: This is working now. Any templates that you want to be noindex templates, just transclude {{NOINDEX}} within them. See for example, {{Db-g11}} (the spam deletion template), which already has this. Now pages like Collibra and Youssif Isa are noindexed, even though they've already been reviewed. Keep in mind this only works for new article though, not old article that have been marked for deletion. Kaldari (talk) 02:25, 11 November 2016 (UTC)
Thank you everyone, Roan Kattouw (WMF), Kaldari, Cenarium, et al for your all your hard work on this. The next stages for getting Page Curation and its feed up to date in order to get people using them with their new user right are all documented on this page. I shall shortly be posting the community's short-list wish of priorities which will encourage the holders of the new, New Page Reviewer right, to use it. Thanks again. (FYI: xaosflux). Kudpung กุดผึ้ง (talk) 03:11, 11 November 2016 (UTC)