Wikipedia:Bots/Requests for approval/DYKToolsAdminBot
Operator: RoySmith (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 15:34, Wednesday, March 1, 2023 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available: https://github.com/roysmith/dyk-tools/tree/main/dyk_tools/bot
Function overview: Applies move protection to DYK articles which are on the main page or in the queue to be placed on the main page soon.
Links to relevant discussions (where appropriate): Wikipedia talk:Did you know/Archive 188#Move protection
Edit period(s): Continuous
Estimated number of pages affected: 10 per day
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): No
Function details: Definitions
Nomination: A nomination template, i.e. a subpage of Template:Did you know nominations.
Hook: A string starting with "..." and ending with "?". Optionally includes a tag such as "ALT1".
Target: An article referenced from a hook using a bolded wikilink. All hooks have one or more targets.
Hookset: A template containing a collection of hooks along with other metadata. One of
- ... that the Rani of Jhansi (statue pictured) has been described as "the greatest heroine of Indian history"?
- ... that American-born Freddie Lish called playing for the Thai national basketball team, "my dream ever since I found out my mom was born in Thailand"?
- ... that Pakistani prime minister Mohammad Ali Bogra opposed the 1955 election to protect his political position?
- ... that nine days after his heart transplant, J. C. Walter Jr. merged his company Houston Oil & Minerals with Tenneco, then retired to his ranch and shortly after founded Walter Oil & Gas?
- ... that during his final known voyage, Lope Martín overthrew two captains before being marooned on Ujelang Atoll?
- ... that a Japanese governor wore a pregnancy simulation suit to encourage men to help out their wives at home?
- ... that the shooting of Eileen Quinn was referenced in the poetry of W. B. Yeats?
- ... that Victor L. King employed only black people from the South in 1917 at his new chemical plant in New Jersey to "prevent the entrance into the organization of any enemy aliens" during World War I?
- ... that Cory Booker delivered the longest recorded individual speech in United States Senate history while protesting the second presidency of Donald Trump?
(i.e. the current hookset), the 7 numerically named subpages of Template:Did you know/Queue, or the 7 numerically named Template:Did you know/Preparation area 1, etc.
DYKToolsBot is already approved for a different task, but does not have admin rights. This new account (DYKToolsAdminBot) will handle tasks that require admin rights. They share the same code.
There are two distinct tasks proposed here, protect and unprotect. Both tasks are run as scheduled toolforge jobs. Currently both tasks run every 10 minutes, offset by a few minutes. The exact timing is not critical.
The protect task does:
Parse the main page + queue hooksets, extracting all the hooks. From the hooks, extract the targets which need protecting ("protectable targets"). These titles are indicated by wikilinks set in bold. There is typically one target per hook, but there can be more than one. For each protectable target, indef move=sysop indef protection will be applied. The page protection log messages will include a link to a page in the bot's userspace explaining the process.
The unprotect task does:
Queries the bot's user log with type=protect for the previous N days, where N is long enough to account for any hooks which have progressed through the normal promotion process plus extra time to account for intra-queue hook swapping. It's currently set to 9, but might need to be increased. The exact value is not critical. These are the "unprotectable targets". The current list of protectable targets is acquired as in the protect task. Any targets in the unprotectable set which are not also in the protectable set are unprotected.
I considered computing an expiration date and only protecting until then. The problem is that the expiration date is a moving target. Hooks often get shuffled around when problems are discovered. Sometimes hooks get unpromoted entirely after hitting a queue (or even when they're on the main page). Sometimes the queue processing schedule is disrupted by failure of the bot which manages that process (this has happened a couple of times in the past few weeks). A few times a year, queue processing toggles between 1 per day or 2 per day. Keeping track of all these possibilities and updating the expiration time would add significant complexity for no benefit. It's far simpler to use a declarative approach, in the style of puppet; periodically figure what state each target should be in right now and make it so, regardless of history.
Known problems
On rare occasions, hook targets are written as templates such as one of the (many) Template:Ship variants. The current code does not recognize these properly This happens infrequently enough, and it's difficult enough to do correctly (it requries a call out to Parsoid), and the consequence is mild enough (a page doesn't get the move protection it should), that I'm not going to make it a blocker for an initial deployment.
If a target was already move protected before entering the DYK pipeline, it will have that protection removed when it transitions out of DYK. The probability of this happening is so low, I'm going to ignore it. The alternative would be to maintain a database of pre-existing protections so they could be restored properly, which seems like more trouble than it's worth.
If enough protection log history isn't examined, it's possible to miss unprotecting a target which spent an abnormally long time in the DYK queues. If it happens, the target can be manually unprotected and the history window size increased.