User:Monkbot/Task 2: CS1 deprecated coauthor parameters
In CS1 citations replace:
|coauthor=First Coauthor; Second Coauthor; and Third Coauthor
with
|author2=First Coauthor
|author3=Second Coauthor
|author4=Third Coauthor
The script works for 2–9 coauthors listed in |coauthor=
or |coauthors=
.
This script is intended to pluck the low hanging fruit from Category:Pages containing cite templates with deprecated parameters. Editors often place multiple coauthors in either |coauthor=
or |coauthors=
(these two parameters are aliases of each other so hereafter |coauthor=
) and separate the coauthors with a semicolon with or without a following space. This script replaces |coauthor=
with an appropriate number of |authorn=
parameters beginning with |author2=
.
The individual authors listed in |coauthor=
must be separated by a semicolon. Because it is possible that |coauthor=
is used in a citation with already existing |authorn=
parameters, this script requires that |coauthor=
come after |author=
, |author1=
, |last=
, |last1=
, |first=
, or |first1=
. The presumption here is that editors are not likely to place |authorn=
(or the |lastn=
/ |firstn=
equivalents after |coauthor=
. There is no guarantee that this will be the case in all CS1 citations.
|coauthor=
may not end with a semicolon.
The last coauthor may be separated from preceding coauthors with "; and" or "; &", or simply a semicolon.
The script detects |coauthor=
with a single author name and locates and removes empty |coauthor=
parameters.
The script does not evaluate or validate the content of |coauthor=
.
Known shortcomings
- If an editor separates a coauthor in a list of coauthors with a comma instead of a semicolon, the coauthor following the comma is grouped with the author preceding the comma:
|coauthor=First Coauthor, Second Coauthor; and Third Coauthor
becomes:|author2=First Coauthor, Second Coauthor
|author3=Third Coauthor
To do
Fix single coauthor detection so that it doesn't treat multiple comma-separated coauthors as a single author- Do not do the replacement if the citation contains
|ref=harv
because such edits will break existing{{sfn}}
or{{harv}}
links - Remove
{{citation}}
from Capture$1
because it automatically sets|ref=harv
Regex description
- Capture
$1
- The several Module:Citation/CS1 based templates that detect deprecated
|coauthor=
;{{cite AV media}}
but not{{cite AV media notes}}
,{{cite news}}
but not{{cite newsgroup}}
({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)
[^}]*?
keeps the regex from running off the end of the citation
- Capture
$2
|coauthor=
or|coathors=
must be preceded any of|author=
,|author1=
,|last=
,|last1=
,|first=
, or|first1=
which must not be empty. It is common for these first author parameters to be followed by|authorlink=
(often empty) so we also capture that parameter and value if present.(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)
[^\|}]
– match any character up to the next pipe or end of cite template
- Capture
$3–$(n-1)
- These captures make up the bulk of the regex. For this script, coauthor names must be followed by a semicolon (
;
). ([\w\s\.,'\[\]-]+)
– word characters, spaces, periods, commas, apostrophe (for names like O'Brian), and (presumably) wikilink markup
- last coauthor separation
- Quite often, editors separate the last coauthor with "; and " or "; & ". This non-capture accommodates that. The
\b
word boundary prevents this regex from improperly matching "and" in "andrew", etc. (?:\s*and\b|\s*&)?
– when loading this script into AWB from a file, the symbol&
must be written&
- Last coauthor capture
$n
- Same as captures
$3–$(n-1)
except that\b
word boundary prevents the script from adding an empty|authorn=
when|coauthor=
ends with a semicolon. (\s*\b[\w\s\.,'\[\]-]+?)
- Capture
$(n+1)
- Captures the remainder of the citation through the first closing
}}
. If the last parameter is|ref=harv
, the match fails and the replacements do not occur. (\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})
Script
<?xml version="1.0"?>
<AutoWikiBrowserPreferences xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:space="preserve" Version="5.5.2.3">
<Project>wikipedia</Project>
<LanguageCode>en</LanguageCode>
<CustomProject />
<Protocol>http://</Protocol>
<LoginDomain />
<List>
<ListSource>Category:Pages containing cite templates with deprecated parameters</ListSource>
<SelectedProvider>CategoryListProvider</SelectedProvider>
<ArticleList />
</List>
<FindAndReplace>
<Enabled>true</Enabled>
<IgnoreSomeText>false</IgnoreSomeText>
<IgnoreMoreText>false</IgnoreMoreText>
<AppendSummary>false</AppendSummary>
<Replacements>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=\s*([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6|author6=$7|author7=$8|author8=$9|author9=$10|author10=$11$12</Replace>
<Comment>author2; author3; author4; author5; author6; author7; author8; author9;[ &| and] author10</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=\s*([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6|author6=$7|author7=$8|author8=$9|author9=$10$11</Replace>
<Comment>author2; author3; author4; author5; author6; author7; author8;[ &| and] author9</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=\s*([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6|author6=$7|author7=$8|author8=$9$10</Replace>
<Comment>author2; author3; author4; author5; author6; author7;[ &| and] author8</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=\s*([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);\s?([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6|author6=$7|author7=$8$9</Replace>
<Comment>author2; author3; author4; author5; author6;[ &| and] author7</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=\s*([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6|author6=$7$8</Replace>
<Comment>author2; author3; author4; author5;[ &| and] author6</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5|author5=$6$7</Replace>
<Comment>author2 author3 author4[;|&|; and] author5</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=([\w\s\.,'\[\]-]+?);([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4|author4=$5$6</Replace>
<Comment>author2 author3[;|&|; and] author4</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=\s*\w+[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=([\w\s\.,'\[\]-]+?);(?:\s*and\b|\s*&)?(\s*\b[\w\s\.,'\[\]-]+?)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3|author3=$4$5</Replace>
<Comment>author2[;|&|; and] author3</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation[^}]+\s*)[^}]*?)\|\s*coauthors?\s*=\s*(\|[^}]*)</Find>
<Replace>$1$2</Replace>
<Comment>Empty</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
<Replacement>
<Find>({{\s*(?:[Cc]ite (?:(?:AV media(?! notes))|book|conference|encyclopedia|journal|(?:news(?!group))|press release|sign|techreport|thesis|web)|[Cc]itation)[^}]*?)(\|\s*(?:author1?|last1?|first1?)\s*=[^\|}]+?(?:\|\s*authorlink\s*=[^}]*?)?)\|\s*coauthors?\s*=(?:\s*and\b\s*|\s*&\s*)?([\w\.,-]*\s?[\w\.-]+\s[\w\.-]+)(\s*\|[^}]*(?<!\s*\|\s*ref\s*=\s*harv\s*)}})</Find>
<Replace>$1$2|author2=$3$4</Replace>
<Comment>author2</Comment>
<IsRegex>true</IsRegex>
<Enabled>true</Enabled>
<Minor>false</Minor>
<BeforeOrAfter>false</BeforeOrAfter>
<RegularExpressionOptions>IgnoreCase</RegularExpressionOptions>
</Replacement>
</Replacements>
<AdvancedReps />
<SubstTemplates />
<IncludeComments>false</IncludeComments>
<ExpandRecursively>true</ExpandRecursively>
<IgnoreUnformatted>false</IgnoreUnformatted>
</FindAndReplace>
<Editprefs>
<GeneralFixes>false</GeneralFixes>
<Tagger>false</Tagger>
<Unicodify>false</Unicodify>
<Recategorisation>0</Recategorisation>
<NewCategory />
<NewCategory2 />
<ReImage>0</ReImage>
<ImageFind />
<Replace />
<SkipIfNoCatChange>false</SkipIfNoCatChange>
<RemoveSortKey>false</RemoveSortKey>
<SkipIfNoImgChange>false</SkipIfNoImgChange>
<AppendText>false</AppendText>
<AppendTextMetaDataSort>false</AppendTextMetaDataSort>
<Append>true</Append>
<Text />
<Newlines>2</Newlines>
<AutoDelay>5</AutoDelay>
<BotMaxEdits>500</BotMaxEdits>
<SupressTag>true</SupressTag>
<RegexTypoFix>false</RegexTypoFix>
</Editprefs>
<General>
<AutoSaveEdit>
<Enabled>false</Enabled>
<SavePeriod>30</SavePeriod>
<SaveFile />
</AutoSaveEdit>
<SelectedSummary>Fix [[Help:CS1_errors#deprecated_params|CS1 deprecated coauthor parameter errors]]</SelectedSummary>
<Summaries>
<string>clean up</string>
<string>re-categorisation per [[WP:CFD|CFD]]</string>
<string>clean up and re-categorisation per [[WP:CFD|CFD]]</string>
<string>removing category per [[WP:CFD|CFD]]</string>
<string>[[Wikipedia:Template substitution|subst:'ing]]</string>
<string>[[Wikipedia:WikiProject Stub sorting|stub sorting]]</string>
<string>[[WP:AWB/T|Typo fixing]]</string>
<string>bad link repair</string>
<string>Fixing [[Wikipedia:Disambiguation pages with links|links to disambiguation pages]]</string>
<string>Unicodifying</string>
<string>Fix [[Help:CS1_errors#deprecated_params|CS1 deprecated coauthor parameter errors]]</string>
</Summaries>
<PasteMore>
<string />
<string />
<string />
<string />
<string />
<string />
<string />
<string />
<string />
<string />
</PasteMore>
<FindText>\|\s*ref\s*=\s*harv</FindText>
<FindRegex>true</FindRegex>
<FindCaseSensitive>false</FindCaseSensitive>
<WordWrap>true</WordWrap>
<ToolBarEnabled>false</ToolBarEnabled>
<BypassRedirect>true</BypassRedirect>
<AutoSaveSettings>false</AutoSaveSettings>
<noSectionEditSummary>false</noSectionEditSummary>
<restrictDefaultsortAddition>true</restrictDefaultsortAddition>
<restrictOrphanTagging>true</restrictOrphanTagging>
<noMOSComplianceFixes>false</noMOSComplianceFixes>
<syntaxHighlightEditBox>false</syntaxHighlightEditBox>
<highlightAllFind>false</highlightAllFind>
<PreParseMode>false</PreParseMode>
<NoAutoChanges>false</NoAutoChanges>
<OnLoadAction>0</OnLoadAction>
<DiffInBotMode>false</DiffInBotMode>
<Minor>true</Minor>
<AddToWatchlist>2</AddToWatchlist>
<TimerEnabled>false</TimerEnabled>
<SortListAlphabetically>false</SortListAlphabetically>
<AddIgnoredToLog>false</AddIgnoredToLog>
<EditToolbarEnabled>true</EditToolbarEnabled>
<filterNonMainSpace>false</filterNonMainSpace>
<AutoFilterDuplicates>false</AutoFilterDuplicates>
<FocusAtEndOfEditBox>false</FocusAtEndOfEditBox>
<scrollToUnbalancedBrackets>false</scrollToUnbalancedBrackets>
<TextBoxSize>10</TextBoxSize>
<TextBoxFont>Courier New</TextBoxFont>
<LowThreadPriority>false</LowThreadPriority>
<Beep>false</Beep>
<Flash>false</Flash>
<Minimize>false</Minimize>
<LockSummary>false</LockSummary>
<SaveArticleList>true</SaveArticleList>
<SuppressUsingAWB>false</SuppressUsingAWB>
<AddUsingAWBToActionSummaries>false</AddUsingAWBToActionSummaries>
<IgnoreNoBots>false</IgnoreNoBots>
<ClearPageListOnProjectChange>false</ClearPageListOnProjectChange>
<SortInterWikiOrder>true</SortInterWikiOrder>
<ReplaceReferenceTags>true</ReplaceReferenceTags>
<LoggingEnabled>true</LoggingEnabled>
<AlertPreferences />
</General>
<SkipOptions>
<SkipNonexistent>true</SkipNonexistent>
<Skipexistent>false</Skipexistent>
<SkipWhenNoChanges>false</SkipWhenNoChanges>
<SkipSpamFilterBlocked>true</SkipSpamFilterBlocked>
<SkipInuse>true</SkipInuse>
<SkipWhenOnlyWhitespaceChanged>false</SkipWhenOnlyWhitespaceChanged>
<SkipOnlyGeneralFixChanges>true</SkipOnlyGeneralFixChanges>
<SkipOnlyMinorGeneralFixChanges>false</SkipOnlyMinorGeneralFixChanges>
<SkipOnlyCasingChanged>false</SkipOnlyCasingChanged>
<SkipIfRedirect>false</SkipIfRedirect>
<SkipIfNoAlerts>false</SkipIfNoAlerts>
<SkipDoes>false</SkipDoes>
<SkipDoesNot>false</SkipDoesNot>
<SkipDoesText />
<SkipDoesNotText />
<Regex>false</Regex>
<CaseSensitive>false</CaseSensitive>
<AfterProcessing>false</AfterProcessing>
<SkipNoFindAndReplace>true</SkipNoFindAndReplace>
<SkipMinorFindAndReplace>false</SkipMinorFindAndReplace>
<SkipNoRegexTypoFix>false</SkipNoRegexTypoFix>
<SkipNoDisambiguation>false</SkipNoDisambiguation>
<SkipNoLinksOnPage>false</SkipNoLinksOnPage>
<GeneralSkipList />
</SkipOptions>
<Module>
<Enabled>false</Enabled>
<Language>C# 2.0</Language>
<Code> public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
{
Skip = false;
Summary = "test";
ArticleText = "test \r\n\r\n" + ArticleText;
return ArticleText;
}</Code>
</Module>
<ExternalProgram>
<Enabled>false</Enabled>
<Skip>false</Skip>
<Program />
<Parameters />
<PassAsFile>true</PassAsFile>
<OutputFile />
</ExternalProgram>
<Disambiguation>
<Enabled>false</Enabled>
<Link />
<Variants />
<ContextChars>20</ContextChars>
</Disambiguation>
<Special>
<namespaceValues>
<int>0</int>
</namespaceValues>
<remDupes>true</remDupes>
<sortAZ>true</sortAZ>
<filterTitlesThatContain>false</filterTitlesThatContain>
<filterTitlesThatContainText />
<filterTitlesThatDontContain>false</filterTitlesThatDontContain>
<filterTitlesThatDontContainText />
<areRegex>false</areRegex>
<opType>0</opType>
<remove />
</Special>
<Tool>
<ListComparerUseCurrentArticleList>0</ListComparerUseCurrentArticleList>
<ListSplitterUseCurrentArticleList>0</ListSplitterUseCurrentArticleList>
<DatabaseScannerUseCurrentArticleList>0</DatabaseScannerUseCurrentArticleList>
</Tool>
<Plugin />
</AutoWikiBrowserPreferences>