Jump to content

Wikipedia talk:Date formattings/script/MOSNUM dates

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Ohconfucius (talk | contribs) at 09:22, 6 June 2012 (Line 1073). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Some questions about changes by Ohconfucius before 12 May

This changeset contains changes to Ohconfucius' test script that I didn't understand fully. Let's have some review and then incorporate them to the main script if everything is OK :)

Line 741

- regex(/\[\[(AD|BC|CE|BCE)([\s_]?)(\d{1,4})\]\]/gi, '$3$2$1');
+ regex(/\[\[(AD|BC|CE|BCE)([\s_]?)(\d{1,4})\]\]/gi, '$1$2$3');

Why don't we reorder dates into consistent format like 123 AD after unlinking? The change removes this feature. 1exec1 (talk) 21:11, 3 June 2012 (UTC)[reply]

Agree. Though I think it's worth to remove this rule altogether, because it would only match false-positives. I've already removed all links that point to the year pages. 1exec1 (talk) 16:06, 4 June 2012 (UTC)[reply]

 Removed

Line 762

- ohc_regex(/\[\[@day[\s_](?:of[\s_])?@month\|([^\]]{1,30})\]\]/gi, "$1");
+ ohc_regex(/\[\[@day[\s_](?:st|nd|rd|th|)[\s_](?:of[\s_])?@month\|([^\]]{1,30})\]\]/gi, "$1");

Wouldn't @th? be sufficient instead of (?:st|nd|rd|th|)? 1exec1 (talk) 21:11, 3 June 2012 (UTC)[reply]

 Done

Line 762

-    ohc_regex(/\[\[@Day\]\]/gi, "@Day");
+ // ohc_regex(/\[\[@Day\]\]/gi, "@Day");

Is there's something bad with unlinking days? 1exec1 (talk) 21:11, 3 June 2012 (UTC)[reply]

  • Er, not as such, but I didn't see the utility in doing lone 2-digit numbers only on the date range (ie 1-31). All numbers up to at least 2020 should be unlinked. I already have the following in the script:

    regex(/\[\[([12]\d{3}|\d{1,3})\]\]/gi, '$1');

    --Ohconfucius ¡digame! 02:47, 4 June 2012 (UTC)[reply]

Ok, then it's probably worth to remove it. 1exec1 (talk) 16:11, 4 June 2012 (UTC)[reply]

 Removed

Lines 875 and 881

+  ohc_regex(/(\()@YYYY[-–]@MM[-–]@DD(\))/gi, '$1@Day @Month @YYYY$2');
...
+  ohc_regex(/(\()@YYYY[-–]@MM[-–]@DD(\))/gi, '$1@Month @Day, @YYYY$2');

What's the reason for these additions? Was it that multiple dates within the same citation weren't converted? If it is so, then this change isn't necessary, as I've added proper fix for this problem. 1exec1 (talk) 21:11, 3 June 2012 (UTC)[reply]

But this regex would convert ISO dates not only in citations, but everywhere. Is that by intention? If not, the regexes within ohc_ISO_to_dmy_in_references() should handle all ISO dates within references already. Are there any cases when they don't work? 1exec1 (talk) 16:23, 4 June 2012 (UTC)[reply]

Line 898

-    ohc_regex(/\{\{date\|(@yyyy-@zm-@zd)(?:\|dmy|)\}\}/gi, "$1"); //Template:date converts to dmy by default
+ // ohc_regex(/\{\{date\|(@yyyy-@zm-@zd)(?:\|dmy|)\}\}/gi, "$1"); //Template:date converts to dmy by default

What was the problem with the date template? Are there cases where the template doesn't result in dmy yet it is matched by the regex? 1exec1 (talk) 16:37, 4 June 2012 (UTC)[reply]

Agreed. 1exec1 (talk) 16:55, 5 June 2012 (UTC)[reply]

 Removed

Lines 898 and 909

+ regex(/([^\d\w\/\-%,])@YYYY-@MM-@DD(<\s?\/ref.*?>)/g, '$1@Day @Month @Year$2');
...
+ regex(/([^\d\w\/\-%,])@YYYY-@MM-@DD(<\s?\/ref.*?>)/g, '$1@Month @Day, @Year$2');

(Note to myself): These aren't necessary; it's better to update regexes in ohc_ISO_to_dmy_in_references(). 1exec1 (talk) 16:37, 4 June 2012 (UTC)[reply]

Line 955

-  ohc_regex(/([^\d][^\w\d])@Month((?:\s@day?,?){1,6}),?(\/|\s?(?:[-–]|&ndash;)\s?|(?:[ _]|&nbsp;)(?:and|&|to|or)(?:[ _]|&nbsp;)+?)@Day,?\s(?:of\s)?(\d{3,4}[^\w\d][^\d])/gi, "$1$2$3@Day @LMonth $4");
+  ohc_regex(/([^\d][^\w\d])@Month((?:\s@day?,?){1,6}),?(\/|\s?(?:[-–]|&ndash;)\s?|(?:[ _]|&nbsp;)(?:and|&|to|or)(?:[ _]|&nbsp;)+?)@Day,?\s(?:of\s)?([12]\d{3}[^\w\d][^\d])/gi, "$1$2$3@Day @LMonth $4");

Are there any problems with 3-digit years? Also, it's probably better to use @yyyy instead of [12]\d{3} as we're in ohc_regex anyway. 1exec1 (talk) 16:54, 4 June 2012 (UTC)[reply]

 Done

Lines 955 and 998

- ohc_regex(/([^\d][^\w\d])@Day\s@Month\s@YYYY(?=[^\w\d][^\d]|\b)/gi, "$1@Day @LMonth @Year");
+ ohc_regex(/([^\d][^\w\d])@Day\s@Month\s@YYYY(?=[^\w\d][^\d]|\b)/gi, "$1@DD @LMonth @Year");
- ohc_regex(/([^\d][^\w\d])@Day\s@Month(?=[^\w\d][^\d]|\b)/gi, "$1@Day @LMonth");
+ ohc_regex(/([^\d][^\w\d])@Day\s@Month(?=[^\w\d][^\d]|\b)/gi, "$1@DD @LMonth");
...
- ohc_regex(/([^\d][^\w\d])@Month\s@Day,?\s@YYYY(?=[^\w\d][^\d]|\b)/gi, "$1@LMonth @Day, @YYYY");
+ ohc_regex(/([^\d][^\w\d])@Month\s@Day,?\s@YYYY(?=[^\w\d][^\d]|\b)/gi, "$1@LMonth @SD, @YYYY");
- ohc_regex(/([^\d][^\w\d])@Month\s@Day(?=[^\w\d][^\d]|\b)/gi, "$1@LMonth @Day");
+ ohc_regex(/([^\d][^\w\d])@Month\s@Day(?=[^\w\d][^\d]|\b)/gi, "$1@LMonth @SD");

This first one must be a bug: @DD is day with leading zero. As for the second one, we already use mostly @Day for days without leading zeroes, so it makes sense to leave this as is for consistency. 1exec1 (talk) 16:54, 4 June 2012 (UTC)[reply]

 Done

Line 1073

- regex(/([\|\{]\s*(?:file|image\d?|image location\d?|img|pic|title|quote|journal|url|work|doi)\s*=)([^\|\}]*)([\|\}])/gi, protect_function);
+ regex(/((?:file|image\d?|image location\d?|img|pic|title|quote|journal|url|work|doi)\s*=)([^\|\}]*)([\|\}])/gi, protect_function);

I think it's worth to keep the check of preceding { or | symbols as it somewhat ensures that we're in a template. Were there any cases where this broke anything? 1exec1 (talk) 16:54, 4 June 2012 (UTC)[reply]

The current code already matches either of template name or parameter name (both | and { characters are matched). Thus preserving the check doesn't reduce the usefullness of the regex. 1exec1 (talk) 16:41, 5 June 2012 (UTC)[reply]
  • One problem with reinserting the ([\|\{]\s* regex string is that parameters will be evaluated by the regex in relation to adjacent parameters. From here, the parameter adjacent to a protected one will not be acted upon viz:

|url=http://pandora.nla.gov.au/pan/23790/20091105-0000/issue1026.pdf%7Ctitle=The ARIA Report: Week Commencing 26th October 2009|work=Pandora Archive|issue=1026|format=pdf|accessdate=November 21, 2009}}

becomes

|url=⍌17⍍|title=The ARIA Report: Week Commencing 26th October 2009|work=⍌18⍍|issue=1026|format=pdf|accessdate=November 21, 2009

We've already established that lookaheads do not work in this context; the contents of |title=, which ought to be protected, aren't. Do you have any suggestions to fix this? --Ohconfucius ¡digame! 09:19, 6 June 2012 (UTC)[reply]

...

I'll add more later. 1exec1 (talk) 21:11, 3 June 2012 (UTC)[reply]

customization points

The customize* functions are there to be overridden by the user in his own .js file. It doesn't make sense to put anything there, we should just make another function. 1exec1 (talk) 16:59, 4 June 2012 (UTC)[reply]

  • OK. Didn't realise what it was for. I needed an extra code loop and it was convenient. But these are my own customisations to a degree, and I don't expect anyone else to use them. I forgot to remove the customisations in the production script. These are used by me to 'correct' Reflinks (which inserts a lot of yyyy-mm-dd dates). --Ohconfucius ¡digame! 14:45, 5 June 2012 (UTC)[reply]

Testsuite

I think it's worth to combine the known tests into a common testsuite so that this knowledge would become more "public". I propose the following:

Any opinions? 1exec1 (talk) 17:26, 4 June 2012 (UTC)[reply]