Jump to content

Module talk:Citation/CS1/Archive 11

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by ClueBot III (talk | contribs) at 00:20, 12 January 2015 (Archiving 1 discussion from Module talk:Citation/CS1. (BOT)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Archive 5Archive 9Archive 10Archive 11Archive 12

http/ftp etc

Maybe there should be included some check in |url=, if it starts with http/ftp etc.? Because if there is |url=www.example.com or |url=example.com, the link won't be clickable. --Edgars2007 (talk/contribs) 18:01, 25 September 2014 (UTC)

There is. See Help:CS1_errors#Check_.7Curl.3D_scheme. This error message isn't hidden so you should see it. Here is a very simple {{cite web}}:
{{cite web |url=www.example.com}} [www.example.com www.example.com]. {{cite web}}: Check |url= value (help); Missing or empty |title= (help)
Do you not see the error messages? (there are two, one for the malformed url and the other for a missing title)
Trappist the monk (talk) 18:13, 25 September 2014 (UTC)
Oh, sorry. Didn't though to test it :) --Edgars2007 (talk/contribs) 21:08, 25 September 2014 (UTC)

COinS safe

I updated {{COinS safe}} to add Category:Templates not safe for use in citation templates. Is it worth doing a check for any of these templates? --  Gadget850 talk 20:13, 25 September 2014 (UTC)

I'm not sure I understand what it is that you're asking. By the time the module gets to the content of a citation's various parameters, any templates will have already been expanded. So, all the module sees is all of the template cruft intermingled with the important stuff that CS1 wants.
If you mean should someone run an AWB script of some-such that would strip citations of these particular templates, then, yeah, someone should. I have it in mind to do that for {{nihongo}} and {{asiantitle}} after the initial implementation of |script-title= so that editors who work on Asian topics can see the results, and we'll get cleaner COinS.
Trappist the monk (talk) 21:22, 25 September 2014 (UTC)
That is what I meant. --  Gadget850 talk 22:37, 25 September 2014 (UTC)
@Gadget850: Thanks for the info. Some questions/comments:
  1. Should the category documentation state that {{COinS safe}} should be added to the template documentation to add the category?
  2. Should redirects such as {{aut}} and {{sm}} (both redirect to {{smallcaps}}) also be included in the category?
  3. Members of Wikipedia:WikiProject Mesoamerica commonly use {{aut}} in citations (see Wikipedia:WikiProject Mesoamerica/Citations). You might want to talk with them before mass-removing {{aut}}.
  4. Should documentation pages such as Template:Abbr/doc be excluded from the category?
  5. Should sandbox pages such as Template:Abbr/sandbox be excluded from the category?
  6. I just added {{COinS safe|n}} to {{date}} and {{dts}}.
  7. BattyBot aready removes {{date}}, {{dts}}, {{nowrap}} and {{start date}} from citation date fields to remove articles from Category:CS1 errors: dates. If there are variations that BattyBot isn't removing or additional templates it could remove, please let me know.
Thanks! GoingBatty (talk) 00:48, 26 September 2014 (UTC)
Re {{aut}}, you can find quite a few of them in the remaining articles in the deprecated parameters category, since Monkbot is not programmed to fix |coauthors= when templates are present in citations, as far as I know.
Re {{date}} and its ilk, the following appear to be redirects to those four templates: {{Css1date}}, {{Cssdate}}, {{Date start}}, {{Datesort}}, {{DATEtoMOS}}, {{FormatDate}}, {{Foundation date}}, {{Initial release}}, {{ISOtodmymdy}}, {{ISOtoMOS}}, {{J}}, {{Launch date}}, {{No break}}, {{No wrap}}, {{Nobr}}, {{Nobreak}}, {{Release date}}, {{Sbd}}, {{Sortdate}}, {{SortDate}}, {{Start Date}}, {{Startdate}}, and {{Starting date}}. – Jonesey95 (talk) 03:49, 26 September 2014 (UTC)
@Jonesey95: Closer inspection of the code showed that BattyBot already removed {{Nobreak}} and {{Startdate}} as well. It now also removes {{Nobr}}, {{No break}}, {{No wrap}}, and {{Start Date}} (although there weren't any CS1 date errors due to these templates). If anyone has examples of the other templates in use in CS1 citations, I'll add them too. Thanks! GoingBatty (talk) 23:54, 26 September 2014 (UTC)
@Jonesey95: {{nbsp}} is another redirect that could be in the category too. BattyBot is now removing it from templates. GoingBatty (talk) 02:53, 27 September 2014 (UTC)
I found an instance of {{nobreak}} causing a citation error in Krkonose / Karkonosze and another in Nazi crimes against the Polish nation. I used catscan to search the date category for all of the above templates, and that's all I found. – Jonesey95 (talk) 04:28, 27 September 2014 (UTC)
@Jonesey95: BattyBot didn't fix those articles because {{nobreak}} was also being used in an earlier parameter, so I fixed them manually. Thanks! GoingBatty (talk) 21:04, 27 September 2014 (UTC)
GoingBatty, {{dts}} was left in Compiz after Battybot visited. I don't see other templates in the citations in question. There is another one in PCSX-ReloadedJonesey95 (talk) 04:20, 27 September 2014 (UTC)
@Jonesey95:  Done - thanks! GoingBatty (talk) 21:04, 27 September 2014 (UTC)

Where does this break come from?

A few months ago, {{cite doi}} was deprecated. Today, it looks like some edit enforced that deprecation by breaking the connection from {{sfn}} using an individual {cite doi} template. But without article cleanup (by a bot, I expect), {sfn} now produces an error (see helium). I do not know which page was edited into this break. Anyone an idea? (With that knowledge, I'll ask a revert on the right talkpage). -DePiep (talk) 09:26, 13 October 2014 (UTC)

Do you mean this?

Markup
{{Cite journal|title = Probing the interior of fullerenes by <sup>3</sup>He NMR spectroscopy of endohedral <sup>3</sup>He@C<sub>60</sub> and <sup>3</sup>He@C<sub>70</sub> |author = Saunders, M. ''et al.''|journal = Nature |volume = 367|issue = 6460|pages = 256–258 |year = 1994 |doi = 10.1038/367256a0|bibcode = 1994Natur.367..256S|first2 = Hugo A.|first3 = R. James|first4 = Stanley|first5 = Darón I.|first6 = Frank A. L. }}
Renders as
Saunders, M.; et al. (1994). "Probing the interior of fullerenes by 3He NMR spectroscopy of endohedral 3He@C60 and 3He@C70". Nature. 367 (6460): 256–258. Bibcode:1994Natur.367..256S. doi:10.1038/367256a0. {{cite journal}}: |first2= missing |last2= (help); |first3= missing |last3= (help); |first4= missing |last4= (help); |first5= missing |last5= (help); |first6= missing |last6= (help); Explicit use of et al. in: |author= (help)

If so, the error is because the last name fields are missing. --  Gadget850 talk 10:31, 13 October 2014 (UTC)

Yes, these names are missing, producing error texts (helium now has four such references). The point is, a week ago they were not. The names were not deleted, but somehow a connection was removed ( Mirokado noted this). I don't know if the breaking edit was done in citation/CS1 or elsewhere. If a "bug removal" in /CS1 is the cause, I'd say that removal failed and should be revisited. But it could also be caused by another edit, someone enforcing the {{cite doi}} deprecation . -DePiep (talk) 12:02, 13 October 2014 (UTC)
Helium doesn't use {{sfn}} --Redrose64 (talk) 12:26, 13 October 2014 (UTC)
So we can rule out {{sfn}} then. See below for the gadget850 explanation. -DePiep (talk) 13:20, 13 October 2014 (UTC)
I was trying to figure that out as well.
Author detection was fixed in the last update. The citation listed above has |first= without |last= which is certainly not proper. As I see it, this error is proper.
Elsewhere has been discussed the breaking of the "citation trick". See Wikipedia talk:WikiProject Elements#The citation trick has stopped working. --  Gadget850 talk 12:44, 13 October 2014 (UTC)
The template is correct to flag these errors in Helium. I've just fixed a couple. Aa77zz (talk) 12:49, 13 October 2014 (UTC)
re Aa77zz: nice, but they were mass-created, and I don't want to be forced to manually edit this way. Problem is so far nobody knows where the bad edit was made. -DePiep (talk) 13:05, 13 October 2014 (UTC)
re Gadget850 "... the error is proper": Could be, but I'd like to know where the mass-creating stems from. -DePiep (talk) 13:08, 13 October 2014 (UTC)
If, by "mass-creating", you mean that many citations have lists of first names with no corresponding last names, Citation Bot created many citations like that; it was a bug that was eventually fixed. Running Citation Bot on the page again may add the last names. You will want to inspect the resulting edit, however; Citation Bot is a useful tool, but it is never free of bugs. – Jonesey95 (talk) 05:41, 14 October 2014 (UTC)
The bot action you propose still involves manual editing. I think all 17000 {cite doi} transclusions are suspected, and that would be too much for manual check.
I meant to say it was "mass-created" by one single recent edit (exposing old edits at once being 'wrong'; likely and edit in /CS1 as we know by now). When I asked for a bot cleanup, I was thinking of a new bot task, running once, that cleans this up. However, I don't know whether this is feasible, I have lost the topic after the code discussion below. Also, the code discussion below could produce another solution, making a bot action unneeded. So bots can wait till that option is fleshed out. -DePiep (talk) 07:24, 14 October 2014 (UTC)

(copied useful Gadget850 post from WT:ELEMENTS):

  • The citation hack consists of stuffing {{cite doi}} into the |title= parameter of {{citation}} and adding markup to undo the italic markup. Currently the apostrophes are added to the title in the COinS metadata.
  • Module:Citation/CS1 was updated to prevent apostrophe markup from being passed into the COinS metadata. A bug was found and this update was reverted.
  • And the change to Fluorine is a real hack. Lets come up with a better solution before proliferatiing this. --  Gadget850 talk 12:36, 13 October 2014 (UTC) (copy/pasted here -DePiep (talk) 13:15, 13 October 2014 (UTC))
If this explains it, I think we better ask a bot to get author names from {cite doi/xxx} into the straight {cite} parameters (in article page that is). {{cite doi}} has 17000+ transclusions. -DePiep (talk) 13:24, 13 October 2014 (UTC)
Those first names without last names were added by our old friend CitationBot.[1] --  Gadget850 talk 13:29, 13 October 2014 (UTC)
The "real hack" used by Mirokado in this edit was the right idea but the wrong method. The old code was e.g.
{{citation|title=''{{cite doi|10.1039/c0em00373e|noedit}}''|ref={{harvid|Ahrens|2011}}}}
and this was altered to
<span id="{{harvid|Ahrens|2011}}" class="citation">{{cite doi|10.1039/c0em00373e|noedit}}</span>
but by using {{wikicite}} it could have been simpler:
{{wikicite|reference={{cite doi|10.1039/c0em00373e|noedit}} |ref={{harvid|Ahrens|2011}}}}
It is never a good idea to misuse templates, but this is the primary purpose of {{wikicite}}. --Redrose64 (talk) 14:26, 13 October 2014 (UTC)

It is wrongheaded to think that the citation hack ever worked as this citation from Fluorine shows:

  • {{citation/new|title=''{{cite doi|10.1039/c0em00373e|noedit}}''|ref={{harvid|Ahrens|2011}}}}
    • Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1039/c0em00373e instead.
      • '"`UNIQ--templatestyles-00000014-QINU`"'<cite id="CITEREFAhrens2011" class="citation cs2">''<span></span>''<span class="error">Attention: This template (<span class="nowrap">&#123;&#123;</span>[[Template:cite doi|cite doi]]<span class="nowrap">&#125;&#125;</span>) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use <span class="nowrap">&#123;&#123;</span>[[Template:cite journal|cite journal]]<span class="nowrap">&#125;&#125;</span> (if it was published in a bona fide academic journal, otherwise <span class="nowrap">&#123;&#123;</span>[[Template:cite report|cite report]]<span class="nowrap">&#125;&#125;</span> with <code class="tpl-para" style="word-break:break-word; ">&#124;doi&#61;10.1039/c0em00373e</code> instead.</span>''<span></span>''</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=%3Cspan+class%3D%22error%22%3EAttention%3A+This+template+%28%3Cspan+class%3D%22nowrap%22%3E%26%23123%3B%26%23123%3B%3C%2Fspan%3Ecite+doi%3Cspan+class%3D%22nowrap%22%3E%26%23125%3B%26%23125%3B%3C%2Fspan%3E%29+is+deprecated.+To+cite+the+publication+identified+by+doi%3A10.1039%2Fc0em00373e%2C+please+use+%3Cspan+class%3D%22nowrap%22%3E%26%23123%3B%26%23123%3B%3C%2Fspan%3Ecite+journal%3Cspan+class%3D%22nowrap%22%3E%26%23125%3B%26%23125%3B%3C%2Fspan%3E+%28if+it+was+published+in+a+bona+fide+academic+journal%2C+otherwise+%3Cspan+class%3D%22nowrap%22%3E%26%23123%3B%26%23123%3B%3C%2Fspan%3Ecite+report%3Cspan+class%3D%22nowrap%22%3E%26%23125%3B%26%23125%3B%3C%2Fspan%3E+with+%3Ccode+class%3D%22tpl-para%22+style%3D%22word-break%3Abreak-word%3B+%22%3E%26%23124%3Bdoi%26%2361%3B10.1039%2Fc0em00373e%3C%2Fcode%3E+instead.%3C%2Fspan%3E&rfr_id=info%3Asid%2Fen.wikipedia.org%3AModule+talk%3ACitation%2FCS1%2FArchive+11" class="Z3988"></span>

There are two copies of the citation's metadata in that mass of stuff.

This form from Editor Mirokado is better because it produces only one copy of the metadata, but is more difficult to type:

  • <span id="{{harvid|Ahrens|2011}}" class="citation">{{cite doi|10.1039/c0em00373e|noedit}}</span>
    • Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1039/c0em00373e instead.
      • <span id="CITEREFAhrens2011" class="citation"><span class="error">Attention: This template (<span class="nowrap">&#123;&#123;</span>[[Template:cite doi|cite doi]]<span class="nowrap">&#125;&#125;</span>) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use <span class="nowrap">&#123;&#123;</span>[[Template:cite journal|cite journal]]<span class="nowrap">&#125;&#125;</span> (if it was published in a bona fide academic journal, otherwise <span class="nowrap">&#123;&#123;</span>[[Template:cite report|cite report]]<span class="nowrap">&#125;&#125;</span> with <code class="tpl-para" style="word-break:break-word; ">&#124;doi&#61;10.1039/c0em00373e</code> instead.</span></span>

This form from Editor Redrose64 is similar but adds an extra <span>...</span>:

  • {{wikicite|reference={{cite doi|10.1039/c0em00373e|noedit}} |ref={{harvid|Ahrens|2011}}}}
    • ‹See TfM›Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1039/c0em00373e instead.
      • '"`UNIQ--templatestyles-0000001D-QINU`"'<span class="noprint tfd tfd-dated tfd-tiny tfd-dedup">[[Wikipedia:Templates for discussion/Log/2025 May 21#Template:Wikicite|‹See TfM›]]</span>'"`UNIQ--templatestyles-0000001E-QINU`"'<cite class="citation wikicite" id=CITEREFAhrens2011><span class="error">Attention: This template (<span class="nowrap">&#123;&#123;</span>[[Template:cite doi|cite doi]]<span class="nowrap">&#125;&#125;</span>) is deprecated. To cite the publication identified by doi:10.1039/c0em00373e, please use <span class="nowrap">&#123;&#123;</span>[[Template:cite journal|cite journal]]<span class="nowrap">&#125;&#125;</span> (if it was published in a bona fide academic journal, otherwise <span class="nowrap">&#123;&#123;</span>[[Template:cite report|cite report]]<span class="nowrap">&#125;&#125;</span> with <code class="tpl-para" style="word-break:break-word; ">&#124;doi&#61;10.1039/c0em00373e</code> instead.</span></cite>

If this sort of hacking is necessary, choose either of the latter two. Don't use the first.

Trappist the monk (talk) 14:39, 13 October 2014 (UTC)

My suggestion adds an extra <span>...</span>, yes; but only compared to the original. When compared to Mirokado's technique, it's the same number of spans. The only technical difference is that one has the wikicite class, the other doesn't. --Redrose64 (talk) 15:20, 13 October 2014 (UTC)

Titles in CoinS data for books

For books, the template places the book title in rft.atitle and the chapter title in rft.btitle, which seems the wrong way round.[2] Kanguole 14:20, 5 November 2014 (UTC)

I think you're right, it has been wrong forever. Thanks for that. Fixed in the sandbox.
'"`UNIQ--templatestyles-00000020-QINU`"'<cite class="citation book cs1">"Chapter". ''Title''.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=Chapter&rft.btitle=Title&rfr_id=info%3Asid%2Fen.wikipedia.org%3AModule+talk%3ACitation%2FCS1%2FArchive+11" class="Z3988"></span>
Trappist the monk (talk) 14:32, 5 November 2014 (UTC)

non-italic titles

Books published in Chinese have titles in Chinese characters, romanized titles in pinyin, and translated titles in English, e.g.

  • Wang, Li (1985). Hànyǔ Yǔyīn Shǐ 汉语语音史 (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

While the romanized title should be italicized, it is recommended at WP:MOS-ZH that characters not be italicized. However {{noitalic}} is not allowed within citation templates. A common expedient is an extra set of italic quotes, e.g. Hànyǔ Yǔyīn Shǐ ''汉语语音史'' in the above, but this seems brittle. So could these templates have an additional parameter, say |noitalic_title= or something, for a title in a non-roman script that should not be italicized? This would be in addition to |title=, which could be used for the romanized form (which would be italicized), and |trans_title=, for the English translation. Kanguole 10:57, 10 July 2014 (UTC)

But the '''' workaround you suggested seems to work. For instance, it doesn't break things like |url=:
So why not just stick to using that? It Is Me Here t / c 10:33, 10 September 2014 (UTC)

Since the title is italicized, the nested italics are affecting the rendered markup. The span is not properly closed:

  • Without nested italics: <span class="citation book">''Hànyǔ Yǔyīn Shǐ 汉语语音史''.</span>
  • With nested italics: <span class="citation book">''Hànyǔ Yǔyīn Shǐ ''汉语语音史''<span />''

This is probably HTML Tidy at work. This can cause issues if the span connects to another.

And the CoiNs metdata now includes the encoded apostrophes (%27):

  • H%C3%A0ny%C7%94+Y%C7%94y%C4%ABn+Sh%C7%90+%27%27%E6%B1%89%E8%AF%AD%E8%AF%AD%E9%9F%B3%E5%8F%B2%27%27

We should resolve this in the module. Thoughts? --  Gadget850 talk 11:15, 10 September 2014 (UTC)

This appears to be related to Help talk:Citation Style 1/Archive 6#Untitled work which also seeks a mechanism to disable italics. It would be best if we could have one conversation in one place?
Trappist the monk (talk) 11:45, 10 September 2014 (UTC)
The two are semantically different: that discussion is about works with no title but a description, while this one is about works with two titles, one of which is in a non-roman script. Kanguole 11:59, 10 September 2014 (UTC)
They are related because they both impact how Module:Citation/CS1 renders the value in |title=; clearly this is an under-the-hood relationship, but for those of us who might attempt a workable solution, keeping the two conversations separate is no benefit.
Trappist the monk (talk) 12:14, 10 September 2014 (UTC)
We already have a way to remove italics from titles of articles using {{Italic title}}, so it would be best to stay with a similar parameter name for consistency. How hard would it be to implement an optional |italic-title= (using the dash, as recently specified for multi-word parameters) with a default value of "yes" and an optional value of "no"? – Jonesey95 (talk) 14:47, 10 September 2014 (UTC)
But in Editor Kanguole's example, half of the title is italicized while the other half is not. Therein lies the problem. Which leads me to the question, why are both Chinese logograms and pinyin used in |title=? Are both required?
Trappist the monk (talk) 15:05, 10 September 2014 (UTC)
Some library catalogues will use the original title (in Chinese characters), while others will use the pinyin rendering of the title. The latter is more convenient for some purposes, but loses information. Kanguole 15:35, 10 September 2014 (UTC)
I think in that case, I would put the "lossy" rendering(s), in this case both Pinyin and English, in |trans-title=, something like this:
|italic-title=no could be used to straighten the original Chinese title. I think that is clear enough to show a reader what's going on. – Jonesey95 (talk) 19:05, 10 September 2014 (UTC)
It would be pretty weird to put the Chinese pinyin in the English translation. Publications I've seen put it before the characters, formatted approximately as above. Kanguole 16:41, 11 September 2014 (UTC)
The template {{asiantitle}} also renders the logogram title first followed by a transliteration in parentheses. {{Asiantitle}} should not be used in CS1 citation because of the COinS metadata problem that Editor Gadget850 describes above and addresses at Template talk:Asiantitle#CS1 templates.
Trappist the monk (talk) 13:18, 12 September 2014 (UTC)

At the moment, I'm not sure how to solve this problem. Likely it will involve some sort of new parameter which value would be concatenated to the value in |title= after italics have been applied. So the process for a title with both pinyin and logograms and would be:

  1. COinS title = <logogram title> <pinyin title> -- create COinS metadata before formatting
  2. Title = <pinyin title>
  3. Title = ''<pinyin title>'' -- italics applied: pinyin title
  4. Title = <logogram title> ''<pinyin title>'' -- concatenate logogram title: logogram title pinyin title
  5. Title = [<url> <logogram title> ''<pinyin title>''] -- add external link (wikilink is similar)

Do we have a list of all languages that must not be italicized? Presumably Japanese and Korean are on that list. What others? Do these languages have transcribed equivalents as pinyin is to Chinese?

If we invent a new parameter, what do we call it? What restrictions apply to its use?

Trappist the monk (talk) 17:56, 11 September 2014 (UTC)

I think, that to be safe, it's better to specify the languages where italicisation is sensible, rather than those where it isn't. Any language which uses a Latin, Greek or Cyrillic script should be OK, anything else should be treated with caution. --Redrose64 (talk) 19:13, 11 September 2014 (UTC)
Point. I've tweaked my pseudo-process outline above to make more sense – it was actually completely wrong in its original state.
Trappist the monk (talk) 19:59, 11 September 2014 (UTC)
Perhaps for a parameter name we could use |translit-title= or |xlit-title=. I think I prefer the latter because |translit-title= is a bit close to |trans-title=.
Trappist the monk (talk) 13:18, 12 September 2014 (UTC)
I was thinking of putting the romanized title (to be italicized, e.g. pinyin) in |title= and the original non-roman, not-to-be italicized title in a new field. Are you proposing to do it the other way round? I guess that would follow {{asiantitle}}, though that doesn't seem to be widely used. I also think it's a bit more complicated: it would have the font of |title= varying depending on whether this other field was given a value. Kanguole 14:01, 12 September 2014 (UTC)
Ah, you're right. And now I wonder if this ought not also apply to Hebrew and Arabic scripts as well With them we also have the right to left issue. Argh! But let's set those aside and just think about the Asian titles. Clearly we could create a parameter |asian-title= but then when we do get round to Hebrew or Arabic |asian-title= is not quite right. So something more generic: |logogram-title=, |logo-title= |lg-title= or some such? |title= would be assigned the romanized title.
Trappist the monk (talk) 14:49, 12 September 2014 (UTC)
Logogram is probably the simplest. Do we need this for chapter and author? As to directionality, we should be able to wrap it in <bdi>...</bdi> to isolate it from the title text-direction settings; by default <bdi> sets dir=auto. --  Gadget850 talk 15:37, 12 September 2014 (UTC)
If, according to MOS, chapter titles are supposed to be rendered in quotes, not italics, then no, we don't need to do this for |chapter=. But, there is a peculiarity in the module such that when all three of |work=, |title=, and |chapter= are used, the formatting for |chapter= and |title= swap. This to me seems wrong and is properly the topic of another conversation. I see no need to add this functionality to |author= because that is not rendered in italics, right?
Trappist the monk (talk) 16:15, 12 September 2014 (UTC)
The common practice with author names is |last=Wang|first=Li 王力, which seems adequate. "logogram" is a bit narrow: hangul and kana don't fit, and nor do the non-Latin alphabets of Southeast and South Asia, which presumably also shouldn't be italicized. Kanguole 16:49, 12 September 2014 (UTC)
Have you got a better name for the parameter?
Trappist the monk (talk) 17:11, 12 September 2014 (UTC)

So the initial hack was easier than I thought it would be:

{{cite book/new | title = Hànyǔ Yǔyīn Shǐ |script-title=汉语语音史 | trans_title = History of Chinese Phonetics | last = Wang | first = Li | author-link = Wang Li (linguist) | location = Beijing | publisher = China Social Sciences Press | year = 1985 | isbn = 978-7-100-05390-7 | language = Chinese |url=//example.com}}

produces:

Wang, Li (1985). Hànyǔ Yǔyīn Shǐ 汉语语音史 (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Invalid |script-title=: missing prefix (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

without |title=:

Wang, Li (1985). 汉语语音史 (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Invalid |script-title=: missing prefix (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

without |title= and without |trans-title=:

Wang, Li (1985). 汉语语音史 (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Invalid |script-title=: missing prefix (help)

I wonder if we should follow the example set by {{asiantitle}} and wrap the transliterated title in parentheses?

Trappist the monk (talk) 17:09, 12 September 2014 (UTC)

Couple more for completeness: without |trans-title=:

Wang, Li (1985). Hànyǔ Yǔyīn Shǐ 汉语语音史 (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Invalid |script-title=: missing prefix (help)

without |script-title=:

Wang, Li (1985). Hànyǔ Yǔyīn Shǐ (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7. {{cite book}}: Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

without|script-title= and without |trans-title=:

Wang, Li (1985). Hànyǔ Yǔyīn Shǐ (in Chinese). Beijing: China Social Sciences Press. ISBN 978-7-100-05390-7.

Trappist the monk (talk) 17:17, 12 September 2014 (UTC)

The formatting of {{asiantitle}} is a bit idiosyncratic. I just checked four English-languge books I have to hand that include both titles in their bibliographies, and they all put the pinyin title before the character title. As for the name, |nonlatin-title= or |nonitalic-title=? Kanguole 17:43, 12 September 2014 (UTC)
|logogram= looks reasonable for some Asian languages. I am not a linguist, so all I know about logograms, graphemes, and phonograms comes from the WP articles about them. If there is demand after implementation, it should be straightforward to create aliases to |logogram= that respect and align with the linguistic and cultural differences between Chinese, Japanese, Arabic, Hebrew, Korean, and other languages that do not use variations on Roman alphabet letters for their written language. – Jonesey95 (talk) 17:39, 12 September 2014 (UTC)
'nonlatin' doesn't work either, since Greek, Cyrillic and some others have italic variants. 'nonitalic' would be abused by some who want some oddball formatting or just misunderstand the parameter because they don't read the documentation. --  Gadget850 talk 17:55, 12 September 2014 (UTC)
The name of the parameter should have "title" in it, since this is only for titles. As for the rest, "nonitalic" is at least fairly straightforward about what is meant, as "scripts-that-lack-italics" is a bit long. Kanguole 10:35, 13 September 2014 (UTC)
|non-italic-script-title= with an alias |nis-title=?
Should this functionality apply to the periodical parameters |journal=, |encyclopedia=, |work=, etc?
Trappist the monk (talk) 11:28, 13 September 2014 (UTC)
I expect so – I've seen it with journal titles. Kanguole 15:18, 13 September 2014 (UTC)

According to the logic and context of citations which include Mandarin, Japanese, Korean titles:

  • |title=
  • |not_italic_title= (alias: |no_i_title=) —yes/no flag value only
  • |En_cover_title= —optional extra but also better combined into a general |other_title= with free formatting, in other words non–essential and of much less priority for coding, because these English–'spin' type subtitles most commonly get published only on the covers for marketing purposes, not in the colophon (page) nor the official citation such as with the ISBN registering authority and often do not accurately translate fully and properly the actual primary title that is published in Mandarin, Japanese, Korean, etc..
  • |title_transliteration= (alias: |title_xlit=)
  • |title_trans= (title translation) —currently exists as the |trans_title= parameter.

—and the equivalents for titles of journals, encyclopaedias, works, series, conferences, newspapers, etc..

The key point to remember please: the proper title of the document, eg. a book, has been set by the author and publisher and authoritatively published in the book’s colophon (page) and authoritatively registered with the ISBN granting authority. In proper terms, in reality, for books, we WP editors do not have to decide which is the title, the choice was already made in the event of publishing; we have to read the colophon (page) of the book or document, and cross-check with the front cover and the ISBN granting library or authority. Hacking the title (mostly only us geeks doing so) for superficial purposes of rendering appearance is an entirely separate topic and direction to go in.

For example, most books published in Mandarin, Japanese or Korean have their actual proper titles in Hanzi, Kanji–Hiragana–Katakana and Hangul respectively. These books may or may not have been published with secondary English translation subtitles on the cover, usually only for marketing purposes, such as appeal to some Chinese, Japanese and Korean particular people who’ve adopted westernisation sensibilities. Etcetera. For more information of a professional standard refer for example to Extended Citation Style Language and its implementation in Multilingual Zotero, etc. --Macropneuma 04:42, 14 September 2014 (UTC) —correct parameter names, small revision edit—--Macropneuma 02:23, 15 September 2014 (UTC)

I did not find any Wikipedia style rules or advice regarding the use of italics for titles in any non-Latin alphabets and script other than Chinese. For example, I didn't find any rules or advice on how to render a book title in Greek or Cyrillic letters. The Chicago Manual of Style 14th Edition has little to say on this either. About all the advice found there is for Russian, where it says (9.113, p. 347): "In the Cyrillic originals of these citations the author's name and the title are both set in ordinary type (called in Russian pryamoy, "upright"); the author's name, however, is letterspaced. The Cyrillic kursiv is used more sparingly than our italic—never for book titles." This leads me to two ideas:

  1. Somewhere under WP:MOS there should be advice on the use of italics with non-Latin alphabets and scripts, in titles and perhaps elsewhere.
  2. I think the policy should probably be to avoid italics for titles in any non-Latin alphabet and script; however, both the translations and transliterations of titles should be italicized whenever an English title would be italicized, e.g. names of books, magazines, films, long poems, etc.

In any event, I encourage this discussion to consider non-Latin-alphabet languages in general. —Anomalocaris (talk) 06:05, 14 September 2014 (UTC)

With your evidence check based reasoning and point of considering all "non–Latin–alphabet languages in general" i agree. My previous message was coming from my experience and long consideration with those east Asian languages, not at all to exclude the more general point you put forward. --Macropneuma 07:07, 14 September 2014 (UTC)
Wikipedia talk:Citing sources#Italics and non-Latin languages in titles
Trappist the monk (talk) 13:02, 14 September 2014 (UTC)
I agree that Japanese characters (kanji and kana) should not italicized. But if they aren't, don't we need another way to indicate that it's a title? {{Asiantitle}} has parameters for this. You can specify "j" for double corner brackets around a book title, or "jsgl" for single corner brackets around a chapter title. That is normal way this is done in Japan and agrees with the MOS at Japanese Wikipedia (著作物名). --Margin1522 (talk) 19:13, 14 September 2014 (UTC)
Yes, i agree, those are the proper ways in those languages, thus adding another level of consideration here. Once decided upon coding implementation of each of these language specific types of brackets per se is simpler than the more complex interactions of italics and non–italics with the titles, title_translations, title_transliterations and so on. Thanks Margin1522 for reminding us of these language specific types of brackets, i was on one step at a time here with my previous talk post. Instead of |not_italic_title= (—yes/no flag value only), we can use for example |non_italic_title= (alias: |non_i_title=) with the code flags for each different language or yes for simple non italics or no for explicitly specifying plain italics (if necessary for some reason, whatever). That’s a start for this next step here. No worries, many ways we can do this. --Macropneuma 22:53, 14 September 2014 (UTC)
I just checked the way the Japanese cite templates handle this. They have a |和書 (washo, meaning "Japanese book") parameter that takes no value. For example {{cite journal|和書|... will put the single brackets around the article title and the double brackets around the journal name. If the parameter is absent, then it is handled just like the English template. --Margin1522 (talk) 23:08, 14 September 2014 (UTC)
Are you-all not describing what amounts to a style parameter? If we have a parameter (|logogram= but perhaps |script-title= is better because that can encompass a variety of non-Latin languages) we might have another parameter |script-title-style= that takes an ISO639-1 language code as its value. The language codes are unique and already defined so for each script we can have a 'style' that the module applies for the citation's title.
|title= – Usually Latin font title or a transcription
|script-title= – non-Latin title could be Hebrew, Chinese, Cyrillic, Greek, Malay, whatever
|script-title-style= – ISO639-1 language code that selects a predefined style for |script-title=; if empty or omitted no special style applied; if invalid value throws an error
Editor Margin1522 asked: don't we need another way to indicate that it's a title? Perhaps an initial way to do that might be to underscore |script-title= – you know, the way we did titles with typewriters before word processors: |script-title=汉语语音史汉语语音史
Trappist the monk (talk) 00:38, 15 September 2014 (UTC)
We shouldn't base our styling on the conventions of Japanese-language (or Chinese-language) publications. The convention in the four English-language books I checked was to write titles of Chinese books and journals in the form Hànyǔ yǔyīn shǐ 汉语语音史 [History of Chinese phonetics]. Indeed the style sheet in Chinese History: A New Manual specifies this form. The italicized pinyin serves to mark the following characters as the title. Also, underlining can make the characters harder to read. Kanguole 01:42, 15 September 2014 (UTC)
That's a good point. We should check the Monumenta Nipponica style guide to see what they recommend for Japanese titles in English publications.
Initially the idea of an ISO language code was the first thing that occurred to me. From a quick check at the Chinese Wikipedia, it seems (maybe) that they just write the title straight up, no text decoration and no extra characters. And of course no pinyin. The template has no field for that (or for a translation), and in footnotes we're supposed to be able to simply quote the original language. Under current policy anyway. But about ISO, other scripts might have have preferred styles. I have no idea what titles look like in Hebrew or Thai. The Japanese tend to avoid all kinds of text decoration for any purpose and use extra characters instead. The one exception is underlines. But I can't recall seeing underlines to indicate titles, so I'm not sure if readers would understand it. --Margin1522 (talk) 02:04, 15 September 2014 (UTC)
Sorry, I take that back about the translation field. We do have the trans_title= field. --Margin1522 (talk) 02:15, 15 September 2014 (UTC)

I checked the Monumenta Nipponica style guide, and they reommend the following, which is similar to the Chinese style sheet quoted above.

  • Murasaki shikibu nikki. 紫式部日記. In NKBT 19.

They give the romanized title, italicized, and add no-italic kanji after it. If we did this for titles, it would essentially be Kanguole's suggestion for an "asiantitle" field. Or whatever the field is named. One question: would it be possible to write a function to detect whether the title string contains any Asian characters? If so, then we could just not italicize that title and the immediate problem (italicized kanji) is solved without adding an extra parameter. Too much processing just for Asian characters? --Margin1522 (talk) 04:09, 15 September 2014 (UTC)

Italicized transliteration may identify the following characters as the title, but we may not always have a transliteration to use as a delimiter. To those of us who do not read Arabic or Thai or whatever, an author name, which in CS1 precedes the title, is essentially indistinguishable from the title when they are in the same script. I think that it is fairly common for English-only readers to understand that an underscore indicates title. That is somewhat reinforced here at enwiki by the way wikilinks work – hover the mouse pointer over a wikilink and you get underscored text, often the title of an article. True, non-English languages may eschew text decoration, but our purpose here is to serve the readers of enwiki.

I thought about auto-detecting characters and then acting accordingly. The immediate problem that my thought experiment couldn't easily overcome was: what happens when there is a mix of Latin and non-Latin script? Probably better to leave it to editors who (presumably) know what they are doing rather than rely on code to make the right determination for every case.

Because this is not just for Asian-language-only titles but for all non-Latin script titles, I think that we should place the script title (presumably the original title) ahead of transliterated and translated titles because the latter two derive from the original language title.

Trappist the monk (talk) 10:06, 15 September 2014 (UTC)

The author name is separated from the title by the year and a period. I would also argue that a romanized title should be very strongly recommended, because of the opacity of the non-Latin title to most readers. (Of course they may not know what the pinyin or romaji title means either, but at least they'll have an idea of its pronunciation, and may recognize some words.) Regarding the ordering, you propose to ignore the common treatment of Chinese and Japanese titles in English-language publications – are there any publications that use your preferred ordering? Kanguole 12:52, 15 September 2014 (UTC)
Since dates aren't required, they aren't always present so the normal terminal punctuation may not be sufficient especially when titles heretofore have been distinctly styled either by italics or quotes. Yep, strong recommendation for transcribed titles is a must; that part goes in the template documentation.
CS1 is a general purpose tool and this feature, if it is ever implemented, must apply to more than just Asian-script titles. So, yeah, I am ignoring the common treatment you describe in favor of what I perceive to be a treatment that, while not perfect in all cases, is acceptable. It is entirely possible that as time progresses and we gain experience with the feature we can provide more nuanced language specific treatment. We aren't there yet.
I would like to define a minimal implementation solution that can be taken live. We can then gauge how editors react to it. There are too few who watch this page to make any assessment regarding this feature's use or acceptance in the broader community.
Trappist the monk (talk) 13:52, 15 September 2014 (UTC)
Chinese and Japanese titles are the principal use cases for this extension. Because titles in alphabetic scripts can usually be deduced from their standard romanization, style guides such as the Chicago Manual of Style recommend that non-specialist works use just the romanized title for titles in non-Latin scripts. But they make an exception for these two scripts (CMOS, 16th ed, 11.110): "Chinese and Japanese characters, immediately following the romanized version of the item they represent, are sometimes necessary to help readers identify references cited or terms used." We already place the characters after the romanization for authors' names; it would be consistent to do that for titles too. Kanguole 23:30, 15 September 2014 (UTC)
So it looks like the simplest way to go is to provide a field where people can write a non-roman title and not italicise it. That is, everything above "Latin Ext-B" in UTF-8#Codepage_layout is straightup? As far as CJK goes that would be better than italics.
About the order of Romanization/Kanji, most of the Japanese on En Wikipedia uses {{nihongo}}, which has 2 main options: order (A) "English (Kanji Romanization)" and order (B) "Romanization (Kanji)". There is no option (C) for "Kanji (Romanization)". A typical real-life example on Wikipedia might be Eiichiro Oda. It has both (A) and (B) in the titles in the Works section, but not (C). So it looks like romanization should come first.
That said, I wonder whether editors will actually provide the romanization. We can certainly encourage it, but it is extra work. And sometimes you don't know. If a Japanese title contains a proper name it can take up to 1/2 hour of research on the Internet to discover how it's pronounced. Editors aren't going to do that. They'll just write the kanji, and I think that's OK too. --Margin1522 (talk) 06:44, 16 September 2014 (UTC)
Here is a comparison of the four versions of {{nihongo}} and {{asiantitle}}:
  1. {{Nihongo|Tokyo Tower|東京タワー|Tōkyō tawā}} → Tokyo Tower (東京タワー, Tōkyō tawā) – translation, original language, transliteration
  2. {{Nihongo2|東京タワー}}東京タワー – only original language so not applicable
  3. {{Nihongo3|Tokyo Tower|東京タワー|Tōkyō tawā}}Tōkyō tawā (東京タワー, Tokyo Tower) – transliteration, original language, translation
  4. {{Nihongo4|Tokyo Tower|東京タワー|Tōkyō tawā}} → Tokyo Tower (東京タワー, Tōkyō tawā) – translation, original language, transliteration
  5. {{Asiantitle|東京タワー|Tōkyō tawā|Tokyo Tower|j}}Template:Asiantitle – original language, transliteration, translation
In CS1, translation (|trans-title=) always follows the title. Ignoring the translations in the above templates, except for {{nihongo3}} original language precedes transliteration in the rendering.
Here's another interesting data point. Transclusion counts of the templates might be taken as an indication of preference by editors who use the templates. Yeah, I know they weren't created simultaneously so {{nihongo}} from 2006 has a big advantage over the others:
  • {{nihongo}} (2006) → 70515
  • {{nihongo3}} (2008) → 718
  • {{nihongo4}} (2012) → 186
  • {{asiantitle}} (2010) → 157
I have no idea if editors will provide romanization. Agreed that we should encourage it in the documentation; MOS:ROMANIZATION and the like.
So it appears that there is something of a conflict. Chicago Manual of Style vs. common usage on Wikipedia. CS1 is loosely based on CMOS and other published style guides but is ultimately driven by its users.
Trappist the monk (talk) 11:55, 16 September 2014 (UTC)
WP:MOSCHINA says to use pinyin (汉字) in running text and pinyin 汉字 for titles in citations, both of which are in line with CMOS. WP:MOSJAPAN has no guidance on citations. The usage of {{nihongo}} gives little information about citations, as most are used for the bolded title of articles (e.g. the above example in Tokyo Tower – Chinese articles use the {{zh}} template for a similar purpose). Most of the other uses of the template don't use all three parameters and few are in citations. I also noticed a lot of cases where people are putting romaji in the first parameter instead of the third. For example in Fist of the North Star we find
*{{cite book|title={{nihongo|Hokuto no Ken Kanzen Tokuhon|北斗の拳 完全読本||"The Complete Guide to Fist of the North Star"}}|ISBN=978-4-7966-5856-0}}
to obtain something similar to the style recommended by CMOS:
  • Hokuto no Ken Kanzen Tokuhon (北斗の拳 完全読本, "The Complete Guide to Fist of the North Star"). ISBN 978-4-7966-5856-0.
So I don't think there's evidence of a widely-used Wikipedia style that conflicts with CMOS. Kanguole 17:28, 16 September 2014 (UTC)
Very well. I've changed |logogram= to |script-title= and Module:Citation/CS1/sandbox so that the rendered order is |title=transliteration, |script-title=original language |trans-title=translation. This change can be seen in the six citations above.
Trappist the monk (talk) 18:13, 16 September 2014 (UTC)
The six citations look good to me (except for the underline, I'm still a bit dubious about that). My version of CMOS is older and doesn't have that passage. Do they give any long examples? Maybe we need to consult a bibiliography with Chinese titles in a book edited by the U. of Chicago Press. That should settle it any questions about how they handle it.--Margin1522 (talk) 01:48, 18 September 2014 (UTC)
They include an example journal article citation in each language. That section of CMOS is reproduced here, wrapped in some commentary by the library. Elsewhere CMOS has examples of Arabic (11.100) and Russian (11.120), but for those languages they recommend using the transliterated title only. I'm also dubious about adding underlining, and when the title itself is a link, as in the above examples, one gets two underlines on mouseover. Kanguole 09:18, 18 September 2014 (UTC)
Thanks, that looks good. If citations could be made to look like that I would be happy. --Margin1522 (talk) 19:30, 18 September 2014 (UTC)

At Help talk:Citation Style 1/Archive 6#Wikimarkup and COinS metadata I describe a proposed change to Module:Citation/CS1 that strips the common apostrophe markup from |title= so that it doesn't corrupt the COinS metadata. The change will allow editors to add this basic styling to titles or to 'undo' the default italic style of some CS1 templates.

Trappist the monk (talk) 12:54, 18 September 2014 (UTC)

So then could just ask editors to write this?
|title=Tōkyō tawā ''東京タワー'' |first=....
Tōkyō tawā would be italicized in the normal way and 東京タワー would be straightup. This looks like the simplest solution yet, and the result would just as good as the extra field. If this happens I volunteer to add an explanation to WP:MOS-JP to encourage it, and discourage putting {{nihongo}} in citations.
The question is whether editors will actually do it. Many of them don't read the MOS, e.g. the editors coming over from the Japanese wiki to add information about current events. Perhaps we could lobby the author of AWB to add this to the list of things it checks. There are a lot of editors who use AWB to patrol new posts. We could ask them to fix any kanji in (new or existing) title strings that lack the double apostrophes. AWB already checks citations for invalid parameters. --Margin1522 (talk) 19:30, 18 September 2014 (UTC)
Not a complete solution, I think. Simply allowing the use of wikimarkup in |title= values, doesn't address right-to-left language scripts nor does it enforce consistent transcription-script-translation order.
Trappist the monk (talk) 12:34, 19 September 2014 (UTC)
I was just going to make the same comment on language isolation. This could be resolved by having the editor wrapp the Asian text in <bdi> and having the template strip it out, but that puts more on the editor and continues the slippery slope of using HTML inside the templates. --  Gadget850 talk 12:41, 19 September 2014 (UTC)

This is unrelated to Asian, but is this the policy?


cite web

  • "Hongkongs Zukunft entscheidet sich auf der Straße". Die Zeit (in German). 30 September 2014. {{cite web}}: Missing or empty |url= (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

cite book

  • Hongkongs Zukunft entscheidet sich auf der Straße (in German). 2014. {{cite book}}: Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

According to CMOS14 (15.118), the translated title of a book should be upright. – Margin1522 (talk) 18:32, 30 September 2014 (UTC)

Policy? I don't know about that but it has been ever thus:

Cite web comparison
Wikitext {{cite web|date=30 September 2014|language=German|title=Hongkongs Zukunft entscheidet sich auf der Straße|trans_title=The future of Hong Kong will be decided on the street|work=Die Zeit}}
Live "Hongkongs Zukunft entscheidet sich auf der Straße". Die Zeit (in German). 30 September 2014. {{cite web}}: Missing or empty |url= (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)
Sandbox "Hongkongs Zukunft entscheidet sich auf der Straße". Die Zeit (in German). 30 September 2014. {{cite web}}: Missing or empty |url= (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)
Cite book comparison
Wikitext {{cite book|language=German|title=Hongkongs Zukunft entscheidet sich auf der Straße|trans_title=Hong Kong's future will be decided on the street|year=2014}}
Live Hongkongs Zukunft entscheidet sich auf der Straße (in German). 2014. {{cite book}}: Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)
Sandbox Hongkongs Zukunft entscheidet sich auf der Straße (in German). 2014. {{cite book}}: Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)

Trappist the monk (talk) 21:07, 30 September 2014 (UTC)

I see. If possible, could we consider no italics for translated titles, when the original title is given? That's what CMOS recommends, and I prefer it that way, also outside of Wikipedia. It also goes for capitalization. According to CMOS the translated title is just for information, so only the first word gets capitalized. (Although most editors seem to show a strong preference for treating the translated title as a real title, with title caps. So maybe they would prefer the italics too.) – Margin1522 (talk) 23:43, 30 September 2014 (UTC)
If it is important to you, I think that you should raise the issue at some other venue than this backwater talk page; there are 51 others watching us. Some of what I think you are asking for is beyond the abilities of Module:Citation/CS1. In your examples, for instance, CS1 would have to know that Hong Kong should be capitalized so the choice of title case or sentence case will need to be left to the editor. Rendering |trans-title= as undecorated text can be done.
Trappist the monk (talk) 00:33, 1 October 2014 (UTC)

Protected edit request on 13 October 2014

Please revert the recent changes to the above 4 modules (to the pre-October 11 version), as it's causing a malfunction on a large number of pages, such as Mary Elizabeth Braddon, Hinduism, and Language deprivation experiments. Jackmcbarn (talk) 00:04, 13 October 2014 (UTC)

Attention Trappist the monk. Malformed |title= parameter values, including those with italic markup in citations, are causing Lua errors. Examples:
{{cite book |last=Beller |first=Anne-Marie|title=''[http://www.mcfarlandpub.com/book-2.php?id=978-0-7864-3667-5%20 ''Mary Elizabeth Braddon: A Companion to the Mystery Fiction'']''|location=Jefferson, NC |publisher=McFarland |year=2012}}
{{Citation|last= Walker|first=Benjamin|year=1968|title=The Hindu world: an encyclopedic survey of Hinduism
  • (Lua error, and a few subsequent citations hidden, because this citation template was not closed)
{{cite book |last=Shattuck |first=Roger |title=''[http://books.google.com/books?vid=ISBN1568360487&id=9COPTtX16IIC&pg=PP1&lpg=PP1&ots=TKQrin2p4P&dq=%22forbidden+experiment%22&sig=FFI91sIU9yzIyP8yvBxO-1xATgU The Forbidden Experiment: The Story of the Wild Boy of Aveyron]'' |origyear=1980 |year=1994|publisher=Kodansha International |isbn=1-56836-048-7 }}


In case you can't see the error, it is "Lua error in Module:Citation/CS1 at line 741: invalid capture index." One might argue that these citations are malformed, but presumably they were displaying more or less correctly before the updates. – Jonesey95 (talk) 00:34, 13 October 2014 (UTC)
I have disabled two functions: strip_apostrophe_markup() and make_coins_title() which uses it.
Trappist the monk (talk) 00:49, 13 October 2014 (UTC)

I think I have fixed the problem. Once again, patterns in string.gsub() have bitten me, though more accurately, this time it is Lua magic characters in the original string when that string is used as a replacement. strip_apostrophe_markup() gets all text between matching italic or bold wikimarkup. It then replaces the original string (which includes the markup) with the text found inside the markup:

original string: ''test%20string'' – an italicized string; the %20 is a space character used in urls
the found string: test%20string
now, look in the original string for this pattern: ''test%20string'' and replace it with this: test%20string

In the originally released code, the pattern was not modified to escape the Lua magic characters: ^$().[]*+-?%. These characters have properties similar to those used in regular expressions. I fixed part of the problem after the 11 October 2014 update to prevent these magic characters from inappropriate action that was causing similar problems to those that caused the protected edit request. I didn't go far enough. I failed to realize that the replacement is also treated as a Lua pattern and not as a literal string. In my little example above, the substring %2 is treated as a capture identifier in the replacement. There isn't a capture 2, hence the big red error message.

I have created a function escape_lua_magic_chars() that causes string.gsub() to treat the magic characters in these pattern and replacement strings as literal characters so now, the example broken citations work as intended:

  • {{Citation/new|last= Walker|first=Benjamin|year=1968|title=The Hindu world: an encyclopedic survey of Hinduism
  • Beller, Anne-Marie (2012). Mary Elizabeth Braddon: A Companion to the Mystery Fiction. Jefferson, NC: McFarland. p. 100, 102–105. {{cite book}}: External link in |title= (help)
  • '"`UNIQ--templatestyles-0000005D-QINU`"'<cite id="CITEREFBeller2012" class="citation book cs1">Beller, Anne-Marie (2012). ''<span></span>''[http://www.mcfarlandpub.com/book-2.php?id=978-0-7864-3667-5%20 ''Mary Elizabeth Braddon: A Companion to the Mystery Fiction'']''<span></span>''. Jefferson, NC: McFarland. p.&nbsp;[http://www.mcfarlandpub.com/book-2.php?id=978-0-7864-3667-5%20 100, 102–105].</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=%5Bhttp%3A%2F%2Fwww.mcfarlandpub.com%2Fbook-2.php%3Fid%3D978-0-7864-3667-5%2520+Mary+Elizabeth+Braddon%3A+A+Companion+to+the+Mystery+Fiction%5D&rft.place=Jefferson%2C+NC&rft.pages=100%2C+102-105&rft.pub=McFarland&rft.date=2012&rft.aulast=Beller&rft.aufirst=Anne-Marie&rfr_id=info%3Asid%2Fen.wikipedia.org%3AModule+talk%3ACitation%2FCS1%2FArchive+11" class="Z3988"></span> <span class="cs1-visible-error citation-comment"><code class="cs1-code">{{[[Template:cite book|cite book]]}}</code>: </span><span class="cs1-visible-error citation-comment">External link in <code class="cs1-code"><code class="cs1-code">&#124;title=</code></code> ([[Help:CS1 errors#param_has_ext_link|help]])</span>

To the Beller cite I added a linked |page= parameter because get_coins_pages() now uses escape_lua_magic_chars().

This is a crude test that includes all of the Lua magic characters:

  • '''''^Bold$''' (italic).'' [title] and%20''italic*'', '''+bold''', '''bold-again''', last ''italic?'' → "^Bold$ (italic). [title] and%20italic*, +bold, bold-again, last italic?".:
  • '"`UNIQ--templatestyles-00000061-QINU`"'<cite class="citation news cs1">"'''''^Bold$''' (italic).'' [title] and%20''italic*'', '''+bold''', '''bold-again''', last ''italic?''".</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%5EBold%24+%28italic%29.+%5Btitle%5D+and%2520italic%2A%2C+%2Bbold%2C+bold-again%2C+last+italic%3F&rfr_id=info%3Asid%2Fen.wikipedia.org%3AModule+talk%3ACitation%2FCS1%2FArchive+11" class="Z3988"></span>

Trappist the monk (talk) 12:53, 13 October 2014 (UTC)

@Trappist the monk: The following still causes errors: {{cite web|url=https://example.com|title='''Foo'''''Bar''}} yields "FooBar".. Jackmcbarn (talk) 00:42, 16 October 2014 (UTC)

Thanks. Fixed I think; and the slightly different version: {{cite web|url=https://example.com|title=''Foo'''''Bar'''}}"FooBar".
Hate it when I miss obvious stuff like that. Glad that you and others are out there keeping an eye on things.
Trappist the monk (talk) 03:13, 16 October 2014 (UTC)

@Trappist the monk: Another bug: {{cite web|title='''''foo}} causes an infinite loop and uses up all of the page's allowed Lua time. Jackmcbarn (talk) 02:36, 22 October 2014 (UTC)

Thank you. A completely revised, much, much, much simpler version is now in the sandbox. Tested against the problematic cites in this thread and the test cites at Help talk:Citation Style 1/Archive 6#Wikimarkup and COinS metadata.
Unless there comes a flood of similarly broken cites this change will be made with the next update to the live module. I've fixed the one instance you found.
Trappist the monk (talk) 10:51, 22 October 2014 (UTC)
@Trappist the monk: I found another case in the wild at Sankowskya. It wasn't recently edited, so I expect that we'll keep finding more as the job queue catches up. I think we should deploy the fix now. Jackmcbarn (talk) 14:09, 24 October 2014 (UTC)
Thank you. It is done.
Trappist the monk (talk) 14:37, 24 October 2014 (UTC)

Utm detection

A question from non-professional (in terms of all the utm thing) - shouldn't all the Google Analytics things (like in the example) be tracked in some category so it later can be fixed? I think nobody knows, that it should be removed when used in citation. Or it is completely not needed? You gonna ask - how to track? I would say, that check could be added for strings like ?utm_source, &utm_medium, &utm_campaign etc. --Edgars2007 (talk/contribs) 18:33, 26 November 2014 (UTC)

I remember reading somewhere that this kind of tracking information should be stripped from |url= values. Just where that was, I no longer recall, nor can I find in the places that I looked. Still, detecting and categorizing pages with these kinds of urls shouldn't be too difficult if we decide that we should do it. Are there other tracking strings from sites other than Google (Amazon, etc) – large on-line vendors would seem to be the prime candidates?
Trappist the monk (talk) 19:04, 26 November 2014 (UTC)
I think WP:VPT would be a good place to gather some opinions about this and to get answer about other tracking strings. --Edgars2007 (talk/contribs) 19:20, 26 November 2014 (UTC)
From Template:Cite journal#URL: "Remove spurious tracking parameters from URLs, e.g. #ixzz2rBr3aO94 or ?utm_source=google&utm_medium=...&utm_term=...&utm_campaign=.... Do not link to any commercial booksellers, such as Amazon.com." – Jonesey95 (talk) 23:01, 26 November 2014 (UTC)
Ohconfucius' script to fix sources removes link tracking such as #ixzz2rBr3aO94 from URLs. GoingBatty (talk) 23:13, 26 November 2014 (UTC)

Category:CS1 maint: Date and year

I was looking at Michael Foot and spotted Category:CS1 maint: Date and year on the article but no red error message to show which reference was in error. There are other red errors for accessdate & chapter problems. Keith D (talk) 15:51, 30 November 2014 (UTC)

@Keith D: I opened the article in edit mode and searched for "year", and found reference #25 has the error. GoingBatty (talk) 15:58, 30 November 2014 (UTC)
Category:CS1 maint: Date and year is not an error category so pages that land in it are not flagged with error messages. This is because use of both was a requirement under {{citation/core}} to create correctly disambiguated CITEREF anchors for authors who published multiple works in the same year. Because separate |year= and |date= parameters are no longer required to support CITEREF anchor disambiguation, |year=, while still supported, can be removed when the value in |year= is the same as the year portion of |date= (|date=January 2008 |year=2008). The year portions being the same, if the disambiguation character is moved to |date=, then too, |year= may be removed.
Trappist the monk (talk) 16:20, 30 November 2014 (UTC)
Thanks for explanation. Though without some sort of flag then you have to search through each of the refs to find which one is causing the problem. Keith D (talk) 16:54, 30 November 2014 (UTC)
True, but this is the sort of problem that can be easily attacked by an AWB script (which I have in the works) so that editors need not worry over much about 'fixing'.
Trappist the monk (talk) 16:58, 30 November 2014 (UTC)
(edit conflict)We have had complaints about the red error messages, especially when they highlight a condition that may or may not be an actual error. I wonder if Frietjes could work up a javascript similar to User:Frietjes/findargdups to help editors find the redundant year/date parameters. – Jonesey95 (talk) 17:00, 30 November 2014 (UTC)

JSTOR parameter encoding

I have come across this paper which is cited here and here on en-wp. The jstor value of 10.1086/515973 looks like a DOI but fails to resolve as such, so I set |jstor=10.1086/515973. This gets URL-encoded, turning / into %2F, which then means the link doesn't work (seems it would work if that encoding were not applied). So two questions please: is URL encoding of |jstor= deliberate/somehow required by other logic? I suppose it could be that JSTOR have failed to register a set of DOIs (other articles in same issue of journal don't resolve either), anybody know anything about these 10.1086 values? Thanks Rjwilmsi 19:17, 26 November 2014 (UTC)

Can't say that I know anything about those identifiers. There is a flag set in Module:Citation/CS1/Configuration that for JSTOR is :encode = true. I don't know why it is that way. Documented code? What's that? Most JSTOR identifiers are simply a string of digits so uri encoding would seem to be unnecessary. As a test, I have changed the flag to false:
"Title". JSTOR 10.1086/515973. {{cite journal}}: Cite journal requires |journal= (help)
This does what it should do. Will it always work? Don't know. So here's the question: Do I revert this little change or do we leave it and see if anything breaks when I update the live module at week's end?
Trappist the monk (talk) 20:06, 26 November 2014 (UTC)
There are a couple of references at Serial Item and Contribution Identifier#References that might explain JSTOR identifiers, but irritatingly both are dead links. Browsing in JSTOR suggested to me that a high proportion of identifiers (for books at least a third) have the form "publisher_id/item_id" ("10.1086" in the example appears to be a publisher id) rather than just "item_id". The only non-alphanumeric characters in either part of the JSTOR id that I found were "." and "_". So (a) it's essential not to encode "/" (b) it's probably ok not to encode any part of a JSTOR id. Peter coxhead (talk) 21:58, 26 November 2014 (UTC)
I found the first dead link at archive.today here. It doesn't say much, only that JSTOR uses SICI. But about the JSTOR implementation, I did find this. See page 4 "SICIs in Active Use" for an explanation of how JSTOR URLs use SICI. – Margin1522 (talk) 08:56, 6 December 2014 (UTC)
Off topic
P.S. I hope the formatting error in Amos, Robert (2012), "Show Reports: Malvern Show", The Alpine Gardener, 80 (1): 80–83, where the article title is in italics not roman+quote marks, will be fixed this weekend. It's been around far too long in my view. Otherwise I'll join other editors in changing |title= to |contribution= when using {{Citation}} for journal articles. Peter coxhead (talk) 22:36, 26 November 2014 (UTC)
Yes. See Help talk:Citation Style 1#Update to the live CS1 module weekend of 29–30 November 2014. In particular note the item: revised |chapter=, |trans-chapter=, and |chapter-url= handling. Those who have lost faith and succumbed to the Siren-songs of any of the |contribution= siblings are going to awaken to find their citations dashed upon the rocks:
Amos, Robert (2012), The Alpine Gardener, 80 (1): 80–83 {{citation}}: |contribution= ignored (help); Missing or empty |title= (help)
But the faithful, who stuffed their ears with wax or bound themselves to the mast will escape that dread lee shore
Amos, Robert (2012), "Show Reports: Malvern Show", The Alpine Gardener, 80 (1): 80–83
Trappist the monk (talk) 23:16, 26 November 2014 (UTC)
It's always good to know there's a reward (of sorts) for remaining faithful... Thanks. Peter coxhead (talk) 08:36, 27 November 2014 (UTC)