Module talk:WikidataIB/Archive 3
![]() | This is an archive of past discussions about Module:WikidataIB. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | → | Archive 8 |
New parameter for getValue sought to avoid attempt to resolve redirects
Line 371 and 372 perform expensive operations that record tranclusions to pages. This occasionally results in the module outputting a plain label when an article exists but is not a redirect (Module assumes is a DAB page). However, because of artitle.id and artitle.isRedirect, even though no clickable link is produced, a page link is still registered, causing the page to pop up on Disambiguation pages with links. I would like a parameter to pass to getValue that will disable line 371/372. If no valid sitelink exists, go ahead and link to Wikidata, without checking for a local redirect. Terribly verbose but something like "noRedirectResolution". Alternatively, disable this section entirely until such time we can test for page existence without registered a page link. -- ferret (talk) 00:32, 30 May 2018 (UTC)
- The reason for that code is that Wikidata refuses to allow an entry like archaeologist (Q3621491) to have a sitelink to Archeologist, merely because the latter is a redirect (the English Wikipedia discusses 'archaeologist' as part of the 'Archaeology' article). If it were not for that code, biographical infoboxes would not have links to many of the occupations, for example. Who would make use of a parameter that disabled the code? Would any infobox designer use a parameter that disabled linking for an unpredictable number of values or fields? I seriously doubt it. There is no problem with expensive calls in the module – the call mw.title.new(id) is not used. The problem is that mw.title.new(label, namespace) – which is called when there is a label, but no sitelink – incorrectly makes an entry in the table used for "What links here". That affects every Lua module using that call, not just WikidataIB, and that is the problem that needs to be fixed. I'll investigate alternate methods of deciding whether a label represents a linkable article on the local Wikipedia. In the meantime, anybody can bypass the problem by supplying a local value. --RexxS (talk) 15:07, 30 May 2018 (UTC)
- @Ferret: I've been trying to minimise the use of mw.title.new(label, namespace) in the sandbox, but I'm having difficulty testing because I can't find examples of the effect you are referring to. Could you list perhaps half-a-dozen articles where the problem occurs, please? and I'll see if I can improve matters at all. --RexxS (talk) 15:50, 30 May 2018 (UTC)
- @RexxS: mw.title.new itself is not an issue, it is calling .id or .isRedirect on the resulting object. These are the ones that register page links. Even if the resulting output does not contain a link, soon as .id/.isRedirect are used, a transclusion is registered. I have not been able to find any way to test for page existence that does NOT cause a page link to register. The source of complaint resulting in this request is at Template talk:Infobox video game#The series field and Template talk:Infobox video game#Skate (video game) and Template:Infobox video game (series= field). In both cases, editors who are attempting to resolve DABLINK reports have came across cases where the series parameter is pulling from Wikidata, and the linked item does not have an enwiki article. Line 371 and 372 take the linked item's label and attempt to discover if its a redirect (neither case was), and then outputs a plain label (assuming the article to be a DAB). On the surface, this is exactly the behavior you would want. Unfortunately, calling .id and .isRedirect register a pagelink anyways, and the article is pushed into the DABLINK reports. See the back and forth edit history on Skate (video game). As a test case, you can look at Module:Sandbox/Ferret. If you check Whats Links Here on User:Ferret/sandbox, you'll see that User:Ferret/sandbox2's use of the module and artitle.isRedirect (Well, redirectTarget, but same result) cause it to register as a linked page. I used maketitle but new works the same way. It's the call to the title object that actually makes the link, not mw.title.new. -- ferret (talk) 16:42, 30 May 2018 (UTC)
- @Ferret: Sure, that's true. But as soon as I create the object, I test for its id and isRedirect properties, otherwise there's no point in creating it, so we end up with the same result. I've re-written the sandbox code so that if there is no sitelink, I check whether the value itself (which is a wikibase-item) has a property instance of (P31) equal to Wikimedia disambiguation page (Q4167410). If so, I give the plain label and skip checking for existence of an article and whether it's a redirect. Now, if a wikibase entity exists which has no sitelink to enwiki, has a English label, is not an instance of a dab page, has an article title on enwiki the same as the label, but is not a redirect, then I don't know what it is, but it will turn up in Disambiguation pages with links. However, I hope there won't be many of those once the sandbox code is implemented.
- As for your sandbox and the rest, I'm aware of the spurious linking, and I understand how it happens, but I'm still waiting for somebody to give me a few examples of actual articles using WikidataIB that provoke the problem. Then I can test whether the sandbox code fixes the problem. I'm not keen to modify the main module until I can show that the sandbox code does what I'm expecting it to do. Can you help with that? --RexxS (talk) 18:26, 30 May 2018 (UTC)
- Skate (video game) does it. It is currently not doing it because the wikidata pull has been suppressed. Same for Eye of the Beholder (video game). These are the two that started the discussions of spurious links. Removing |series from either should do it. -- ferret (talk) 21:03, 30 May 2018 (UTC)
- @Ferret: Thanks. That let me test other possibilities. I agree that the call mw.title.new(label, namespace) alone doesn't cause the spurious link. I thought I could circumvent the issue by replacing the test using the title object's id and isRedirect properties with one that uses the title object's redirectTarget property. Unfortunately, despite the documentation not indicating it, it also creates a spurious link. I'll have another think. --RexxS (talk) 13:32, 31 May 2018 (UTC)
- I found the same issue. Several of the title object properties are not explicitly labelled as creating a site link but do. It seems to be any of the expensive ones that cause a DB retrieval though. -- ferret (talk) 13:36, 31 May 2018 (UTC)
- That's right. It's because the devs used a short-cut to record the change of state in the same table that records links. And I'm now having an argument with Anomie on this very issue at mw:Extension talk:Scribunto/Lua reference manual #Title object and spurious links because I dared to update his documentation. If you have a moment to spare, You might pitch in to emphasise just how important it is to fix this bug, because otherwise the title library is pretty useless in Wikipedia. Cheers --RexxS (talk) 17:51, 31 May 2018 (UTC)
- Anomie appears to be correct in the strictest sense. isRedirect is a link, redirectTarget is a transclusion. Both appear in "What Links Here". I do not know if DABLINK stuff tells the two apart or not. If so, it is worth trying redirectTarget to find if a redirect works without tripping DABLINK. -- ferret (talk) 19:06, 31 May 2018 (UTC)
- Well, Anomie is insistent that redirectTarget doesn't create a link, so I'll happily refer any future complaints to him to sort out. The sandbox code does indeed only produce a transclusion, so I'm happy that it's an improvement, but I'd be willing to bet that the gnomes who look for links to dab pages won't see it that way. I'm waiting for the RfC on using Wikidata in infoboxes to be concluded before rolling out the updates currently in the sandbox, but hopefully those will lay to rest many of the current issues that folks have with how we fetch Wikidata. Cheers --RexxS (talk) 20:25, 31 May 2018 (UTC)
- I've also re-enabled my test code in User:RexxS/sandbox/Wikidata so we can check whether that page shows up as a link to the dab page [ [Skate]] in the other reports. --RexxS (talk) 20:29, 31 May 2018 (UTC)
- Anomie appears to be correct in the strictest sense. isRedirect is a link, redirectTarget is a transclusion. Both appear in "What Links Here". I do not know if DABLINK stuff tells the two apart or not. If so, it is worth trying redirectTarget to find if a redirect works without tripping DABLINK. -- ferret (talk) 19:06, 31 May 2018 (UTC)
- That's right. It's because the devs used a short-cut to record the change of state in the same table that records links. And I'm now having an argument with Anomie on this very issue at mw:Extension talk:Scribunto/Lua reference manual #Title object and spurious links because I dared to update his documentation. If you have a moment to spare, You might pitch in to emphasise just how important it is to fix this bug, because otherwise the title library is pretty useless in Wikipedia. Cheers --RexxS (talk) 17:51, 31 May 2018 (UTC)
- I found the same issue. Several of the title object properties are not explicitly labelled as creating a site link but do. It seems to be any of the expensive ones that cause a DB retrieval though. -- ferret (talk) 13:36, 31 May 2018 (UTC)
- @Ferret: Thanks. That let me test other possibilities. I agree that the call mw.title.new(label, namespace) alone doesn't cause the spurious link. I thought I could circumvent the issue by replacing the test using the title object's id and isRedirect properties with one that uses the title object's redirectTarget property. Unfortunately, despite the documentation not indicating it, it also creates a spurious link. I'll have another think. --RexxS (talk) 13:32, 31 May 2018 (UTC)
- Skate (video game) does it. It is currently not doing it because the wikidata pull has been suppressed. Same for Eye of the Beholder (video game). These are the two that started the discussions of spurious links. Removing |series from either should do it. -- ferret (talk) 21:03, 30 May 2018 (UTC)
- @RexxS: mw.title.new itself is not an issue, it is calling .id or .isRedirect on the resulting object. These are the ones that register page links. Even if the resulting output does not contain a link, soon as .id/.isRedirect are used, a transclusion is registered. I have not been able to find any way to test for page existence that does NOT cause a page link to register. The source of complaint resulting in this request is at Template talk:Infobox video game#The series field and Template talk:Infobox video game#Skate (video game) and Template:Infobox video game (series= field). In both cases, editors who are attempting to resolve DABLINK reports have came across cases where the series parameter is pulling from Wikidata, and the linked item does not have an enwiki article. Line 371 and 372 take the linked item's label and attempt to discover if its a redirect (neither case was), and then outputs a plain label (assuming the article to be a DAB). On the surface, this is exactly the behavior you would want. Unfortunately, calling .id and .isRedirect register a pagelink anyways, and the article is pushed into the DABLINK reports. See the back and forth edit history on Skate (video game). As a test case, you can look at Module:Sandbox/Ferret. If you check Whats Links Here on User:Ferret/sandbox, you'll see that User:Ferret/sandbox2's use of the module and artitle.isRedirect (Well, redirectTarget, but same result) cause it to register as a linked page. I used maketitle but new works the same way. It's the call to the title object that actually makes the link, not mw.title.new. -- ferret (talk) 16:42, 30 May 2018 (UTC)
Major update, June 2018
While the Wikipedia:Wikidata/2018 Infobox RfC has been running, I've not updated the main module, but left developments in the sandbox. Now that the RfC has ended, I've updated the main module from the sandbox. This has brought some improvements both in performance and functionality.
- Performance gains will occur because the module now only loads the part of the Wikidata entry that it needs for the call, rather than the whole Wikidata entry (which was formerly the only way of accessing the data). The calls to other Wikidata entries which are not immediately associated with the page are no longer expensive.
- The getValue call now supports ranks via a parameter, offering more flexibility than using getPreferredValue (which is now just a call to getValue with rank set to "best").
- The getValue call now supports returning qualifiers as values in parentheses after the property value.
- The getValue call now allows a number of extra parameters to provide extra functionality such as limiting the number of values returned, or auto-collapsing the list of returned values if the number of values exceeds a given number. Details are in the documentation at Module:WikidataIB/doc #Parameters to getValue.
- There is a new call, getValueByQual, which works similarly to getValue, but only returns value(s) that have a particular qualifier with a given value. Details are in the documentation at Module:WikidataIB/doc #Function getValueByQual.
- There is a new call, getValueByLang, which works similarly to getValue, but only returns value(s) that have the qualifier language of work or name (P407) with a given language code as its value. Details are in the documentation at Module:WikidataIB/doc #Parameters to getValueByLang.
- The getValue call now displays its results a little differently when the value returned has a sitelink available. In those cases, the link remains to the sitelink, but the displayed text uses the site link (with disambiguation text removed) instead of the label. This is a response to the vulnerability of labels to vandalism on Wikidata.
- The wrapper template
{{wdib}}
is a convenient shortcut for{{#invoke:WikidataIB |getValue | ...}}
using the same parameters. For example the spouse (P26) of Douglas Adams (Q42):{{wdib |P26 |qid=Q42 |fwd=ALL |qual=DATES}}
gives Jane Belson (1991–2001)
There are numerous test cases/examples at Module talk:WikidataIB/testing. Please ping me if problems arise. --RexxS (talk) 18:27, 13 June 2018 (UTC)
- This module is getting ridiculously bloated. {{3x|p}}ery (talk) 21:24, 13 June 2018 (UTC)
- Don't be so rude. Nevertheless, you should feel free to make constructive suggestions about which parts of the functionality you would remove to improve the module. --RexxS (talk) 23:25, 13 June 2018 (UTC)
getPreferredValue causing problem on page without a WD item ?
I know, the simple answer is "So make the item."
{{Infobox video game}} on use at The Sinking City is not displaying the locally defined value for image=. Currently the article has no WD item linked. Substituting a random QID for testing, the image will then appear. Looks like something isn't gracefully failing when there is no QID. -- ferret (talk) 02:28, 14 June 2018 (UTC)
- I believe this is now fixed. {{3x|p}}ery (talk) 02:38, 14 June 2018 (UTC)
- Confirmed, is fixed, thanks. -- ferret (talk) 02:40, 14 June 2018 (UTC)
Utility functions
One of the important consequences of developing ways of importing Wikidata into other Wikimedia projects is that work done in one project can be put to use int other projects. This module is used on 40 other projects at present, including Commons. The utility functions such as emptyor, getLang, formatNumber, examine, etc. will find use in debugging, constructing infoboxes, or developing functionality. Several of those could be moved into one or more different modules, but that simply means that any other project that wants to use them will end up importing multiple modules for no gain. Keeping all of the utility functions with the main module also means that if I'm asked to debug code on another project, I can be reasonably sure that I have these tools available, even if I can't read the language there.
I therefore disagree with Pppery removing utility functions with no other justification than Remove unused (outside of doc page examples) function that has nothing to do with Wikidata. I use them quite often simply in preview mode just to check or debug something, so their absence from saved pages does not correlate with their usefulness. Frankly, I also don't accept the premise that only functions that are directly related to Wikidata belong in this module. It was created to implement mechanisms for including Wikidata in infoboxes. Functions like emptyor that are useful in constructing infoboxes belong in the module, even if they don't connect directly to Wikidata. Similarly, the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki, even if it seems superfluous on enwiki. Leaving them in place, like the code that detects which project the module is on, allows one set of code to be used across multiple projects (with local internationalisation, of course).
If there were any evidence that shortening modules by removing functionality improves performance, I'd reconsider the goal of keeping one set of code across projects. But for the present, I think that better reasons need to be given for removing functionality. --RexxS (talk) 01:18, 15 June 2018 (UTC)
- My desire to keep this module from getting overly bloated isn't because I'm trying to make my module more performant, but rather a goal to keep this module from becoming a monolith of unrelated tools. Each module should only have the things that it directly needs in the place it is directly used, not random other miscellany, not things that are only useful in slightly different contexts (like other Wikis), thus things that don't contribute to this purpose are not necessary and should be purged. The English Wikipedia isn't multi-lingual, so
the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki
is irrelevant. As I said before, this is Module:WikidataIB, and should contain only functions that actually involve wikidata and infoboxes, not other random miscellany. {{3x|p}}ery (talk) 01:41, 15 June 2018 (UTC) - Everyone would consider it ridiculous to have a template that did something like
{{#switch:{{{1|}}}|foo = <code foo>|bar = <unrelated code bar> | baz = <unrelated code baz> | quux = <unrelated code quux>}}
. How is this module somehow different? {{3x|p}}ery (talk) 01:45, 15 June 2018 (UTC) - (ec) I'll try to sort out what's going on later. I was thinking of reverting all Pppery's adjustments because they were often unnecessary and abrasive, but there might have been some points that should be retained. In particular the procedure of fiddling in a live module with many thousands of transclusions takes bold in a bad direction. The template editor right should be retained only by editors with the right temperament for collaboration in a technical area. Meaningless optimization of this module for use on enwiki in a manner that complicates its use on other projects is a bad idea. There might be some point if there were test cases that showed the optimization saved significant time or resources, but of course there aren't because there is no observable improvement. Johnuniq (talk) 01:46, 15 June 2018 (UTC)
- @Pppery: I'm not going to play edit-war games with you, but I disagree that the module is becoming "a monolith of unrelated tools". I use all of those tools when working with Wikidata integration into infoboxes, and I'm not impressed by your suggestion that they are "unrelated". They are related for me by the fact that I use them when working with this module.
"Each module should only have the things that it directly needs in the place it is directly used"
. Says who? You? Modules are fundamentally different from templates. Templates have a single point of entry, but modules are designed to have multiple independent functions; they resemble shared libraries more than a stand-alone program. If I can collect together a group of functions that are useful to me (and presumably useful to others), then why should you have a veto because of some philosophical objection, or misunderstanding of what modules are?- There is no measurable performance hit by having many independent functions in a single module, and there is no problem with maintaining such code in a well documented code source. There is no downside in having these functions in the module, and I'd like to have them available.
- Your other argument
"The English Wikipedia isn't multi-lingual, so 'the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki' is irrelevant.
is simply blinkered. Having multilingual capability is vital for multilingual wikis and useful for reducing the amount of internationalisation needed when transferring a module to another Wikipedia. The work we do is not just for English Wikipedia, but for all of the projects who want to use it. There is therefore considerable advantage in maximising the amount of common code in terms of updates and maintenance, so I really don't want to have code forks beyond the internationalisation at the start of the module (or from a local sub-module). I'm not seeing any advantage for anyone in removing those functions and I'm going to ask you to revert yourself now. --RexxS (talk) 19:20, 15 June 2018 (UTC)- Modules are not fundamentally different from templates in the way you claim. Like I said above, one could begin a template with a
#switch
statement, and have it to unrelated things depending on the first parameter, but no one would ever code such a template. You are suggesting the exact same thing. You want this module to be equivalent to{{#switch:{{{1|}}}|getValue={{#property:{{{2|}}}|from={{{3|}}}}}|formatNum={{formatnum:{{{2|}}}}}...}}
, which is not how one codes templates in MediaWiki. The utility functions are unrelated in the sense that if one were to move them to a separate module, no line of code other than boilerplate likelocal p = {} ... return p
would need to be included in both modules, and in the sense that, unlike the rest of the module, they would work in the world in which mw:Extension:Wikibase Client wasn't installed. I'm not misunderstanding what modules are -- we simply have different ideas about what they are. - I don't dispute that having common code is a good goal, but that does not mean that every single line of code except for some predefined section needs to be exactly the same. Multi-lingual wikis can fork the code to stick appropriate {{int:lang}} in places, commons can fork it to use labels instead of sitelinks etc. Trying to have every single line of code centralized is a misguided goal that leads only to monsters like Module:Cycling race littered with code like
if wiki == "mk" or wiki == "ja" or wiki == "ru" then ...
, and thus I will not be self-reverting. {{3x|p}}ery (talk) 19:51, 15 June 2018 (UTC)- Modules are fundamentally different from templates in the way I explained. Your example would be analogous to writing a module containing one function that used a parameter to switch between different unrelated things. No one would write a module like that, either. However, the ability to collect together a group of functions into a single module as desired is a fundamental feature of the way modules are implemented on MediaWiki. I have a practical reason to make a particular collection as I explained above. You have offered no reason why your removal of functions improves the module in terms of performance, convenience, portability or any other objective consideration.
- Having every single line of code except for some predefined section does need to be the same, as it allows simple updating by a single copy and-paste. For code that is still growing and developing to meet editors' needs, and hence requires frequent updates, that is a compelling reason. There is no problem whatsoever with a few key sections having code that switches between two options based on the requirements of different wikis. This is far more than just a question of {int:lang}, as wikis like Commons have completely different page titles from enwiki, which makes the substitution of sitelinks for labels undesirable there. There is simply no advantage in forking code for different wikis when one piece of code can do the job for all of them. Your fear of "bloat" is completely unfounded. Are you or anyone else having any problems following the program flow in getValue? No? Then explain what problems having wiki-sensitive switches create.
- I think it's time to ask for a third opinion, as I don't believe you are able to grasp the points I'm making, and I don't intend to have you make my work in maintaining this module any harder merely to satisfy your desire to meddle, rather than improve. --RexxS (talk) 20:47, 15 June 2018 (UTC)
- There's definite value in keeping the same version between the different wikis rather than having different versions around that are difficult to sync with each other, and that's worth more than having a bit of redundant code here. It might be worth considering having a separate utility module at some point that this one calls functions from (in the same way that {{convert}} uses a few different sub-modules), but only when there's a clear benefit to doing so. Thanks. Mike Peel (talk) 21:36, 15 June 2018 (UTC)
- Modules are not fundamentally different from templates in the way you claim. Like I said above, one could begin a template with a