Jump to content

Module talk:WikidataIB/Archive 4

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Lowercase sigmabot III (talk | contribs) at 05:27, 7 March 2019 (Archiving 1 discussion(s) from Module talk:WikidataIB) (bot). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Archive 1Archive 2Archive 3Archive 4Archive 5Archive 6Archive 8

getAliases

I've added a new function to the sandbox to get the aliases for an entity. This is an expensive call if arbitrary access is used.

The function getAliases has the qid of a Wikidata entity passed as |qid= (it defaults to the associated qid of the current article if omitted), and a local parameter passed as the first unnamed parameter. It implements blacklisting and whitelisting with a field name of "alias" by default. Any local parameter passed becomes the return value, subject to the blacklisting of "alias".

It returns the aliases for the Wikidata entity with the usual list options, and nothing is returned if the aliases do not exist.

Examples and test cases are at Module talk:WikidataIB/sandbox/testing #getAliases. It may be useful for infoboxes that have an "alternative names" field. --RexxS (talk) 19:51, 14 November 2018 (UTC)

@RexxS: It definitely sounds useful! I have two requests: would it be possible to also include results from short name (P1813); and would it be possible to filter out very similar names, such as ones that differ by an accent (e.g., see Telescopio Carlos Sánchez (Q7696819)) Thanks. Mike Peel (talk) 20:30, 14 November 2018 (UTC)
@Mike Peel: At present, I've only written general calls that would be usable in any circumstance, like get a property value, get a label, or get a description, etc. I think what you're moving toward is more specialised calls that are very specific to particular purposes. I don't have a problem with that, but it will take a little more time to create something usable. What I think we will produce is larger "building blocks" for templates that will do more than just return a simple set of values. Module:WikidataIB has the routines now to create these bigger blocks while retaining the functionality that we're used to in terms of black/white-listing, sourcing, output formatting and internationalisation, so most of it is already there.
I'll have a look at the issue of filtering out "similar" names, because that's new to me, and I'm not sure what is the best algorithm to use. The question it raises is which of two similar values do we filter out? the second one we come across? or the one with diacritics? or the one without diacritics? And how should that be handled for other languages? Take an example: if we have Buhlmann, Bühlmann, and Buehmann, which one(s) do we keep? Would we make the same decision if language is set to German? --RexxS (talk) 20:58, 14 November 2018 (UTC)
Sorry, I was trying to stay general! However, there does seem to be some overlap between the aliases and the short name... With similar names, I'd go by the order that they appear on Wikidata if possible, since they seem to have a special order there (which is a lot easier to change than the property values), and they are language-specific. Thanks. Mike Peel (talk) 21:02, 14 November 2018 (UTC)
@Mike Peel: I've made Module:Diacritics containing some utility functions that can be used via #invoke or imported into other modules. It can strip diacritics from phrases or compare two phrases returning true if they are the same or differ only in diacritics (what we might call "similar"). I can use that in WikidataIB to write a new function that collects aliases and short names, then removes identical or similar names before outputting them. The question remaining is, are there any other "similar" words besides those which differ only in diacritics? Do you know of any examples? --RexxS (talk) 13:54, 17 November 2018 (UTC)

References

Would it be possible to generate citations (i.e. <ref name="-wd-Q56813023">[transcluded data]</ref>) using this module? I think this would be useful in infoboxes since Wikidata references are usually different to the sources used directly in articles. The French Wikipedia has something similar but it seems to be somewhat less developed and creates duplicates where the same source is used multiple times. {{Cite Q}} exists but it's not entirely Lua-based. Jc86035 (talk) 15:58, 17 November 2018 (UTC)

I've stayed clear of trying to generate references from Wikidata because there is simply no way of knowing what format the references in an article should be. An editor can make up their own style of reference formatting, and then CITEVAR forbids any other style of formatting. I'm simply not prepared to give more ammunition to the Wikidata-haters and infobox-haters to complain about the code that I write. You can get references using {{wd}}, of course, but I have no plans to do that in WikidataIB. --RexxS (talk) 16:55, 17 November 2018 (UTC)

I have also noted that French WP automatically creates archive links. It would help reducing the problem of linkrot here, and simplify the referencing work of en.wp editors. -- Ohc ¡digame! 11:22, 18 November 2018 (UTC)

Localize multipliers like "million"

How can the multipliers like "million" been localized? Do I have to translate something for that or create a file in the directory I18n if I want to use it in another language than English?

For example

{{#invoke:WikidataIB |getValue |qid=Q684773 |P1436 |rank=b |fwd=ALL |osd=n |scale=6 |uabbr=y}}

results (independently of the language) into "9.444 million Edit this on Wikidata", see als:ETH-Bibliothek. --Zuphilip (talk) 12:17, 31 December 2018 (UTC)

@Zuphilip: The multipliers are currently defined (in lines 74–79) like this:
local i18n =
{
...
	["multipliers"] = {
		[0]  = "",
		[3]  = " thousand",
		[6]  = " million",
		[9]  = " billion",
		[12] = " trillion",
	}
}
You can either:
  1. modify those lines directly in the local copy of the module; or
  2. create Module:WikidataIB/i18n locally containing the definitions you require. Those will then overwrite the English definitions in the main module.
Let me know if you need more help or if you want me to make the changes to als:Modul:WikidataIB for you. Cheers --RexxS (talk) 14:23, 31 December 2018 (UTC)
Okay, since I don't want to mess up with the template and make future updates easier, I chose the second option. After some attempts I have now the correct syntax for the translation and created the file als:Modul:WikidataIB/i18n which seems to work fine. Thank you RexxS for your help! --Zuphilip (talk) 15:20, 31 December 2018 (UTC)
Nice work, Zuphilip. I'll try to remember als:Modul:WikidataIB/i18n as an example to show to anybody else who wants to do some localisation on other Wikipedias. Cheers --RexxS (talk) 15:34, 31 December 2018 (UTC)

Lua

@RexxS: In the short-to-medium term, would it be possible/desirable to enable calling this module from other Lua modules? I've tried making a Lua/Wikidata infobox (code, test) which calls data from multiple items, and presumably this would be quite difficult in wikitext. I don't really know how this module works so I haven't tried to touch it. I'm not anticipating actually making use of this just yet, but maybe in a few months to a few years.

For context on why I would want to call multiple items: Other music databases tend to indicate music releases, compositions and tracks as separate entities, but Wikidata has lumped them all together through infobox imports. There are a few thousand items which aren't lumped together like this (so this infobox would only work with those items). There is no real consensus for using either method in Wikidata. (Related: draft RfC (by me); most recent discussion.) Presumably this would also be useful for things like going up an administrative subdivision tree. Jc86035 (talk) 09:39, 30 December 2018 (UTC)

@Jc86035: All of the work in fetching the objects representing statements and translating each data type into wikitext is done inside separate functions that could easily be adapted to be available to another module via the require() function. So it should not be much of a problem to write a bespoke module that picks up several properties and/or qualifiers and assembles them internally into an infobox (or any other wikitext).
However, there is little to gain from trying to fetch multiple statements in one call, because the getEntity function is resource-heavy compared to getBestStatements/getAllStatements. It also tells the system that it is loading every statement, alias, description and sitelink into the page where it's called, which then makes it very difficult to filter most of the irrelevant changes from any en-wp watchlists that enable Wikidata tracking.
The present problem with writing all or most of the code in Lua is that we would restrict maintainers to the relatively small number who are comfortable writing in Lua, compared with the seemingly much larger number of editors who are happy to write template code using the MediaWiki functions. That prompted me to develop WikidataIB as a way that they could adapt existing infobox code on a field-by-field basis to draw data from Wikidata under carefully prescribed circumstances. You can see an example at the unfinished Template:Infobox person/Wikidata for example. It would be much cleaner to write the whole infobox in Lua, of course, as the French and others do, but then it would need the same one or two editors to do maintenance and implement agreed changes, a situation that I wouldn't be happy with. I much prefer to maximise the number of editors able to service or create the code, even if they don't know Lua. Most of the time, it's easy to create what folks want using wikitext like this:
{{infobox
| above      = {{#invoke:WikidataIB |getLabel |qid={{{qid|}}} }}

| label1     = Language
| data1      = {{#invoke:WikidataIB |getValue |rank=best |P407 |name=language |qid={{{qid|}}} |fetchwikidata={{{fetchwikidata|}}} |onlysourced={{{onlysourced|}}} |{{{lang|}}} }}

| label2     = Artist
| data2      = {{#invoke:WikidataIB |getValue |rank=best |P175 |name=artist |qid={{{qid|}}} |fetchwikidata={{{fetchwikidata|}}} |onlysourced={{{onlysourced|}}} |{{{artist|}}} }}

| label3     = ISWC
| data3      = {{#invoke:WikidataIB |getValue |rank=best |P1827 |name=iswc |qid={{{qid|}}} |fetchwikidata={{{fetchwikidata|}}} |onlysourced={{{onlysourced|}}} |{{{iswc|}}} }}
}}
But you're also looking to fetch a value indirectly by looking for the performer's version of the song as the value of the qualifier statement is subject of (P805), then finding the ISRC (P1243) of that entity. If you use getEntity, that's automatically an expensive function, so again you're much better off using getBestStatements/getAllStatements. If you look in Module:WikidataIB, you'll see there is a function called getPropOfProp, which gets a property of a property. It wouldn't be too difficult to write an analogous call that gets a property of a qualifier.
Nevertheless, you might want to experiment with the code that you're writing, and use some code from WikidataIB. WikidataIB contains all the code needed to create linked items from sitelinks, to resolve datatypes and make a table of multiple values, as well as formatting that table into wikitext in different ways, so feel free to take whatever you need. I'm sorry the code has grown so difficult to follow, but I had to implement whitelists and blacklists in order to give infobox designers the opportunity to make their infoboxes "opt-in" at the article level, and I had to create filters that only return sourced information by default. The parameters |fetchwikidata=, |suppressfields= and |onlysourced= control that behaviour.
Let me know if you want any specific functions added to WikidataIB, and I'll do my best to help out. Cheers --RexxS (talk) 15:49, 30 December 2018 (UTC)
@RexxS: The main issue that I was getting stuck on (which I thought would necessitate Lua) was that for every statement from the single release of a song, the same function has to be done:
Is this something I should worry about if I decide to make {{Infobox song/Wikidata}} at some point in the medium term? I've realized that it would be possible to find the appropriate items once and then use a subtemplate for the actual infobox (with the relevant QIDs passed to the subtemplate), but presumably this would also have some performance drawbacks. Jc86035 (talk) 12:00, 2 January 2019 (UTC)
@Jc86035: Lua is an astonishingly fast scripting language, as well as being lightweight (or perhaps as a result of that). So, generally speaking, you don't have to worry about performance. If you read half-a-dozen property values separately using getBestStatements/getAllStatements, you won't see any difference over using getEntity once and picking out the half-a-dozen values from that. Using getEntity is really to be avoided because it loads the entire Wikidata entity, which will include aliases, descriptions, sitelinks and labels, as well as all of the properties. That's why calling a different entity from the one associated with the current page is marked as an expensive call. As I indicated previously, it also means that the page here then effectively transcludes every single Wikidata value and English watchlists will show any changes to that item on Wikidata (when enabled), even if we're only interested in changes that would affect the values we're actually using.
I suggest that when you make your test infobox, you should go with whatever scheme you find easiest to get working using getBestStatements/getAllStatements (or the functions from WikidataIB), and don't worry about performance. You can always check with the 'parser profiling data' you see every time you preview a page, and then worry about optimising performance afterwards if needed. One complicated template will use more resources and server time than dozens of "lean" Lua calls, so your code is almost certainly not going to be the performance bottleneck on an average page.
When you have some code to test, give me a call and I'll help you with anything you need. Cheers --RexxS (talk) 14:22, 2 January 2019 (UTC)

Lua error when qual-parameter and osd=yes are used for not sourced statements

I integrated the module in Template:Infobox_library and now an Lua-error occur when the qual-parameter is used in combination with osd=yes for not sourced statements, e.g.:

{{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |qual=P585 |fwd=ALL |osd=yes |{{{collection_size|}}} }}
← [edit: was giving an error]

Can someone fix this? Ping User:RexxS (I would expect that for the case that only sourced statements are allowed and the current statement is not sourced around line 1087 we would "continue" the loop with the next possible statement and not just leave out one assignment, but I haven't look closer...) --Zuphilip (talk) 07:26, 2 January 2019 (UTC)

Hi Zuphilip thanks for catching that. You're right: when the requested statement is unsourced but has a qualifier that is requested, the code was attempting to concatenate the nil value for the statement with the qualifier value. I've caught the unsourced condition and skipped the part that gets the qualifier. Here are some tests to check that it's working for National Library of Catalonia (Q1200925) collection or exhibition size (P1436) point in time (P585) :
  • {{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |fwd=ALL |osd=no |{{{collection_size|}}} }} → 4,401,625 unit, 72 terabyte, 63,800 linear metre Edit this on Wikidata
  • {{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |qual=P585 |fwd=ALL |osd=no |{{{collection_size|}}} }} → 4,401,625 unit (2020), 72 terabyte (2019), 63,800 linear metre (2019) Edit this on Wikidata
  • {{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |qual=P585 |fwd=ALL |osd=no |qualsonly=yes |{{{collection_size|}}} }} → 2020, 2019, 2019 Edit this on Wikidata
  • {{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |qual=P585 |fwd=ALL |osd=yes |{{{collection_size|}}} }}
  • {{#invoke:WikidataIB |getValue |qid=Q1200925 |P1436 |rank=b |qual=P585 |fwd=ALL |osd=yes |qualsonly=yes |{{{collection_size|}}} }}
Let me know if you spot any further problems. Cheers --RexxS (talk) 10:02, 2 January 2019 (UTC)
Great, thank you very much RexxS for the fast fix! --Zuphilip (talk) 20:43, 2 January 2019 (UTC)