Jump to content

Module talk:Tabular data

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Jts1882 (talk | contribs) at 16:14, 30 June 2021 (Search more than 1 column: modify tests). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Great

@Mxn: this is great! Now we just need easier ways to edit tables, like Vera's work. Let me try this with this data. – SJ + 12:41, 10 May 2020 (UTC)[reply]

@Sj: I didn't realize it during the hackathon, but phab:T251759 is already well underway. Can't wait to see it live! – Minh Nguyễn 💬 19:38, 10 May 2020 (UTC)[reply]
hot dog! thanks for the link :) and the illustrative attempt here too, good practice to parse. – SJ + 00:26, 11 May 2020 (UTC)[reply]

Multiple fields

@Mxn: what would be nice (and significantly help performance in some cases) is the ability to get multiple fields. Like how Module:Covid19Data is called on User:EProdromou (WMF)/COVID-19 case data as {{#invoke:Covid19Data|regionTable|CA|QC|<tr><td> %s</td><td> %s</td><td> %s</td><td> %s</td></tr>}}.

For example:

{{#invoke:Tabular data|lookup
|search_column=model
|search_value=XYZ
|output_column=brand,year
|format=<li> The XYZ was made by [[%s]] and released in [[%s]].
|Example.tab}}

Not sure how difficult this would be. - Alexis Jazz 06:11, 19 May 2020 (UTC)[reply]

@Alexis Jazz: Thanks for the idea! It would certainly be feasible, but if efficiency or tidiness is the primary consideration, then I think it would be even better to refine {{#invoke:Tabular data|wikitable}} or {{Json2table}} to allow for the desired format on each row or create a separate function that outputs the whole table in list form. Or were you thinking of a use case where each row would come from a different Commons data table? – Minh Nguyễn 💬 10:19, 20 May 2020 (UTC)[reply]
@Mxn: I was actually thinking of a case where two (or more) fields from the same row are needed. A template like this one has to do two lookups for the same row, only to return a different field each time. This increases the page preview/rendering time and load on the Wikimedia servers. - Alexis Jazz 03:11, 21 May 2020 (UTC)[reply]
@Alexis Jazz:  Done, although I'd expect that large or complex tables or lists would be better served by a custom Lua function that interacts with the tabular data directly, since that also affords more control over formatting and allows lookups to be reused. – Minh Nguyễn 💬 05:14, 21 May 2020 (UTC)[reply]
@Mxn: Thanks, I also updated {{Tabular query}}. I've noticed though that [1] seemingly took almost a second to preview after I updated the module where it was about half a second before. This was before I updated that template to use the new functionality. Now that I've updated it, it's back taking half a second where I was hoping to get the preview/rendering time down to about 0.3 seconds. - Alexis Jazz 06:31, 21 May 2020 (UTC)[reply]
I see the performance has increased, down to about 0.4 seconds now. - Alexis Jazz 08:47, 1 June 2020 (UTC)[reply]

Performance

Recently @Johnuniq: developed {{NUMBEROF}} which uses c:Data:Wikipedia statistics/data.tab generated by GreenC bot. One of the issues we ran into was performance, because each time the template is invoked, the Commons file is retrieved via mw.ext.data.get() which is slow. List of Wikipedias had over 4,000 invocations which exceeded Lua's 10 second time and rendered red errors. @Pppery: suggested a solution to load the Commons file 1 time per page but mw.ext.data.get() does not support this, however mw.loadData() does. So the mw.ext.data.get() is used in {{NUMBEROF/data}} which is then loaded by mw.loadData() in {{NUMBEROF}}. It works to ensure the file from Commons is loaded 1 time regardless of how often the module is invoked on a page. Is this an issue with this module? Should we recommend readers to use {{NUMBEROF}} vs. this template, since it is being used as an example? -- GreenC 02:26, 24 May 2020 (UTC)[reply]

Module:NUMBEROF/data is easily able to provide a cache of the Commons data because the module was written specifically for that data format, and with knowledge of what was wanted by the main module. To do that more generally would be tricky. Using hundreds of calls to Module:Tabular data would consume a lot of resources. If that is ever required, I would think a workable solution would require a custom module like Module:NUMBEROF/data. Re the question: yes, {{NUMBEROF}} should be used although I suppose the example in the docs here was intended to show this module's flexibility. I think the docs should include a "but see {{NUMBEROF}}" note. Johnuniq (talk) 03:43, 24 May 2020 (UTC)[reply]
@GreenC: This module is intended to serve a variety of use cases generically, so it's different than {{NUMBEROF}}, but I added a "See also" link to that template, just in case. This module provides {{#invoke:Tabular data|wikitable}} for situations where the entire table is needed on a given page, as opposed to a lookup of a few values. That function could be made more flexible, along the lines of {{Json2table}}, but I think ultimately any use case that requires looking up a lot of values from the same Commons table and including the results on the same page warrants a dedicated Lua module to build that entire portion of the page. Then caching wouldn't be so relevant, because the Commons table would only get loaded once anyways. – Minh Nguyễn 💬 22:44, 25 May 2020 (UTC)[reply]

Getting row or column data

Just an idea, not sure about the technical feasibility. Similar to getting cell value, is it possible to get the column values or row values? Output shall be csv(or some delimited values) in place of a value. One of the usage I'm looking for is to use in {{Graph:Chart}} as data series.- Timbaaa -> ping me 13:31, 14 July 2020 (UTC)[reply]

@Timbaaa: That's definitely feasible, though it might be easier to integrate something with {{Graph:Lines}}, which is already pretty usable with tabular data, as seen in COVID-19 pandemic in the San Francisco Bay Area#Cases by county over time. It would look pretty similar to the existing _wikitable() function, but just the part that collects the titles of the elements in data.schema.fields. If you're planning to use this functionality inside a module instead of directly inside a template or article, I'd suggest working with mw.ext.data.get(…).schema.fields directly so you have maximum control over formatting. – Minh Nguyễn 💬 19:46, 19 September 2020 (UTC)[reply]

Search as Number

Great job!

For some reason, it doesn't work for me. For example, a request like this:

{{#invoke: Tabular data | lookup | COVID-19 Slovenia cases per capita.tab | search_value = 261 | search_column = cases | output_column = name}}

returns an empty string instead of "Ajdovščina".

Help me please.— Preceding unsigned comment added by Игорь Темиров (talkcontribs) 19:14, 8 November 2020 (UTC)[reply]

There's no 261 in the cases column of c:Data:COVID-19 Slovenia cases per capita.tab. The cases value for Ajdovščina is 2204. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚  01:05, 24 June 2021 (UTC)[reply]

Search more than 1 column

Would it be possible to make it search two (or more) columns?
I.e.:
{{#invoke:Tabular data|lookup|Page name.tab|search_value=|search_column=|search_value2=|search_column2=|...|...|output_format=}}
E.g.:
{{#invoke:Tabular data|lookup|UN:Total population, both sexes combined.tab|search_value=Afghanistan|search_column=Country|search_value2=1950|search_column2=Year|output_column=Value}}

𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚  01:00, 24 June 2021 (UTC)[reply]

I've created Module:Tabular_data/sandbox with a function to try and handle the second search requirement. It doesn't work. However, I can't get the existing module to return data from c:Data:UN:Total population, both sexes combined.tab.
{{#invoke:Tabular data|lookup
|search_column=date
|search_value=2020-03-16
|output_column=totalConfirmedCases
|COVID-19 cases in Santa Clara County, California.tab}}

Lua error in Module:Tabular_data at line 48: Output column “totalConfirmedCases” not found..

{{#invoke:Tabular data|lookup
|UN:Total population, both sexes combined.tab 
|search_value=Afghanistan|search_column=Country 
|output_column=Value}}

40754.388

{{#invoke:Tabular data/sandbox|lookup
|UN:Total population, both sexes combined.tab 
|search_value=Afghanistan|search_column=Country 
|output_column=Value}}

Lua error: bad argument #1 to "get" (not a valid title).

{{#invoke:Tabular data/sandbox|lookup2
|UN:Total population, both sexes combined.tab 
|search_value=Afghanistan|search_column=Country 
|search_value2=1950|search_column2=Year 
|output_column=Value}}

7752.118

What am I missing? Could it be the page name with a colon that is invalid? —  Jts1882 | talk  13:41, 30 June 2021 (UTC)[reply]