Jump to content

Module talk:Tabular data

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Mxn (talk | contribs) at 19:46, 19 September 2020 (Getting row or column data: Replied). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Great

@Mxn: this is great! Now we just need easier ways to edit tables, like Vera's work. Let me try this with this data. – SJ + 12:41, 10 May 2020 (UTC)[reply]

@Sj: I didn't realize it during the hackathon, but phab:T251759 is already well underway. Can't wait to see it live! – Minh Nguyễn 💬 19:38, 10 May 2020 (UTC)[reply]
hot dog! thanks for the link :) and the illustrative attempt here too, good practice to parse. – SJ + 00:26, 11 May 2020 (UTC)[reply]

Multiple fields

@Mxn: what would be nice (and significantly help performance in some cases) is the ability to get multiple fields. Like how Module:Covid19Data is called on User:EProdromou (WMF)/COVID-19 case data as {{#invoke:Covid19Data|regionTable|CA|QC|<tr><td> %s</td><td> %s</td><td> %s</td><td> %s</td></tr>}}.

For example:

{{#invoke:Tabular data|lookup
|search_column=model
|search_value=XYZ
|output_column=brand,year
|format=<li> The XYZ was made by [[%s]] and released in [[%s]].
|Example.tab}}

Not sure how difficult this would be. - Alexis Jazz 06:11, 19 May 2020 (UTC)[reply]

@Alexis Jazz: Thanks for the idea! It would certainly be feasible, but if efficiency or tidiness is the primary consideration, then I think it would be even better to refine {{#invoke:Tabular data|wikitable}} or {{Json2table}} to allow for the desired format on each row or create a separate function that outputs the whole table in list form. Or were you thinking of a use case where each row would come from a different Commons data table? – Minh Nguyễn 💬 10:19, 20 May 2020 (UTC)[reply]
@Mxn: I was actually thinking of a case where two (or more) fields from the same row are needed. A template like this one has to do two lookups for the same row, only to return a different field each time. This increases the page preview/rendering time and load on the Wikimedia servers. - Alexis Jazz 03:11, 21 May 2020 (UTC)[reply]
@Alexis Jazz:  Done, although I'd expect that large or complex tables or lists would be better served by a custom Lua function that interacts with the tabular data directly, since that also affords more control over formatting and allows lookups to be reused. – Minh Nguyễn 💬 05:14, 21 May 2020 (UTC)[reply]
@Mxn: Thanks, I also updated {{Tabular query}}. I've noticed though that [1] seemingly took almost a second to preview after I updated the module where it was about half a second before. This was before I updated that template to use the new functionality. Now that I've updated it, it's back taking half a second where I was hoping to get the preview/rendering time down to about 0.3 seconds. - Alexis Jazz 06:31, 21 May 2020 (UTC)[reply]
I see the performance has increased, down to about 0.4 seconds now. - Alexis Jazz 08:47, 1 June 2020 (UTC)[reply]

Performance

Recently @Johnuniq: developed {{NUMBEROF}} which uses c:Data:Wikipedia statistics/data.tab generated by GreenC bot. One of the issues we ran into was performance, because each time the template is invoked, the Commons file is retrieved via mw.ext.data.get() which is slow. List of Wikipedias had over 4,000 invocations which exceeded Lua's 10 second time and rendered red errors. @Pppery: suggested a solution to load the Commons file 1 time per page but mw.ext.data.get() does not support this, however mw.loadData() does. So the mw.ext.data.get() is used in {{NUMBEROF/data}} which is then loaded by mw.loadData() in {{NUMBEROF}}. It works to ensure the file from Commons is loaded 1 time regardless of how often the module is invoked on a page. Is this an issue with this module? Should we recommend readers to use {{NUMBEROF}} vs. this template, since it is being used as an example? -- GreenC 02:26, 24 May 2020 (UTC)[reply]

Module:NUMBEROF/data is easily able to provide a cache of the Commons data because the module was written specifically for that data format, and with knowledge of what was wanted by the main module. To do that more generally would be tricky. Using hundreds of calls to Module:Tabular data would consume a lot of resources. If that is ever required, I would think a workable solution would require a custom module like Module:NUMBEROF/data. Re the question: yes, {{NUMBEROF}} should be used although I suppose the example in the docs here was intended to show this module's flexibility. I think the docs should include a "but see {{NUMBEROF}}" note. Johnuniq (talk) 03:43, 24 May 2020 (UTC)[reply]
@GreenC: This module is intended to serve a variety of use cases generically, so it's different than {{NUMBEROF}}, but I added a "See also" link to that template, just in case. This module provides {{#invoke:Tabular data|wikitable}} for situations where the entire table is needed on a given page, as opposed to a lookup of a few values. That function could be made more flexible, along the lines of {{Json2table}}, but I think ultimately any use case that requires looking up a lot of values from the same Commons table and including the results on the same page warrants a dedicated Lua module to build that entire portion of the page. Then caching wouldn't be so relevant, because the Commons table would only get loaded once anyways. – Minh Nguyễn 💬 22:44, 25 May 2020 (UTC)[reply]

Getting row or column data

Just an idea, not sure about the technical feasibility. Similar to getting cell value, is it possible to get the column values or row values? Output shall be csv(or some delimited values) in place of a value. One of the usage I'm looking for is to use in {{Graph:Chart}} as data series.- Timbaaa -> ping me 13:31, 14 July 2020 (UTC)[reply]

@Timbaaa: That's definitely feasible, though it might be easier to integrate something with {{Graph:Lines}}, which is already pretty usable with tabular data, as seen in COVID-19 pandemic in the San Francisco Bay Area#Cases by county over time. It would look pretty similar to the existing _wikitable() function, but just the part that collects the titles of the elements in data.schema.fields. If you're planning to use this functionality inside a module instead of directly inside a template or article, I'd suggest working with mw.ext.data.get(…).schema.fields directly so you have maximum control over formatting. – Minh Nguyễn 💬 19:46, 19 September 2020 (UTC)[reply]