Jump to content

Module talk:Transcluder

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Krinkle (talk | contribs) at 03:26, 5 February 2022 (bad argument #2 to 'sub' (number expected, got nil): new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Edit request in re nofollow

I added a |nofollow= to disable following redirects. See Module:Transcluder/sandbox. This has two main uses:

  1. Sometimes redirect pages have interesting information on them, that is tedious to copy over whenever it changes. E.g., User:Psihedelisto.
  2. When making many redirects to the same page, that will all have exactly the same wikitext, it's annoying to have to keep typing the {{R shell}}, when you can just, {{subst:#invoke:Transcluder|main|User:Psihedelisto|nofollow=y}}.

Pinging Sophivorus as this is their module. Psiĥedelisto (talkcontribs) please always ping! 02:01, 19 July 2020 (UTC)[reply]

@Psiĥedelisto: Thanks for the contribution! I did a few tweaks and just made it live. Notice that the option is called "noFollow" rather than "nofollow" for coherence with the other options. Let me know of any issues or further requests, cheers! Sophivorus (talk) 13:46, 19 July 2020 (UTC)[reply]

only didn't work on more than 1 element for me

Here, called there. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 22:11, 24 May 2021 (UTC)[reply]

@Guarapiranga Hi yea, I actually removed that feature some time ago but forgot to remove it from the documentation. I just did, thanks. By the way, you can simulate the functionality by doing multiple calls to the module, cheers! Sophivorus (talk) 00:58, 25 May 2021 (UTC) 👍 [reply]

Could this module be lighter and faster with mw.string instead of mw.ustring?

I'm trying my hand at it in the sandbox, but still gotta debug. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 21:16, 30 May 2021 (UTC)[reply]

@Guarapiranga Honestly I used mw.ustring only because it seemed safer, without really understanding the difference with mw.string. If you think you understand the difference and that it won't matter, and all the test cases look good, I'll be happy to incorporate your changes, cheers! Sophivorus (talk) 00:45, 31 May 2021 (UTC)[reply]
My sandbox is nowhere near ready—I was hoping someone more experienced could jump in and find my mistakes. It was from Johnuniq I read that ustring is a lot slower.[1] The reason is that Scribunto implements mw.ustring by using Lua to replace the regex library which is a breathtaking accomplishment but which makes it slower than using plain strings. [2] Mr._Stradivarius also says that using mw.ustring has the drawback of having to cross back and forth between Lua and PHP all the time, which reduces the performance by quite a bit.[3] — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 02:21, 31 May 2021 (UTC)[reply]
I'm not interested in modules that attempt to add another layer of wikitext parsing to the complex PHP of MediaWiki. I can imagine why something like this module would be wanted for special noticeboards but using it in articles is certain to fail periodically and is a maintenance problem. Johnuniq (talk) 02:52, 31 May 2021 (UTC)[reply]
This module is used in Module:Excerpt, which I gather was built for creating topic portals, and I use it at WP:User scripts/List/sandbox, Johnuniq. I don't think it's used in articles. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 03:50, 31 May 2021 (UTC)[reply]
Try "what links here" in the article namespace. Johnuniq (talk) 04:06, 31 May 2021 (UTC)[reply]
@Guarapiranga @Johnuniq This module was primarily designed to be used by Module:Excerpt which implements Template:Excerpt, a template used in 2000+ articles in the English Wikipedia (including COVID-19 pandemic and other high profile articles) and 1500+ more in the Spanish Wikipedia (including heavy use in high profile articles like Science, Philosophy, country articles and many more). Cheers! Sophivorus (talk) 13:05, 31 May 2021 (UTC)[reply]
I stand corrected 😊 — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 13:07, 31 May 2021 (UTC)[reply]

For an example of problems, see Electoral results for the district of Parramatta (which is currently showing a million "time allocated for running scripts has expired" errors). That is solved by removing all the mw.ustring stuff but the bigger question is why articles are using this guaranteed-to-fail method of padding articles. Johnuniq (talk) 05:26, 18 July 2021 (UTC)[reply]

@Johnuniq Hi! Thanks for the info, switching to mw.string has just sharply increased its priority, I'll try to implement it asap. Regarding your bigger question, I'm not sure I understand it or why you say "guaranteed-to-fail", since this issue only seems to affect one or at most a few articles, and seems solvable. Cheers! Sophivorus (talk) 12:32, 18 July 2021 (UTC)[reply]
I did a massive edit to replace many mw.ustring with plain string. Unfortunately, the module is trying to do too much work and Electoral results for the district of Parramatta is still broken. I left 30 mw.ustring and replacing them would fix that article. I left them because some functions may need mw.ustring (depending on how the functions are called), and some I lost interest in because I had already done a lot of changes. I'll explain about "guaranteed-to-fail" some other time because I don't want to distract from seeing if this module can be further edited, or whether the failing article needs a drastic edit (yes it does need a drastic edit because it's just silly to repeat so much information, but that's not a problem for discussion here). Johnuniq (talk) 03:57, 19 July 2021 (UTC)[reply]
There are now 25 articles in Category:Pages with script errors because my editing this module means some pages have been purged. Johnuniq (talk) 05:03, 19 July 2021 (UTC)[reply]
@Johnuniq Hi, thanks for the work done so far! I copied your changes to the central sandbox and some of the test cases seem to be failing now. This may mean broken or buggy excerpts on the English Wikipedia. Do you know what may be the cause? If not, I'll try to figure it out and fix it this week, but I may have to revert to the previous stable version until I do. Many thanks again!! Sophivorus (talk) 12:42, 19 July 2021 (UTC)[reply]

@Sophivorus: I edited mw:Module:Transcluder to fix a couple of problems and copied that module to Module:Transcluder. That means "non" is accepted as "no" which frankly is not a problem but if it were, it can be removed later. Having the two modules the same is simpler now. mw:Module:Transcluder/testcases is showing one problem, namely testReferences:

Expected = acfgk
Actual = acfgk<ref name="l" />

I think the test text for that is the following (from mw:Module:Transcluder/test#References).

a<ref>b</ref>c<ref name="d">e</ref>f<ref name="d" />g<ref name="h" group="i">j</ref>k<ref name="l" />

I cannot see why the code would ever remove <ref name="l" />. It only removes <ref name="d" /> because there is another ref with name="d". Any ideas? Johnuniq (talk) 09:43, 20 July 2021 (UTC)[reply]

After considerable trouble, I have worked out why the current module (with my changes) fails testReferences while the previous mw:Module:Transcluder revision from 12:40, 9 January 2021 passes. It's because this edit removed the following code from getReferences:
if flags and not truthy(flags) then
    text = mw.ustring.gsub(text, '<%s*[Rr][Ee][Ff][^>/]*>.-<%s*/%s*[Rr][Ee][Ff]%s*>', '')
    text = mw.ustring.gsub(text, '<%s*[Rr][Ee][Ff][^>/]*/%s*>', '')
    return references, text
end
Johnuniq (talk) 03:56, 21 July 2021 (UTC)[reply]
@Johnuniq Thanks so much for figuring it out! I just restored the code and now all the test cases pass. I also added a comment explaining why I put that code there in the first place. It's kind of hacky but it's the best I could come up with so far. Same with 'non': it's there to make it work on the French Wikipedia, but it's also kind of hacky so I added a short comment to try to clarify it. Electoral results for the district of Parramatta still displays timeout errors but I think it displays way less than before. The page is also kind of extreme so I think the blame can't be put much on the Transcluder module or the excerpt concept. In fact, this reminds me that some time ago I tried to construct a table of COVID stats with data pulled out entirely from Wikidata (using Module:Wd) but encountered timeout errors that prevented me from accomplishing it. I remember I thought: why should volunteers have to work more in order to save computers from working? Would you agree? Do you think that maybe the max execution time should be upped a bit? Thanks again for the hours put into this! Sophivorus (talk) 13:46, 23 July 2021 (UTC)[reply]
I am a pragmatist and believe we should work with what we've got. While asking for more execution time might be reasonable, the fact is that previewing an edit to any of the many articles which excessively use {{excerpt}} is painfully slow. More execution time would only make those previews/saves take longer. The fundamental problem is that duplicating content is a broken idea. If it really would be desirable to do that, MediaWiki has a built-in mechanism that is extremely fast, namely transclusion. Rather than trying to write a module which deals with all the weird and unpredictable ways wikitext might be written, those wanting to mirror excerpts around the place should use transclusion instead. That would not break and would be very fast. Johnuniq (talk) 01:49, 24 July 2021 (UTC)[reply]
I had another go at making the module work at that failing article. Some debugging showed that Module:Excerpt is also trying to do too much work and it appears to contribute quite a lot to the 10 seconds execution limit. The modules are trying to replace the MediaWiki parser and that is not going to work reliably. Johnuniq (talk) 10:25, 24 July 2021 (UTC)[reply]
@Johnuniq Thanks again for the improvements!! Problem with plain transclusions is that they transclude the entire page, so you have to go use <includeonly> tags which complicate the wikitext and make it more difficult for some users to understand, adopt and use correctly. In the first version of excerpts, I was using the #lsth parser function that at least can transclude just the intro of a page or a specific section. Problem was that it also transcludes undesired elements like infoboxes, notices, navigation templates with undesired automatic categorization, and many others. Name conflicts with references were also rampant (especially because the visual editor names all references the same), so ugly <noinclude> tags and other complications were still necessary. Long story short, modules like Transcluder and Excerpt remove all these difficulties and result in easy, beautiful excerpts at the cost, granted, of a performance loss. But luckily, your efforts are making me see that so many things can be improved on that front that maybe that drawback can be greatly reduced! Thanks again!!! Sophivorus (talk) 12:34, 24 July 2021 (UTC)[reply]

function getNamespaces is misbehaving

Unfortunately, function getNamespaces is not performing as intended the function

local function getNamespaces(name)
	local namespaces = mw.site.namespaces[name].aliases
	table.insert(namespaces, mw.site.namespaces[name].name)
	table.insert(namespaces, mw.site.namespaces[name].canonicalName)
	return namespaces
end

Although 'namespaces' is a local variable, the table.insert is actually adding to the global values

A test

function p.td()
	tab = getNamespaces('Category')
	mw.log(#tab)
	tab = getNamespaces('Category')
	mw.log(#tab)
	tab = getNamespaces('Category')
	mw.log(#tab)
end

returns 2, 4, 6

I think mw.clone() would help eg

local namespaces = mw.clone(mw.site.namespaces[name].aliases)

Desb42 (talk) 10:06, 24 November 2021 (UTC)[reply]

Interesting bug! Your solution is correct, thanks. The original defines namespaces as a pointer to the aliases table in mw.site.namespaces and unexpectedly it allows that table to be modified (presumably it's only a temporary table so the modifications have no effect). The function should be a bit smarter and not add duplicates—in this example both .name and .canonicalName are "Category" and getNamespaces returns a table including both. I'll have a look later to see if a fix would be warranted (that is, how is getNamespaces used?). I checked for changes since I last looked at the module and there is another problem: local min, max = string.match is no longer adequate now that en dash and em dash have been included in the regex. One fix would be to use ustring.match. It might be better (I haven't thought about it yet) to instead use gsub to replace en/em dashes with a hyphen before doing the match. Johnuniq (talk) 02:59, 25 November 2021 (UTC)[reply]
I started looking at function parseFlags but have abandoned it for the moment because other stuff should be cleaned. The regex '^(%d+)%s*[-–—]s*(%d+)$' has a typo that would make it almost never work—the second s should be %s. I can't (quickly) work out what to do about the returned flags table—for a range like 3-5 the table would have numeric keys 3, 4, 5 but for a plain number such as 3 the table would have a string key '3'. Finally, the function should use tonumber first rather than looking for a range then setting min, max to the same value if the text is a number rather than a range. Johnuniq (talk) 04:26, 25 November 2021 (UTC)[reply]
@Sophivorus: You might like to investigate these issues. Johnuniq (talk) 05:52, 27 November 2021 (UTC)[reply]
@Desb42 @Johnuniq Thanks for the report and the fixes! I just fixed the namespace issue and the %s typo and replaced string.match for mw.ustring.match in parseFlags. Johnuniq, I couldn't quite follow your reasoning about the problem with the returned flags table, but feel free to experiment at the central https://www.mediawiki.org/wiki/Module:Transcluder (since the module is used increasingly on several Wikipedias) and testing your changes at https://www.mediawiki.org/wiki/Module:Transcluder/testcases Cheers! Sophivorus (talk) 12:20, 27 November 2021 (UTC)[reply]

Those working on the module need to decide whether input 3-5 should result in keys that are numbers while input 3 gives a key that is a string. It doesn't make sense for them to be different although quirks in the code could possibly make it work. To illustrate the difference:

t = {}
t[1] = 'My key is a number'
t['1'] = 'My key is a string'
for i, v in ipairs(t) do ... end  -- processes t[1] only

Johnuniq (talk) 23:13, 27 November 2021 (UTC)[reply]

@Johnuniq Thanks, I think I understand now and I agree. Do you think this change should suffice? Sophivorus (talk) 21:56, 29 November 2021 (UTC)[reply]
@Sophivorus: Sorry, I was wrong (or, I cannot now work out why I thought the code was mixing numbers and strings). It is true that mw.ustring.match will return strings given something like '3-5', but in for i = min, max do flags[i] = true end, i can only be a number and min, max will be automatically converted to numbers if possible. I think your code would work but I edited mw:Module:Transcluder to show what I recommend. There is very little practical difference but I think checking for input "3" before checking for "3-5" is cleaner. The line
min = tonumber(range:match('^%s*(%d+)%s*$'))
could simply be
min = tonumber(range)
if you don't need to worry about input like 1.2e1 (which tonumber regards as 12) or broken input like 1.2 (which would give a non-integer key). Johnuniq (talk) 06:17, 1 December 2021 (UTC)[reply]

some (hopefully) performance improvements

I have just created Module:Sandbox/Desb42/Transcluder to test out a number of ideas

  1. change getNamespaces to create a list of unique namespace names (as this produces a list of regexes to be processed one by one)
  2. change how template black listing is processed (check the first letter of the template name and only process those blacklist names that start with that letter). See function matchBlacklist and the reorganised xtemplates (currently within this Module)
  3. the function removeString could be called many times in getTemplates (this 'text' is potentially rebuilt many times). Instead of using gmatch to work through 'text', take a note of how far the search has progessed (using the '()' regex instruction), and copy each snippet up to this to build one text string at the end of the loop

Desb42 (talk) 16:27, 27 November 2021 (UTC)[reply]

@Desb42 Awesome!! Every performance improvement is welcome since some pages include many excerpts and we have encountered script timeouts before. Furthermore, I think excerpts will slowly become more and more common so performance improvements will become more and more of a deal. I looked at your code but I need to study it better (I'll do so this week, promise). Feel free to experiment as much as you want and then deploy your changes at https://www.mediawiki.org/wiki/Module:Transcluder so that we can test them against https://www.mediawiki.org/wiki/Module:Transcluder/testcases before deploying on the wikis. Lastly, please continue trying to make your code and comments as readable as possible so that future developers can understand it easier and are motivated to contribute! Kind regards, Sophivorus (talk) 11:47, 30 November 2021 (UTC)[reply]

MediaWiki update

I noticed the MediaWiki version is slightly out of data with the English WP version:[4] fgnievinski (talk) 03:01, 29 January 2022 (UTC)[reply]

bad argument #2 to 'sub' (number expected, got nil)

On Platform Engineering Team/SOPs/Intake Process the last section is failing with this error when attempting to transclude "intake" from another page. I traced this to the second sub call on line 443, which is string.sub(text, suffix). Perhaps someone here will know what causes this and how to repair the module and/or the page using it (If the latter, perhaps we can handle the issue and render a more helpful error). Krinkle (talk) 03:26, 5 February 2022 (UTC)[reply]