Jump to content

Module talk:Excerpt/Archive 3

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Lowercase sigmabot III (talk | contribs) at 06:34, 9 November 2022 (Archiving 1 discussion(s) from Module talk:Excerpt) (bot). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Archive 1Archive 2Archive 3Archive 4

Undefined referenes through excerpt

I'm trying to sort out why 59th Venice Biennale gives errors about the reference "Artnews" being undefined. The text (and the reference definition) are brought in by {{excerpt}} from List of national pavilions at the 59th Venice Biennale. The source artilce renders without error, and I've done some poking at it to try to figure out what's wrong ... but nothing makes a fix.

Anyone able to help me sort it out please? - Mikeblas (talk) 22:38, 25 April 2022 (UTC)

References from quoted articles have the article name prepended, in this case <ref name="List of national pavilions at the 59th Venice Biennale Artnews">. This is to prevent clashes when both articles use the same name (commonly <ref name=":0"> etc.) for different citations. Certes (talk) 23:28, 25 April 2022 (UTC)
Is that relevant here? I don't think it is becuawe 59th Venice Biennale doesn't invoke any references defined in the excerpted text. -- Mikeblas (talk) 01:06, 26 April 2022 (UTC)
Partly. The problem seems to be that some of the citations use just name="Artnews" but others use name="Artnews" group="". The name isn't being changed when it appears with group="". Pinging Sophivorus. Certes (talk) 09:59, 26 April 2022 (UTC)
I must be missing something; all the usage I see is <ref name="Artnews"/>, without a group name attribute (not even a blank one). -- Mikeblas (talk) 13:50, 26 April 2022 (UTC)
No, I was missing something. I put {{excerpt|List of national pavilions at the 59th Venice Biennale|National pavilions}} through Special:ExpandTemplates and noted that some ref tags had group="" and didn't get a prefix. I hadn't spotted that they were expansions of {{r}} in the original article. That no longer happens now that the article contains a workaround. Certes (talk) 14:44, 26 April 2022 (UTC)
Observation: it seems to work fine when it came across more named references, such as <ref name="Artnews"/>, but fails when it is trying to call the reference via {{r|...}} on the remote article. I replaced one instance of r with the declaration and the problem went away. For your immediate need @Mikeblas: you could do the rest of those, as in Special:Diff/1084771632. — xaosflux Talk 14:05, 26 April 2022 (UTC)
That works! Sorry, I thought I had already removed uses of the {{r}} tempplate -- except for the invocations which were commented away. So it looks like the problem is that {{excerpt}} is simply incompatible with {{r}}. -- Mikeblas (talk) 14:10, 26 April 2022 (UTC)

Template-protected edit request on 17 May 2022 - Rare section edge-cases

Note: This is actually a dual edit request, with the pages in question being both Module:Excerpt and Module:Transcluder. Im only going to do 1 edit request to avoid polluting anything.

This is a potential fix for the transcluder and excerpt not correctly recognising a section's name if it contained a #. Note that while the testcases for excerpt have shown this change should be perfectly fine, I would preferably like a comment from anyone who knows more about this module than me to mention if there was a reason for ([^#]*) over (.*) (if any) before a TE or other implements this, as it feels odd enough to potentially be intentional.

The relevant fixes are on the sandbox for each page. The main excerpt testcases can be found here and the examples of it successfully working with the # are visible on my test page. Note that /testcases3 is claiming that there are differences between the 2 versions on some examples. After using safesubst to get the exact content and using a comparison tool, no difference was noted. Im not entirely sure what has happened there.

Thanks. Aidan9382 (talk) 21:17, 17 May 2022 (UTC)

@Aidan9382 Done, thanks for the fix and careful request! BTW, the issue with /testcases3 has been there for a long time and seems caused by Template:Test case table Sophivorus (talk) 21:15, 18 May 2022 (UTC)
Thanks for implementing! That testcases bug feels very odd to be happening, but thanks for letting me know. Might take a look at it, who knows. Aidan9382 (talk) 04:31, 19 May 2022 (UTC)
@Sophivorus: Thanks for bringing that whole /testcases3 issue has been happening for a while thing. Been working on that module for the past hour or so, and it really needs some slight bug fixing. The UNIQ--QINU detection was the issue here (fixed now), but the nowiki and nowiki+ output options are apparently not doing so good and making the diff checker think this isnt right (Maybe its somehow skipping normalisation?), so i guess ive got something to focus on now. Aidan9382 (talk) 05:32, 19 May 2022 (UTC)
@Aidan9382 Awesome, I didn't know you were working on that module! The testcases look green already, so thanks, nice work! Sophivorus (talk) 11:26, 19 May 2022 (UTC)

Template-protected edit request on 29 May 2022 - Blacklist fix

In the /config, protection padlock (pp) templates are considered blacklisted. However, the current regex, [Pp]p%-.+, catches all protection templates except the core {{pp}} one. To fix this, change

'[Pp]p%-.+',  -- {{pp-move-indef}} etc.

to

'[Pp]p', '[Pp]p%-.+',  -- {{pp-move-indef}} etc.

The reason its over 2 different regexs is because simply doing [Pp]p.* might be a bit too aggressive, and its easier to stay on the safe side of things. Thanks. Aidan9382 (talk) 21:18, 29 May 2022 (UTC)

 Done. P.I. Ellsworth , ed. put'r there 00:18, 30 May 2022 (UTC)

Recent change?

Has there been a recent change to the module - I'm noticing pages that previously worked are once again getting the error "The time allocated for running scripts has expired" --Find bruce (talk) 22:52, 28 May 2022 (UTC)

Always link to an example page when you are reporting a possible problem. – Jonesey95 (talk) 23:50, 28 May 2022 (UTC)
Thanks for the reminder. An example is Electoral results for the district of Bathurst, but the problem exists in virtually every article in Category:New South Wales state electoral results by district. You will note that every election prior to 1930 uses transcluded sections rather than the template. This was done in December 2021 [1] to deal with the error. Find bruce (talk) 02:21, 29 May 2022 (UTC)
@Find bruce: This issue has existed for a while - ive noticed quite a few election articles that have had this error for a while. As for recent changes, no. This module and its main associated module (Module:Transcluder) have had no significant changes that would affect performance. Pages can just sometimes take longer or shorter to load each time you save a page with Excerpt. Hope this helps. Aidan9382 (talk) 03:58, 29 May 2022 (UTC)
Thanks @Aidan9382: - the issue never arose when I first started using the module in June 2020 & first appeared around August last year. I edited all 350 articles to implement a work around. I will wait to see if anyone has a solution, otherwise I will simply give up on the module & go back to my previous template approach for all but the most recent elections - it's not as elegant & there are traps for the unwary that the module avoids, but it works with fewer resources. Find bruce (talk) 05:48, 29 May 2022 (UTC)
@Find bruce: There seem to have been quiet a few changes to the Transcluder last year august, but I don't see any differences that could be major. If your having to implement a fix for 350 pages, then I think it might be time this module set had a performance look-over, as that could cause a bit of confusion for passing editors. I'll take a look, see if I can find any performance improvements, but for now, I guess the template approach will have to do. Aidan9382 (talk) 05:59, 29 May 2022 (UTC)

@Find bruce: - Just came across an interesting discovery. It turns out that one of the main expensive calls (as in a third of the entire lua runtime) is to do with it trying its hardest to find files. Of course, in this situation, files are not gonna be relevant (at least I doubt it). Adding |files=0 to all the excerpts on a page should significantly improve the performance, and infact should be good enough to remove lua timeout errors in 99% (or 100% if we are lucky enough) of scenarios. Give it a shot, tell me how it goes. Aidan9382 (talk) 07:45, 29 May 2022 (UTC)

I haven't investigated the current code but it may be worth looking at Transcluder.removeNonFreeFiles(). A previous version of the module (now forked for use in portals) spent a lot of its time opening every file's rationale to check whether it could be included, and there was no obvious way to do that more efficiently. Certes (talk) 18:28, 29 May 2022 (UTC)
@Certes: While I could easily see that becoming a quick problem (just the sound of it makes me concerned), that unfortunately isn't the issue here. Transcluder.removeNonFreeFiles is only ran if any files are found. The issue here is the fact its going to extents much further than whats worth it to attempt to find files unless explicity told not to, and that is leading to another expensive Transcluder.get call. I will note that there may be reason for concern, as I've just noticed the code only attempts to remove Non-free files if found in blacklisted templates (Not a common scenario. I can explain further if required). Do you think this is worth looking into as another potential issue with this module? Thanks! Aidan9382 (talk) 18:36, 29 May 2022 (UTC)
@Aidan9382: I think the "blacklisting" is just another way to specify files, e.g. files=-2 means all files except the second, and behaves just like files=1,3-999. The check looks unchanged from the portal version except that it's also checking Commons, which may be unnecessary as Commons doesn't accept non-free content. It would be nice if we had an inexpensive isFileNonFree method, but sadly we do need to read the text; people got very upset and started deleting transclusions when we didn't. However, as you say, this may be a red herring as that code only runs when a file is found. Certes (talk) 19:38, 29 May 2022 (UTC)
Thanks @Aidan9382: I have added |files=0 to all the excerpts in Category:New South Wales state electoral results by district & it appears to have fixed all of the current timeout issues. Find bruce (talk) 08:56, 30 May 2022 (UTC)

Expensive calls and why excerpt runs so slow

During some recent testing, I was giving a prod at some of the parts of excerpt, and I noticed a section to do with desperately trying to grab files from anything we might have missed if a file cant be found initially (Lines 152-173). After excluding the content during some testing and some show preview usage, I noticed the execution time dropped by about 42% (Average 7.4438s -> Average 4.304s). As expected, the actually troubling part of that is probably the second Transcluder.get() for just templates. However, I am running very short on solutions, so I was wondering: Could someone give their opinions on:

  • The usefulness of having this to run by default,
  • The usefulness of this code section in general (why only try if we couldnt get files normally? Why does this make the infobox invaluable?), and
  • Any solutions that could either prevent calling it too often or improve the calling of it

I have a solution idea in mind, however the elegance of it is very low and is quite a hacky idea that would probably be a bad move, so input from others would be nice. Aidan9382 (talk) 18:26, 29 May 2022 (UTC)

@Aidan9382 Amazing find!! The usefulness of the feature is basically that many articles (perhaps even most) have their main image in the infobox, so this code tries to salvage it. It's inefficient, messy and even unreliable, I agree, but also useful and perhaps even expected at this point. That being said, I think I have an idea on how to fix it! I tried to implement it here but I couldn't quite finish (it's late and I can't find a good way to debug!). Basically the idea is to avoid entirely the second call to get() which essentially fires a second request and is probably the main cause of the performance loss. @Certes The call to removeNonFreeFiles also fires a request and may be causing a performance loss, so I'm thinking that given that all the trouble you had was in portals, we can maybe disable this feature on the mainspace by default, and only enable it on a case-by-case basis when required, thus avoiding the performance loss. Sounds like a plan? Sophivorus (talk) 21:26, 30 May 2022 (UTC)
@Sophivorus: Disabling the check in mainspace is worth a try. We need the current behaviour (in Module:Excerpt/portals, not Module:Excerpt) for portals because many of them display an excerpt from one article picked randomly from a long list, and no one checks them all to see if one has a blurry movie still that will bankrupt Hollywood if displayed. Certes (talk) 21:38, 30 May 2022 (UTC)
@Aidan9382 @Certes I think I nailed it (see Module:Excerpt/sandbox). I still have to fix a few minor issues, but would you mind checking if the effort was worth it, Aidan9382? (I have no idea know how to benchmark Lua modules.) Thanks! Sophivorus (talk) 23:23, 30 May 2022 (UTC)
@Sophivorus: I don't know the current code well enough to review it properly but that change to checking freefiles looks like what you need. (I'm interpreting that flag as "limit to free files only?".) For a quick benchmark, I preview with a relevant page, view the source code (Ctrl-U on Firefox) and search for "NewPP limit report". That shows how close we are to various limits (rarely a problem except in portals) and gives timings for various templates and Lua functions. Certes (talk) 23:36, 30 May 2022 (UTC)
@Certes Awesome! Well, judging by the "NewPP limit report" for Electoral results for the district of Bathurst it seems the sandbox version is far superior (~ 6 secs vs ~10 secs)! I'll wait for Aidan while I fix the remaining issues and deploy in a day or two, cheers!! Sophivorus (talk) 00:12, 31 May 2022 (UTC)
@Sophivorus: I've gone ahead and done some comparisons between Except and Excerpt/sandbox on a test page, and I can confirm the /sandbox version fixes the issue quite well, bringing my test page from an average of 8s to an average of 6s (The results are being really inconsistent so the improve is probably better than 1/4th). My only concern with doing Transcluder.getTemplates(excerpt) is that, if the excerpt is set to only=files (or similar), the excerpt will include no templates, and therefore wont find any infoboxes, but this is more of a rare scenario. While you could technically grab an unfiltered excerpt of the section and filter out what the user wants later, that also kinda nullifies a significant part of Transcluder's get(), so I'm wary of trying to implement that. Other than that though, this looks like its doing a fine job. Nice work! Aidan9382 (talk) 05:06, 31 May 2022 (UTC)
My theory seems to be correct, but it also seems to not be removing non-free files. Just take a look at the differences from Template:Excerpt/testcases. Also check Template:Excerpt/testcases2 for a weird template difference at the bottom. Note: If the content side by side looks the same, but its marked as different (yellow, uncollapsed), then its a formatting difference. Ignore it, consider it the same. It probably is. Aidan9382 (talk) 13:23, 31 May 2022 (UTC)
@Aidan9382 Thanks for the insights! I was able to fix the issue at the bottom of /testcases2, here. As to the non-free files, as I mentioned to Certes above, it's just a change in the defaults. From here on non-free files will have to be filtered explicitly by setting freefiles=yes. This is because 99% of excerpts don't have issues with non-free files but each check for non-free files is expensive. Finally, regarding the issue in /testcases where only=files no longer gets files from the infobox, I agree this is an edge case not worth the trouble. If no more issues or concerns arise, I'll deploy in a day or two, cheers! Sophivorus (talk) 22:38, 31 May 2022 (UTC)
@Aidan9382 @Certes Deployed! I'll stay around for some minutes just in case, thanks for the awesome work and support! @Find bruce This new version should make setting files=0 unnecessary. Thanks for bringing up this issue in the first place! Sophivorus (talk) 22:07, 1 June 2022 (UTC)