Module talk:Convert/Archive 2
![]() | This is an archive of past discussions about Module:Convert. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 |
Module:Convert/makeunits
I've done some major refactoring on Module:Convert/makeunits and it now runs on wiki (no stand-alone computer needed, although I have made a system to invoke the new makeunits for easier testing on my local computer). As noted on the doc page, there is some weirdness in the output that can be worked around by using "subst:". Someone may like to experiment and put the newly defined units in Module:Convert/data. BTW I have reported the bug currently visible in the archives box on this page. Johnuniq (talk) 10:48, 7 October 2013 (UTC)
- Great! May I suggest we start with the habit to put any new version into Module:Convert/data/sandbox first, and test the /sandbox? -DePiep (talk) 11:10, 7 October 2013 (UTC)
Status
I've seen this project working in the shadows, and I wonder what the status is, and if it's going to gonna get ready for prime time any time soon. →AzaToth 19:27, 14 November 2013 (UTC)
- I'm a bit embarassed about the status as the module has been nearly ready for three months. The module actually is ready (and has been used at the Bengali wiki since June, see here), but I've wanted to do a couple of other things, and I wanted to wait for some personal quiet time before asking for {{convert}} to be changed to use the module. I have no plans to change anything in the module—the "other things" involves setting up a more extensive set of testcases and finishing some documentation. The "quiet time" is because quite a bit of work will be required when the module is live, and there's sure to be a couple of weeks of problem fixing (defining missing units; handling reports of breakage; changing the syntax for some features such as spelling which the module does differently). Unfortunately real-life stuff has got quite hectic, and while I've had time to respond to a couple of things on my watchlist, I've had hardly any thinking time for a couple of weeks. In a week I will have a good idea of how off-wiki problems are going. Come to think of it, I might never be "ready" and the best thing might be to just switch the damned thing on and see what happens... Johnuniq (talk) 08:45, 15 November 2013 (UTC)
- I would suggest release it now; If there are minor problems, others can probably fix it if you are not around, and if it crashes totally, then I assume it can be reverted. →AzaToth 21:45, 19 November 2013 (UTC)
- Johnuniq is saying that he/she will need personal quiet time when this one goes live: there could be major problems popping up. I remember when citation /CS1 went live last April, it needed 15 updates in the first month (not that number is important, but the thinking & coding it represents). Leave it up to J I say. -DePiep (talk) 06:11, 20 November 2013 (UTC)
- Thanks DePiep: you have read me accurately. However, I'm glad to say that the last couple of days have gone well here, and my hectic period has passed for the moment.
- Thanks AzaToth: you have pushed at exactly the right moment to get me going again, and I've spent the last couple of days checking some matters that were outstanding. I will recommend the switch at Template talk:Convert#Request to switch to Module:Convert. Johnuniq (talk) 09:28, 20 November 2013 (UTC)
- Johnuniq is saying that he/she will need personal quiet time when this one goes live: there could be major problems popping up. I remember when citation /CS1 went live last April, it needed 15 updates in the first month (not that number is important, but the thinking & coding it represents). Leave it up to J I say. -DePiep (talk) 06:11, 20 November 2013 (UTC)
- I would suggest release it now; If there are minor problems, others can probably fix it if you are not around, and if it crashes totally, then I assume it can be reverted. →AzaToth 21:45, 19 November 2013 (UTC)
Feature request: configReport feedback
I propose the module get s a "config feedback" public function. Its job is to return what configuration settings the module uses. It would be used in document pages & sandboxes (not in mainspace). Demo: Say the module is live, and calling template has this code:
- {{Convert}} {{#invoke:convert|convert| numdot = , }}
Then on Template:convert/doc we ask for the config overview:
- {{#invoke:convert|configReport| numdot = , }}
The report would return:
- Incomplete, mockup list
Variable | Value | Note |
---|---|---|
Configuration of | #invoke:convert | |
Template invoking | Module:convert/doc | (template calling configReport) |
Module invoked | Module:convert | |
Parameter(s) set in invoking template: | |numdot=, |
|
|is_test_run= |
false |
|
Units data page used | Module:Convert/data | |
Text data page used | Module:Convert/text | |
Value applied for numsep | , | Warning: numdot=numsep |
Value applied for numdot | , | |
Value applied for max_sigfig | 14 | |
Value applied for lang | (none) | implies en for current wiki |
... | ... |
Rationale.
A lot of configs variables are set, in multiple locations (template, data, text, in module code various places, ...). Also, many are interacting (e.g., sandbox, testing, and language).
Some settings are hardcoded in the main convert function. Also there is the logic to be reporduced in the function (prevalence and logic of settings, e.g. overwriting defaults).
These has to be copy-pasted or hardcoded into the configReport function; this requires attention. And this configReport also must be called from a template that has copy-paste all parameters from the researched one, another manual action = risk.
On the other hand, it gives an editor an overview & a check. It is self-documenting the configuration.
Notes: This idea is working in Module:RailGauge to check the datafiule. (coding: User:Mr. Stradivarius). -DePiep (talk) 06:55, 20 November 2013 (UTC)
- Good idea, although I'm not sure how much use it would get. Rather than a new function, it should be pretty easy to have a new option (perhaps
report=on
inside a normal {{convert}}) which would display the table instead of the normal output. I don't think the first three lines are achievable (configuration of, template invoking, module invoked). Johnuniq (talk) 09:56, 20 November 2013 (UTC)- Yes, better: calling through function "convert" prevents having to copy-paste the very settings & logic one wants to check.
- Self-identifying the calling template &tc. was a long shot - failed.
- Would not be used much, but invaluable in sandboxing. -DePiep (talk) 10:04, 20 November 2013 (UTC)
- Good idea, although I'm not sure how much use it would get. Rather than a new function, it should be pretty easy to have a new option (perhaps
Use of Lua Convert fork
Currently, the Lua script version of Template:Convert (as [[Module:Convert) has been treated as a sandbox version, with Template:Convert/sandboxlua, but installation for wider usage is being discussed. However, despite fears of forking the template, the Lua version has diverged as a fork with different features, incompatible with the markup-based Convert. Originally, {convert} was a simple gateway template to hundreds of subtemplates, which each could convert more than just measurements, but also dates and times which the Lua fork does not handle. Examples of non-numeric conversions:
- {{convert|25 November|date|day}} → 25 November (day 329)
- {{convert|3:45 pm|time|24}} → 3:45 pm (15:45)
The divergence of the Lua version, as a fork, began reasonably as an attempt to "improve" some glitches in the protected Convert subtemplates (now being fixed by the authorized wp:template editors), where the rewrite in Lua would make results consistent. Hence, the forking was a natural follow-on to also add new features, such as 3-part combinations in Lua:
- {{convert/sandboxlua |99|in|ydftin}} → 99 inches (2 yd 2 ft 3 in)
- {{convert |99|in|ydftin}} → 99 inches (2 yd 2 ft 3 in)
Similarly, the Lua version has other new features, but they come at a high price: as a gargantuan Lua module which would trigger the reformatting of all articles (552,000) using Convert (if any feature were altered except new units in Module:Convert/extra), whereas the original markup-based Template:Convert has allowed numerous feature changes to affect only a few dozen or hundred pages at a time. Efforts to keep the 2 versions synchronized have been delayed for weeks or months, and I think we will need to release the Lua version, as a beneficial fork, but under separate template name(s). Also, there have been calls for a "feature freeze" to transition entirely to Lua, but that implies the old features will be dropped (as incompatible), while a better plan would be to accept new features until a 3-month period of no new requests, allowing the efficient markup-based Convert to reformat only the few articles using a new feature each day (not reformat 552,000 pages for each Lua edit). Meanwhile, I have created Template:Convert/q, to run the rapid Lua version, for rare cases beyond the typical 3-7 conversions, in pages which would contain hundreds of {convert/q}. New features could be added, to either version, and the need for identical features could be assessed in each case, as there are no "consistency police" forcing similar templates to function with lock-step operation. Forks are needed at times: some prefer vanilla and some prefer chocolate, but "chocnilla" is unlikely to satisfy everyone. I think we could carefully expand the use of the Lua version, such as inside Template:Height (using ft/m units), but keep both versions and not force a VE-style upset where "edit" suddenly ran a different interface putting "<nowiki>" tags around "[__]". Because the markup-based Convert is relatively fast (over 40 per second), there is no critical need to use Lua for 7-conversions-per-page. Perhaps discuss these issues below, for use of Lua in {Height} or other templates, and then announce plans at those other templates. -Wikid77 (talk) 07:48, 25 November 2013 (UTC)
- WP:FORUMSHOPPING. Wikid, why did you not answer your {{convert/q}} contradiction where it was put to you? Why do I not get the impression that you are helping to improve {convert} in? Were was your collaborative input for the module setup? For months I read you objections thought of afterwards. I add this for me. I find it strange to see this behaviour by Wikid77. I know Widid as a very smart & good editor; this opposition from behind does not seem to fit character. That is not a nice experience. -DePiep (talk) 08:01, 25 November 2013 (UTC)
The old templates (with more than 3000 subtemplates) have many problems:
- See a recent bug report at Template talk:Convert#Dual convert/spell.
- It's a little unfair, but here is what happens if weird mixtures of options are fed into the old templates—see the "Convert (Live)" column.
- Many more examples of template bugs and inconsistencies are here.
The module is live at bn:Template:Convert and simple:Template:Convert and commons:Template:Convert with no apparent problems. A period of maintentance will be required after switching the templates to use the module, but there is every reason to believe that the module will reduce the number of problems currently visible in articles.
No one has identified any units or options that are used in articles and which do not work in the module.
Discussion of whether to switch the template to use the module should be at Template talk:Convert#Request to switch to Module:Convert. Johnuniq (talk) 09:41, 25 November 2013 (UTC)
Data Organization
Has any consideration been given to breaking up the /data into subpage groups based on base unit (or theme)? The data page is pretty overwhelming which is going to discourage editing, and since it includes all units, it is going to lead to lots of pages being marked for update in the event of any little change. It would seem reasonable to me to break it up into subgroups, so that /data tells the Module which subpage to look at e.g. data['K'] -> 'temperature' and the actual record is at data/temperature['K']. You'd still have to edit /data if adding new units, but you wouldn't have to refresh everything when changing just one unit group (e.g. temperature).
On a similar theme, I think any special cases requiring dedicated functions (e.g. "mach", "hand") should be moved to an auxiliary module and not be part of the core. The core is complicated enough without also being burdened by the special cases. Ideally one might consider a plugin framework such that the /data entry could specify a module and function to use when converting to and from the special cases, which would make it easier to support additional such units in the future, without having to write them into the core. Dragons flight (talk) 20:31, 3 December 2013 (UTC)
- Agreed, and my first rough design was to have a separate table for each unit type, and a meta table to determine what type of unit was used in a convert. If a convert has, say, "m" as the input unit code, the module would look up the fact that "m" has type "length", and would then load the submodule which has details on units of that type. The output unit would be in that same submodule, or would be an error. I certainly think that the built-in units (Mach and hand, with a couple more wanted) should be in a submodule. I started with them in the main module because it was necessary to first get them working in order to see what hooks into a submodule would be needed (I like top-down design, but my experience is that it has to be tempered with bottom-up).
- A contrary view (one that supports the single giant data module) is that people should get the unit definitions right, then we can stop fiddling with them. After that, edits to the unit definitions should be rare, and there could be, say, monthly updates to incorporate tweaks made in the sandbox. Only if that model is found to be a problem would there be a need to introduce the complexity of breaking the giant data module into submodules (one giant module is a problem, but having one hundred submodules would also be a problem). Another factor is that I want to make the modules reasonably simple for use on other wikis. If there are more modules, that's trickier, and it's more difficult to check that the modules at enwiki agree with those on another wiki.
- You are aware of Module:Convert/extra? New units can be added there, and they are instantly usable. The "extra" module is only loaded if a conversion involves unknown units, so changing it involves updating only a handful of pages. Also, editors must not change Module:Convert/data except to replace its entire contents with the output from Module:Convert/makeunits. A single (giant) page defines all units (here), and that is reasonably easy to edit, and the output from makeunits (here) can be copied to the sandbox (here) for testing. Johnuniq (talk) 21:23, 3 December 2013 (UTC)
Useless digits
Under Fahrenheit, you have the scale as "0.555555555555555580227178325003".
Obviously this should be 5/9 = 0.555555555555555555555555...
I would assume the bit at the end of your factor is a bit of garbage caused by converting to/from a digital representation at some point. I would suspect that some of the other very long number strings also have garbage at the end, but I haven't checked them. As a practical issue, an error of 3 parts in 1016 is completely negligible in any context that Wikipedia editors are likely to use, but it might be nice to avoid putting strange gunk in the files if possible. Dragons flight (talk) 20:57, 3 December 2013 (UTC)
- I noticed the weird numbers in the scales! The scale for Fahreneit is entered as "5/9" (here), and that wikitext is processed by Module:Convert/makeunits which generates the silly string above—it is the output from
tostring(v)
. - Interestingly, on my local computer, the same code produces "0.55555555555555556", and I was rather shocked when I saw the output from the servers. I pondered what to do, perhaps use
string.format('%.14f', v)
. But that would turn "5" into "5.00000000000000", so I would need another (easy) step to remove the trailing dot and zeroes. And "%.14f" is totally wrong for small scales—"%.14g" would be needed, and that produces more gunk. Finally, what if the servers can actually get more than "%.14f" precision? What if an upgrade in three years can manage "%.20f"? In the end, I gave up and just usedtostring(v)
which I have found gives good results. - Any suggestions are welcome. The actual line of code is, of course, more complex:
-- Replace results like '1e-006' with '1e-6'. v = string.gsub(tostring(v), '(e[+-])0+([1-9].*)', '%1%2', 1)
- Originally I just used the defined scale in the data module, so if an editor enters "5/9", the data module would say "scale = 5/9". I decided that each such expression (there are lots) should be evaluated by makeunits so the servers do not have to do a thousand evaluations just to load the data module. Johnuniq (talk) 21:51, 3 December 2013 (UTC)
- No the part that generates the long run of garbage is Module:Convert/makeunits:
- local result = string.format('%.30g', value)
- Simply replacing that with:
- local result = tostring(value)
- Should be sufficient to get the decimal representations to terminate at a more normal place. Dragons flight (talk) 02:02, 4 December 2013 (UTC)
Yikes, I've totally misread my code, thanks. Actually, at the moment I cannot recall much about that issue except that it caused a fair bit of angst at the time (nearly nine months ago: diff). As I recall, my thought processes were along the following lines. I wanted to remove expressions like "5/9" by evaluating them so the server does not have to do pointless calculations when the data file is loaded. Therefore each scale has to be written like "scale = 1.234". But the resulting string should not lose precision that may be available on the server that ultimately runs the convert module. I found some cases where '%.30g' gave extra digits compared with tostring, and those digits (on my computer) appeared to be give more precision without too much junk. The "30" is more than what is available in an attempt to say "give as much precision as possible".
I just did this experiment (400/121 is the scale for pyeong):
# In ilua (interactive Lua). > p = function(v) print(string.format('%.30g\n%s', v, tostring(v))) end > p(5/9) 0.55555555555555558 0.55555555555556 > p(400/121) 3.3057851239669422 3.3057851239669 # In Python (exact integer arithmetic; "L" = "long integer"). >>> 55555555555555558 * 9 500000000000000022L >>> 55555555555556 * 9 500000000000004L >>> 33057851239669422 * 121 4000000000000000062L >>> 33057851239669 * 121 3999999999999949L
The above shows that the extra digits (on my computer) are not entirely junk. By contrast, is seems clear that the server is producing junk.
I think I'll put your proposal into makeunits and make a sandbox comparing the two cases (the scales produced now, versus the scales that "tostring" would produce). I'll have to do that later. Johnuniq (talk) 03:44, 4 December 2013 (UTC)
Experiment
@Dragons flight: Sorry this is a bit lengthy, but I wanted to record my thoughts. I edited Module:Convert/makeunits so an argument to the invoke controls whether it uses "%.30g" or "tostring".
- Module talk:Convert/makeunits uses
{{#invoke:convert/makeunits|makeunits}}
("%.30g"). - User:Johnuniq/sandbox3 uses
{{#invoke:convert/makeunits|makeunits|tostring=yes}}
("tostring").
To see the difference, I copied the displayed text to local files, and did a local diff. I was going to post the local files as pages so others could compare them, but the data is pretty useless because it just shows the obvious–using "%.30g" gives a lot of extra digits, most of which are junk. What's really needed is to check the outputs from converts that use a lot of significant figures, when using the different scale values. I've done a little of that on a local computer, and have noticed one irritating difference. Lots of other differences may be apparent if a systematic test were conducted, but the big question is what value of "sigfig" or "precision" to use in such a test. In the vast majority of cases, six significant figures is more than would be meaningful, but might there be some cases where 12 figures or more is needed?
The difference I noticed follows:
Convert knots to kilometers per hour using the module. {{convert|37.5|kn|km/h|1}} outvalue = invalue * (inscale / outscale) Using "%.30g": "kn" scale = 0.514444444444444481945311054005 "km/h" scale = 0.277777777777777790113589162502 outvalue = 37.5 * (0.514444444444444481945311054005 / 0.277777777777777790113589162502) = 69.45 = 69.5 (rounded to 1 digit) Using "tostring": "kn" scale = 0.51444444444444 "km/h" scale = 0.27777777777778 outvalue = 37.5 * (0.51444444444444 / 0.27777777777778) = 69.449999999999 = 69.4 (rounded to 1 digit)
The above calculations can be confirmed by editing any module. In the "Debug console" at the bottom, paste the first of the following lines, press Enter, and wait a few seconds for the result to appear; repeat for the second line.
= 37.5 * (0.514444444444444481945311054005 / 0.277777777777777790113589162502) = 37.5 * (0.51444444444444 / 0.27777777777778)
The current template gives 69.5, as does the module using "%.30g":
{{convert|37.5|kn|km/h|1}}
→ 37.5 knots (69.5 km/h){{convert/sandboxlua|37.5|kn|km/h|1}}
→ 37.5 knots (69.5 km/h)
I find it a bit irritating that switching the module from "%.30g" to "tostring" would mean the module gets 69.4. I understand that the result is just a side effect of how floating point arithmetic works, but it's irritating! Using tostring means Module:Convert/data would be "cleaner" (and nearly 2000 bytes shorter), but it also means there would be some tiny loss of precision in some cases. My inclination is to put up with the silliness of the excess digits and not worry about changing the module to use "tostring". Johnuniq (talk) 09:21, 5 December 2013 (UTC)
- If you change "%.30g" to "%.16g" (or maybe it's 17), I think you get all of the precision that actually exists without appending a long string of garbage. A floating point double can only represent 16 decimal digits after all. For whatever reason, tostring seems to give 14 digits.
- Trying with 16 digits:
= 37.5 * (0.5144444444444444 / 0.2777777777777778)
- Seems to give what you expect in the console. Dragons flight (talk) 18:38, 5 December 2013 (UTC)
Alternative Suggestion
When the scale value in Module:Convert/documentation/conversion_data/doc is made up on one number divided by another, instead of pre-calculating it, why not store both values within two params, scale & scalediv, for example:
["km/h"] = { name1 = "kilometre per hour", name1_us = "kilometer per hour", name2 = "kilometres per hour", name2_us = "kilometers per hour", symbol = "km/h", utype = "speed", scale = 10, scalediv = 36, default = "mph", link = "Kilometres per hour",
Would solve the rounding issues but would mean changing the main module. -- WOSlinker (talk) 11:31, 5 December 2013 (UTC)
- I restored a bunch of text above that I had inadvertently deleted earlier.
- That's a great idea. Later I'll ponder what would be involved as there are several quite weird expressions currently used in the scale definitions. One concern is that the data module is already abusing the server, and I'm reluctant to add a lot more fields—but there probably aren't all that many. I'll check that later. Johnuniq (talk) 11:55, 5 December 2013 (UTC)
- Commonly called numerator⁄denominator or num⁄den. -DePiep (talk) 13:01, 5 December 2013 (UTC)
Conclusion
Thanks all for the suggestions. I have done some experiments to show that using "%.17g" (instead of "%.30g") as recommended above by Dragons flight does the job. Interestingly, using "%.30g" on my computer gives the same results as "%.17g" does when run on the server (but "%.30g" on the server generates excess digits in many scales). Module:Convert/data now has the cleaned scale values, and my tests show they work ok.
I did some work on WOSlinker's idea because that appeals to me. The idea I was following was to have makeunits pass either "scale" as a number, or "_scale" as a string with a simplified expression. Then, I was going to put a simplified evaluation function in Module:Convert to generate the numeric scale from the string _scale which would have text like "5/9". I wrote a very short piece of code to do the evaluation, but then I came to my senses and adopted the clean approach of just using "%.17s". The reason I was thinking of evaluation expressions is that such a system would automatically adapt to any server upgrades that may occur in a decade when floating point produces 50 significant digits, or whatever. However, simplicity is good. Johnuniq (talk) 03:10, 6 December 2013 (UTC)