Help:Searching/Regex/Sandboxing
An ad hoc sandbox is any edit page you don't save, but do use for previewing. The intent is to employ a search link on itself. Then the search link template acts as a doorway by helping to develop a database query before running it on the wiki, and can also be used to share such discoveries.
Regular expressions are little computer programs, so it is characteristic of regex searches that they must always be tested to achieve there potential precision and thoroughness. But only a few of these intensive searches are technically able to run at a time against the database. A sandbox minimizes your footprint, and guarantees that you will never run an untested regexp on every namespace in the wiki, even if your default search would let you do that. Use of a sandbox enables the smallest possible footprint by using filters to limit the search domain. The first domain it targets its own page in an ad hoc sandbox. Once your regexp pattern is honed, you can safely increase the search domain.
Sandboxing procedure
Regexp searches are restricted on the server, so the sandbox method reduces the regex search footprint by using the prefix:{{FULLPAGENAME}} filter every time. The prefix: filter can also filter a namespace by specifying that only page names that start with given letters are searched.
When using insource:/regex/, always use a filter. Filters include:
- intitle:
- incategory:
- hastemplate:
- prefix:
- linksto:
- and another insource:
Namespace plus pagename equals fullpagename. Knowing this you can adjust your Prefix parameter. Prefix itself doesn't always need the fullpagename this sandboxing procedure starts with. Prefix does need need a full namespace name, and a colon; then it accepts the beginning letter(s) of the pagename if you want to limit the search domain from a full namespace.
The procedure here is an iterative, read-evaluate-modify cycle.
- Find an existing fullpagename with the wikitext instances you are interested in targeting. Or create one yourself, and save it to the database so the query will find it.
- Open the wikitext, and enter a search link with an insource parameter and a prefix parameter with the fullpagename.
- Show Preview, and check the pattern. Activate the search link. Note the bold text in each match.
- Go back in your browser. Modify the regexp. Cycle. (Or don't go back, you may need to majorly reset at the complete query.)
- Expand the search domain, and test the accuracy of those results. You can trim the number of the results, at the complete query, by using only the first letter(s) of the pagenames in a namespace.
Caveat emptor: if you change the target, you'll have to save and purge, but not if you just change the regexp.
Examples
As an ad hoc sandbox, you can show the wikitext of a section like this, already saved in the database, with template calls on it, modify some patterns, do a Show Preview, and see what matches when you click on the newly formed "search the database" link, all quite safely, and without changing a thing in the database.
The template calls that produce "1 ft/s, 2 sq ft, 3 m/s, 4 m*s-2, 5 ft.s-2, 6 °C/J, and 7 J/C" appear in the wikitext of this section like this:
- {{val|1|ul=ft/s|fmt = commas}}
- {{val|2|u=ft2}}
- {{val|3|u=m/s| fmt =commas }}
- {{val|4|u=m*s-2}}
- {{val|5|u=ft.s-2}}
- {{val|6|u=C/J}}
- {{val|7|ul=J/C}} → 7 J/C
Note how the above targets are |numbered|, then click on these links.
Query | Search link | Answer |
---|---|---|
Q1 Does this page actually employ template Val (outside any noinclude tags or comments)? | {{search link|hastemplate: VaL prefix:Help:Searching/Regex/Sandboxing}} → hastemplate: VaL prefix:Help:Searching/Regex/Sandboxing
|
A. Yes, because its title shows on the search results. |
Q2 Does this page use Val's fmt parameter? | {{search link|insource:/=fmt/}} →
|
A. Look for 1 and 3 in the search results in bold text. |
Q3. Who uses u=ft OR ul=ft? (a one letter diff) | {{search link|insource:/=ul?=ft/}} →
|
A. Look for 1, 2, and 5 in bold text.
|
Q4. AND of these, who also uses fmt=commas after that? | {{search link|insource:/=ul?=ft.*commas/}} →
|
A. No context shown, but article title is shown. A half a Bug? |
Who has one space before commas? | {{search link|insource:/=. commas/}} →
|
A. 1 but not 2.
|
Q5. Who uses either ul?=ft OR fmt=commas | {{search link|insource:/=(ul?=ft|=co)/}} →
|
A. 1, 2, 3, and 5.
|
Q6. Who uses ft or m, in |u= or |ul= ?
|
{{search link|insource:/= ul?=(ft|m)/}} → m)/
|
A. 1, 2, 3, 4, and 5.
|
Q7. Who uses . or * in the unit code? | {{search link|insource:/=(\.|\*) /}} → \*) /
|
A. 4 and 5. |
Who uses a pipe? | {{search link|insource:/=\| /}} → /
|
All of them |
Q8. Who uses / or - within the |u= or |ul= paramter?
|
{{search link|insource:/=ul?=[^|}]+(\/|-)/}} → -)/
|
A. 1,3,4,5,6 and 7.
|
Q9. Where is Val used in the template namespace with u or ul? | {{search link|insource:=ul}} →
|
A. In the 15 or so articles listed.
|
Q10 | {{search link|insource:/[^.0-9][0-9]\|-\| prefix::/|Who converts single digits using a dash?}} → converts single digits using a dash?=1&fulltext=Search -\
|
A Around 11. |
In Q2, notice how the MediaWiki software ignores the spaces around parameters, but how in Q4 the same MediaWiki software processes the spaces inside parameters. Q2 might have been solved with a plain insource:val fmt search because "fmt" and "val" are whole words, and fmt is rarely seen apart from inside Val. How about hastemplate:val insource:fmt?