Jump to content

Module:DecodeEncode/doc

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by DePiep (talk | contribs) at 23:56, 25 December 2020 (Decode (© → ©)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Implements Lua functions mw.text.decode, mw.text.encode in a module.

{{#invoke:decode|s=Source text}}Source text

See List of XML and HTML character entity references.

Decode (© → ©)

Decodes Named Entities from entity name into a regular (unicode) character:
© --> ©
> --> >

All welldefined named entities are decoded (HTML Named character references, formally: as defined in the PHP table).

A regular, rendered sentence:
"At 100 °F, & with a "burning" sun above, we walked"
In code:
"At 100 °F, & with a "burning" sun above, we walked"-- in code
Processing:
{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we walked}}
At 100 °F, & with a "burning" sun above, we walked -- In code: no named entities

Decode a reduced set only

By setting |subset_only=true, only these five entity names are decoded: '&lt;', '&gt;', '&amp;', '&quot;', '&nbsp;' (that is, into '<', '>', '&', '"', ' ').

Note: There is a difference with the relevant Lua parameter. (This only concerns your task if you also work directly with the Lua mw.text.decode function). Lua documentation defines parameter |decodeNamedEntities=, having this effect: when omitted or false, only the reduced set of entities is recognized and decoded. This use of 'false' is inverted in using |subset_only=: |decodeNamedEntities=false = |subset_only=true.
Also, this module ignores the "omitted" logic: |subset_only= should be set explicitly to 'true' to be effective.

Encode (© → &copy;)

Function encode encodes some entity-named characters into that name (for example: &&amp;).
A regular, sentence:
"At 100 °F, & with a "burning" sun above, we walked"; to process:
{{#invoke:decodeEncode|encode|s=At 100 °F, & with a "burning" sun above, we walked}}
→ "At 100 °F, &amp; with a &quot;burning&quot; sun above, we walked" -- in code
Per Lua documentation, only a small set of characters is processed. The charset can be set (|charset=). Format
Use escape character is '\' (backslash; not "%" then). Example: |charset=<>" \'& (the default), |charset=<>\|\°\"\'\&©; characters not in the default will be replaced by their decimal entity: ©&#169; (hexadecimal number, not decimal nor named &copy;)

Template

As of Dec 2020, there are no tempates implementing this module.

See also