Jump to content

User:Gechy/lua-scripting

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Gechy (talk | contribs) at 08:19, 2 February 2021 (Site library). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Overview

Lua is a programming language implemented on Wikipedia with some substantial restrictions via Scribunto. Its purpose is to allow you to process the data which is available on Wikipedia content pages to allow various sorts of customized display of information. Lua is supported as a scripting language in all Wikimedia Foundation sites (since March 2013), via the Scribunto extension. for example. See also [tutorial].

The Scribunto (Latin: "they shall write/let them write (in the future)") extension allows for embedding scripting languages in MediaWiki.

Currently the only supported scripting language is Lua.

Why Lua Scripting

Templates and ParserFunctions were introduced to allow end-users of MediaWiki to replicate content easily and build tools using basic logic, effectively turning wikitext into a limited programming language.

This project aims to make it possible for MediaWiki end-users to use a proper scripting language that will be more powerful and efficient than ad-hoc ParserFunctions-based logic.

Initially MediaWiki templates were pieces of wikitext that were substituted into pages instead of copy-pasting. By 2005, any other use was rare and, to some extent, controversial. In 2006, ParserFunctions were enabled, allowing users to use constructions such as {{#if}} and {{#switch}}, essentially turning wikitext into a purely functional programming language (i.e., a language that has no concept of state at any level and one part of the code may not affect any other part, it can only change its own output). This eventually caused several problems, including performance (some pages are overloaded with templates and require 40 seconds or more to parse/render) and readability (just take a look at this).

However, complex templates have caused performance issues and bottlenecks.

Installation and configuration

Scribunto comes bundled with Lua binary distributions for Linux (x86 and x86-64), Mac OS X Lion, and Windows (32- and 64-bit). For detailed explanation see mw:Extension:Scribunto.

Check out the Lua.org demo if you don't want to download Scribunto just yet.

Deep dive on Lua(Scribunto) Scripting

Setting up

The software used by Wikipedia, called MediaWiki, has an extension that provides a version of Lua that can be used within Wikipedia pages. The extension is called Scribunto.

On a MediaWiki wiki with Scribunto enabled, create a page with a title starting with "Module:", for example "Module:Bananas". Into this new page, copy the following text:

local p = {} -- p stands for package</translate>

function p.hello( frame )
    return "Hello, world!"
end

return p

Save that, then on another(non-module) page, type:

{{#invoke:Bananas|hello}}  -- Replace "Bananas" with the name of your module.

The "hello" function is exported from the module, and the result of the function is returned.

It's generally a good idea to invoke Lua code from the context of a template. This means that from the perspective of a calling page, the syntax is independent of whether the template logic is implemented in Lua or in wikitext.

For practical example on setting-up see Setting up really fast.

Working with modules

The module itself must return a Lua table containing the functions that may be called by{{#invoke:}}. Generally, as shown above, a local variable is declared holding a table, functions are added to this table, and the table is returned at the end of the module code.

Any functions that are not added to this table, whether local or global, will not be accessible by {{#invoke:}}, but globals might be accessible from other modules loaded using require(). It is generally good style for the module to declare all functions and variables local.

Wrap up this session by creating your own lua module

Accessing parameters from wiki

Functions called by {{#invoke:}} will be passed a single parameter, that being a frame object. To access the parameters passed to the {{#invoke:}}, code will typically use the args table of that frame object. It's also possible to access the parameters passed to the template containing the {{#invoke:}} by using frame:getParent() and accessing that frame's args.

local p = {}
function p.hello(frame)
    return 'Hello, my ' .. frame.args[1] .. ' is ' .. frame.args[2]
end
return p

Such a function can access the frame object to get access to the parameters that the template was invoked with.

For detailed examples on passing information to your lua module

Loops and tables

Lua libraries

MediaWiki libraries

All Scribunto libraries are located in the table mw.

Base functions

mw.allToString (mw.allToString( ... ))

Calls tostring() on all arguments, then concatenates them with tabs as separators.

mw.clearLogBuffer(mw.clearLogBuffer())

Removes all data logged with mw.log().

mw.clone (mw.clone( value ))

Creates a deep copy of a value. All tables (and their metatables) are reconstructed from scratch. Functions are still shared, however.

mw.executeFunction(mw.executeFunction( func ))

This creates a new copy of the frame object, calls the function with that as its parameter, then calls tostring() on all results and concatenates them (no separator) and returns the resulting string.

Note this will not work correctly from the debug console, as there is no frame object to copy.

mw.executeModule(mw.executeModule( func ))

Executes the function in a sandboxed environment; the function cannot affect anything in the current environment, with the exception of side effects of calling any existing closures.

The name "executeModule" is because this is the function used when a module is loaded from the Module: namespace.

Note this will not work correctly from the debug console, as there is no frame object to copy.

mw.getCurrentFrame(mw.getCurrentFrame())

Note this will not work correctly from the debug console, as there is no frame object to copy.

Returns the current frame object.

mw.getLogBuffer(mw.getLogBuffer())

Returns the data logged by mw.log(), as a string.

mw.incrementExpensiveFunctionCount (mw.incrementExpensiveFunctionCount())

Adds one to the "expensive parser function" count, and throws an exception if it exceeds the limit(see $wgExpensiveParserFunctionLimit).

mw.loadData (mw.loadData( module ))

Sometimes a module needs large tables of data; for example, a general-purpose module to convert units of measure might need a large table of recognized units and their conversion factors. And sometimes these modules will be used many times on one page. Parsing the large data table for every {{#invoke:}} can use a significant amount of time. To avoid this issue, mw.loadData() is provided.

mw.loadData works like require(), with the following differences:

  • The loaded module is evaluated only once per page, rather than once per {{#invoke:}} call.
  • The loaded module is not recorded in package.loaded.
  • The value returned from the loaded module must be a table. Other data types are not supported.
  • The returned table (and all subtables) may contain only booleans, numbers, strings, and other tables. Other data types, particularly functions, are not allowed.
  • The returned table (and all subtables) may not have a metatable.
  • All table keys must be booleans, numbers, or strings.
  • The table actually returned by mw.loadData() has metamethods that provide read-only access to the table returned by the module. Since it does not contain the data directly, pairs() and ipairs() will work but other methods, including #value, next(), and the functions in the Table library, will not work correctly.

The hypothetical unit-conversion module mentioned above might store its code in "Module:Convert" and its data in "Module:Convert/data", and "Module:Convert" would use local data = mw.loadData( 'Module:Convert/data' ) to efficiently load the data.

Global modules

Modules containing tables can also be retrieved from a global repository ( specifically dev.wikia.com). This uses a syntax such as:

local data = mw.loadData( 'Dev:Convert/data' )

The code above will load a table from a module stored in dev.wikia.com/wiki/Module:Convert/data (if it exists). Note: This is case sensitive.

mw.log(mw.log( ... ))

Passes the arguments to mw.allToString(), then appends the resulting string to the log buffer.

In the debug console, the function print() is an alias for this function.


Frame object

The frame object is the interface to the parameters passed to {{#invoke:}}, and to the parser.

frame.args

A table for accessing the arguments passed to the frame. For example, if a module is called from wikitext with

{{#invoke:module|function|arg1|arg2|name=arg3}}

then frame.args[1] will return "arg1", frame.args[2] will return "arg2", and frame.args['name'] (or frame.args.name) will return "arg3". It is also possible to iterate over arguments using pairs( frame.args ) or ipairs( frame.args ).

Note that values in this table are always strings; tonumber() may be used to convert them to numbers, if necessary. Keys, however, are numbers even if explicitly supplied in the invocation: {{#invoke:module|function|1|2=2}} gives string values "1" and "2" indexed by numeric keys 1 and 2.

As in MediaWiki template invocations, named arguments will have leading and trailing whitespace removed from both the name and the value before they are passed to Lua, whereas unnamed arguments will not have whitespace stripped.

For performance reasons, frame.args is a metatable, not a real table of arguments. Argument values are requested from MediaWiki on demand. This means that most other table methods will not work correctly, including #frame.args, next( frame.args ), and the functions in the Table library.

If preprocessor syntax such as template invocations and triple-brace arguments are included within an argument to #invoke, they will be expanded before being passed to Lua. If certain special tags are written in XML notation, such as <pre>, <nowiki>, <gallery> and <ref>, are included as arguments to #invoke, then these tags will be converted to "strip markers" — special strings which begin with a delete character (ASCII 127), to be replaced with HTML after they are returned from #invoke.

frame:getParent (frame:getParent())

Called on the frame created by {{#invoke:}}, returns the frame for the page that called {{#invoke:}}. Called on that frame, returns nil. This lets you just put {{#invoke:ModuleName|method}} inside a template and the parameters passed to the template (i.e. {{Hello|we|are|foo=Wikians}}) will be passed straight to the Lua module, without having to include them directly (so, you don't have to do {{#invoke:ModuleName|method|{{{1|}}}|{{{2|}}}|{{{foo|}}}}}).

Example:

  • Module:Hello
local p = {}

function p.hello( frame )
	return "Hello, " .. frame:getParent().args[1] .. "!"
end

return p
  • Template:Hello

{{#invoke:Hello|hello}}

  • Article

{{Hello|Fandom}}

  • This will output "Hello, Fandom!".
frame:expandTemplate (frame:expandTemplate{ title=template, args=table })

Note the use of named args syntactic sugar; see Function calls for details.

This is transclusion. The call frame:expandTemplate{ title = 'template', args = { 'arg1', 'arg2', name = 'arg3' } } does roughly the same thing from Lua that {{arg1}} does in wikitext. As in transclusion, if the passed title does not contain a namespace prefix it will be assumed to be in the Template: namespace.

Note that the title and arguments are not preprocessed before being passed into the template:

-- This is roughly equivalent to wikitext like
-- {{template|{{!}}}}
frame:expandTemplate{ title = 'template', args = { '|' } }

-- This is roughly equivalent to wikitext like
-- {{template|{{((}}!{{))}}}}
frame:expandTemplate{ title = 'template', args = { '{{!}}' } }
frame:preprocess

This can be represented as frame:preprocess( string ) frame:preprocess{ text = string }. It expands wikitext in the context of the frame, i.e. templates, parser functions, and parameters such as {{{1}}} are expanded. Certain special tags written in XML-style notation, such as <pre>, <nowiki>, <gallery> and <ref>, will be replaced with "strip markers" — special strings which begin with a delete character (ASCII 127), to be replaced with HTML after they are returned from {{#invoke}}.

If you are expanding a single template, use frame:expandTemplate instead of trying to construct a wiki text string to pass to this method. It's faster and less prone to error if the arguments contain pipe characters or other wiki markups.

local p = {} 

function p.hello( frame )
	-- This will preprocess the wikitext and expand the template {{foo}}
	return frame:preprocess( "'''Bold''' and ''italics'' is {{Foo}}" )
end

return p
frame:getArgument

This is represented as frame:getArgument( arg ) , frame:getArgument{ name = arg }.

It gets an object for the specified argument, or nil if the argument is not provided.

The returned object has one method, object:expand(), that returns the expanded wikitext for the argument.

local p = {} 

function p.hello( frame )
	-- {{#invoke:ModuleName|hello|''Foo'' bar|{{Foo}}|foo={{HelloWorld}}}}
	local varOne = frame:getArgument( 1 )
	local varTwo = frame.args[2]
	local varThree = frame:getArgument( 'foo' )
	return varOne:expand() .. varTwo .. varThree:expand()
end

return p
frame:newParserValue

frame:newParserValue( text ), frame:newParserValue{ text = text }.

Returns an object with one method, object:expand(), that returns the result of frame:preprocess( text ).

frame:newTemplateParserValue

frame:newTemplateParserValue{ title title, args table }

Returns an object with one method, object:expand(), that returns the result of frame:expandTemplate called with the given arguments.

frame:argumentPairs

frame:argumentPairs()

Same as pairs( frame.args ).

Included for backwards compatibility.

frame:getTitle

Returns the title associated with the frame as a string.


Language library

Many of MediaWiki's language code is similar to IETF language tags, but not all MediaWiki language codes are valid IETF tags or vice versa.

Functions documented as mw.language.name are available on the global mw.language table; functions documented as mw.language:name are methods of a language object.

mw.language.fetchLanguageName

mw.language.fetchLanguageName( code )

The full name of the native language name (language autonym).

mw.language.getContentLanguage

mw.language.getContentLanguage(), mw.getContentLanguage().

Returns a new language object for the wiki's default content language.

mw.language.isValidBuiltInCode

mw.language.isValidBuiltInCode( code )

Returns true if a language code is of a valid form for the purposes of internal customisation of MediaWiki.

The code may not actually correspond to any known language.

mw.language.isValidCode

mw.language.isValidCode( code )

Returns true if a language code string is of a valid form, whether or not it exists. This includes codes which are used solely for customisation via the MediaWiki namespace.

The code may not actually correspond to any known language.

mw.language.new

mw.language.new( code ) mw.getLanguage( code )

Creates a new language object. Language objects do not have any publicly accessible properties, but they do have several methods, which are documented below.

The methods below must all use the language object (e.g. lang).

local lang = mw.language.new("en")
local ucText = lang:uc("En Taro Adun executor")
mw.log (ucText)
mw.language:getCode

lang:getCode()

Returns the language code for this language object.

mw.language:isRTL

lang:isRTL()

Returns true if the language is written right-to-left, false if it is written left-to-right.

mw.language:lc

Converts the string to lowercase, honouring any special rules for the given language.

When the Ustring library is loaded, the mw.ustring.lower() function is implemented as a call to mw.language.getContentLanguage():lc( s ).

mw.language:lcfirst

lang:lcfirst( s )

Converts the first character of the string to lowercase, as with lang:lc().

mw.language:uc

lang:uc( s )

Converts the string to uppercase, honouring any special rules for the given language.

When the Ustring library is loaded, the mw.ustring.upper() function is implemented as a call to mw.language.getContentLanguage():uc( s ).

mw.language:ucfirst

lang:ucfirst( s )

Converts the first character of the string to uppercase, as with lang:uc().

mw.language:caseFold

lang:caseFold( s )

Converts the string to a representation appropriate for case-insensitive comparison. Note that the result may not make any sense when displayed.

mw.language:formatNum

lang:formatNum( n , nocommafy)

Formats a number with grouping and decimal separators appropriate for the given language. Given 123456.78, this may produce "123,456.78", "123.456,78", or even something like "١٢٣٬٤٥٦٫٧٨" depending on the language and wiki configuration.

With the second parameter, one can prevent the output from containing commas, as shown below:

 local result = lang:formatNum(123123123, {noCommafy=true})
output: 123123123
mw.language:formatDate

lang:formatDate( format, timestamp, local )

Formats a date according to the given format string. If timestamp is omitted, the default is the current time. The value for local must be a boolean or nil; if true, the time is formatted in the server's local time rather than in UTC.

The timestamp is the actual date. It accepts dates with either a backslash or dash e.g. "2015/10/20" or "2015-10-20".

The format string and supported values for timestamp are identical to those for the #time parser function from Extension:ParserFunctions, as shown below:

--This outputs the current date with spaces between the month,day and year
lang:formatDate( 'y m d' )

--This outputs the inputted date using the specified format
lang:formatDate( 'y m d', "2015-02-01" )

Note that backslashes may need to be doubled in the Lua string where they wouldn't in wikitext:

-- This outputs a newline, where {{#time:\n}} would output a literal "n"
lang:formatDate( '\n' )

-- This outputs a literal "n", where {{#time:\\n}} would output a backslash
-- followed by the month number.
lang:formatDate( '\\n' )

-- This outputs a backslash followed by the month number, where {{#time:\\\\n}}
-- would output two backslashes followed by the month number.
lang:formatDate( '\\\\n' )
mw.language:parseFormattedNumber

lang:parseFormattedNumber( s )

This takes a number as formatted by lang:formatNum() and returns the actual number. In other words, this is basically a language-aware version of tonumber().

This library allows one to do arithmetic in multiple supported languages (simultaneously), and is better than the {{#expr}} parser function:

local n1 = "١"
local n2 = "٣"
local lang = mw.language.new("ar")
local num1 = lang:parseFormattedNumber(n1) 
local num2 = lang:parseFormattedNumber(n2) 
local tot = lang:formatNum(num1 + num2)
mw.log(tot)
mw.language:convertPlural =

lang:convertPlural( n, ... ) lang:convertPlural( n, forms ) lang:plural( n, ... ) lang:plural( n, forms )

This chooses the appropriate grammatical form from forms (which must be a sequence table) or ... based on the number n. For example, in English you might use n .. ' ' .. lang:plural( n, 'sock', 'socks' ) or n .. ' ' .. lang:plural( n, { 'sock', 'socks' } ) to generate grammatically-correct text whether there is only 1 sock or 200 socks.

mw.language:convertGrammar

lang:convertGrammar( word, case ) lang:grammar( case, word )

Note the different parameter order between the two aliases. <coe>convertGrammar matches the order of the method of the same name on MediaWiki's Language object, while grammar matches the order of the parser function of the same name, documented at mw:Help:Magic words#Localisation.

This chooses the appropriate inflected form of word for the given inflection code case.

mw.language:gender

lang:gender( what, masculine, feminine, neutral ) lang:gender( what, { masculine, feminine, neutral } )

Chooses the string corresponding to the gender of what, which may be "male", "female", or a registered user name.

Site library

mw.site.currentVersion

A string holding the current version of MediaWiki.

mw.site.scriptPath

The value of $wgScriptPath.

mw.site.server

The value of $wgServer $wgServer.

mw.site.siteName

The value of $wgSitename $wgSitename.

mw.site.stylePath

The value of $wgStylePath $wgStylePath.

mw.site.namespaces

Table holding data for all namespaces, indexed by number.

The data available is:

  • id: Namespace number.
  • name: Local namespace name.
  • canonicalName: Canonical namespace name.
  • displayName: Set on namespace 0, the name to be used for display (since the name is often the empty string).
  • hasSubpages: Whether subpages are enabled for the namespace.
  • hasGenderDistinction: Whether the namespace has different aliases for different genders.
  • isCapitalized: Whether the first letter of pages in the namespace is capitalized.
  • isContent: Whether this is a content namespace.
  • isIncludable: Whether pages in the namespace can be transcluded.
  • isMovable: Whether pages in the namespace can be moved.
  • isSubject: Whether this is a subject namespace.
  • isTalk: Whether this is a talk namespace.
  • aliases: List of aliases for the namespace.

subject: Reference to the corresponding subject namespace's data.

  • talk: Reference to the corresponding talk namespace's data.
  • associated: Reference to the associated namespace's data.

A metatable is also set that allows for looking up namespaces by name (localized or canonical). For example, both mw.site.namespaces[4] and mw.site.namespaces.Project will return information about the Project namespace.

mw.site.contentNamespaces

Table holding just the content namespaces, indexed by number. See mw.site.namespaces for details.


mw.site.subjectNamespaces
mw.site.talkNamespaces
mw.site.sassParams
mw.site.stats
mw.site.stats.pagesInCategory
mw.site.stats.pagesInNamespace
mw.site.stats.usersInGroup

Uri library

Ustring library

HTML library

Text library

Title library

Message library

Site library