Jump to content

Module:Sensitive IP addresses/API

From Simple English Wikipedia, the free encyclopedia
Revision as of 07:44, 31 July 2016 by Johnuniq (talk | changes) (test getRanges to show IP ranges equivalent to collection)

This module provides an API for information about IP addresses that Wikipedia considers sensitive. The intention is that this one API can be used for templates, Lua modules, and software using the MediaWiki Action API such as JavaScript gadgets and bots.

Usage

From templates

Templates wishing to make use of this API need to use an intermediary Lua module to parse the results of API queries. One such module, used to create a wikitable summary of sensitive IPs, exists at Module:Sensitive IP addresses/summary.

From Lua

To load this module from Lua modules, use:

local querySensitiveIPs = require('Module:Sensitive IP addresses/API').query

The query function is called with named parameters. For example:

local result = querySensitiveIPs{
	test = {'1.2.3.4', '5.6.7.8'}
}

Parameters

The following parameters are available to the query function:

  • test - an array of IP addresses and/or IP ranges to test for sensitivity. IP addresses and ranges can be IPv4 or IPv6, and ranges must be in CIDR notation.
  • entities - an array of entity IDs to get information about. An entity is a country or organization which is considered sensitive, and for which blocks should be handled with care. Entity IDs are defined in Module:Sensitive IP addresses/list along with the rest of the sensitive IP data. For example, ushr is the ID for the United States House of Representatives. If the special ID all is contained in the array, information about all entities will be included in the result.
  • format - the format to return results in. Use json to return a JSON-formatted string, and use lua to return a Lua table. If this option is not specified, a Lua table is returned by default.

Results

By default, the query function returns a Lua table, but it can return a JSON object if the format option is set to json. Whether Lua or JSON, the structure of the object returned is similar to the structure of query results from the MediaWiki Action API.

Top-level object

The top level object contains exactly one child object. If the query executed successfully, this object has a key of sensitiveips and contains the query results.

{
    "sensitiveips": {
        [query results]
    }       
}

If there were any errors when executing the query, the child of the top-level object has a key of error and contains error information. The error object has three keys: code, the error ID; info, the error message; and *, a message about where to find the API documentation. The error IDs all have a prefix of "sipa". For example:

{
    "error": {
        "code": "sipa-invalid-test-string",
        "info": "test string #1 'foo' was not a valid IP address or CIDR string",
        "*": "See https://en.wikipedia.org/wiki/Module:Sensitive_IP_addresses/API for API usage"
    }       
}

Sensitive IPs object

If the query was successful, the sensitiveips child object will be present and can contain the following objects and arrays:

  • matches - an array of IP address objects or IP range objects, where the corresponding IP address or IP range string was specified in the test query option, and where Wikipedia regards that IP address or IP range as being sensitive. If no IPs or ranges were tested for, or if no matches were found, this array will not be present in the results.
  • matched-ranges - an object with CIDR IP range strings as keys, and matched-range objects as values, where the IP range matches one of the IP addresses or IP ranges tested for with the test query option. If no IPs or ranges were tested for, or if no matches were found, this object will not be present.
  • entities - an object with entity IDs as keys, and entity objects as values. An entity is a country or organization which has IP addresses that Wikipedia considers sensitive. Entity IDs are defined in Module:Sensitive IP addresses/list; for example, ushr is the ID for the United States House of Representatives. Entities will be included in this object if an IP range belonging to them is matched by one of the IP addresses or IP ranges tested for with the test query option, or if their entity ID is specified in the entities query option.
  • entity-ids - an array of entity ID strings, in the order they are defined in Module:Sensitive IP addresses/list. The entity IDs in this array correspond one-to-one with the entity ID keys of the entities result object. This array can be useful for outputting the IDs in the same order that they were defined in the list.

IP address objects

An IP address object represents a single IPv4 or IPv6 address that matches a sensitive IP range. IP address objects contain the following fields:

  • ip - the string representation of the IP address, e.g. "1.2.3.4" or "2001:d8::ffff:ab:cdef".
  • type - the string "ip" (used to differentiate between IP address objects and IP range objects).
  • ip-version - the version of the IP protocol the address uses. This is either "IPv4" or "IPv6".
  • matches-range - the sensitive IP range that the address matches, in CIDR notation.
  • entity-id - the entity ID of the entity that owns the sensitive IP range that the address matches.

IP range objects

An IP range object represents an IPv4 or IPv6 range that overlaps with a sensitive IP range. IP range objects contain the following fields:

  • range - the CIDR string representation of the range, e.g. "1.2.3.0/24" or "2001:d8::ffff:ab:0/16".
  • type - the string "range" (used to differentiate between IP range objects and IP address objects).
  • ip-version - the version of the IP protocol the range uses. This is either "IPv4" or "IPv6".
  • matches-range - the sensitive IP range that the tested range overlaps, in CIDR notation.
  • entity-id - the entity ID of the entity that owns the sensitive IP range that the tested range overlaps.

Entity objects

An entity object represents a country or organization that has IP ranges which Wikipedia considers sensitive. Entity objects may contain the following fields:

  • id - the entity ID. This is a unique string used to identify the entity. This field is always present.
  • name - the name of the entity. This is a plain string, containing no wikitext, and is always present.
  • description - a description of the entity. This is a string, and may contain wikitext. This field is optional, and may not be present.
  • reason - the reason that the entity's IP ranges are sensitive. The possible reasons are Script error: The function "_getSensitivityReasons" does not exist..
  • ipv4-ranges - an array of IPv4 CIDR strings that belong to the entity, and are considered as sensitive by Wikipedia. This field is optional, and may not be present.
  • ipv6-ranges - an array of IPv6 CIDR strings that belong to the entity, and are considered as sensitive by Wikipedia. This field is optional, and may not be present.
  • notes - notes about the entity or its ranges. This field is optional, and may not be present.

Examples

Here are some examples of some queries from Lua and the results they produce.

No matches

Query:

querySensitiveIPs{
    test = {'1.2.3.4'}
}

Result:

{
  ["sensitiveips"] = {
  }
}

One match

Query:

querySensitiveIPs{
    test = {'156.33.5.76'}
}

Result:

{
  ["sensitiveips"] = {
    ["matches"] = {
      {
        ["type"] = "ip",
        ["ip"] = "156.33.5.76",
        ["ip-version"] = "IPv4",
        ["matches-range"] = "156.33.0.0/16",
        ["entity-id"] = "ussenate",
      },
    },
    ["matched-ranges"] = {
      ["156.33.0.0/16"] = {
        ["range"] = "156.33.0.0/16",
        ["ip-version"] = "IPv4",
        ["entity-id"] = "ussenate",
      },
    },
    ["entities"] = {
      ["ussenate"] = {
        ["id"] = "ussenate",
        ["name"] = "United States Senate",
        ["description"] = "the [[United States Senate]]",
        ["reason"] = "political",
        ["ipv4Ranges"] = {
          "156.33.0.0/16",
        },
        ["ipv6Ranges"] = {
          "2620:0:8a0::/48",
          "2600:803:618::/48",
        }
      },
    },
    ["entity-ids"] = {
      "ussenate",
    },
  },
}

One match, JSON output

Query: Query:

querySensitiveIPs{
    format = 'json',
    test = {'156.33.5.76'}
}

Result:

{
   "sensitiveips":{
      "matches":[
         {
            "type": "ip",
            "ip": "156.33.5.76",
            "ip-version": "IPv4",
            "matches-range": "156.33.0.0/16",
            "entity-id": "ussenate"
         }
      ],
      "matched-ranges": {
         "156.33.0.0/16": {
            "range": "156.33.0.0/16",
            "ip-version": "IPv4",
            "entity-id": "ussenate"
         }
      },
      "entities": {
         "ussenate": {
            "id": "ussenate",
            "name": "United States Senate",
            "description": "the [[United States Senate]]",
            "reason": "political",
            "ipv6Ranges": [
               "2620:0:8a0::/48",
               "2600:803:618::/48"
            ],
            "ipv4Ranges": [
               "156.33.0.0/16"
            ]
         }
      },
      "entity-ids": [
         "ussenate"
      ]
   }
}

Entity IDs

querySensitiveIPs{
    format = 'json',
    entities = {'usdhs', 'usdoj'}
}

Result:

{
   "sensitiveips": {
      "entities": {
         "usdoj": {
            "id": "usdoj",
            "name": "United States Department of Justice",
            "description": "the [[United States Department of Justice]]",
            "reason": "political",
            "ipv4Ranges": [
               "149.101.0.0/16"
            ]
         },
         "usdhs": {
            "id": "usdhs",
            "name": "United States Department of Homeland Security",
            "description": "the [[United States Department of Homeland Security]]",
            "reason": "political",
            "ipv4Ranges": [
               "65.165.132.0/24",
               "204.248.24.0/24",
               "216.81.80.0/20"
            ]
         }
      },
      "entity-ids": [
         "usdoj",
         "usdhs"
      ]
   }
}

Invalid IP error

Query:

querySensitiveIPs{
    test = {'foo'}
}

Result:

{
  ["error"] = {
    ["code"] = "sipa-invalid-test-string",
    ["info"] = "test string #1 'foo' was not a valid IP address or CIDR string"
    ["*"] = "See https://en.wikipedia.org/wiki/Module:Sensitive_IP_addresses/API for API usage",
  }
}



-- This module provides functions for handling sensitive IP addresses.

-- Load modules
local libraryUtil = require('libraryUtil')
local checkType = libraryUtil.checkType

local sensitivityReasons = {
	political = true,
	technical = true,
}

-- Plan of attack:
-- * Load the data from Module:Sensitive IP addresses/list via a formatting module
--   at Module:Sensitive IP addresses/data
--   * This module will do preprocessing that must be done for every query
-- * Make an API so that other modules can query this module for data about
--   sensitive IPs and ranges.
--   * Export query results as both a Lua table and as JSON
-- * Use this API to create a table to be used in
--   [[Template:Sensitive IP addresses]].

-------------------------------------------------------------------------------
-- Sensitive IP API
-------------------------------------------------------------------------------

-- This API is used by external tools and gadgets, so it should be kept
-- backwards-compatible. Clients query the API with a query table, and the
-- API returns a response table. The response table is available as a Lua table
-- for other Lua modules, and as JSON for external clients.

-- Example query tables:
--
-- Query IP addresses and ranges:
-- {
-- 	test = {'1.2.3.4', '4.5.6.0/24', '2001:db8::ff00:12:3456', '2001:db8::ff00:12:0/112'},
-- }
--
-- Query specific entities:
-- {
-- 	entities = {'ussenate', 'ushr'}
-- }
--
-- Query all entities:
-- {
-- 	entities = {'all'}
-- }
--
-- Combined query:
-- {
-- 	test = {'1.2.3.4', '4.5.6.0/24', '2001:db8::ff00:12:3456', '2001:db8::ff00:12:0/112'},
-- 	entities = {'ussenate', 'ushr'}
-- }

-- Example response:
--
-- {
--     sensitiveips = {
--         matches = {
--             {
--                 ip = '1.2.3.4',
--                 type = 'ip',
--                 ['ip-version'] = 'IPv4',
--                 ['matches-range'] = '1.2.3.0/24',
--                 ['entity-id'] = 'entityid'
--             },
--             {
--                 range = '4.5.6.0/24',
--                 type = 'range',
--                 ['ip-version'] = 'IPv4',
--                 ['matches-range'] = '4.5.0.0/16',
--                 ['entity-id'] = 'entityid'
--             }
--         },
--         ['matched-ranges'] = {
--             ['1.2.3.0/24'] = {
--                 range = '1.2.3.0/24',
--                 ['ip-version'] = 'IPv4',
--                 ['entity-id'] = 'entityid'
--             },
--             ['4.5.0.0/16'] = {
--                 range = '4.5.0.0/16',
--                 ['ip-version'] = 'IPv4',
--                 ['entity-id'] = 'entityid'
--             }
--         },
--         entities = {
--             ['entityid'] = {
--                 id = 'entityid',
--                 name = 'The entity name',
--                 description = 'A description of the entity',
--                 ['ipv4-ranges'] = {
--                     '1.2.3.0/24',
--                     '4.5.0.0/16'
--                     '6.7.0.0/16'
--                 },
--                 ['ipv6-ranges'] = {
--                     '2001:db8::ff00:12:0/112'
--                 },
--                 notes = 'Notes about the entity or its ranges'
--             }
--         }
--         ['entity-ids'] = {
--             'entityid'
--         }
--     }
-- }
--
-- Response with errors:
--
-- {
--     error = {
--         code = 'example-error',
--         info = 'There was an error',
--         ['*'] = 'See https://en.wikipedia.org/wiki/Module:Sensitive_IP_addresses for API usage'
--     }
-- }

--------------------------------------------------------------------------------
-- Q&D demo of loading data from [[Module:Sensitive IP addresses/list]]
-- into a structure that could be used to determine whether a particular
-- IP or subnet overlaps a sensitive range.
-- If used, this would be greatly refactored and possibly split to
-- [[Module:Sensitive IP addresses/data]].
--
-- Usage in a sandbox:
-- {{#invoke:Sensitive IP addresses|main}}

local function main()
	-- Test Module:IP.
	----------------------------------------------------------------------------
	-- An IP collection in Module:IP should hold both IPv4 and IPv6 lists and
	-- it would use the appropriate list depending on the object queried?
	-- That would make this code more straight forward.
	----------------------------------------------------------------------------
	-- Support stuff
	----------------------------------------------------------------------------
	local modcode = require('Module:IP')
	local IPAddress = modcode.IPAddress
	local Subnet = modcode.Subnet
	local IPv4Collection = modcode.IPv4Collection
	local IPv6Collection = modcode.IPv6Collection
	local Collection = {}
	Collection.__index = Collection
	do
		function Collection:add(item)
			if item ~= nil then
				self.n = self.n + 1
				self[self.n] = item
			end
		end
		function Collection:join(sep)
			return table.concat(self, sep)
		end
		function Collection:sort(comp)
			table.sort(self, comp)
		end
		function Collection.new()
			return setmetatable({n = 0}, Collection)
		end
	end
	local function getObject(ipStr)
		-- Parse a string and return an appropriate object:
		--   IPv4 or IPv6 IP or subnet, or nil.
		-- TODO This should be in Module:IP (see IPCollection:_store).
		local maker
		if ipStr:find('/', 1, true) then
			maker = Subnet.new
		else
			maker = IPAddress.new
		end
		local success, obj = pcall(maker, ipStr)
		if success then
			return obj
		end
		return nil
	end
	local function preBlock(text)
		-- Pre tags returned by a module do not act like wikitext <pre>...</pre>.
		return '<pre>\n' ..
			mw.text.nowiki(text) ..
			(text:sub(-1) == '\n' and '' or '\n') ..
			'</pre>\n'
	end
	----------------------------------------------------------------------------
	-- Load sensitive IP information
	----------------------------------------------------------------------------
	local function loadList(modname)
		-- Return a table to query an IP/subnet wrt sensitive ranges.
		local data = {
			subnetToInfo = {},
			v4Collection = IPv4Collection.new(),
			v6Collection = IPv6Collection.new(),
		}
		local sensitiveList = mw.loadData(modname)
		for i, info in ipairs(sensitiveList) do
			for _, r in ipairs({
				{key = 'ipv4Ranges', list = data.v4Collection},
				{key = 'ipv6Ranges', list = data.v6Collection},
			}) do
				local rangeStrings = info[r.key]
				if rangeStrings then
					for _, str in ipairs(rangeStrings) do
						local subnet = Subnet.new(str)
						r.list:addSubnet(subnet)
						data.subnetToInfo[subnet] = info
					end
				end
			end
		end
		return data
	end
	----------------------------------------------------------------------------
	-- Run test using Module:IP
	----------------------------------------------------------------------------
	local data = loadList('Module:Sensitive IP addresses/list')
	local results = Collection.new()
	results:add('IP ranges equivalent to collection')
	for _, col in ipairs({data.v4Collection, data.v6Collection}) do
		for _, range in ipairs(col:getRanges()) do
			if range[1] == range[2] then
				results:add('  ' .. range[1])
			else
				results:add('  ' .. range[1] .. ' – ' .. range[2])
			end
		end
	end
	for _, ipStr in ipairs({
		-- Each of the following is tested against the sensitive list.
		'143.228.19.123',
		'2620:0:E21:9F2::',
		'131.132.224.0/19',
		'198.35.27.255',
		'2620:0:860::1',
		'1.2.3.4',
		'11.12.13.192/26',
		'2001:db8::abcd',
		'2001:db8::/72',
	}) do
		local obj = getObject(ipStr)
		if obj then
			local isPresent, clashObj
			local col = obj:getVersion() == 'IPv4' and
				data.v4Collection or data.v6Collection
			if obj.getNextIP then  -- dirty trick to check if obj is an IP
				isPresent, clashObj = col:containsIP(obj)
			else
				isPresent, clashObj = col:overlapsSubnet(obj)
			end
			results:add('')
			results:add('IP or range under test: ' .. ipStr)
			if isPresent then
				local info = data.subnetToInfo[clashObj]
				if info then
					results:add('  sensitive: ' .. clashObj)
					results:add('  name: ' .. (info.name or '?'))
					results:add('  id: ' .. (info.id or '?'))
					results:add('  description: ' .. (info.description or '?'))
					results:add('  reason: ' .. (info.reason or '?'))
				else
					-- Should not occur!
					results:add('  info not found!')
				end
			else
				results:add('  not sensitive')
			end
		else
			-- Report problem?
		end
	end
	return preBlock(results:join('\n'))
end

--------------------------------------------------------------------------------
-- Exports
--------------------------------------------------------------------------------

local p = {}
p.main = main

function p.isValidSensitivityReason(s)
	checkType('isValidSensitivityReason', 1, s, 'string')
	return sensitivityReasons[s] ~= nil
end

function p.getSensitivityReasons()
	local ret = {}
	for reason in pairs(sensitivityReasons) do
		ret[#ret + 1] = reason
	end
	table.sort(ret)
	return ret
end

function p.query()
end

return p