Wikipedia:Authority control integration proposal/RFC
- RFC still under preparation - please do not leave feedback yet!
We propose to incorporate VIAF authority control identifiers, using the {{Authority control}} template on English Wikipedia biography articles in a fully automated way. After an initial period of data-gathering and testing utilising multiple sources the template and VIAF parameter will be added or augmented by bot.
The Wikipedia:Authority control integration proposal was initially discussed on the Village Pump here and has been updated to include the feedback and commentary received during the discussions. While the Village Pump discussion was broadly favourable, it is being formally listed as an RFC in order to ensure clear support from the community before implementation later in 2012.
Summary of the proposal
Authority control is the term used in librarianship, archival practice and related fields for unique identifiers to disambiguate objects (people, places, academic subjects, etc). On Wikipedia, this is handled with the {{authority control}} template, which places the identifiers at the end of the article and links out to library catalogues and central authority databases.
As well as the links for readers, this also embeds information which can be used to help build tools linking back into Wikipedia, or for maintaining its content.
It is widely used on the German Wikipedia (220,000 articles) and on Commons, but only lightly used on the English Wikipedia (4,000 articles). We plan to add a large number of identifiers to the English Wikipedia using data drawn from VIAF and from the German Wikipedia; depending on the level of overlap, this will probably be between 250,000 and 300,000 records. These will predominantly be drawn from the Virtual International Authority File (VIAF), an international project to merge multiple national authority files. VIAF identifiers correspond to identifiers in other systems, and can be used to populate other identifiers in the future.
Using data already embedded within VIAF, as well as on the German Wikipedia, we will identify pairs of corresponding VIAF numbers and articles. After data validation, a bot will add the VIAF number to the article using a reworked version of the {{Authority control}} template.
Frequently asked questions
- How do I add a subject's VIAF to the article about them (or mine to my user page)?
- Use {{Authority control}}.
- Why use VIAF and not another identifier?
- VIAF is a composite of several existing authority control databases, and so includes all the content from many of the other systems. Any entity with, for example, a LCCN should have a corresponding VIAF number as well, but not every entity with a VIAF number will have an LCCN. Adding VIAF does not preclude the inclusion of other identifiers (and may indeed make it easier); this isn't aiming to impose a sole standard.
- Why only people?
- The authority control system does cover other things, but for the moment (written 2013) we are only planning to cover people—this is to simplify the initial program, as well as target the articles where the template is most likely to be useful.
- What about errors in VIAF?
- You can report apparent errors in VIAF (or its constituent catalogues) at Wikipedia:VIAF/errors. These are then available to the relevant managing body, and for linkage repair on-Wiki. For the German equivalent noticeboard, see de:WP:PND/F.
- What about licensing?
- VIAF is licensed as ODC-BY, which is compatible with Wikipedia licensing; the use of a VIAF URI is sufficient attribution for the terms of the license.
- Will this give any control over Wikipedia content to third parties?
- No. While we will be including VIAF identifiers, the content of Wikipedia and VIAF will remain entirely separate. No metadata will be imported automatically from VIAF, nor will Wikipedia need to follow VIAF naming conventions.
- What if editors object to the template or the identifier?
- Editors of specific pages will in all cases be free to remove the metadata where it is inaccurate or felt to be editorially inappropriate. For the purposes of Wikipedia:Sanctions, the first revert of an automated or semi-automated addition of authority control information shall not count as a revert.
- What about pages covering two people?
- There are many cases where a single article deals with two individuals. If two VIAF identifiers refer to the same article, this will be logged but not added to the article; if it currently contains one but not the other, or a mixture of identifiers referring to both, this will also be flagged.
- What about Wikidata?
- Wikidata includes authority identifiers. However, adding the template now allows us to gain the benefit of having this information available before Wikipedia transcludes it from Wikidata ; it also will simplify any future work to add these identifiers to Wikidata.
- What about cases where several people have the same name?
- The primary purpose of authority control records is to help distinguish between people with the same (or similar) names. As such, identifiers are usually not matched on the name alone; the software is able to take account of other information such as birth and death dates.
- I wrote a new biographical article, how do find the VIAF identifier?
- Thank you for contributing to Wikipedia! You can look up a subject's VIAF at http://viaf.org/ Enter their name as the "Search Terms:", and leave the other parameters at their default values. If there are two or more entries with the same name, check the listed works for a match. If you're not sure which to use, you can ask for advice at Wikipedia talk:Authority control.
- I have another question
- Any comments, criticisms, etc. will be gratefully received, again at Wikipedia talk:Authority control.
Responses
- RFC still under preparation - please do not leave feedback yet!