Wikipedia:Authority control integration proposal/RFC
![]() |
|
This proposal covers a plan to incorporate a large number of VIAF authority control identifiers to English Wikipedia biography articles, using the {{Authority control}} template. After an initial period of data-gathering and testing utilising multiple sources the template and VIAF parameter will be added or augmented by bot. This plan is being coordinated by Max Klein, the Wikipedian in Residence at OCLC, and Andrew Gray, the Wikipedian in Residence at the British Library.
Video Summary of the proposal
On youtube.
Summary of the proposal
The proposal was initially discussed on the Village Pump here and has been updated to include the feedback and commentary received during the discussions. While the Village Pump discussion was broadly favourable, it is being formally listed as an RFC in order to ensure clear support from the community before implementation later in 2012.
Authority control is the term used in librarianship, archival practice and related fields for unique identifiers to disambiguate objects (people, places, academic subjects, etc). On Wikipedia, this is handled with the {{authority control}} template, which places the identifiers at the end of the article and links out to library catalogues and central authority databases.
As well as the links for readers, this also embeds information which can be used to help build tools linking back into Wikipedia, or for maintaining its content.
It is widely used on the German Wikipedia (220,000 articles) and on Commons, but only lightly used on the English Wikipedia (4,000 articles). We plan to add a large number of identifiers to the English Wikipedia using data drawn from VIAF and from the German Wikipedia; depending on the level of overlap, this will probably be between 250,000 and 300,000 records. These will predominantly be drawn from the Virtual International Authority File (VIAF), an international project to merge multiple national authority files. VIAF identifiers correspond to identifiers in other systems, and can be used to populate other identifiers in the future.
Using data already embedded within VIAF, as well as on the German Wikipedia, we will identify pairs of corresponding VIAF numbers and articles. After data validation, a bot will add the VIAF number to the article using a reworked version of the {{Authority control}} template.
Frequently asked questions
- How do I add a subject's VIAF to the article about them (or mine to my user page)?
- Use {{Authority control}}.
- Why use VIAF and not another identifier?
- VIAF is a composite of several existing authority control databases, and so includes all the content from many of the other systems. Any entity with, for example, a LCCN should have a corresponding VIAF number as well, but not every entity with a VIAF number will have an LCCN. Adding VIAF does not preclude the inclusion of other identifiers (and may indeed make it easier); this isn't aiming to impose a sole standard.
- Why only people?
- The authority control system does cover other things, but for the moment (written 2013) we are only planning to cover people—this is to simplify the initial program, as well as target the articles where the template is most likely to be useful.
- What about errors in VIAF?
- You can report apparent errors in VIAF (or its constituent catalogues) at Wikipedia:VIAF/errors. These are then available to the relevant managing body, and for linkage repair on-Wiki. For the German equivalent noticeboard, see de:WP:PND/F.
- What about licensing?
- VIAF is licensed as ODC-BY, which is compatible with Wikipedia licensing; the use of a VIAF URI is sufficient attribution for the terms of the license.
- Will this give any control over Wikipedia content to third parties?
- No. While we will be including VIAF identifiers, the content of Wikipedia and VIAF will remain entirely separate. No metadata will be imported automatically from VIAF, nor will Wikipedia need to follow VIAF naming conventions.
- What if editors object to the template or the identifier?
- Editors of specific pages will in all cases be free to remove the metadata where it is inaccurate or felt to be editorially inappropriate. For the purposes of Wikipedia:Sanctions, the first revert of an automated or semi-automated addition of authority control information shall not count as a revert.
- What about pages covering two people?
- There are many cases where a single article deals with two individuals. If two VIAF identifiers refer to the same article, this will be logged but not added to the article; if it currently contains one but not the other, or a mixture of identifiers referring to both, this will also be flagged.
- What about Wikidata?
- Wikidata includes authority identifiers. However, adding the template now allows us to gain the benefit of having this information available before Wikipedia transcludes it from Wikidata ; it also will simplify any future work to add these identifiers to Wikidata.
- What about cases where several people have the same name?
- The primary purpose of authority control records is to help distinguish between people with the same (or similar) names. As such, identifiers are usually not matched on the name alone; the software is able to take account of other information such as birth and death dates.
- I wrote a new biographical article, how do find the VIAF identifier?
- Thank you for contributing to Wikipedia! You can look up a subject's VIAF at http://viaf.org/ Enter their name as the "Search Terms:", and leave the other parameters at their default values. If there are two or more entries with the same name, check the listed works for a match. If you're not sure which to use, you can ask for advice at Wikipedia talk:Authority control.
- I have another question
- Any comments, criticisms, etc. will be gratefully received, again at Wikipedia talk:Authority control.
Responses
- Please leave feedback or comments below. More general queries can also be left at Wikipedia talk:Authority control integration proposal.
Support
- Tagishsimon (talk) 22:28, 28 June 2012 (UTC)
- DGG ( talk ) 00:45, 29 June 2012 (UTC)
- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:30, 29 June 2012 (UTC)
- Ironholds (talk) 10:46, 29 June 2012 (UTC)
- Nyttend (talk) 13:28, 29 June 2012 (UTC)
- --AndreasPraefcke (talk) 13:42, 29 June 2012 (UTC)
- Wer900 • talk • coordinationconsensus defined 16:41, 29 June 2012 (UTC)
- SarekOfVulcan (talk) 19:44, 29 June 2012 (UTC)
- --j⚛e deckertalk 22:31, 29 June 2012 (UTC)
- Imzadi 1979 → 23:02, 29 June 2012 (UTC)
Oppose
Comments
- I like this idea very much and think it would benefit both readers and researchers using Wikipedia. 64.40.54.97 (talk) 00:19, 29 June 2012 (UTC)
- With regards to FAQ question number three, how receptive have the VIAF people been to corrections submitted by the German community? Lankiveil (speak to me) 10:13, 29 June 2012 (UTC).
- Good question - I don't know, but I'll try to find out. That said, note that the German noticeboard is submitting corrections to PND/GND at the Deutsche Nationalbibliothek, rather than to VIAF, and so they'll be handled by different organisations. Andrew Gray (talk) 10:36, 29 June 2012 (UTC)
- VIAF has a reviews all corrections submitted by an editor. If there they are notified of an error which they agree with (which is mostly and obejctive process) then that correction will appear in VIAF the next time it is updated. Typically VIAF is updated every 6 months to a year. Maximiliankleinoclc (talk) 19:21, 29 June 2012 (UTC)
- Good question - I don't know, but I'll try to find out. That said, note that the German noticeboard is submitting corrections to PND/GND at the Deutsche Nationalbibliothek, rather than to VIAF, and so they'll be handled by different organisations. Andrew Gray (talk) 10:36, 29 June 2012 (UTC)
- As I noted in earlier discussion, we should look to moving AC links into infoboxes, where articles have them, during a subsequent phase of this initiative. That will allow them to be included in the emitted metadata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:32, 29 June 2012 (UTC)
- Should be pretty easy to move them into infoboxes later, right? Wouldn't it simply mean a bot cutting code from the bottom of the page and pasting it into the infobox? Nyttend (talk) 13:26, 29 June 2012 (UTC)
- Yes, but maybe infoboxes like Template:Infobox writer (and possibly some others) could be adjusted beforehand, and the bot could write the info directly there. (At de.wikipedia, the majority has always disapproved of infoboxes for most kinds of people, so we didn't bother to do that. Possibly, in geographic articles and other fields where we do have infoboxes, the authority control data will one day be shown there, but maybe only after the WikiData revolution.) --AndreasPraefcke (talk) 13:40, 29 June 2012 (UTC)
- Should be pretty easy to move them into infoboxes later, right? Wouldn't it simply mean a bot cutting code from the bottom of the page and pasting it into the infobox? Nyttend (talk) 13:26, 29 June 2012 (UTC)
- It might be good to amend the FAQ with "What about cases where several people have the same name?" IOW, how are we going to be sure we put the right VIAF id on the right pages? ErikHaugen (talk | contribs) 22:15, 29 June 2012 (UTC)