Jump to content

Wikipedia:Identifying and using primary sources

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by WhatamIdoing (talk | contribs) at 23:37, 9 May 2011 (First draft). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

A lot of people have trouble figuring out what we mean when we say that we want a "secondary source".

Source classification in the real world

The concept of primary, secondary, and tertiary sources originated with the academic discipline of historiography. The point was to give historians a handy way to indicate how close the source of a piece of information was to the actual events.

Importantly, the concept developed to deal with "events", rather than ideas or abstract concepts. A primary source was a source that was created at about the same time as the event, regardless of the source's contents. So while a dictionary is a classic example of a tertiary source for the meanings of words, an ancient dictionary is actually a primary source—for the meanings of words in the ancient world.

There are no quaternary sources: Either the source is primary, or it describes, comments on, or analyzes primary sources (in which case, it is secondary), or it relies heavily or entirely on secondary sources (in which case, it is tertiary). The first published source for any given fact is always considered a primary source.

The historians' concept has been extended into other fields, with partial success.

Wikipedia is not the real world

Wikipedia does not use these terms exactly like academics use them. There are at least two major definitions of secondary source in use on Wikipedia. This page deals primarily with the classification of reliable sources in terms of article content. The classification used specifically for notability is addressed in a separate section at the end.

How to classify a source

Imagine that an army conquered a small country 500 years ago, and you have three sources:

  • a proclamation of victory written at the time of the conquest,
  • a book written 100 years later, based on the proclamation, and
  • an encyclopedia entry written last year, based on the book.

The proclamation is a primary source. This primary source has advantages: it happened at the time, and so is free of the opinions and fictions imposed by later generations. It also has disadvantages: it might contain propaganda designed to pacify the conquered country, or omit politically inconvenient facts, or overstate the importance of other facts, or be designed to stroke the new ruler's ego. Its authors might be unaware of relevant facts.

The book is a secondary source. This secondary source has advantages: The author was not involved in the event, so he has the emotional distance that allows him to analyze the events dispassionately. It also has disadvantages: He is writing about what other people said they did, and cannot use his own experience to correct any errors or omissions. He may be unable to see clearly through his own cultural lens, and the result may be that he unconsciously emphasizes things important to his culture and time, while overlooking things important to the actual actors.

The encyclopedia article is a tertiary source. It has advantages: it summarizes information. It also has disadvantages: in relying on the secondary source, the encyclopedia article will repeat, and may accidentally amplify, any distortions or errors in that source. It may also add its own interpretation.

This sort of simple example is what the source classification system was intended to deal with. It has, however, been stretched to cover much more complicated situations.

Consider the simple example above: the original proclamation is a primary source. Is the book necessarily a secondary source?

The answer is: not always. If the book merely quotes the proclamation (such as re-printing a section in a sidebar or the full text in an appendix) with no analysis or commentary, then the book is just a newly printed copy of the primary source, rather than being a secondary source.

It's not a matter of counting up the number of sources in a chain. The first published source is always a primary source, but it is possible to have dozens of sources, without having any secondary or tertiary sources. If Alice writes down an idea, and Bob simply quotes her work, and Chris refers Bob's quotation, and Daisy cites Chris, and so forth, you very likely have a string of primary sources, rather than one primary, one secondary, one tertiary, and all subsequent sources with made-up classification names.

Uses in fields other than history

In science, data is primary, and the first publication of any idea or experimental result is always a primary source. Narrative reviews, systematic reviews and meta-analyses are considered secondary sources, because they are based on and analyze or interpret (rather than merely citing) these original experimental reports.

All sources are primary for something

Every source is the primary source for something, whether it be the name of the author, its title, its date of publication, and so forth. For example, no matter what kind of book it is, the copyright page inside the front of a book is a primary source for the date of the book's publication.

More importantly, many high-quality sources contain both primary and secondary material. A peer-reviewed journal article may begin by summarizing previously published work to place the new work in context (which is secondary material) before proceeding into a description of a novel idea (which is primary material).

"Secondary" is not another way to spell "good"

"Secondary" is not, and should not be, a bit of jargon used by Wikipedians to mean "good" or "reliable" or "useable". Secondary does not mean that the source is independent, authoritative, high-quality, accurate, fact-checked, expert-approved, subject to editorial control, or published by a reputable publisher. Secondary sources can be unreliable, biased, and self-published.

According to our content policies, a reliable source has the following characteristics:

  • It has a reputation for fact-checking and accuracy.
  • It is published by a reputable publishing house, rather than by the author(s).
  • It is "appropriate for the material in question", i.e., the source is directly about the subject, rather than mentioning something unrelated in passing.
  • It is a third-party or independent source.
  • It has a professional structure in place for deciding whether to publish something, such as editorial oversight or peer review processes.

A primary source can have all of these qualities, and a secondary source may have none of them.

You are allowed to use primary sources... carefully

Secondary sources for notability

One rough rule of thumb for identifying primary sources is this: if the source is noticeably closer to the event than you are, then it's a primary source. For example, if an event occurred on January 1, 1800, and a newspaper article appears about it the next day, then Wikipedia (and all historians) considers the newspaper article a primary source.

However, Wikipedia fairly often writes about current events. As a result, an event may happen on Monday afternoon, may be written about in Tuesday mornings's newspapers, and may be added to Wikipedia just minutes later. Many editors—especially those with no training in historiography—call these newspaper articles "secondary sources", by which they mean "please don't delete this article" sources.

Typically, very recent newspaper articles are mis-labeled as a "secondary source" during AFDs, by way of trying to finesse the general notability guideline's requirement that secondary sources exist, when no true secondary sources actually exist. It is difficult, if not impossible, to find true secondary sources for run-of-the-mill events. Typically, editors are willing to overlook this error for recent events. However, once a couple of years have passed, if no true secondary sources can be found, the article is usually deleted.