Wikipedia:Requests for comment/Memento
- The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the debate may be found at the bottom of the discussion.
This is a request for comment (RFC) regarding turning on the Memento extension.
This is a preliminary RFC to assess community interest among English Wikipedia users for this functionality. No significant commitment of Wikimedia Foundation engineering resources has been made yet. An early pilot would likely run on the English Wikipedia; hence the initial poll is taking place here.
What is Memento?
[edit]When searching information on the Web, you cannot navigate in the past. A link typically takes you to the current version of a resource. Memento – a project funded by the Library of Congress and run by Los Alamos National Laboratory in collaboration with Old Dominion University – aims to "make the history of the internet accessible" and bridge the gap between a current resource and its prior versions. At the moment, the closest analogy is the Internet Archive's "Wayback Machine", which allows you to view versions of websites as they were at certain points in time. While it is useful for broad coverage of the history of the web, the Wayback Machine has certain disadvantages:
- it's limited to those sites the Internet Archive is able to access;
- it's limited to as many versions as the IA's servers can cache; specifically for Wikipedia, the Internet Archive will only have a very spotty coverage of article versions, compared to the full article revision history accessible in Wikipedia;
Memento solves these problems by developing a standard way for individual websites to expose their own revision histories and for clients to negotiate these histories. MediaWiki, of course, already provides access to old article revisions via page histories and via its API but it does so without using a standardized protocol that other sites also use. Supporting Memento will allow readers to negotiate Wikipedia's contents (and any other Memento-compliant website) to return the revision of a given article matching a specified time or time range (for example: what did the 2011–2012 Egyptian revolution article look like 24 hours after the protest started? or how the Michael Jackson article changed before and after his death?). It will also allow bots, web services and applications to perform time series analysis by extracting information from the article (text mining, data extraction, etc) and to retrieve and integrate time-dependent information from different data sources. As such, it will contribute to building a key piece of infrastructure for the W3C linked data initiative.
How does it work?
[edit]Memento adds support for datetime negotiation (a variation on content negotiation), and new Relation Types for the HTTP "Link" header aimed at interlinking resources with their archival/version resources. The Memento team have developed a MediaWiki extension that would allow Wikipedia and other Wikimedia projects to support this protocol. A working browser plugin for Firefox is also available.
How does it impact the editing community?
[edit]Memento makes no visible difference for editors. We're certainly not moving away from page histories or revisions. The difference is going to be that browsers that support Memento will be able to search our content using a format standardised with other parts of the Internet.
Why support it?
[edit]Wikipedia and the Wikimedia movement projects are leaders in the field of open data, offering unmatched transparency and open licensed content. By making it easier for Wikipedia to be syndicated by humans and machines alike we help spearhead the linked data vision of an interoperable ecosystem of open licensed, structured information. Memento also helps tackle the problem of linkrot, where sites archive their content or delete it from public view creating dead links and obsolete references. A standardized method to pull up what a page used to look like at a given point in time makes referencing easier and more reliable. By supporting this protocol we pave the road for other projects and organizations to do the same. Some major players already have, including:
- the W3C;
- the Internet Archive;
- the Dublin Core Metadata Initiative;
- the UK Web Archiving Consortium;
- the British Government Web archive; and
- the Government of Canada Web archive.
FAQ
[edit]- How does Memento work?
- This Screencast demonstrates Memento navigation using the MementoFox browser extension.
- The video shows seamless web navigation in the past across multiple systems, each of which uses different methods to archive their past revisions.
- The video starts with LANL's memento experiment page. This page uses a Transactional Archive to store revisions.
- After retrieving the archived copy (a memento) by choosing a particular datetime (07/12/2012) in MementoFox toolbar, we navigate to a Wikipedia article.
- The version of the Wikipedia article retrieved is the one that was live at the chosen datetime. MementoFox currently uses a proxy setup to retrieve revisions from wikipedia. This proxy uses Wikipedia's XML interface and hence is slow; native support for Memento in Wikipedia would yield much better performance.
- After navigating through a few wikipedia articles, we arrive at a memento in the Dublin Core wiki. This is a MediaWiki with the Memento extension installed. From here, we navigate to an external link, and the memento is fetched from the UK National Archives.
- We follow a link in this article to navigate back to Wikipedia.
- Note that after MemenfoFox is turned off, the current version of the page is reloaded.
- Will it make deleted pages accessible again?
- No. It only makes accessible those pages that are already accessible via the history pages and the existing revisions API. It does not (and can not) make deleted pages accessible. The earlier version of this extension had an option that allowed deleted revisions to be displayed. This feature was added after the feedback we received from the wikitech-l <at> lists.wikimedia.org mailing list. The extension had the following constraints to display deleted revisions in the previous version of the extension:
- The feature is turned off by default and if needed, it should be turned on in the LocalSettings.php file during installation.
- With the feature turned on, only users with appropriate privileges can view these deleted pages.
- The pages were shown only in "Edit" mode.
- No. It only makes accessible those pages that are already accessible via the history pages and the existing revisions API. It does not (and can not) make deleted pages accessible. The earlier version of this extension had an option that allowed deleted revisions to be displayed. This feature was added after the feedback we received from the wikitech-l <at> lists.wikimedia.org mailing list. The extension had the following constraints to display deleted revisions in the previous version of the extension:
- This feature has been removed from the extension now and hence no deleted revisions will be accessed.
- How will it impact performance?
- The performance hit for a TimeGate request (please give me this article at this time) is significantly less than generating a history page, as it doesn't need to build a list of revisions, just find the single version closest in time. As such this would be an advantage, performance wise, if people were to use it.
- The performance of the TimeMap request is almost identical to that of generating a page of History; they both need to know 500 revision ids, however the TimeMap does not include diffs or anything else, just the links. Furthermore, TimeMaps are able to be cached, which would reduce the load on the database.
- Will transcluded template content also be backdated? (At present, old article revisions always display current template text.)
- Turn it on. For all the reasons stated above. Cielbie (talk) 21:39, 29 August 2012 (UTC)[reply]
- Support for all reasons mentioned above. Wikipedia should continue to be a pioneer for open data. (Although would this not be a discussion better suited to Meta?) —JmaJeremy✆✎ 21:50, 29 August 2012 (UTC)[reply]
- Support - Although there are some discussions that need to be had about it I think its a good idea and worth supporting. Kumioko (talk) 00:34, 30 August 2012 (UTC)[reply]
- I don't see why not, provided deleted revisions or pages are not made visible using this protocol. Kurtis (talk) 00:49, 30 August 2012 (UTC)[reply]
- Support provisionally See my message in the oppose section. Gigs (talk) 15:37, 30 August 2012 (UTC)[reply]
- Support—Absolutely this is a good idea. In the first place the information is already available to anyone who wants to look for it. It can't hurt to standardize access through something like this extension. In the second place, and more importantly, Wikipedia is itself an object of intense academic research, including research into the negotiation of revisions and the changes in articles over time. Academic interest in the social mechanics of Wikipedia is growing rapidly as well. Since this would make it easier for researchers to understand the state of Wikipedia at a given time I think it would be quite beneficial to them. The proposal states that this extension will make it easier to data-mine the history of Wikipedia, and that will also enhance the availability of information for research. Yes, we are here to build an encyclopedia, but we are committed to building it through transparent processes as well. Let's not forget that either.— alf laylah wa laylah (talk) 18:24, 31 August 2012 (UTC)[reply]
- Support, duh. Standardising the process of getting old versions of documents seems a good idea. The opposes are unconvincing: this really is just technical plumbing, I don't see any particular reason why the community ought to be opposed. Put it this way: if you've never used the command line tool cURL, this probably doesn't affect you. It simply exposes what is already exposed by 'view history' but in a more computer-friendly manner. As for the argument that people are already breaking the intentions of Wikipedia's share-alike licensing by extracting facts from it and then storing them in proprietary, copyrighted databases... yeah, well, the solution to that isn't to stop technological advancement until we have a way of cracking down on the big bad pirates. That way lurks digital rights management! So, yeah, turn it on. No reason not to. —Tom Morris (talk) 21:34, 31 August 2012 (UTC)[reply]
- Support: it's harmless. It duplicates the page history mechanics with the off-wiki standard way. That said, I would be more enthusiastic about git gateway. — Dmitrij D. Czarkoff (talk) 11:03, 1 September 2012 (UTC)[reply]
- Strong support. Wow. An amazing idea. It's virtually always good to standardise with other major websites, because building the Semantic Web is an important goal. This is only going to enable computers to do something that's already human-doable. Nyttend (talk) 22:52, 1 September 2012 (UTC)[reply]
- Support. Facilitates transparency and access to history a lot. -- JakobVoss (talk) 11:00, 2 September 2012 (UTC)[reply]
- Support. Please implement this important standard for others to follow. -- Acka47 (talk) 11:47, 2 September 2012 (UTC)[reply]
- This is all a little over my head, and I had never heard of the Momento extension until now. However, if everything works as this proposal says it does, then I see no reason not to enable it on the English Wikipedia. I have read the opposition and see no argument that seems compelling to me. Support. AGK [•] 22:41, 2 September 2012 (UTC)**[reply]
- Support. Some doubters were asking if Memento aids our purpose to build an encyclopedia. From what I understand, it clearly does. • Technical reasons: Ask yourself why Wikipedia is not created on paper, like most encyclopedias before. It's because the Web lends itself well for collaboratively creating and using an encyclopedia. As the Web progressed towards Web 2.0, WP got a little bit left behind in technical terms. As the Web further progresses towards Web 3.0 aka the Semantic Web, it is crucial for WP to support it early on. This is what the Memento extension will do: To provide a well-defined interface between the Semantic Web and Wikipedia's versioning system. In the Web 3.0 world of linked open data, it is state-of-the-art to maintain a revision history and to expose it to other technical systems in a standardized way. • Non-technical reasons: Historians and all other researchers (including citizen researchers and all Wikipedians) when taking a historical view on some subject need to investigate their subject in the light of resources as they were at the time, i.e. not only from hindsight. If I understand Memento correctly, then it aims to automate surfing the Web as it was at a given point in time. People who use this service will be aware that they are looking at historical versions -- this is just what they want from this service. --Thüringer ☼ (talk) 07:50, 3 September 2012 (UTC)[reply]
- Note that all of that is possible within the existing Wikipedia system without adding to the concerns noted below. --Nouniquenames (talk) 14:26, 3 September 2012 (UTC)[reply]
- Not at all; there's currently no interaction between our versioning system and the wider semantic web. Okeyes (WMF) (talk) 15:36, 3 September 2012 (UTC)[reply]
- Indeed. It is not possible from outside of the system. My point was that this (viewing historic versions...) is currently possible from the Wikipedia site. --Nouniquenames (talk) 23:51, 3 September 2012 (UTC)[reply]
- Not at all; there's currently no interaction between our versioning system and the wider semantic web. Okeyes (WMF) (talk) 15:36, 3 September 2012 (UTC)[reply]
- Note that all of that is possible within the existing Wikipedia system without adding to the concerns noted below. --Nouniquenames (talk) 14:26, 3 September 2012 (UTC)[reply]
- Support. Clearly we will have memento in other webapps integrated. Web of Data needs a fourth dimension. Again wikipedia would be leading edge. PascalC (talk) 11:35, 3 September 2012 (UTC)[reply]
- Support. This project is an excellent idea for the web, and hopefully will be widely adopted. Wikipedia has always been very open about making article histories available, this is simply a logical step to make them accessible in a standardised way. the wub "?!" 11:37, 4 September 2012 (UTC)[reply]
- Support. It's unlikely to make much of a difference to casual readers, but this is a very useful tool for the small community who use it, and the costs of implementation are low. Andrew Gray (talk) 12:04, 4 September 2012 (UTC)[reply]
- Support assuming that the extension won't be a significant detriment to the speed of the MediaWiki software. I don't know if the Memento protocol will catch on, but it certainly sounds interesting, and I believe that, as Wikipedians, we should always strive to be on the cutting-edge. ❤ Yutsi Talk/ Contributions ( 偉特 ) 17:44, 4 September 2012 (UTC)[reply]
- Support. No new information will be disclosed, and User:Okeyes (WMF) has confirmed below that there is no objection from the WikiMedia Foundation legal team. Opponents' concerns are met by the fact that readers will be aware that they are not looking at the current version and will therefore appreciate that pages will be more likely to contain inaccuracies which have since been corrected (in some cases, it might even assist with spotting vandalism which has subsequently crept in).
Interested readers will benefit from support for a full point-in-time view of (non-revdeleted) pages, including any transcluded pages (unlike the limited existing functionality of History links, which misleadingly show the current version of any transcluded templates).Readers who don't want the new interface won't have to use it, so there is no detriment. — Richardguk (talk) 15:10, 5 September 2012 (UTC) Edit: Per new FAQ above, it seems that transcluded text is unlikely to be updated in the current implementation, so the rendered text will be the same as viewing the relevant History link. Still no downside though. — Richardguk (talk) 21:39, 5 September 2012 (UTC)[reply] - Support.
The possibility to see a whole article the way it was — not just the wikitext of that article, without reflecting earlier versions of transcluded elemented as well — is a big feature. That alone makes it worthwhile even aside from the broader linked data ecosystem benefits.So it looks like the extension doesn't handle template histories without adding a hack to the parser. Still, maybe in the future (when parser performance isn't so critical an issue) that can be enabled. and in the meantime, there's still not much downside.--Ragesoss (talk) 17:09, 5 September 2012 (UTC)[reply] - Support; leading the way with Web standards like this is exactly where we should be. James F. (talk) 18:12, 5 September 2012 (UTC)[reply]
- Support. Promoting historianship and data archaeology is exactly the sort of thing that we should be doing as the world's largest knowledge project. The arguments presented below against this are nothing but handwaving and scaremongering. — Hex (❝?!❞) 15:12, 9 September 2012 (UTC)[reply]
- Support as long as rev-del material isn't available. Hobit (talk) 19:17, 9 September 2012 (UTC)[reply]
- Support I'm persuaded this will be useful and won't make it easier to access deleted articles and revisions. causa sui (talk) 19:29, 14 September 2012 (UTC)[reply]
- Support. I see no reason why not, the opposers didn't convinced me why not to turn on this great feature! mabdul 17:04, 15 September 2012 (UTC)[reply]
- Support. Sounds right up Wikipedia's alley. --Martin Wisse (talk) 16:27, 19 September 2012 (UTC)[reply]
- Strong support for every reason mentioned in the RfC itself. St John Chrysostom Δόξατω Θεώ 09:16, 20 September 2012 (UTC)[reply]
- Strong support. This will allow history browsing across websites in a consistent way, and we'll be the biggest participant, creating great value for Memento all by ourselves and also encouraging many others to participate. I would also find it personally interesting to browse Wikipedia at a fixed time in the past, to get a "feeling" for how things have changed over time. Very much worthwhile. Dcoetzee 00:51, 24 September 2012 (UTC)[reply]
I won't say I oppose it, but is it really such a great idea? Do we really want to make past versions of our pages (with all the vandalism, libel and unwanted personal details they may contain) any easier for the world to access than we do already? Victor Yus (talk) 19:32, 29 August 2012 (UTC)[reply]
- Is there a mechanism that would prevent viewing of pages that have been rev-deleted ? "....We are all Kosh...." <-Babylon-5-> 19:38, 29 August 2012 (UTC)[reply]
- From what I understood, implementing this proposal will only make the already public information available in a format conforming to the Memento specification. Keφr (talk) 19:41, 29 August 2012 (UTC)[reply]
- Correct. It's a standards based way to get to pages in history, it doesn't make anything accessible that would not be otherwise. It is a very different question whether or not past pages should be removed completely. azaroth42 (spec editor)
- If someone decides to make a copy of a page which is later revdeleted, Wikipedia can't do anything about it. This wouldn't change with the new extension; so yes, there is a chance revdeleted content could be accessible if other websites choose to make a local backup of Wikipedia. This would mean, however, that each of those other websites would be responsible for any copyright violations and claims of libel, which are the primary reasons for revdeletion. —JmaJeremy✆✎ 22:29, 29 August 2012 (UTC)[reply]
- This extension would definitely have to honor your user privileges on Wikipedia: if you can't access a deleted revision, you won't be able to negotiate it via this extension either. JmaJeremy is totally right about third-party reuse of Wikipedia data, check out this paper if you are interested in the survival of revdeleted content. --DarTar (talk) 22:48, 29 August 2012 (UTC)[reply]
- I get the argument that anyone could be mirroring later-deleted content now. The concern that I and others share is whether this would lead to the creation of much easier ways for the public to view deleted content. For example, currently we have an informal gentleman's agreement with Google that they will very promptly remove deleted content from their caches. I'd be willing to support turning it on for now, but if it results in much easier access to deleted content, I think we'd want it off again. Gigs (talk) 15:35, 30 August 2012 (UTC)[reply]
- From what I understood, implementing this proposal will only make the already public information available in a format conforming to the Memento specification. Keφr (talk) 19:41, 29 August 2012 (UTC)[reply]
- I Strongly Oppose this. Pages are often in a state of flux for a reason. In theory, they are being made progressively better and more reliable. If someone has a real reason for looking at a past revision, it can be done using the current system. I am against making that any easier for the casual reader. --Nouniquenames (talk) 15:44, 30 August 2012 (UTC)[reply]
- Oppose We take a very incremental approach to creating content, here. Early versions of articles may be unreliable, biased, wrong, spammy, flawed in any number of ways, which professionally written and edited content sites are not. While I'm sure there are many cases where valid content has been lost, such as in edit wars or plain vandalism, I'd rather editors make a determination and restore it themselves, rather than allowing searches to sift through all the previous versions of articles for anything one might find. Shawn in Montreal (talk) 20:23, 30 August 2012 (UTC)[reply]
- (Re Shawn, Nouniquenames) The extension only makes pages that are already available accessible, via a standards based mechanism as well as the existing history pages mechanism. One of the main strengths of Wikipedia is its openness and transparency about the editing process and history. Being able to see old revisions of a page is an important aspect of the credibility of the site. The Memento protocol and extension do not take any standpoint on what should be accessible, and if a revision should not be accessible, then there are existing mechanisms to deal with that. The extension would simply make it easier for editors to find the older pages, in order to make the determination as to whether to restore previous text or not. Azaroth42 (talk) 22:22, 30 August 2012 (UTC)[reply]
- Whoa, wait. So deleted revisions will be shown, it's just the content that will remain hidden? - jc37 00:13, 31 August 2012 (UTC)[reply]
- My (potentially flawed) understanding is that visibility will be the same as for any current user without special rights. Any revdel might show up that a revision was deleted, but it would be impossible to see what that revision was or what it contained. If I'm wrong, someone please correct me. --Nouniquenames (talk) 05:41, 31 August 2012 (UTC)[reply]
- I'm not against the ability to view past revisions. I'm against making it easier for any random passer-by to grab an old, potentially inaccurate version of a page without doing so via specific, deliberate, intentional, locally controlled steps. (I hope I stressed that enough.) The intelligence requirement for reading content from this site is not particularly high, and that is a good thing. That said, it takes an extra few clicks to see an old version of a page for a reason. If we wanted everyone to see the old version, we wouldn't have changed it. If someone wants to see the old version, it is possible, but the (minimal) extra time and effort help to weed out those who might inadvertently stumble across an old article (possibly vandalized or incomplete) and think it the current. To enable this, vandalism response would almost be required to include a revdel in essentially every case lest a vandalized page be what people see. Further, if we are simply enabling a standardized API (as I understand it), we lose that control over how easily one might accidentally see an old version of a page as the current (causing or reinforcing the revdel requirement). Absent giving everyone the ability to delete pages (at least from Memento), which would likely give quick rise to new, inventive forms of vandalism we generally don't have to deal with now, I cannot see this as a good thing. --Nouniquenames (talk) 05:25, 31 August 2012 (UTC)[reply]
- Whoa, wait. So deleted revisions will be shown, it's just the content that will remain hidden? - jc37 00:13, 31 August 2012 (UTC)[reply]
- It is exactly because [e]arly versions of articles may be unreliable, biased, wrong, spammy, flawed in any number of ways that the Memento extension is so important. When taking a historical view on any matter, it is crucial to be aware of the fact that Wikipedia at the time may have had an article about it that was very different from what it is at present. See more of my reasons for supporting this request above. There I also explain why no random passer-by will accidentally grab an old, potentially inaccurate version of a page. --Thüringer ☼ (talk) 08:02, 3 September 2012 (UTC)[reply]
- (Re Shawn, Nouniquenames) The extension only makes pages that are already available accessible, via a standards based mechanism as well as the existing history pages mechanism. One of the main strengths of Wikipedia is its openness and transparency about the editing process and history. Being able to see old revisions of a page is an important aspect of the credibility of the site. The Memento protocol and extension do not take any standpoint on what should be accessible, and if a revision should not be accessible, then there are existing mechanisms to deal with that. The extension would simply make it easier for editors to find the older pages, in order to make the determination as to whether to restore previous text or not. Azaroth42 (talk) 22:22, 30 August 2012 (UTC)[reply]
- Oppose We have enough edit wars over current versions of articles. Last thing we need is more wars about trying to rewrite the past (through revdel campaigns). More deeply, I disagree with the concept that we should "spearhead the linked data vision of an interoperable ecosystem of open licensed, structured information". We are here to write an encyclopedia--a work whose intended mode of use is humans reading articles--not an "ecosystem of open licensed, structured information." There are already companies sucking structured info out of WP content in order to undermine our share-alike licensing policies, recycling Wikipedia's work into proprietary media. We don't have the legal means to stop them from doing that (assuming we wanted to), but it's not something we should be assisting as volunteers. They're getting the big bucks for it, let them do the work themselves. 69.228.170.132 (talk) 09:32, 31 August 2012 (UTC)[reply]
- How would this extension aid them at all, unless they really really wanted to publish a book on "every version of the article on physics ever" instead of a book on "the article on physics". Okeyes (WMF) (talk) 20:37, 31 August 2012 (UTC)[reply]
- Strong Oppose: I find persuasive the argument that our purpose is to build an encyclopedia, and that Memento does not aid in doing so. The point of the Wayback Machine is to view webpages or content that have expired or been deleted. We already have the means to do so on Wikipedia, but since our goal is to provide the best version of information in a current format, exactly what good is this supposed to do us? Ravenswing 09:57, 1 September 2012 (UTC)[reply]
- So, you oppose it because it doesn't contribute to our goal. What is the cost or undermining of that goal that this extension creates? :). Ironholds (talk) 14:17, 1 September 2012 (UTC)[reply]
- It takes time and resources (of servers and developers) which could be put toward other issues, for one. --Nouniquenames (talk) 14:26, 1 September 2012 (UTC)[reply]
- No, the extension has already been developed. It's done. We're talking about how to turn it on. Yes, it'll take some server cycles - but this isn't going to be enabled unless Ops confirm that it scales. Ironholds (talk) 14:41, 1 September 2012 (UTC)[reply]
- What leads you to believe - no need to answer, because the question is rhetorical - that it wins any hearts and minds for the Support side to rebut every single Oppose voter, any more than it's the case anywhere else on Wikipedia? I stated my position. I am not minded to change it just because you think this extension is Wicked Cool. If you want to debate it, take it down to the section clearly marked "Discussion." Ravenswing 17:47, 1 September 2012 (UTC)[reply]
- I'm rebutting one oppose vote :). And I've not explained that I think this extension is Wicked Cool; I've explained that your one reason for opposing it is somewhat weak. Ironholds (talk) 23:57, 1 September 2012 (UTC)[reply]
- And I agree that it should totally be discussed. Can I suggest you look at the discussion section, particularly the bit about server resources? Of particular interest is the line "The performance hit for a TimeGate request is significantly less than generating a history page, as it doesn't need to build the list, just find the version closest in time. As such this would be an advantage, performance wise, if people were to use it". Ironholds (talk) 23:58, 1 September 2012 (UTC)[reply]
- I'm rebutting one oppose vote :). And I've not explained that I think this extension is Wicked Cool; I've explained that your one reason for opposing it is somewhat weak. Ironholds (talk) 23:57, 1 September 2012 (UTC)[reply]
- It will never be done. To say otherwise is to misunderstand software development. By the same logic, we could have stopped at the first functional version of MediaWiki. There will always be bugs, bugfixes, new features, and testing against bloody everything that is added or tweaked later. Further, not only does it not help us, it duplicates an existing functionality. Also, it apparently has not even begun. Please read the intro: This is a preliminary RFC to assess community interest among English Wikipedia users for this functionality. No significant commitment of Wikimedia Foundation engineering resources has been made yet. --Nouniquenames (talk) 05:30, 2 September 2012 (UTC)[reply]
- No, it hasn't been evaluated by Ops yet. The extension has been fully developed by the MementoWeb developers, and evaluated to make sure it's compatible. There seems to be a misunderstanding about how MediaWiki extension development works; the WMF doesn't write all of them (or even most of them); our volunteer developer community is responsible for quite a few. Ironholds (talk) 10:49, 2 September 2012 (UTC)[reply]
- What leads you to believe - no need to answer, because the question is rhetorical - that it wins any hearts and minds for the Support side to rebut every single Oppose voter, any more than it's the case anywhere else on Wikipedia? I stated my position. I am not minded to change it just because you think this extension is Wicked Cool. If you want to debate it, take it down to the section clearly marked "Discussion." Ravenswing 17:47, 1 September 2012 (UTC)[reply]
- No, the extension has already been developed. It's done. We're talking about how to turn it on. Yes, it'll take some server cycles - but this isn't going to be enabled unless Ops confirm that it scales. Ironholds (talk) 14:41, 1 September 2012 (UTC)[reply]
- It takes time and resources (of servers and developers) which could be put toward other issues, for one. --Nouniquenames (talk) 14:26, 1 September 2012 (UTC)[reply]
- Oppose I simply don't understand what value this will give us. Page histories are already viewable, except for where we wouldn't want them viewable. How will having another way to access old pages "help referencing"? We can already access old pages and Wikipedia is not a reliable source. I'm open to being persuaded, because I know I'm not a techie and I can't believe I'm understanding this right. --Dweller (talk) 12:37, 4 September 2012 (UTC)[reply]
- So, us turning it on does not directly help referencing. What it does do is lend extra support to the protocol - which already has orgs like the W3C behind it - and promote its adoption. If it gets adopted widely, it solves the perennial linkrot problem; sites that revise their pages but have this protocol will have "old" versions still stored that we can link to in a way standardised internet-wide. That's the ideal, anyway. Okeyes (WMF) (talk) 13:05, 4 September 2012 (UTC)[reply]
- I still don't quite understand this. We're told that Wikipedia's having this protocol will merely make available, in some more robot-friendly way, the historical revisions that we make available already. But apparently other sites' having this protocol will magically cause them to make available lots of historical revisions that they don't make available already. How's that then? Victor Yus (talk) 14:11, 4 September 2012 (UTC)[reply]
- Because they'll have a standardised way of storing them they currently lack? Okeyes (WMF) (talk) 14:33, 4 September 2012 (UTC)[reply]
- I see - so the protocol provides a way of storing pages and a way of making them available, and Wikipedia plans only to use the making them available part - would that be right? Victor Yus (talk) 15:16, 4 September 2012 (UTC)[reply]
- Hi Victor. The protocol provides a method of making old versions of web pages available in the same way across different sites. It doesn't say anything about how the pages are stored (eg Wayback stores in special WARC files, Mediawiki in a database). The extension just turns on a more client developer friendly way to say "I want the version of (article) as it was at (date and time)." HTH Azaroth42 (talk) 15:53, 4 September 2012 (UTC)[reply]
- Let me get this straight... the proposal doesn't have any benefit whatsoever for us... it's about us setting an example for others to follow and if other sites that we use as RS ever follow suit, we will benefit? --Dweller (talk) 16:02, 4 September 2012 (UTC)[reply]
- That depends a lot on which "us" you're talking about :) If you don't ever want to see old versions of articles, then it doesn't benefit or hinder you at all. On the other hand, if you want to browse wikipedia as it was in the past, it's a huge benefit to not have to go back through the history list for every single article. Azaroth42 (talk) 16:12, 4 September 2012 (UTC)[reply]
- So would this extension allow me to do that? I.e. look up the article for X as it was on 1 Jan 2008, and click a link in that article to Y to see the latter article as it was on 1 Jan 2008 and so on? With transcluded templates and so on also as they were on 1 Jan 2008? If so, that would be great and I'd be a strong supporter. But if not, then providing only a partial solution has the potential to mislead people into thinking they're seeing a version that existed then whereas in fact it didn't, which might go against the spirit of the protocol(?) Victor Yus (talk) 16:38, 4 September 2012 (UTC)[reply]
- User:Andrew Gray brought up on my talkpage the idea that it would be consistent; I can't verify directly (I'm not one of the devs!) but I'll ask them to jump in :). Okeyes (WMF) (talk) 16:49, 4 September 2012 (UTC)[reply]
- Looks like the answer is yes! Okeyes (WMF) (talk) 16:50, 4 September 2012 (UTC)[reply]
- Yes it will. We'll make a screencast to demonstrate it. Thanks (and to Andrew) for bringing up this aspect. Azaroth42 (talk) 16:52, 4 September 2012 (UTC)[reply]
- If you use firefox, you can install the mementofox extension and see for yourself how it works. This works with wikipedia now. It makes all the links act as if they were in the chosen time period. I've found this to be useful for rescuing deadlinks; you look at a version of the page when the link was live, click on it, and it takes you to the archived version of the page, which can then be inserted into the current version of the wikipedia article.— alf laylah wa laylah (talk) 18:04, 4 September 2012 (UTC)[reply]
- We have added a screencast in the FAQ section. --Hariharshankar (talk) 20:00, 5 September 2012 (UTC)[reply]
- If you use firefox, you can install the mementofox extension and see for yourself how it works. This works with wikipedia now. It makes all the links act as if they were in the chosen time period. I've found this to be useful for rescuing deadlinks; you look at a version of the page when the link was live, click on it, and it takes you to the archived version of the page, which can then be inserted into the current version of the wikipedia article.— alf laylah wa laylah (talk) 18:04, 4 September 2012 (UTC)[reply]
- Yes it will. We'll make a screencast to demonstrate it. Thanks (and to Andrew) for bringing up this aspect. Azaroth42 (talk) 16:52, 4 September 2012 (UTC)[reply]
- Looks like the answer is yes! Okeyes (WMF) (talk) 16:50, 4 September 2012 (UTC)[reply]
- User:Andrew Gray brought up on my talkpage the idea that it would be consistent; I can't verify directly (I'm not one of the devs!) but I'll ask them to jump in :). Okeyes (WMF) (talk) 16:49, 4 September 2012 (UTC)[reply]
- So would this extension allow me to do that? I.e. look up the article for X as it was on 1 Jan 2008, and click a link in that article to Y to see the latter article as it was on 1 Jan 2008 and so on? With transcluded templates and so on also as they were on 1 Jan 2008? If so, that would be great and I'd be a strong supporter. But if not, then providing only a partial solution has the potential to mislead people into thinking they're seeing a version that existed then whereas in fact it didn't, which might go against the spirit of the protocol(?) Victor Yus (talk) 16:38, 4 September 2012 (UTC)[reply]
- That depends a lot on which "us" you're talking about :) If you don't ever want to see old versions of articles, then it doesn't benefit or hinder you at all. On the other hand, if you want to browse wikipedia as it was in the past, it's a huge benefit to not have to go back through the history list for every single article. Azaroth42 (talk) 16:12, 4 September 2012 (UTC)[reply]
- Let me get this straight... the proposal doesn't have any benefit whatsoever for us... it's about us setting an example for others to follow and if other sites that we use as RS ever follow suit, we will benefit? --Dweller (talk) 16:02, 4 September 2012 (UTC)[reply]
- Hi Victor. The protocol provides a method of making old versions of web pages available in the same way across different sites. It doesn't say anything about how the pages are stored (eg Wayback stores in special WARC files, Mediawiki in a database). The extension just turns on a more client developer friendly way to say "I want the version of (article) as it was at (date and time)." HTH Azaroth42 (talk) 15:53, 4 September 2012 (UTC)[reply]
- I see - so the protocol provides a way of storing pages and a way of making them available, and Wikipedia plans only to use the making them available part - would that be right? Victor Yus (talk) 15:16, 4 September 2012 (UTC)[reply]
- Because they'll have a standardised way of storing them they currently lack? Okeyes (WMF) (talk) 14:33, 4 September 2012 (UTC)[reply]
- I still don't quite understand this. We're told that Wikipedia's having this protocol will merely make available, in some more robot-friendly way, the historical revisions that we make available already. But apparently other sites' having this protocol will magically cause them to make available lots of historical revisions that they don't make available already. How's that then? Victor Yus (talk) 14:11, 4 September 2012 (UTC)[reply]
- So, us turning it on does not directly help referencing. What it does do is lend extra support to the protocol - which already has orgs like the W3C behind it - and promote its adoption. If it gets adopted widely, it solves the perennial linkrot problem; sites that revise their pages but have this protocol will have "old" versions still stored that we can link to in a way standardised internet-wide. That's the ideal, anyway. Okeyes (WMF) (talk) 13:05, 4 September 2012 (UTC)[reply]
How well would it work for Category:Virginia cities for 22 August 2006? This category contains a template which had been modified significantly (see here what {{cfd}} looked like at the time), and then later moved. עוד מישהו Od Mishehu 20:05, 29 August 2012 (UTC)[reply]
- Addressed here: http://www.mediawiki.org/wiki/Extension:Memento#Templates TL;DR: It can work if a small patch is also included into the core parser. azaroth42 (spec editor)
- With that patch it would be amazing. It's always been frustrating that we can't easily go back and see what heavily templated pages (such as Main Page) looked like in the past. If it's really that simple, and wouldn't cause any problems performance-wise, I'm all for it! the wub "?!" 23:05, 29 August 2012 (UTC)[reply]
- How well would it work for the actual category listing for Category:Virginia cities for 22 August 2006? Would it somehow show the pages that were in the category then, or the pages that are in there now? Anomie⚔ 01:55, 30 August 2012 (UTC)[reply]
- It would retrieve the old version of the page exactly as in the history for the category. The links would not be rewritten to point directly into other history pages, but clients such as the MementoFox browser add-on, take care of this for you. Thus, if you set your datetime preference to be August 22, 2006 and clicked on a link in a page, it would take you to the version of the new page closest in time to August 22, 2006. If you install the MementoFox browser add-on, you will see how it works via a (slow and computationally expensive) proxy based solution. Azaroth42 (talk) 22:34, 30 August 2012 (UTC)[reply]
- If you were replying to my question, I think you misunderstood it. If not, feel free to delete this comment. Anomie⚔ 23:55, 30 August 2012 (UTC)[reply]
- It would retrieve the old version of the page exactly as in the history for the category. The links would not be rewritten to point directly into other history pages, but clients such as the MementoFox browser add-on, take care of this for you. Thus, if you set your datetime preference to be August 22, 2006 and clicked on a link in a page, it would take you to the version of the new page closest in time to August 22, 2006. If you install the MementoFox browser add-on, you will see how it works via a (slow and computationally expensive) proxy based solution. Azaroth42 (talk) 22:34, 30 August 2012 (UTC)[reply]
A couple concerns
[edit]I support the idea but I do have some concerns about it in addition to the ones above.
- Does it include all namespaces (especially non article like File, User, Special, Mediawiki, etc.). If so how will it reflect images or articles that have been deleted due to copyright?
- It does not include namespaces for which there is no history, such as Special. So it works for User (eg User:Azaroth42) and User_talk but not Special (eg not Special:Preferences).
- If an article or its content is deleted due to Copyright via how is that relayed through Memento?
- As above, it is not retrievable via Memento if it is not retrievable via the History tab.
- Is it for English Wikipedia only? What about the other languages, commons or sister projects like Wiktionary and Wikinews? Each will have its own issues with this.
- The extension is a generic MediaWiki extension. Sister projects would have to enable it themselves, based on their own discussions, one imagines. I defer to Wikipedia folk as to different languages, but assume that it would.
- Will this cause any performance problems with the servers?
- The performance hit for a TimeGate request is significantly less than generating a history page, as it doesn't need to build the list, just find the version closest in time. As such this would be an advantage, performance wise, if people were to use it. The performance of the TimeMap request is almost identical to that of generating a page of History; they both need to know 500 revision ids, however the TimeMap does not include diffs or anything else, just the links. Furthermore, TimeMaps are able to be cached, which would reduce the load on the database.
That's all I can think of at the moment. Kumioko (talk) 00:39, 30 August 2012 (UTC)[reply]
- Thanks for the comments Kumioko :) Azaroth42 (talk) 22:45, 30 August 2012 (UTC)[reply]
Thanks for the quick reply Azaroth. I still support the idea but I have some trouble with the User namespace being a part of it. Sometimes people put person info on the User page, some with the understanding that it applies primarily to WP because most mirror sites don't pull that data in so I imagine some users are going to have some heartburn about that. Kumioko (talk) 00:14, 1 September 2012 (UTC)[reply]
Past revisions = pre-deleted revisions?
[edit]So will this include the revisions of a page for the month of its existence prior to speedy deletion as a BLP attack page? Being the encyclopedia that anyone can edit, means there are a lot of edits which occur which may be considered problematic to say the least.
And this doesn't even get into edit warring or patent nonsense or privacy.
And what about robots.txt? will all those pages' revisions be included? Will all talk pages? - jc37 00:41, 30 August 2012 (UTC)[reply]
- See above.
"It's a standards based way to get to pages in history, it doesn't make anything accessible that would not be otherwise."
Not sure what you mean about robots.txt. the wub "?!" 09:42, 30 August 2012 (UTC)[reply]
- I believe he means that tools such as the Internet Wayback Machine typically respect robots.txt (the English Wikipedia's robots.txt file is here). From our article, "Robots.txt is used as part of the Robots Exclusion Standard, a voluntary protocol the Internet Archive respects that disallows bots from indexing certain pages delineated by the creator as off-limits." --MZMcBride (talk) 20:05, 30 August 2012 (UTC)[reply]
- Thanks MZM. And nod, though also whether such revisions will now suddenly be open to be mirrored through bypassing robots.txt. I don't know enough about momento to knowhow this will affect things.
- I've read over the extension several times. And get the idea that deleted revisions while standardised through momento, will not be viewable except by those with the ability to view deleted. Same with oversight, etc. (And does that mean we will be even more vulnerable to a compromised admin account/admin tools gained on the sly just to robot-copy everything?)
- I don't understand how all this will work, and maybe it's because I don't quite understand the extension.
- Right now, At the internet archive (and other such places) I can look at a previous version of a page, which has since been deleted. Will this extension allow that from Wikipedia directly? And further, will this extension make it easier for other sites to save deleted contributions? In other words, even though a page is deleted, through the momento standardisation, will the bots now be able to copy any edits (deleted and otherwise) and install them at their own wiki, and now they can undelete at their site, etc. This has privacy ramifications etc.
- (I'm hoping the response is: "Chuckle, and no, you don't understand what this will actually do, let me more clearly explain..." : ) - jc37 23:25, 30 August 2012 (UTC)[reply]
- 1) Access to crawlers is guided by robots.txt file. All the old revisions in wikipedia are in the /w/ path and the robots.txt file for en.wikipedia.org reads:
- User-agent: *
- Disallow: /w/
- Hence, no bots have access to these old revisions and memento does not change anything about this.
- I believe he means that tools such as the Internet Wayback Machine typically respect robots.txt (the English Wikipedia's robots.txt file is here). From our article, "Robots.txt is used as part of the Robots Exclusion Standard, a voluntary protocol the Internet Archive respects that disallows bots from indexing certain pages delineated by the creator as off-limits." --MZMcBride (talk) 20:05, 30 August 2012 (UTC)[reply]
- 2) Deleted revisions will not be accessible using this extension. Please refer to FAQ for more information. --Hariharshankar (talk) 14:36, 1 September 2012 (UTC)[reply]
- 1.) Thanks for the clarifications concerning robots. Though I'll note that we've long seen that there are bots which ignore the exclusions in robots.txt
- 2.) As I have already said, I've read the extension which notes that that is the intention. But a quote about a certain road being paved with good intentions, comes to mind. hence why I am asking these questions : )
- Would it be possible to create a temporary wiki, port a few hundred edits of varying types to it, and show exactly how this would work? This was done prior to the implementation of the filter, and I think it would help deal with concerns about this. - jc37 23:03, 2 September 2012 (UTC)[reply]
- That would be awesome. We'd have to stick it on test.wikimedia or prototype.wikimedia before deployment anyway - one of those, maybe? Okeyes (WMF) (talk) 01:04, 3 September 2012 (UTC)[reply]
- Setting up a Memento-powered MediaWiki instance on Labs sounds like a no-brainer. --DarTar (talk) 17:45, 5 September 2012 (UTC)[reply]
- That would be awesome. We'd have to stick it on test.wikimedia or prototype.wikimedia before deployment anyway - one of those, maybe? Okeyes (WMF) (talk) 01:04, 3 September 2012 (UTC)[reply]
- 2) Deleted revisions will not be accessible using this extension. Please refer to FAQ for more information. --Hariharshankar (talk) 14:36, 1 September 2012 (UTC)[reply]
WMF legal advice
[edit]Have you consulted WikiMedia Foundation for confirmation that their legal staff have no objection to the proposal?
You have made clear that hidden revisions would not be exposed by this new interface, so it does not amount to publishing any more information than is already available via the History tab. Also, users would presumably be well aware that they were not viewing the most recent versions of articles (and transcluded templates), and hence would appreciate that they might be more likely to see inaccurate or potentially libellous information than if they were reading the current website.
On the other hand, the interface is intended to make accessing non-revdeleted revisions easier, and inaccuracies are often simply reverted rather than revdeleted, so remain visible in the history page and under the proposed interface. On balance, I don't think that is objectionable, but it would be good to know whether WMF share this view.
— Richardguk (talk) 10:33, 3 September 2012 (UTC)[reply]
- Well, the RfC was started by a pair of staffers in their professional capacity - but you raise an excellent point. I don't see a problem myself, but I don't work for Legal; I'll check in with them today (if I can find them. It's Labour Day, apparently, which means the dang 'merkins get the day off). Okeyes (WMF) (talk) 10:43, 3 September 2012 (UTC)[reply]
- FWIW, we already note that "This is an old revision of this page ... it may differ significantly from the current revision." when you look at a history version. (I would love it if this box was more emphatic!) Readers using a complex opt-in web history tool are probably more likely than casual browsers to be aware of this, but I think it would be quite reasonable to have a (click-to-dismiss?) banner across the top of all pages reminding them of the Wikipedia-specific risks of older content.
- (That said, I suspect many people will be reading it this way to look for those inaccuracies and oddities - "what was being reported about X on this day, before we knew about Y"?) Andrew Gray (talk) 11:59, 4 September 2012 (UTC)[reply]
- I've spoken to Michelle Paulson over at legal; she has no objection as long as it doesn't make visible anything that isn't already visible (which it shouldn't). Okeyes (WMF) (talk) 12:55, 5 September 2012 (UTC)[reply]
Behaviour of Memento
[edit]My vague memory of having Memento described to me in a web-archive context is that it would allow date preferences to carry through to linked pages (where supported) - you'd read the article on United States, as of 1/1/08, which would say that George W. Bush was president, and then click through to that article, where it would retain the date and give you a version as of 1/1/08, etc.
Does the MediaWiki installation work like this? If so, it'd make a more compelling case for its usefulness than the examples above, which are all one-page scenarios. Andrew Gray (talk) 16:09, 4 September 2012 (UTC)[reply]
- This is exactly how it works. If you have your datetime preference set to 1/1/08 and you click from one article to the next, you'll end up at the version of the clicked on article from that same date. Azaroth42 (talk) 16:25, 4 September 2012 (UTC)[reply]
- Thanks. I suspected this was the case, but I was having trouble persuading Firefox to place nicely with a test MediaWiki installation to confirm it! Andrew Gray (talk) 16:29, 4 September 2012 (UTC)[reply]
- This is exactly how it works. If you have your datetime preference set to 1/1/08 and you click from one article to the next, you'll end up at the version of the clicked on article from that same date. Azaroth42 (talk) 16:25, 4 September 2012 (UTC)[reply]
On the fence
[edit]I will either strongly oppose or strongly support this, but I haven't decided which yet :-) the implications are complex and I would urge people not to make reflex judgements on something like this.
- I doubt that improved access to prior versions will be a great advantage to conventional content-building. Maybe a little. We improve articles in a series of edits, each better than the last; in that context why would improved access to earlier (less good) versions be so helpful?
- However, I think it could have some advantages in dealing with problematic editing. Not the blatant vandalism, but the stuff we have more trouble dealing with - sneaky vandalism, long-term pov-pushing, subtle copyvio, &c - could perhaps be easier to handle if we had Memento in our toolbox. For instance, when dealing with neutrality problems which span several articles that touch on some controversial issue, I often find myself trawling through the histories of several articles. However, I'm not yet sure how much Memento would help us with this - it might be helpful to explore some more use-cases &c..? The obvious answer to that is that a trial would help, of course.
- I don't care, much, whether Memento makes life easier or harder for third parties (ie. anything outside the sphere of enwiki content, editors, and readers). I don't think the stakes are so high since most third parties are interested in our current content rather than past content. Objecting on the basis that somebody might find it easier for somebody to use content, when our whole mission is to provide content that's open for others to use, seems odd to me.
- It's important to bear in mind that we don't always revdel the bad stuff. Srsly. In principle we're supposed to; but in practice, once an edit has been reverted, the Bad Stuff is not visible to casual readers and the community relaxes a bit. Also, the number of editors with a revdel button is much lower than the number of editors with a revert button, and the latter group includes some highly active bots. Recently I found an editor who had a history of incoherent IP edits disclosing great personal detail - email address, credit card numbers, name, address, DOB &c - and requested oversight; then, trawling through article histories, I found lots more of these edits on other articles, which had simply been reverted rather than revdel'd. A tool like Memento would be a double-edged sword, in that it's easier to find the Bad Stuff that had simply been reverted - making it more visible to the wider internet but also making cleanup easier.
Those are my principles; if you don't like them, I have others. bobrayner (talk) 11:07, 18 September 2012 (UTC)[reply]
- Mind if I join you on the fence? : )
- I agree with the above, and the last point is a big deal breaker for me. But as I yet dunno what's going on with deleted revisions, I'd like to see the test version first. - jc37 01:12, 26 September 2012 (UTC)[reply]
- My understanding is the extension would not make any data public that is not already. However, it may make it easier to stumble across reverted revisions without trawling through long page histories looking for them. Dcoetzee 07:20, 27 September 2012 (UTC)[reply]
- So, in response to your not caring about whether it makes life easier or harder for third parties: the point of Memento is that it makes it easier for readers to access Wikipedia content as it was in the past. As for old edits that should have been revdelled: they are still accessible now. This isn't a good criticism of Memento: it's like opposing a wheelchair ramp being added to the local school because it'll allow paedophiles in wheelchairs to break in. Making access to old edits is a good thing: they have substantial educational value for people wanting to understand the history and culture of Wikipedia and the history of the subjects we cover. See the heavy metal umlaut video and the history of the Iraq War through Wikipedia edits. —Tom Morris (talk) 18:19, 2 October 2012 (UTC)[reply]
There is consensus for a test run pilot of Memento on the English Wikipedia, provided that it does not make content not already available through the History tab available to Memento users. MBisanz talk 22:58, 4 October 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. No further edits should be made to this discussion.