Wikipedia:History of Wikipedian processes and people/2006 Citizen Valley interview of Brooke Vibber

The following is a transcript of a 2006 CitizenValley.org interview of User:Brooke Vibber, who was Wikimedia Foundation's Chief Technical Officer from 2005 to 2010 and its first employee.^[1]^[2] The captions/questions have been copyedited for grammar.

Transcript

MySQL Conference April 26th, 2006: A look inside Wikipedia with Brion Vibber, CTO, Wikipedia

“

My name is Brion Vibber, I'm the Chief Technical Officer of Wikimedia Foundation. So I've been working on the software that we use to run the site and on maintaining the site itself for about the last four years, initially as just a volunteer working on things like the software, all under an open-source development model, and then, last year, I got hired to work full time on that. So I've been doing that ever since.

”

About the Wikimedia Foundation

“

The Wikimedia Foundation is a non-profit corporation. It's based in Florida. There is about 3½ employees now, I think. There's myself and another programmer; we have a guy who comes in half-time to work on the servers, then we have a guy in the office who works on paperwork and handling the phones and taking complaints and things of that nature, and then, another person in the office, part time, for some things. And that's about it as far as employees go. Most of our income comes from donations—we run a fundraiser couple of times a year—and we bring in enough to keep the site running; primarily goes out to adding more servers to handle the increasing load, because our popularity level and web traffic just keep going up. If you go to the comparative website traffic graphs on Alexa Internet (it's an Amazon service; I think it's alexa.com) they have this neat little service where you can compare the relative traffic any site to any other site. So it's kind of frightening to compare us to CNN or something, and, see, CNN is—you know—they have a lot of traffic but it's kind of steady, and our traffic is kind of going up and up and up and up; it's really kind of frightening, but it's very exciting as well. So we're trying to keep up with traffic. It's mostly adding hardware to maintain that, and a couple of people. Right now, we have about a hundred application servers and about a dozen database servers, and we also have several off-site proxy cache servers in Netherlands and Korea, which speed up access to Europe and Asia. And that's Wikipedia and several of its sister sites—We have the Wiktionary dictionary; Wikimedia Commons is our multimedia repository, so it's mostly a lot of photographs under Open Content License, Creative Commons, public domain, that sort of thing. So then, those are used in Wikipedia but also on some other sites. [Speaker B: Wikiquotes [sic]] Exactly.

But we have recently split the back-end database for the English-language Wikipedia, which is our largest site. But that's totally transparent from the outside. That's just—we have—you know—one master database server and then another master database server, and they're just churning around lots of data. Then we also have four Asian wikis, which we actually host separately. The Japanese, Korean, Malay, and Thai Wikipedias are actually hosted on our servers in Korea. And that's partly an experiment in being able to maintain a second data center, for—you know—as we continue to expand, eventually we're gonna have to move more things out, and that sort of keeps things working.

”

About MediaWiki

“

Well our wiki software, which runs Wikipedia, is called MediaWiki, and it's an open-source program under the GPL license, so it's available for free. Anyone can use it. There are thousands of both public and private sites using it. If you go around to the various companies here and ask them about it, a fair amount of companies are using a MediaWiki site internally. A lot of others are also using different wikis. So there's many different programs for this [Speaker B, mostly unintelligible].

Some of them are more commercial; some of them are more non-commercial. Ours is primarily non-commercial, but we do get used in companies. One thing that we're going to be looking at is how we can help to serve them as well. So, we might at some point start offering some consulting and support for that.

”

What is Wikipedia?

“

Wikipedia, as a sort of the idea of it, is to be very open to accepting contributions, to the point that a lot of material can be put together very quickly, and it can be updated immediately when events change. For instance, when something important happens in current events, we have an article on it right away. In a more traditional encyclopedia, it might be—you know—the next year they have a yearly update that has it; it's not really in a standard version for years and years yet. But of course, the flip side of that is: Because it is so very easy to contribute something, it's also very easy to contribute something that's not totally accurate. So, there is a bit of a need to have other people go through and do some fact-checking and some reviewing. [Speaker B: So, how do you make sure of that fact checking?]

”

How do you prevent inaccuracies?

“

Now, at the moment, it's primarily just a matter of community self-policing. Because you have a lot of other people working on those same articles, and as one person adds something, somebody else takes a look at it... if it's maybe not totally accurate, might be not totally sourced, it'll get filtered out, or they'll go back to try and find more information. But that doesn't always happen immediately. Sometimes, it maybe takes a few months, in the worst cases, to identify that—you know—"This problem is an error. This article is kind of problematic". But at the moment, all of our articles are in a work-in-progress state. So, we can't really guarantee anything at a particular time.

If it's a very high-profile article—for instance, the George W. Bush article, obviously, is very closely watched—there's a lot of people who are working on it, and we've even gone so far as to require that new users first sign up and—you know—wait a few days before they can make a change to it. And if they do make a change, then it gets immediately reviewed. Whereas with other more rarer topics, that's not really as regular a process.

So, what we can expect to see over the next several months as we more kind of formalize this process is: We're going to have—on the articles that are being very closely watched and reviewed—there will be a—you know—nice little marker that says: "This page has been reviewed, has been fact-checked. We know that the information in this article is pretty good. But you can also go see what the latest changes are on this page to see the latest event that just happened in Congress or whatever". And then on the other pages that aren't being as closely watched, there will be a nice little marker that says: "This page is probably okay, but—you know—take it with a grain of salt. It hasn't been formally reviewed yet". Or if it's a brand-new article and hardly anyone's edited it, then that tends to be sort of the most problematic type of thing. Or, for instance, somebody adds a fake or not-totally-accurate biography of someone, that maybe is either inaccurate or sometimes just not very flattering. (Well, maybe quite possibly accurate but not well-sourced, not well-researched.) Those tend to be the areas where maybe people get a little upset that this information is there, but maybe it's not accurate or sourced. And those are the kind of places where it's going to be quite explicit, disclaimer-wise I guess, in saying: "This has not been reviewed. If there's a problem with it, you know, click here", making that a little more of an explicit process.

”

Lessons learned from the latest controversies

“

Well, the biggest lesson is that it works a lot better than most people think. But the second lesson is that when it doesn't work, people do get upset. So, sort of, management of that is something that we sort of left to the winds in the early days, that "oh, it'll get fixed eventually". But as we've become more popular, a more often consulted resource, we do maybe have a little bit of responsibility to say, you know: "Yes, there is a different level of quality on some pages than on others". And we want to make sure that we recognize that in saying: "This page is good, this page is a work in progress". And making it very easy for people to get involved—not just in directly adding an article—because not everyone is going to just go in and delete the—you know—"I assassinated Bobby Kennedy". They're going to just say: "Hey, you know, this is wrong. I'm upset about it. Who's got to fix it?" [Speaker B: And who's fixing it?] That's a good question. Because the first part is knowing that there's something wrong and the second part is having someone to fix it. And so there's sort of a back-channel to some degree in that if you're particularly upset, you can call up the office or send an email to the abuse address. But that's... not everyone sort of leaps to that level. So having sort of a middle path is something that's important.

”

Who checks for accuracy of information?

“

We have a number of volunteer contributors, including several hundred volunteer administrators who have shown some commitment to the project and who work on cleaning up problem pages: vandalism, copyright issues, things like that. If there's a huge problem, then somebody calls the central office and they say: "You know, this article is really bad, somebody needs to take care of that". And they'll contact some of those administrators who will go in and—you know—take a look at it and either get rid of it entirely or just fix it up.

”

What are the benefits for volunteers?

“

For a lot of people, it's just a hobby. It's a more productive way to spend your time than playing EverQuest or something, certainly. A lot of people like to do something that is productive. People like to write, they like to share their knowledge, they maybe like to proofread... Some people... that's a major hobby, and they like fixing all the punctuation errors. So it depends. But people who have an interest in some area of knowledge, for instance, if they're a history buff or a film buff or whatever, they'll often work on articles in those topics because they do have a fair degree of domain knowledge. And some exercising that by writing things, by reading other things, by going: "Oh, I'm not sure if that's totally accurate, but I'm going to go research it. And that's got to be fun". And then you have a chance to fix that up and learn a little something yourself.

”

Is Wikipedia at risk?

“

I don't think it's ultimately a huge risk, but I think it does require a little bit of discipline. One of the things that we're trying to lay down as a policy is particularly: Biographies of living persons are the biggest issue, where someone takes exception to the way they're characterized. And when we have such a biography, the policies of how you write and how you source information are going to necessarily be a bit different from someone who—you know—died 200 years ago and has 50 books written about them. So, sort of keeping up with that and having better reporting tools for people to see what's been added and sort of what sorts of pages are going to be problematic to very quickly go in and find those and treat them differently should, I think, keep it pretty well in line.

”

What is a wiki?

“

A wiki is a website that multiple people can change the contents on. The difference between a wiki and a blog is primarily in two things, which are what you can change and when you can change it, as far as how things are organized. In a blog, you basically have a post which is a sort of stream of things one at a time, one after the other, and then you add a comment on top of a post. In a wiki, you have more of a space rather than a time. So you have a page, and then you have another page, and another page, and all of those pages can be updated continuously. So, a page on a certain subject is always current because it can always be changed. And instead of just having a comment on the end, you can actually alter the document itself. So it becomes more of a living document, and that instead of... For instance, if you go to a blog post from three years ago, there might be some comments on it that say: "Oh, this thing has changed since then, that thing has changed since then". But if it's a wiki page, then you can directly change the page itself. So the article is always alive and always current. You sort of have a more live, up-to-date environment.

”

Wikis and other collaboration tools?

“

The biggest difference to something like Lotus Notes is that it is very immediately accessible and it's sort of open by definition. If you want to work with Lotus Notes, first of all, it's a tightly controlled commercial system. You have to have the special Lotus Notes client, or you have to use a sort of hacky web interface onto that. Whereas wikis are very much adapted to the web itself. They're accessible from anywhere. You don't need additional software to access them. You just can consult them like any website, and then click "edit", and you're in. And you can immediately make changes.

”

About supporting other types of media

“

On Wikipedia, we make extensive use of photographs. We can upload image files. We have somewhat less use of different kinds of media—audio and video—partly just because those are harder for users to contribute and they're harder to distribute in a compatible way. Video encoding is sort of a tricky issue to really get right. But certainly, that's a problem space that's being very actively worked on these days with things like Google Video and YouTube and all that. That's something that's really hopping in the modern web world. So that's something that we're going to see more on our wiki sites in the future.

”

What's coming?

“

One of the... a couple of the more immediate things that we're going to be seeing over the next few months is, first of all, we're unifying our own login system so that all Wikipedia sites will use the same logins, because, right now, it's {separate}, it's kind of cruddy. But... [Speaker B: You mean the languages, the different languages ...] Because right now, if you go to English Wikipedia and French Wikipedia, you have to create your account separately on each one. So that's kind of a pain. So we're going to merge that. So, and they have to create an account once.

But once that's done, we can start to do really cool things like using distributed authentication and integration systems like OpenID. So that, for instance, if you are—you have an account on, say, LiveJournal, and then you come to a Wikipedia article and you want to make a comment on it, your comment there can be associated with your LiveJournal identification. So you're not just some random person: You have some sort of consistent ID across sites. And by the same token, if you have a Wikipedia identification, then you go to a LiveJournal or whatever blog that talks about Wikipedia, and you make a post about it, that post can be explicitly identified with your Wikipedia account. So you're not just some random person on LiveJournal. You really are such and such person on Wikipedia. And that sort of goes beyond to all kinds of blogs and wikis and other sorts of community websites. As far as being able to provide a system where you are explicitly a person, as far as being able to sort of link all those communities together in an explicit way.

”

Merci!

“

Sure!

”

Video

Google Video ^{[dead link]}
YouTube

References

^ User:Kaldari (31 December 2012). "Interview with Brion Vibber, the WMF's first employee". The Signpost. Wikipedia. Retrieved 24 February 2024.
^ Baker, Nicholson (10 April 2008). "How I fell in love with Wikipedia [1]". The Guardian. Retrieved 24 February 2024.

[1] User:Kaldari (31 December 2012). "Interview with Brion Vibber, the WMF's first employee". The Signpost. Wikipedia. Retrieved 24 February 2024.

[2] Baker, Nicholson (10 April 2008). "How I fell in love with Wikipedia [1]". The Guardian. Retrieved 24 February 2024.

[1]

[2]

Transcript

Video

See also

References