Wikipedia:Wikipedia Signpost/2011-08-01/Research interview
The Huggle Experiment: an interview with the research team

As part of the 2011 Summer of Research, the Wikimedia Foundation's Community Department has announced an experiment to investigate potential improvements to first-contact between new editors and patrollers using the Huggle anti-vandalism tool. The experiment aims to test "warning templates that are explicitly more personalized and set out to teach new editors more directly, rather than simply pointing them to policy and asking them not to do something", according to Wikimedia Foundation Fellow Steven Walling. To gain an insight into how such initiatives come about and what goes into planning them, the Signpost interviewed researchers R Stuart Geiger, Aaron Halfaker and Steven Walling.
Can you tell us a little about the Summer of Research project, how it came about, how you got involved and what sorts of questions you hope to investigate this year?
- Steven: The Wikimedia Summer of Research (WSoR) is a 3 month intensive to study aspects of participation in Wikipedia that may have a significant effect on the issue of editor retention. Per the Board resolution on Openness and the Foundation's Annual Plan for 2011-12, recruiting and retaining editors for Wikipedia is one of our top priorities.
- Zack Exley, our Chief Community Officer, designed the summer to really dig deeper into the exact areas of English Wikipedia and other projects that have the largest effect on new editors and whether they stick around. The Editor Trends Study gave the Foundation a high-level understanding of the problematic trends in participation, but it didn't tell us with certainty what internal community factors have an impact. We need to have data that we are confident in if we're going to make good decisions as a movement.
- I personally got involved because, as a Fellow at the Foundation, research has been one part of my job. I currently share the responsibility for leading the WSoR team with Diederik van Liere and Maryana Pinchuk. (Diederik has experience with the technical side of this project, Maryana is a qualitative researcher with an academic background, and I lend community experience to round out the leadership team.)
- As for the question set, we built an enormous, multi-part list publicly on Meta. But it turns out that was just a beginning guide. We have been structuring the summer as a series of weekly sprints, and to get a feel for the research topics that have been and are currently being explored, I'd check out the public list on our Meta page. Due to the fact that we have a team with a wide variety of skills, we've looked at many different aspects of Wikipedia as a community so far.
How did the idea to experiment with Huggle's standardised warning system originate?
- Stuart, Aaron:
- WSoR's goal is to understand the decline in new editors, so one of the areas we focused on is new editors' experience in the community.
- Dr. Melanie Kill suspected that welcome messages might have an effect on how new editors perceive the community.
- Hugglers send out the most messages to new editors.
- We wanted to see if we could improve conversion (from damage) and other retention rates by just changing the working of the message.
How do lofty strategic goals like "Support the recruitment and acculturation of newer contributors" get translated into practical initiatives such as this?
An experiment of this kind seeks to understand social phenomena using technical methodologies. Does this involve coordination between, for instance, the Community Department and the Huggle developers, or is the experiment conducted by researchers proficient in social statistics or the digital humanities? Can you talk a little about the backgrounds of those involved?
Aaron is a computer science graduate student from the University of Minnesota. He's been an editor since Feb. 2008 (EpochFail) and publishing academic research of Wikipedia since WikiSym 2009. He specializes in statistical data mining and he designs user-scripts for Wikipedia intended to understand/improve editor interactions.
Stuart (User:Staeiou): Certainly! I've been a Wikipedian editor since late 2004, and have been studying the project as an academic since I started writing my undergraduate senior thesis in 2006. Since that time, I've moved in and out of all kinds of academic fields and disciplines, trying to gather the conceptual, theoretical, and methodological tools necessary to study something as complex as Wikipedia. At present, I'm a doctoral student from the School of Information at the University of California, Berkeley, and I have a keen interest in both the digital humanities and social statistics movements. In particular, I am an adherent of the sociotechnical systems approach, which demands that we think about how social and technical phenomena are inherently intertwined, especially when we study processes in communities as technologically-mediated as Wikipedia. Our motto, "the free encyclopedia that anyone can edit," speaks to this principle that what Wikipedia is as a community cannot be fully understood without taking into account the code upon which it runs -- and vice versa. Huggle is a great example of this: scripts, tools, and bots like Huggle, Twinkle, and User:ClueBot have become the predominant way in which new users are introduced into Wikipedia. In fact, here's a statistic that is hot off the research press: almost 75% of newbies have their first talk page message sent to them from one of those semi- or full-automated software systems. Stu (aeiou)I`m Researching Wikipedia 00:12, 29 July 2011 (UTC)
How were the parameters of the experiment – number of warnings delivered, proportion of changed warnings – decided upon?
We settled on three variables for testing in our experiment: personalized, teaching-oriented and image. Dr. Kill, a professor of rhetoric, produced personalized and teaching-oriented versions of the default warning template for Huggle; Stuart and Aaron then expanded these templates with image/no-image versions and prepared a random template generator. Our requirement for the number of experimental welcome/warnings is based on a little bit of statistical algebra that lets us predict at how many observations we'll need to find statistically significant differences between the variables.
The Huggle experiment is not the first attempt to investigate the interactions of patrollers and new page creators. A notable community-lead effort was the Newbie treatment at Criteria for speedy deletion experiment in 2009, where experienced editors (this interviewer included) posed as inexperienced article creators in order to gain an insight into how new contributors were treated in the patrolling process. The experiment attracted significant controversy, due to ethical concerns surrounding informed consent of the subjects. To what extent did the research team consider or engage with the relevant subjects from the editing community (i.e. Huggle patrollers, new contributors) prior to this experiment?
- Public notice posted Wikipedia:Village_pump_(technical)#Huggle_experiment
What do the researchers hope to learn from the experiment and what are the preliminary expectations or hypotheses to be tested?
Huggle users come across hundreds of potential editors every day. Quite a lot of these editors are testing whether they can, in fact, edit Wikipedia by damaging an article. We suspect that the reaction these potential editors receive affects their decision register an account and try contributing productively. We hypothesized that the tone of the welcome/warning message could be an important factor in this decision. We have Hugglers testing a few variations of the 1st level warning message to find out if we are right.
What sorts of approaches we might expect from the Foundation in testing and improving usability, reader engagement and editor retention in the months to come?
Discuss this story
"Fewer than half of the newbies investigated received a response from a real person during their first 30 days". I think we really dropped the ball here. Interaction is a major way to recruit newbies and hopefully turn them into "regulars". OhanaUnitedTalk page 05:18, 2 August 2011 (UTC)[reply]
Perhaps two critical concerns will govern the efficiency with which the problem can be addressed: (i) how long into a newbie's edit-history the patterns become clear, and (ii) the extent to which they can be identified by a bot (including whether a bot could do the initial "easy" filtering and pass a minority on to human eyes for higher-level sorting to identify the promising newbie-pluses for human interaction – a three-tiered filtering, as it were). Of particular interest might be the grey area of newbies – not those who will clearly stay and those who clearly won't (or who we clearly do or don't want to stay), but those where final stage, human interaction, has a reasonable likelihood of making the difference, of bringing them over the line. Finding the best bot/human mechanism for rationing the supply of "newbie mentors" to this prioritised editorial demographic, IMO, is the challenge. After that, a future project could work on developing guidelines for the best ways in which to interact with newbie-pluses. Tony (talk) 02:41, 3 August 2011 (UTC)[reply]