Wikipedia:Articles for deletion/MurmurHash (2nd nomination)
- The following discussion is an archived debate of the proposed deletion of the article below. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review). No further edits should be made to this page.
The result was no consensus. No evident agreement as to whether or not the topic is sufficiently notable for inclusion. Currently the nominator is the only editor arguing for deletion, but they seem to provide a strong case which is backed up by relevant standards and policies. There are a number of keep "votes", but they are weak in nature and often don't address the main issue. Overall, I can't see any consensus either way. –Juliancolton | Talk 00:33, 29 November 2009 (UTC)[reply]
- MurmurHash (edit | talk | history | protect | delete | links | watch | logs | views) – (View log)
- (Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL)
Article was previously deleted for (although having plenty of references) for not having any wp:reliable source references at all. The article's subject is thus not currently wp:notable. Article was rewritten and recreated without solving this problem in any way, and hence we're back in an AFD again, not so very long after the previous one. Basically in the absence of a reliable source (with proper editorial control and peer review of the information) then the problem is that hash functions are two a penny, and there's no reason at all, under the Wikipedia's policy that this should be kept. This is particularly so after we had a previous delete for the same reason. This article seems to me to be intended to be a form of advertising right now, the Wikipedia is not supposed to be here to create notability. - Wolfkeeper 21:15, 14 November 2009 (UTC)[reply]
- Well, I anticipated this reaction, so I checked with both of the admins who had deleted the article previously and got their input and permission before restoring it. Both agreed that the Hammer academic article and the NIST listing are sufficient WP:RS for notability. For some reason, you don't.
- Keep in mind that the bar for notability for non-cryptographic hashes on Wikipedia is not especially high. I was unable to get Fowler Noll Vo hash deleted even though it only references Noll's own site. The other comparable hash, Jenkins hash function, likewise references only Jenkins, although one of these references is in a published (industry but non-academic magazine) article by him. This leaves me to wonder why you hold this particular algorithm to a higher standard than the rest. Did MurmurHash do something bad to you once?
- Also, please WP:AGF and avoid impugning the motivations of editors. There is absolutely nothing in the article that constitutes advertising, and it's just an algorithm that was placed in the public domain for all to use freely, so there is no "product" to advertise. Instead, I would appreciate it if you could address the conflict between your claim that there are no reliable sources and the presence of these demonstrably reliable sources. Phil Spectre (talk) 00:31, 15 November 2009 (UTC)[reply]
- Both of them are one word mentions. They don't in my opinion confer any significant notability. The Hammer mentions it once, and then never mentions it again in the Conclusions. The NIST mentions it once at the bottom of the page. These are not substantive references.- Wolfkeeper 01:03, 15 November 2009 (UTC)[reply]
- See below for refutation of the "one word mentions" notion. Phil Spectre (talk) 01:40, 15 November 2009 (UTC)[reply]
- I don't really think you're comparing like with like, if you look at the AFD discussion for Fowler Noll Vo Hash, you'll see that they found over 105 hits in google scholar alone. Last time we couldn't find any hits in google scholar on murmurHash, and you haven't quoted any. Indeed, it's the fact that you've recreated this article without fixing the problems with it that is so frustrating, and prompted the AFD.- Wolfkeeper 01:03, 15 November 2009 (UTC)[reply]
- If hits were all that mattered, then the 584 site:.edu hits that User:Gruntler found for MurmurHash are on par with the 1580 for FNV. There were apparently 105 Scholar "hits" for FNV, but not a single one made its way into Fowler Noll Vo hash, so why should we care? For that matter, you don't need 105 hits to be notable or even 2; a single WP:RS of WP:N suffices. Hammer, NIST and Chouza are all good enough for that. There are also good industry and open-source citations, which you've arbitrarily chosen to ignore. Please don't poison the well or stack the deck. Phil Spectre (talk) 01:40, 15 November 2009 (UTC)[reply]
- I've never seen an article with this little in the way of reliable sources, and I've seen articles with much more than this fail multiple AFDs.- Wolfkeeper 01:03, 15 November 2009 (UTC)[reply]
- So you're saying you never saw Fowler Noll Vo hash, which has no references except to Noll's site? I'm sorry, but what you're saying just doesn't fit the evidence. Phil Spectre (talk) 01:40, 15 November 2009 (UTC)[reply]
- I know that after MurmurHash got AFDd you AFDd at least two other hash function articles, but both were convincingly kept. Whether an article is kept or not is often more about whether AFD participants can establish that the encyclopedic topic is viable, and Fowler Noll Vo Hash convinced in AFD based on google scholar and other criteria. Last time with MurmurHash the answer was no. I'm still not convinced on this; it still seems very, very thin indeed. There are only a tiny number of reliable sources, and this is not terribly substantive.- Wolfkeeper 02:40, 15 November 2009 (UTC)[reply]
- I'm sorry, but I'm getting the impression that you're speaking hastily without checking your sources. In specific, what you're claiming is demonstrably false.
- I did launch an AfD on the FNV article, and I still believe that, as it was then, it did not deserve to be on Wikipedia. The solution was to trim the article down so that it wasn't too big for its sources. I did the same for Jenkins, removing references to a long-dead website, but I never initiated an AfD against it. In fact, the one on FNV was my first and only, and I now regret it because I was clearly too eager to delete instead of repair. I made the mistake of basing my argument for deletion on yours, and learned that it was flawed. That's why, after some reluctance, I worked to restore MurmurHash.
- As for your conclusion about the sources, it is just as hasty. I suggest that you read below about academic publications that provide clearly reliable sources for notability before you jump to repeating your conclusion. I would then encourage you to be specific about what you find lacking, instead of simply stating that you're unimpressed. Such statements speak of your own unspecified personal criteria, not anything relevant to Wikipedia, so they are not particularly persuasive in the context of a !VOTE. 15:20, 15 November 2009 (UTC)
- I know that after MurmurHash got AFDd you AFDd at least two other hash function articles, but both were convincingly kept. Whether an article is kept or not is often more about whether AFD participants can establish that the encyclopedic topic is viable, and Fowler Noll Vo Hash convinced in AFD based on google scholar and other criteria. Last time with MurmurHash the answer was no. I'm still not convinced on this; it still seems very, very thin indeed. There are only a tiny number of reliable sources, and this is not terribly substantive.- Wolfkeeper 02:40, 15 November 2009 (UTC)[reply]
- So you're saying you never saw Fowler Noll Vo hash, which has no references except to Noll's site? I'm sorry, but what you're saying just doesn't fit the evidence. Phil Spectre (talk) 01:40, 15 November 2009 (UTC)[reply]
- Both of them are one word mentions. They don't in my opinion confer any significant notability. The Hammer mentions it once, and then never mentions it again in the Conclusions. The NIST mentions it once at the bottom of the page. These are not substantive references.- Wolfkeeper 01:03, 15 November 2009 (UTC)[reply]
Delete per the lack of reliable sources. This topic fails WP:N. Searches in Google News Archive and Google Books come up with nothing, while a Google Scholar search returns unrelated results. (This rationale is the same as last time because no new sources have surfaced since September 2009.)Cunard (talk) 01:14, 15 November 2009 (UTC)[reply]
- I'm sorry, but this turns out not to be the case. As before, Google Scholar search returned academic articles by Hammer and by Chouza et al., which discuss their reasons for using MurmurHash in the course of achieving their primary goal and reference it properly in their end notes. Contrary to what Wolfkeeper has claimed, these were not all one-word mentions. For example, http://laboratorios.fi.uba.ar/lsi/chouza-tesisingenieriainformatica.pdf compares MurmurHash to FNV-1a and SuperfastHash a half dozen times, evaluating them on the basis of key size, speed and collisions. This is precisely the sort of analysis in an academic setting that establishes notability. Interestingly, the article goes on to explain that they chose to use MurmurHash because it was a better hash than the other two, which is itself notable. I do realize that you might not speak Spanish, but anyone can get a decent translation through Google Language Tools, so can only can only conclude that you just didn't read this article, which means that your !VOTE here should be understood in that context. The goal here is not merely to give your opinion, but to give an informed opinion. Phil Spectre (talk) 01:27, 15 November 2009 (UTC)[reply]
- Very true; there are more sources. I don't see http://laboratorios.fi.uba.ar/lsi/chouza-tesisingenieriainformatica.pdf in the article, which is why I did not evaluate it. This source appears to indicate notability; however, the depth of coverage is not enough for it to establish notability by itself — more sources are needed.
Many of the sources in the article are unreliable: this is a blog and this is passing mention in a code website. Because my old rationale no longer applies and because I am undecided about the notability of MurmurHash, I will abstain for now. Cunard (talk) 01:48, 15 November 2009 (UTC)[reply]
- Thank you for evaluating this issue more carefully, whatever you conclude. To be clear, the links you mentioned were not provided primarily to establish notability, but to add value to the article. It is typical for articles on algorithms to link to available sample implementations, and unlike the now-deleted version, this one does not contain source code or even pseudocode (whereas some articles do one or both).
- Very true; there are more sources. I don't see http://laboratorios.fi.uba.ar/lsi/chouza-tesisingenieriainformatica.pdf in the article, which is why I did not evaluate it. This source appears to indicate notability; however, the depth of coverage is not enough for it to establish notability by itself — more sources are needed.
- Having said this, the fact that the algorithm is included in a number of significant open-source projects is an indirect indication of its notability, regardless of how little fanfare there is about it. If we put aside WP:RS for a moment and just Google for insight, it's easy to find discussions about MurmurHash within these projects' forums, showing that it's genuinely being used, evaluated and discussed. For example, this blog entry evaluates MurmurHash for use in Bloom filters under Hadoop and directly endorses it. And, yes, I know it's a blog and not WP:RS; I'm just offering insight. Phil Spectre (talk) 02:00, 15 November 2009 (UTC)[reply]
- I realized that you were spot on about the Chouza reference not being in the article, so I've changed that. Rather than taking this RFD as something personal, I'm using it as an opportunity to get input to improve the article. As a result, I've also added an academic reference that confirms how well the hash works in a Bloom filter, which is another indication of notability. Phil Spectre (talk) 02:21, 15 November 2009 (UTC)[reply]
- Articles should mainly contain reliable sources. Blogs, which can be published by anyone (no editorial oversight), do not belong in the article unless they meet the third paragraph of Wikipedia:Reliable sources#Statements of opinion.
I searched the academic reference, http://www.inesc-id.pt/ficheiros/publicacoes/5453.pdf, but could not find which page(s) MurmurHash is on. Please provide the page number(s). Cunard (talk) 02:30, 15 November 2009 (UTC)[reply]
- Unfortunately, this PDF is all bitmaps, not text. I don't have the tools handy to OCR it, but using the end notes as an index made it easy enough to find that page 14 contains the mention. The exact text is: "The choice of the hashing algorithm to be employed within D2STM has been based on experimental comparison of a spectrum of different hash functions trading off complexity, speed and collision resistance. The one that exhibited the best performance while matching the analytically forecast false positive probability turned out to be MurmurHash2 [5], a simple, multiplicative hash function whose excellent performances have been also confirmed by some recent benchmarking results [24]." This is simultaneously a "one word mention" and a WP:RS of WP:N. Phil Spectre (talk) 03:55, 15 November 2009 (UTC)[reply]
- Articles should mainly contain reliable sources. Blogs, which can be published by anyone (no editorial oversight), do not belong in the article unless they meet the third paragraph of Wikipedia:Reliable sources#Statements of opinion.
- I realized that you were spot on about the Chouza reference not being in the article, so I've changed that. Rather than taking this RFD as something personal, I'm using it as an opportunity to get input to improve the article. As a result, I've also added an academic reference that confirms how well the hash works in a Bloom filter, which is another indication of notability. Phil Spectre (talk) 02:21, 15 November 2009 (UTC)[reply]
- I've been reading up on WP:RS. What I've learned is that whether something is a reliable source depends on what we're using it as a reliable source for. For example, we link to Appleby's self-published official MurmurHash site, which includes the source code but also offers benchmark results that suggest the algorithm is substantially faster than alternatives. It would be entirely inappropriate for us to reference these self-published claims in the context of asserting them as true, and in fact, we do not. Rather, just as it's ok to quote someone's blog purely as a source for what that person has said, it's fine to point to Appleby's site for the exact code that he released into the public domain. In a similar way, we use Landman's blog to point to the C# port that he published, not to repeat his opinion about MurmurHash. After all, who is Davy Landman anyhow and why should we care? (Sorry, Davy, nothing personal.)
- So, in this case, even though blogs are not generally WP:RS, we still reference two of them, one of which is a primary source that's got possible WP:COI. However, because of how we use them, these two are not only valuable additions to the article, but are WP:RS. In contrast, when we do speak of possibly controversial claims, such as the speed of the hash, we carefully rely on a combination of published academic papers and respected open-source projects. I suppose that, if we really wanted to, we could quote Appleby's speed claims as "according to the author", but we don't need to so we don't. Phil Spectre (talk) 14:41, 15 November 2009 (UTC)[reply]
- I've also been reading WP:N, particularly WP:GNG, and what I found supports the notability of this article. I'm not going to copy and paste the whole thing here, but I will quote parts and refer to the rest in a way that assumes you have it up in another window. The basic requirement is that it must receive "significant coverage in reliable sources that are independent of the subject". I think the issue here is that these terms are being misunderstood in a way that inflates the difficulty of qualifying. Fortunately, the page does go into more detail, which I believe settles the issue.
- Significant coverage: The source has to cover the topic "directly in detail", adding that this means "more than a trivial mention but it need not be the main topic of the source material". I would suggest that Chouza is a particularly fine example of this, as is Coceiro.
- Reliable: Clearly includes published academic papers.
- Sources: Requires that they be secondary. All of the sources used in MurmurHash to establish notability are secondary.
- Independent of subject: Requires that secondary sources not be mouthpieces for the primary source. With papers from multiple countries, I don't believe this is an issue.
- Presumed: States that this is sufficient to support a presumption of notability, but it's still possible that we may decide for other reasons that it's not notable. Again, this doesn't seem to be an issue here. Rather, we seem to agree that, so long as WP:GNG was satisfied, this topic would be notable.
- Having broken down the requirements and analyzed them piecemeal, I'm convinced that they are being met. Phil Spectre (talk) 15:05, 15 November 2009 (UTC)[reply]
- Note: This debate has been included in the list of Computing-related deletion discussions. -- • Gene93k (talk) 01:55, 15 November 2009 (UTC)[reply]
- Thank you. Phil Spectre (talk) 02:00, 15 November 2009 (UTC)[reply]
- Comment: I have notified the participants of the previous AfD about this discussion. Cunard (talk) 02:30, 15 November 2009 (UTC)[reply]
- Weak keep From both Phil Spectre's and Wolfkeeper's comments it does seem that the bar for notability for hashes is too low. Also, the argument from failure (or success) of Afd for other articles is an inherently weak one. For those who propose deletion, what would a reasonable notability guideline be for hash functions? --Bejnar (talk) 14:33, 15 November 2009 (UTC)[reply]
- Thank you for participating. If you don't mind, I'd like to ask you a related question: What would it take to convince you to change from Weak Keep to Strong Keep? What's missing? Phil Spectre (talk) 15:23, 15 November 2009 (UTC)[reply]
- It has very poor notability, just 3 hits in google scholar, and none in google books. And two of those are one word or one sentence mentions. The other one liked it, it benchmarked well against just two other contenders in his application. I think it takes more than that; I don't think it's in the top 3.5 million topics (which is what the Wikipedia looks like it will reach eventually), not by a long way.- Wolfkeeper 15:46, 15 November 2009 (UTC)[reply]
- Perhaps I'm mistaken, but it appears that you may well lack the ability to correctly use Google Scholar, as your numbers are simply wrong. If you look at MurmurHash, you'll see it lists 4 academic publications, all found through Google Scholar. That's not all of them, either, but I made an effort to avoid duplicates from the same sources, based on advice from User:Jclemens. Likewise, I've already corrected your claim about "one sentence mentions", showing it to be both false and irrelevant, and addressed WP:GNG on a point-by-point basis. I don't think any of your objections thus far have survived inquiry. Phil Spectre (talk) 18:38, 15 November 2009 (UTC)[reply]
- Keep. Unless there's some sort of conflict of interest, I don't see why we need to delete. WP:Wikipedia is not paper. It would be nice if the article could be expanded a bit. — sligocki (talk) 15:42, 15 November 2009 (UTC)[reply]
- To be honest we don't know who Phil Spectre is; the account was created shortly after MurmurHash was deleted. He seems uncommonly keen on this particular hash function; and has, since he created the account, mostly tried to get other competing hash function's articles deleted.- Wolfkeeper 15:51, 15 November 2009 (UTC)[reply]
- As a matter of fact, we don't know who you are, but we do know who I am. While I put my name and personal credibility on the line with each edit I make, I would very much appreciate it if you kept the focus of this discussion where it belongs, on WP:N not User:Phil Spectre. The only one I know of with any WP:COI is User:Aappleby, who has been entirely up front regarding his identity and role. I'm sorry I have to ask this, but is there something that you'd like to share with us at this time about your own identity? Phil Spectre (talk) 18:38, 15 November 2009 (UTC)[reply]
- I'm on 25,000+ edits I don't think I have anything to prove. You suddenly appeared, two days after the previous article was deleted, and it turns out that that article was created by a known sock puppet of a banned user, and we had that sock puppet commenting in the deletion discussion as well, and arguing against it.- Wolfkeeper 18:44, 15 November 2009 (UTC)[reply]
- I don't know or care who else supported or opposed this article. What I do know is that you are attacking me instead of the article. I will let that speak for itself. Phil Spectre (talk) 19:29, 15 November 2009 (UTC)[reply]
- I'm on 25,000+ edits I don't think I have anything to prove. You suddenly appeared, two days after the previous article was deleted, and it turns out that that article was created by a known sock puppet of a banned user, and we had that sock puppet commenting in the deletion discussion as well, and arguing against it.- Wolfkeeper 18:44, 15 November 2009 (UTC)[reply]
- As a matter of fact, we don't know who you are, but we do know who I am. While I put my name and personal credibility on the line with each edit I make, I would very much appreciate it if you kept the focus of this discussion where it belongs, on WP:N not User:Phil Spectre. The only one I know of with any WP:COI is User:Aappleby, who has been entirely up front regarding his identity and role. I'm sorry I have to ask this, but is there something that you'd like to share with us at this time about your own identity? Phil Spectre (talk) 18:38, 15 November 2009 (UTC)[reply]
- To be honest we don't know who Phil Spectre is; the account was created shortly after MurmurHash was deleted. He seems uncommonly keen on this particular hash function; and has, since he created the account, mostly tried to get other competing hash function's articles deleted.- Wolfkeeper 15:51, 15 November 2009 (UTC)[reply]
- Comment: A COI is not a valid reason for deletion. --ThaddeusB (talk) 17:42, 15 November 2009 (UTC)[reply]
- But what's sufficient notability? We have 4 hits in google scholar, 3 of which are basically 1 word mentions. Even the most significant google scholar hit isn't exactly the New York Times, does anything reference that? I'm still not getting warm fuzzies about it's notability here.- Wolfkeeper 04:26, 17 November 2009 (UTC)[reply]
- Wolfkeeper, these numbers are still wrong. Whatever your opinion might be, it would be more relevant to us if you got the facts straight. Phil Spectre (talk) 07:11, 23 November 2009 (UTC)[reply]
- 4 hits in Google scholar The count of 5 at the top appears to be incorrect. I actually downloaded and read all 4 of these, and only one actually did more than a one word mention.- Wolfkeeper 02:55, 26 November 2009 (UTC)[reply]
- I have purposely not provided an opinion on notability, because I don't have one yet. --ThaddeusB (talk) 01:54, 21 November 2009 (UTC)[reply]
- Wolfkeeper, these numbers are still wrong. Whatever your opinion might be, it would be more relevant to us if you got the facts straight. Phil Spectre (talk) 07:11, 23 November 2009 (UTC)[reply]
- But what's sufficient notability? We have 4 hits in google scholar, 3 of which are basically 1 word mentions. Even the most significant google scholar hit isn't exactly the New York Times, does anything reference that? I'm still not getting warm fuzzies about it's notability here.- Wolfkeeper 04:26, 17 November 2009 (UTC)[reply]
- Relisted to generate a more thorough discussion so consensus may be reached.
Please add new comments below this notice. Thanks, Tim Song (talk) 00:37, 21 November 2009 (UTC)[reply]
- Strong keep - I still haven't !VOTED yet, so I might as well get that formality out of the way. For all the reasons I've explained above, I believe that the article in its current state should be left in place. As it does not appear as though Wolfkeeper has been able to garner any support here, I don't think there's a question about what the consensus is. Phil Spectre (talk) 07:11, 23 November 2009 (UTC)[reply]
- As required by the AfD process, I would like to reveal any and all potential conflicts of interest. In case it is not readily apparent, I should explicitly disclose that I worked with the two administrators who have deleted previous versions of this article to recreate it in a form that I believe merits inclusion in the project. As such, I cannot claim to be completely objective. Having said that, I gain no benefit, monetary or otherwise, from any increased exposure that MurmurHash may receive as a result of this article. I have not written about MurmurHash in any other forum, nor have I used it professionally, although I did spend an hour confirming the performance claims to my satisfaction. I should also remind you that I am not the original author of the article or of the algorithm, and I have been fully cleared of the sock puppetry charges that alleged the contrary. Phil Spectre (talk) 01:23, 26 November 2009 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review). No further edits should be made to this page.