Jump to content

Wikipedia:Articles for deletion/C10k problem

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by ErikHaugen (talk | contribs) at 18:32, 30 December 2009 (C10k problem). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
C10k problem (edit | talk | history | protect | delete | links | watch | logs | views) – (View log • AfD statistics)
(Find sources: Google (books · news · scholar · free images · WP refs· FENS · JSTOR · TWL)

In my attempts to find information on this topic, every page I found that mentioned "C10K problem" either used the term as a given without justifying it, or referred to the Kegel page referenced in this page, which implies that such a limit exists, without substantiating the implication, and then deal entirely with ways to increase the amount of traffic a web server can handle without any of that text relying on a 10K limit in particular. I don't see that this is a notable topic because it seems to be one person's name for an unsubstantiated phenomenon, and I don't find any evidence that that 10K limit exists. So, possible WP:N and possible WP:V. —Largo Plazo (talk) 17:47, 13 December 2009 (UTC)[reply]


Relisted to generate a more thorough discussion so consensus may be reached.
Please add new comments below this notice. Thanks, Tim Song (talk) 02:12, 20 December 2009 (UTC)[reply]
  • Weak Keep If there are published academic papers referring to a variant of it (the reverse C10K)m, then it seems reasonable that the problem that is the basis for that name must be very well known in its field. I think the evidence shows some degree of notability. Question, though: is Kegel notable enough for an article--if so, there's a possible merge .
    • Hoaxes, legends, and misconceptions can be very well known. I think the issue here is whether it's real. If a topic in information technology were notable, it's very hard to imagine that it would be so difficult to find anything substantiating it on the Web; and if no one can provide any resource (unlike the ones given above) that substantiates it rather than presupposing it, it's tough to see what an article on it could be about. —Largo Plazo (talk) 03:22, 21 December 2009 (UTC)[reply]

Relisted to generate a more thorough discussion so consensus may be reached.
Please add new comments below this notice. Thanks, Arbitrarily0 (talk) 16:29, 27 December 2009 (UTC)[reply]
  • Strong Keep To clarify, the "limit" is certainly not any kind of hard limit; as noted above there won't be any citations found to support this limit. It is simply a recognition that certain kinds of very popular server architectures can't get much beyond a few thousand simultaneous connections. See yaws (web server) - different threading systems are one way to get beyond this limit, lighttpd uses another, select/poll/epoll. I'm confused by this AFD proposal. Can someone enlighten me about the WP:V complaint? Is there some question that web servers like apache run into these limits? WP:N is just crazy talk - see the papers linked above, see the lighttpd article - it was created in response to c10k! ErikHaugen (talk) 21:00, 28 December 2009 (UTC)[reply]
    • "...there won't be any citations found to support this limit". You're saying strongly that we should have an unsourced article about something that doesn't exist because something similar to it does exist. I'm quite certain that somebody is capable of writing an article about the phenomenon that does exist, under a title that does not imply the existence of the phenomenon that doesn't exist, with reliable sources that document the real phenomenon, rather than a link to a document that doesn't in any way support the fake phenomenon other than to claim (falsely) that it exists. —Largo Plazo (talk) 23:22, 28 December 2009 (UTC)[reply]
      • It isn't a fake phenomenon. http://www.sics.se/~joe/apachevsyaws.html See the little blue and green squiggles in the left side of this graph? This is real. C10k is a name used to describe it. It's overly precise and won't always be accurate, but it is nonetheless a name used for it. ErikHaugen (talk) 00:40, 30 December 2009 (UTC)[reply]
        • What is it with people providing references that don't even mention, let alone, support, the thing being claimed? There is no mention on this page of a C10k phemonenon. There is no mention of the number 10,000. All this page shows is that Apache seems to be limited to this number of simultaneous connections, while Yaws seems to be able achieve that number of simultaneous connections. In fact, the number achieved by Yaws is 80,000, which, to the best of my knowledge is much more than 10,000, so while somehow you have the impression that this page in some way supports the existence of a phenomenon known as the C10k limit, it actually seems to be a demonstration I might present it to show that there is no such limit. —Largo Plazo (talk) 02:56, 30 December 2009 (UTC)[reply]
          • The link shows the difference between Apache, which like "most web servers" mentioned in the wp article uses a separate o/s thread/process for each connection, and exotic web servers like yaws or lighttpd, which don't - they have "solved" the c10k problem. The graph shows the enormous limitation that web servers constrained by the c10k problem suffer; they are only able to use a fraction of the available hardware. It's quite a dramatic experiment, and clearly demonstrates the failure of the conventional model to handle lots of connections. ErikHaugen (talk) 18:32, 30 December 2009 (UTC)[reply]
    • Regarding the last part of your question, you answered WP:V yourself: you said "the 'limit' is not any kind of hard limit ... there won't be any citations to support such limit". This "limit" isn't verifiable because it doesn't exist. End of story: the article doesn't satisfy Wikipedia's criterion that articles must be about verifiable topics; if it's false then it certainly isn't verifiable. That there is some limit isn't remarkable: at every single point in computing history, there has been a limit on processor speed, a limit on storage speed and capacity, a limit on network bandwidth and number of connections. The whole point of the alleged C10K problem is that some special, perplexing limit of around 10,000 connections—near enough to warrant the name—has been reached and has been a special source of frustration. And yet, here you are, telling me that it's a fiction, and in response to my concern that there doesn't seem to be any corroboration supporting it, you confirm that there isn't any such corroboration—and yet you feel strongly that this article on this unverifiable, nay, false topic should be kept. I don't get it. And let's not even get into notability, unless someone wants to rewrite the entire article to be about a notable gross misconception prevailing in some quarters that some limit called "C10k" exists. And even then, to avoid being an original research piece, that article would have to provide citations to resources that address this misconception (widely accepted hoax?), rather than making the first-person observation that there are a lot of references to C10k out there. —Largo Plazo (talk) 04:34, 29 December 2009 (UTC)[reply]
      • Thread/process based servers that many were/are using, such as Apache, hit a limit of a few thousand simultaneous connections, thus underutilizing the hardware for non-cpu/etc bound services. What is hard to understand about this? This isn't controversial, is it? In a few years, better hardware might mean the limit is typically 20-30k instead of a few thousand, sure - but the problem is still referred to as c10k, at least for now. The shortcomings of this architecture are surely interesting enough for an article. What else would you call it? Should we rename the article? Is there another we could merge it with? ErikHaugen (talk) 00:40, 30 December 2009 (UTC)[reply]
        • Nothing is hard to understand about your first sentence. Your first sentence, however, is not what this article says. Your first sentence has nothing to do with whether there's a specific phenomenon, as claimed in the article, called the C10k phenomenon, that supposedly has something to do, specifically, with the number 10,000? If you'd like to write an article about connection limits on various platforms and their progression over the years, and can provide sources that support what you write about them, please feel free. —Largo Plazo (talk) 03:01, 30 December 2009 (UTC)[reply]
          • I've tried to say this elsewhere - I think it might be helpful to stop getting hung up on the number 10,000. The problem is called the c10k problem, and perhaps that is a lousy name since it will probably be 50k in a few years, but that is a name for the problem, and the name is quite popular. Not much we can do about that here. ErikHaugen (talk) 18:32, 30 December 2009 (UTC)[reply]
        • Let's go back and look at the very first sentence in the article: "The C10k problem [1] is the name given to a limitation that most web servers currently have which limits the web servers capabilities to only handle about ten thousand simultaneous connections." I didn't make up that this problem is supposed to have something to do with ten thousand connections: it's the salient point of the lead sentence of the article.
          1. Are any references given in the article that support this? No.
          1. Do any of the references provided by anybody in this discussion corroborate this? No.
          • I provided experimental results demonstrating that conventional server architectures can't get past a few thousand connections. See Simoncpu's links above - other academics cite Kegel's article in discussion about how thread/process based architectures don't scale as well as event-driven architectures. Is the fact that this phenomenon is discussed in academic papers good enough? ErikHaugen (talk) 18:32, 30 December 2009 (UTC)[reply]
          1. Do any of the references provided in this discussion state in any kind of authoritative way what the C10k problem is, as opposed to taking its existence for granted? No.
          • Irrelevant. Maybe it would help to think of c10k as a cultural thing. Clearly it exists, because people are talking about it; referenced from papers, etc - do a google search. The phrase "C10K" is not obscure, and needs an article on its own even if the concept is bogus. It isn't bogus, of course, but that's just gravy as far as whether or not this article should be deleted. In any case, to answer your question, the Kegel article explains what the problem is. Nobody else needs to, because it's explained there. Look at this page: http://www.javafaq.nu/java-article572.html - this is a description of a library that uses c10k in the description and defines it. See how this is being used? Clearly we need this wp article! ErikHaugen (talk) 18:32, 30 December 2009 (UTC)[reply]
        • We don't carry articles in Wikipedia about things about whose existence some people are convinced exists. We carry articles about things whose existence is supported by reliable documentation. Was there a Year 2000 problem about which Wikipedia should have an article? Yes, there are thousands of sources that explain in detail what the Year 2000 problem is, that indicate what it actually has to do with the year 2000, that indicate in detail what the specific problems were that needed to be addressed, that indicate what kinds of measures were taken to address it, that indicate what kind of failures resulted from inadequate measures taken to address it. In response to the same kind of question about the C10k problem, we are shown sources that imply that it exists or in which the writer is under the understanding that exists, but nothing that shows that it exists. We are shown sources that show that different platforms are limited to different numbers of connections, the general issue of connection limits as opposed to a very specific phenomenon that this article purports to be about. —Largo Plazo (talk) 03:15, 30 December 2009 (UTC)[reply]
    • For create a common consense. In my opinion, Largoplazo is right in that the article has _not_ realiable sources. We only have the lighttpd and nginx paragraphs and it's not enougth for any enciclopedia. So we must search sources of this problem. For the other hand, the 10K limit is trivial. It suggests that web servers not benefit from all hardware power of servers. It seems that if the programmers don't program web server for doing it, the web server serves most few pages than it could be serve for hardware limitations. So 10K is not a hard limit. It's a soft limit. A limit for remind this situation. I propose that we create another article: "Software limitations of web servers" or something else and redirect C10k to this page and note in software limitations.... that C10k is one term used to refer to this situation, but as a limit, is soft. What do you think?--Xan2 (talk) 16:56, 29 December 2009 (UTC)[reply]
      • "A limit for remind this situation." - well put. wrt your suggestion, sounds good, go for it! Please leave c10k intact until you create the other article, though, so people can find it when they want to know what c10k means. ErikHaugen (talk) 00:40, 30 December 2009 (UTC)[reply]

i