Talk:Decision tree learning

Statistics Start‑class Mid‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the importance scale.

Robotics Start‑class Mid‑importance

	This article is within the scope of WikiProject Robotics, a collaborative effort to improve the coverage of Robotics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.RoboticsWikipedia:WikiProject RoboticsTemplate:WikiProject RoboticsRobotics
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the project's importance scale.

I'm very grateful for the work that's gone into this article so far, but I think that it's missing a couple of key points that would be helpful to readers. First, there is no discussion of pruning - what necessitates it, and what algorithms are used to guide it? Second, although Gini impurity and Information gain are discussed in their own section, there is no discussion explaining their application in the construction of decision trees (i.e. as criteria for selecting values on which to split a node).

Here's hoping that these observations are not out of place; this is my first-ever contribution to Wikipedia and I suppose I ought to go play in the sandbox and then add to the article myself, eh?

Yoy riblet 04:41, 18 July 2007 (UTC)[reply]

There's also a section missing on the _disadvantages_ of decision trees. For instance, ISTM that a decision tree can only divide the input region with axis-parallel lines (neural networks do not have this restriction). -Thenickdude (talk) 02:20, 3 November 2008 (UTC)[reply]

Copyright problem

I'm concerned that the example in this page is needlessly copied without attribution from Quinlan's paper "Induction of Decision Trees" - in particular the table's content and organization is directly copied from Table 1. This is unlikely to be considered fair use since the specific example is not needed to comment on the method, much less decision trees in general. I suggest creating a novel example for this article (I don't believe Wikipedia:No original research is an issue here, since we're describing decision trees in general and not any particular algorithm). Dcoetzee 00:10, 1 March 2009 (UTC)[reply]

I agree - the example seems to have been lifted from Quinlan, with tweaks. More importantly from a pedagogical standpoint, the example is not about decision trees in machine learning, but about decision trees in decision analysis, so it's in the wrong page. I am removing that section, as I don't see any benefit to having it in this article. --mcld (talk) 11:49, 3 November 2009 (UTC)[reply]

On a related note: the example is weird. The target variable is sort of a pretend categorical variable with values "Play" or "Don't Play", when what is really being measured is the number of people who show up on a given day. The example talks about the decisions as if they are describing groups of people, which is a stretch from the data. If there are no objections, I'm going to make up a completely different example, using Wikipedia data. riedl 12 June 2009

References

With all due respect to the work of Drs. Tan and Dowe -- I doubt that multiple references to their publications are of uttermost importance to an overview article on "Decision tree learning". This might be reviewed. BM 128.31.35.198 (talk) 13:37, 24 April 2009 (UTC)[reply]

Agree - the pile of superfluous citations is entirely unneccessary. Removed. --mcld (talk) 12:03, 3 November 2009 (UTC)[reply]

Poor explanation

I read the general description several times and I'm confused. First of all, what is the target variable in the graph? There are two numbers under each leaf (I'll assume it's the probability of survival). What is the source set? The set of passengers? How do you assign probability of survival to a given passenger? What is the meaning of "... when the subset at a node all has the same value of the target variable"? Passengers with the same probability of survival? The graph will only make sense as an illustration if the mapping between it and the general description is made explicit at every step. Bartosz (talk) 19:46, 10 June 2011 (UTC)[reply]

Inclusion of "QuickDT" in list of Implementations

Two days ago a new decision tree learning implementation called QuickDT was added to the list of implementations. This is after QuickDT was posted to reddit/r/MachineLearning, to a favorable response, and also to mloss.org. It is already in commercial usage, and its author is well known.

Shortly afterwards this was removed by User:X7q with the explanation "wikipedia is not a place to promote your pet projects".

My opinion is that there is as much justification to include QuickDT in this list as any of the implementations already listed, and so I have restored the link.

If User:X7q would like to elaborate on their reason for thinking that this decision tree learning implementation should not be included in this list of decision tree learning implementations, I am happy to discuss it.

--Javalangstring (talk) 15:59, 22 September 2011 (UTC)[reply]

First of all, you should not add links to your own site (WP:EL#ADV, WP:COI). In a comment you wrote "This is a list of decision tree learning implementations (of which there aren't very many).". Both your points are wrong. Wikipedia is not a collection of links. Decision trees are a highly popular data mining tool and as result there are a lot of decision tree algorithms and their implementations, both open and closed source. And they aren't that hard to implement, so I imagine many students learning this topic (including myself long time ago) have implemented them - I don't see how this is different from what you've done. -- X7q (talk) 21:35, 22 September 2011 (UTC)[reply]

If you can't find a decent decision tree implementation on google, then you're probably googling for wrong keywords. Don't google for "ID3", "CART", "C4.5", etc - noone uses them anymore. Ensembles are the start-of-art today. Google for AdaBoost, random forest, or, to give a more recent example, additive groves - all these algorithms use decision trees as building blocks. -- X7q (talk) 21:47, 22 September 2011 (UTC)[reply]

I take your point that adding a link about a project I'm affiliated with may have been a violation of the letter of the guidelines. However, I don't believe it is a "conflict" of interest as I have no commercial interest in this project. My only motivation is to fill a need for an easy-to-use Java decision tree library. In this regard, my interests are aligned with, not in conflict with the goals of Wikipedia.

While we are being sticklers about Wikipedia guidelines, I believe they also state that you should assume good faith WP:AGF, which you failed to do when you questioned my motives in your original revision: "wikipedia is not a place to promote your pet projects". If my goal was self-promotion, I can think of better ways to do it than a link from an obscure wikipedia article to a github page that barely mentions my name.

I did look very hard for a Java decision learning tree library prior to implementing QuickDT, and I can assure you that I'm a proficient user of Google. I found two, jaDTi, which has a terrible API and has been dormant since 2004, and Weka, which also has a horrible API, not at all idiomatic Java, and is a resource hog. I wasted several weeks trying to make each of these fit my needs, they failed to do-so. They may not be hard to implement, but there is more to writing a good library than simply implementing the algorithm, you also need a well-designed API.

In short, there are no good Java options.

I'm familiar with ensembles, they don't suit everyone. In my particular situation I required 100% recall accuracy on the training set, and ensembles degenerate to simple decision tree learners in this case. Ensembles are also less well suited if your goal is to gain some insight into the data, rather than simply good predictive behavior.

That is why I originally came to this page about decision tree learning, and not a page about ensembles. Even if you consider decision tree learning algorithms to be obsolete, they are the subject matter of this article, and you shouldn't presume that visitors to this article are really looking for something else when they come here.

Because of WP:COI I will not make any further edits on this matter, however I request that you reconsider your removal of the link in the light of my argument above. If you decide against it I may request third-party arbitration. Javalangstring (talk) 22:32, 22 September 2011 (UTC)[reply]