User:Flow~enwiki
Notes in preparation for the SYDE 372 Final Exam, Winter 2004
Hi classmates,
I am preparing my study notes for pattern rec, and thought that it might be useful to create a collaborative document. That way, we can benefit from each other's knowledge and time - there's not much time left before the exam, and perhaps we can take advantage of the fact that each of us may have a somewhat different focus while studying. So fell free to edit, to add information to something that seems incomplete, or to correct mistakes, etc. I think a good format would, for each technique/algorithm list:
- its name
- what it does
- advantages
- disadvantages
- special cases
- formula (mathematical expressions are supported by the wiki, but if you don't want to take the time to enter formulas in, please don't let it stop you from contributing to the other fields - the formulas are relatively easy to look up anyway.)
-Ellen
For some help on editing, see Wikipedia's how to edit page. (Or alternatively, if you have suggestions on how to make the formatting of this page better, let me know.)
For mathematical formulas, some help can be found at character formatting and Wikiproject Mathematics and Wikipedia TeX Markup
Linear Discriminants
What they are: straight line classifiers
Advantages
- fastest, simplest possible classifier
Disadvantages
- simplistic, and high probability of error, however, this can be overcome/mitigated by
1) transforming the data by some high-dimensional non-linear functions, and creating the straight line in non-linear space (Support Vector Machines)
OR
2) Combining several discriminants when making a classifier by i) voting ii) aggregation iii) Neural Networks (uses some kind of voting?)
Perceptron algorithm
What it does: finds a linear discriminant between two clusters
Advantages
- will always find a discriminant, if one exists
Disadvantages
- discriminant usually not optimum, or close to optimum
- premature halting of the algorithm gives nonsense
- may require an arbitrary number of iterations - number of iterations is inversely proportional to distance between clusters
Minimum Square Error (MSE)
What it does: finds a linear discriminant between clusters, by the least squares solution of the distance from (all points in the cluster or only the closest?) to the line
Advantages
- will always give a 'reasonable' result, even if the clusters are not separable
Disadvantages
Template Title
What it does:
Advantages
Disadvantages
Some examples of mathematical expressions: