User:Flow~enwiki

Notes in preparation for the SYDE 372 Final Exam, Winter 2004

Hi classmates,

I am preparing my study notes for pattern rec, and thought that it might be useful to create a collaborative document. That way, we can benefit from each other's knowledge and time - there's not much time left before the exam, and perhaps we can take advantage of the fact that each of us may have a somewhat different focus while studying. So fell free to edit, to add information to something that seems incomplete, or to correct mistakes, etc. I think a good format would, for each technique/algorithm list:

its name
what it does
advantages
disadvantages
special cases
formula (mathematical expressions are supported by the wiki, but if you don't want to take the time to enter formulas in, please don't let it stop you from contributing to the other fields - the formulas are relatively easy to look up anyway.)

-Ellen

For some help on editing, see Wikipedia's how to edit page. (Or alternatively, if you have suggestions on how to make the formatting of this page better, let me know.)

For mathematical formulas, some help can be found at character formatting and Wikiproject Mathematics and Wikipedia TeX Markup

Linear Discriminants

What they are: straight line classifiers

Advantages

fastest, simplest possible classifier

Disadvantages

simplistic, and high probability of error, however, this can be overcome/mitigated by

1) transforming the data by some high-dimensional non-linear functions, 
and  creating the straight line in non-linear space (Support Vector Machines)

OR

2) Combining several discriminants when making a classifier by
   i) voting
   ii) aggregation
   iii) Neural Networks (uses some kind of voting?)

Perceptron algorithm

What it does: finds a linear discriminant between two clusters

Advantages

will always find a discriminant, if one exists

Disadvantages

discriminant usually not optimum, or close to optimum
premature halting of the algorithm gives nonsense
may require an arbitrary number of iterations - number of iterations is inversely proportional to distance between clusters

Minimum Square Error (MSE)

What it does: finds a linear discriminant between clusters, by the least squares solution of the distance from (all points in the cluster or only the closest?) to the line

Advantages

will always give a 'reasonable' result, even if the clusters are not separable

Disadvantages

Template Title

What it does:

Advantages

Disadvantages

Some examples of mathematical expressions:

$\sin x+\ln y\alpha a^{2}$

$\zeta =\cos(\alpha )$

$T_{P}={\frac {\pi }{\omega _{n}{\sqrt {1-\zeta ^{2}}}}}$