User:Jimmy Novik/Nonlinear Mapping Function

In order to map x (which is nonlinear) to linear models, we need to perform non-linear mapping on the inputs.

Options

1. Choose a very generic φ, like an infinite-dimensional that is implicitly used by kernel machines based on the RBF kernel. (Not good for advanced problems) 2. Manually engineer φ. 3. Use deep learning to learn φ. y=f(x;θ, w) =φ(x;θ)Tw

We now have parametersθthat we use to learnφfrom a broad class of functions, and parameterswthat map fromφ(x) tothe desired output. This is an example of a deep feedforward network, withφdeﬁning a hidden layer.

This approach is the only one of the three thatgives up on the convexity of the training problem, but the beneﬁts outweighthe harms. In this approach, we parametrize the representation asφ(x;θ)and use the optimization algorithm to ﬁnd theθthat corresponds to a goodrepresentation.

If we wish, this approach can capture the beneﬁt of the ﬁrstapproach by being highly generic—we do so by using a very broad familyφ(x;θ). Deep learning can also capture the beneﬁt of the second approach.Human practitioners can encode their knowledge to help generalization bydesigning familiesφ(x;θ) that they expect will perform well. The advantageis that the human designer only needs to ﬁnd the right general functionfamily rather than ﬁnding precisely the right function.