User:Jimmy Novik/Nonlinear Mapping Function
In order to map x (which is nonlinear) to linear models, we need to perform non-linear mapping on the inputs.
Options
1. Choose a very generic φ, like an infinite-dimensional that is implicitly used by kernel machines based on the RBF kernel. (Not good for advanced problems) 2. Manually engineer φ. 3. Use deep learning to learn φ. y=f(x;θ, w) =φ(x;θ)Tw
We now have parametersθthat we use to learnφfrom a broad class of functions, and parameterswthat map fromφ(x) tothe desired output. This is an example of a deep feedforward network, withφdefining a hidden layer.
This approach is the only one of the three thatgives up on the convexity of the training problem, but the benefits outweighthe harms. In this approach, we parametrize the representation asφ(x;θ)and use the optimization algorithm to find theθthat corresponds to a goodrepresentation.
If we wish, this approach can capture the benefit of the firstapproach by being highly generic—we do so by using a very broad familyφ(x;θ). Deep learning can also capture the benefit of the second approach.Human practitioners can encode their knowledge to help generalization bydesigning familiesφ(x;θ) that they expect will perform well. The advantageis that the human designer only needs to find the right general functionfamily rather than finding precisely the right function.