Talk:Newton's method in optimization

Physics Start‑class Low‑importance

	Physics portal This article is within the scope of WikiProject Physics, a collaborative effort to improve the coverage of Physics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.PhysicsWikipedia:WikiProject PhysicsTemplate:WikiProject Physicsphysics
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-importance on the project's importance scale.

Should be merged with Newton's method

There is another wikipedia page called Newton's method (http://en.wikipedia.org/wiki/Newton%27s_method#Nonlinear_systems_of_equations). I suggest both are merged or the content is just too confusing and even seemingly contradictory. Alpheratzstar (talk) 12:25, 9 November 2011 (UTC)alpheratzstar[reply]

I agree, this page is very confusing. It could be made better by stating right after

$x_{n+1}=x_{n}-{\frac {f'(x_{n})}{f''(x_{n})}},\ n=0,1,\dots$

that this is equivalent to Newton's Method applied to f'(x).173.237.117.134 (talk) 21:11, 9 October 2013 (UTC)[reply]

Needs more generalization

Could someone write a piece on Halley's method in multi-dimensional case? I could only find this on the German page, and can't really follow it very well: http://de.wikipedia.org/wiki/Halley-Verfahren I also realize that this is not really the Newton's method, but there are many similarities and more people are likely to visit this page..... — Preceding unsigned comment added by 90.215.230.136 (talk) 15:34, 22 July 2011 (UTC)[reply]

Yes! This would be really nice to have! Also,.... most papers don't discuss this method (obviously) but what affect would it have on convergence properties (for example could the addition of the (3rd derv.) tensor actually mess-up iterative schemes that are proven to work (but as part of their proof the truncation MUST be taken at 2nd derv.) Also, in the case of positive definite hessian, can the addition of the 3rd derivative cause the hessian (in the next iteration) to be non-positive definite? These are good questions indeed. — Preceding unsigned comment added by 144.32.126.15 (talk) 12:44, 18 October 2011 (UTC)[reply]

I think you are mixing up quasi-NM with the real thing. 0¹⁸ (talk) 18:55, 18 October 2011 (UTC)[reply]

Call for review/correction)

The first equation appears to contain a significant typo. The derivation from taylors expansion gives f/f' rather than f'/f". I therefore propose a change from f'(x)/f"(x) to f(x)/f'(x) within a reasonable time unless there is objection/discussion or the equation's author agrees and makes the correction. Merlin Pendragon 23:27, 17 August 2006 (UTC)[reply]

Everything is correct as far as I can tell. The Newton's method is applied to f', as we want to solve f'(x)=0, and that's why you get eventually f'/f". Oleg Alexandrov (talk) 23:50, 17 August 2006 (UTC)[reply]

Yes, I see now. My error in haste. Thank you. My proposed correction is withdrawn. I had been looking for a cite for Newton_method for finding roots and fell victim to seeing what I expected to see instead of what you actually wrote. :) In review, perhaps a little tweek on your intro two sentences to wake up the lazy or hasty reader, perhaps along the lines that this is an *application* of *Newton's method* which will find a local maximum or minimum value of a function, a task useful in *optimization*... or some such. The current has multiple (indestinguished, distracting?) blue links in the beginning sentence, the first sentence refers to the root method rather than explaining what this article is about, and the second sentence refers to what "can also be" done suggesting the sentence is precatory rather than making clear that this is the point of the article. Careful reading and reasoning sorts it out, so there is no error, but you may wish to consider a tweek? Thanks for the article, and the quick response. -Merlin

Call for clarification

A beginner will have a hard time understanding how the Taylor expansion is minimized by solving the algebraic equation presented here. I'd bet you can do better at making this important first step more clear. -Feraudyh

I support this call of clarification. Why does minimizing the Taylor expansion give us the $\delta x$ step? And isn't $\delta x=0$ a trivial solution for the minimization? -Emre

Root or minimum??

The formula presented is the method applied to find the root of f(x), or to find a minimum, applying the method to f'(x)?? -- NIC1138 17:27, 21 June 2007 (UTC)[reply]

The second is right, applying the method to f'(x). I think that's mentioned in the intro. Oleg Alexandrov (talk) 01:59, 22 June 2007 (UTC)[reply]

Error in the formula?

I think there is an error in the formala: $\mathbf {p} _{n}=\mathbf {x} _{n+1}-\mathbf {x} _{n}=-[Hf(\mathbf {x} _{n})]^{-1}\nabla f(\mathbf {x} _{n}),\ n\geq 0.$

I think to make it a linear equation it should be $\mathbf {p} _{n}=[Hf(\mathbf {x} _{n})](\mathbf {x} _{n+1}-\mathbf {x} _{n})=-\nabla f(\mathbf {x} _{n}),\ n\geq 0.$

or perhaps $[Hf(\mathbf {x} _{n})]\mathbf {p} _{n}=-\nabla f(\mathbf {x} _{n}),\ n\geq 0.$

with $\mathbf {x} _{n+1}=\mathbf {p} _{n}+\mathbf {x} _{n}$

Any objections? —Preceding unsigned comment added by 91.19.151.5 (talk) 11:32, 5 October 2007 (UTC)[reply]

request

Someone should include the formula for Newton's if the Hessian is not invertible. I would, but I don't know how to add mathematical formulas. Some proofs would be nice too. -anon — Preceding unsigned comment added by 128.208.36.170 (talk • contribs) 20:18, 13 March 2006

I don't know if Newton's method even works if the hessian is not invertible. Either way, in that case convergence will be slower, so gradient descent may be better. But I don't know much about what happens for non-invertible hessians. Oleg Alexandrov (talk) 01:22, 14 March 2006 (UTC)[reply]

I don't understand what this business is with inverting the Hessian. Why would you ever invert the Hessian when you can just rearrange the problem to be solved as a traditional linear system? eg, H(f)*(Xn+1 - Xn) = -Grad(f). Then you can just solve for (Xn+1 - Xn) using, say, Gaussian elimination, and add that to the previous Xn. Yahastu (talk) 20:44, 15 February 2009 (UTC)[reply]

Gaussian elimination is equivalent to matrix inversion. —Preceding unsigned comment added by 87.103.91.29 (talk) 02:38, 24 November 2009 (UTC)[reply]

Not computationally! ;)131.215.143.10 (talk) 19:40, 16 March 2010 (UTC)[reply]

The Hessian also needs to be Lipschitz continuous near the minimum for quadratic convergence to hold. See http://www.cs.ucf.edu/courses/cap6411/cot6505/spring03/Lecture-20.pdfMattrbunique (talk) 21:53, 23 November 2010 (UTC)[reply]

Isn't this covered in the last section: "If the Hessian is close to a non-invertible matrix..." It goes on to mention the Levenberg–Marquardt algorithm, which I was going to mention. That stabilizes things by essentially removing zeros in the spectrum of the Hessian, as I understand it.

As for adding equations, look at the source. It's pretty easy. Just add the <math></math> tags and type LaTeX math in between. —Ben FrantzDale (talk) 14:08, 24 November 2010 (UTC)[reply]