Talk:Conjugate gradient method

This is the talk page for discussing improvements to the Conjugate gradient method article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1: 2 months

Archives

Archive 1

This page has archives. Sections older than 60 days may be automatically archived by .

First few lines are messy

I think that it is more correct to say: Conjugate gradient is a line search method for optimizing a function of several variables. It is often applied to solve linear systems of equations. (save the details for later) —Preceding unsigned comment added by 203.200.55.101 (talk • contribs) 05:55, 2 September 2006 (UTC)[reply]

This article is too condensed. Could you please give some more explications? Where do the alphas come from? What does mean: "This suggests taking the first basis vector p1 to be the gradient of f at x = x0, which equals -b" ? Sorry, I dont understand what you are talking about.

Ulrich —Preceding unsigned comment added by 85.90.4.11 (talk • contribs) 07:46, 6 November 2006 (UTC)[reply]

Comment

In your algorithm, the formula to calculate pk differs from Jonathan Richard Shewchuk paper. The index of r should be k instead of k-1. Mmmh, sorry, it seams to be correct! ;-) —The preceding unsigned comment was added by 171.66.40.105 (talk • contribs) 01:45, 21 February 2007 (UTC)

Notation and clarification.

Could someone explain the beta? The article explains that $\alpha _{k}$ is the kth component of the solution in the p basis (i.e., it is how far to go at the kth step). But what about beta? We have

\beta _{k}:={\frac {r_{k+1}^{\top }r_{k+1}}{r_{k}^{\top }r_{k}}}={\frac {\|r_{k+1}\|^{2}}{\|r_{k}\|^{2}}}

and it gets used to find the next p according to

p_{k+1}:=r_{k+1}+\beta _{k}p_{k}\,

The preceding section says that we'll update p by

p_{k+1}=r_{k}-{\frac {p_{k}^{\top }Ar_{k}}{p_{k}^{\top }Ap_{k}}}p_{k}

and

r_{k+1}=r_{k}-\alpha Ap_{k}

so

r_{k}=r_{k+1}+\alpha Ap_{k}

so

p_{k+1}=r_{k+1}+\alpha Ap_{k}-{\frac {p_{k}^{\top }Ar_{k}}{p_{k}^{\top }Ap_{k}}}p_{k}=r_{k+1}+\alpha Ap_{k}-{\frac {\langle p_{k},r_{k}\rangle _{A}}{\langle p_{k},p_{k}\rangle _{A}}}p_{k}

How does this algebra work out and what does the beta mean? Maybe it's the end of the day and my algebra is just failing me. Also, the Octave code should have similar variable names to the mathematics. —Ben FrantzDale (talk) 22:56, 15 May 2008 (UTC)[reply]

The explanation is presented in the context. Please derive it once again yourself giving the hint provided. —Preceding unsigned comment added by 173.26.252.42 (talk) 21:46, 16 November 2009 (UTC)[reply]

While this is an old thread, it shows that this article has been very unclear since quite some time ago about how the the formulae for

\alpha _{k}

and

\beta _{k}

as they are presented in the “The resulting algorithm” section are derived from those in previous sections. The root cause of such unclarity is that it fails to point out the orthogonality of residuals, i.e.,

\langle r_{i},r_{j}\rangle =0

for any

i\neq j

, which is an important property of the conjugate gradient method and crucial to the simplication of the formulae of

\alpha _{k}

and

\beta _{k}

.

With the orthogonality of residuals, one can take advantage of the fact that

p_{k}

is the sum of

r_{k}

and some linear combination of

r_{0},r_{1},\ldots ,r_{k-1}

to show that

p_{k}^{\top }r_{k}=r_{k}^{\top }r_{k}

and thus

\alpha _{k}={\frac {p_{k}^{\top }r_{k}}{p_{k}^{\top }Ap_{k}}}={\frac {r_{k}^{\top }r_{k}}{p_{k}^{\top }Ap_{k}}}{\text{.}}

To simplify

p_{k+1}=r_{k+1}-\sum _{i\leq k}{\frac {p_{i}^{\top }Ar_{k+1}}{p_{i}^{\top }Ap_{i}}}p_{i}{\text{,}}

one can take advantage of the fact that for

i\leq k

,

p_{i}^{\top }Ar_{k+1}={\frac {1}{\alpha _{i}}}(r_{i+1}-r_{i})^{\top }Ar_{k+1}={\frac {1}{\alpha _{i}}}r_{i+1}^{\top }r_{k+1}\neq 0

only if

i=k

.

Hence,

p_{k+1}=r_{k+1}-{\frac {1}{\alpha _{k}}}{\frac {r_{k+1}^{\top }r_{k+1}}{p_{k}^{\top }Ap_{k}}}p_{k}=r_{k+1}-{\frac {p_{k}^{\top }Ap_{k}}{r_{k}^{\top }r_{k}}}{\frac {r_{k+1}^{\top }r_{k+1}}{p_{k}^{\top }Ap_{k}}}p_{k}=r_{k+1}-{\frac {r_{k+1}^{\top }r_{k+1}}{r_{k}^{\top }r_{k}}}p_{k}{\text{,}}

which gives

\beta _{k}={\frac {r_{k+1}^{\top }r_{k+1}}{r_{k}^{\top }r_{k}}}{\text{.}}

All these derivations as well as the orthogonality property should really have been readily available in the text to help people understand the subject more easily rather than the overly simple and context-incoherent opening paragraph of “The resulting algorithm” section. Not every reader can be assumed to have sufficient capability to derive these by themselves.Kxx (talk) 04:17, 5 February 2010 (UTC)[reply]

Minor point about quadratic forms

Is x'A'x + b'x really a quadratic form? It's not homogeneous, and my understanding is that quadratic forms must be homogeneous. Note that this is just a usage quibble; I have no complaint about the math. Birge (talk) 03:26, 16 May 2009 (UTC)[reply]

It is not according to "Quadratic form". I have made the modifications.Kxx (talk) 05:48, 16 May 2009 (UTC)[reply]

Small error in code

"end if" causes a sintax error in Octave. Should be "endif" or just "end". Italo Tasso (talk) 20:05, 20 July 2009 (UTC)[reply]

Major conceptual flaw

The article give the impression that you need to keep in memory all vectors calculated in order to find a new one that is conjugate to all the previous ones, but this is not true. The vectors are calculated one by iteration and based just on the previous one, and this is precisely one of the most important aspects of the whole theory.

Yes it's that awesome, it's not a simplification, it's not a crazy new direction conjugate just to the last one, it really is conjuge to all previous in spite of we throwing them away! And it must be made clear that the algorithm is direct _and_ iterative. Sometimes you can have good approximations before the last step, and stop before, but this approximate algorithm is otherwise identical to the direct version. -- NIC1138 (talk) 05:04, 11 April 2010 (UTC)[reply]