Talk:Conjugate gradient method

Mathematics C‑class Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics
C	This article has been rated as C-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-priority on the project's priority scale.

This is the talk page for discussing improvements to the Conjugate gradient method article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1: 2 months

Archives

Archive 1

This page has archives. Sections older than 60 days may be automatically archived by .

Discussion of correct of the algorithm

The explanation skips a large important part of the correctness proof, which shows that the mentioned "Gram-Schmidt orthonormalization" only needs to consider the latest direction p_k and need not be performed relative to all previous directions p_j where j < k. This can be deduced from the fact that the r_k and p_k span the same Krylov subspace, but it should be highlighted how this implies that p_j r_k = 0 for j <= k and p_j A r_k = 0 for j < k.

Comment

In your algorithm, the formula to calculate pk differs from Jonathan Richard Shewchuk paper. The index of r should be k instead of k-1. Mmmh, sorry, it seams to be correct! ;-) —The preceding unsigned comment was added by 171.66.40.105 (talk • contribs) 01:45, 21 February 2007 (UTC)

additional comment:

basis vector set $P = \{ p_0, ... p_{n-1} \}$.

Monotonic improvements?

"Fortunately, the conjugate gradient method can be used as an iterative method as it provides monotonically improving approximations x_k to the exact solution." Is this really true? I find, in practice, that the residual norm(A*x_k - b) is definitely not monotonic w.r.t. k. As an example, I invite you to try the following program with the given octave code. You will see that the residuals, when x_0=[0;0;0], are [1.3776, 1.3986, 1.6422, 1.4e-14].

A := [0.100397715109637, 0.333310651201826, 0.197818584877521; 0.333310651201826, 1.443856474586787, 0.826552007973684; 0.197818584877521, 0.826552007973684, 1.143082020356276];

b := [0.964888535199277, 0.157613081677548, 0.970592781760616]';

Rocketman768 (talk) 19:32, 16 May 2012 (UTC)[reply]

A note is added that monotonicity is in the energy norm, so the issue is resolved as of now. — Preceding unsigned comment added by 24.61.185.230 (talk) 16:43, 15 March 2021 (UTC)[reply]

Complex (preconditioned) CG

The CG algorithm extends to complex matrices and vectors in both the original and the preconditioned case. The derivation for this can be found in Iterative Methods for Solving Linear Systems, A. Greenbaum. The changes to the algorithms are trivial - the transposes are replaced by the hermitian (conjugate) transpose (which I always denote as superscript H). It's necessary in the preconditioned case to take care whether it is the update for z or r that is conjugated (in the real case, it doesn't matter which is transposed), but the algorithm shown is correct. I suggest a comment should be made somewhere that the complex case is also covered. Finding references to this is _not_ trivial (almost universally the derivations say "For real..."). [edit: claiming ownership or last comment] Hgomersall (talk) 19:19, 4 March 2014 (UTC)[reply]

Done by adding a new section. — Preceding unsigned comment added by 24.61.185.230 (talk) 16:45, 15 March 2021 (UTC)[reply]

Conflicting information under "The resulting algorithm" subsection

In the subsection, I can see that r_i^T r_j = 0. The following is quoted from the same. However, in the algorithm, in the calculation for the value of {\beta}, the r_k^T r_k comes up in the denominator, which does not look right. In my openion, the denominator should be r_k^T r_k+1. However, I could be incorrect, and hence, I would like to bring this to the notice of the experts. — Preceding unsigned comment added by Zenineasa (talk • contribs) 13:25, 26 November 2021 (UTC)[reply]

Missing steps thinking about it as a linear solver

I can't wrap my head around the fact that it's optimizing

f(\mathbf {x} )={\tfrac {1}{2}}\mathbf {x} ^{\mathsf {T}}\mathbf {A} \mathbf {x} -\mathbf {x} ^{\mathsf {T}}\mathbf {b} ,\qquad \mathbf {x} \in \mathbf {R} ^{n}\,

.

Abstractly, I think (?) the idea is that we choose $f(x)$ such that

\nabla f(\mathbf {x} )=\mathbf {A} \mathbf {x} -\mathbf {b} \,

and so the residual vector is the gradient of the surface. But coming at it from the perspective of "let's minimize the residual", my line of thinking is let

g(\mathbf {x} ):=\|\mathbf {A} \mathbf {x} -\mathbf {b} \|_{2}^{2}=(\mathbf {Ax-b} )^{\top }(\mathbf {Ax-b} )=\mathbf {x^{\top }A^{\top }Ax} -2\mathbf {b^{\top }Ax+b^{\top }b}

which we could drop the constant $\mathbf {b^{\top }b}$ from. That's similar but decidedly different from $f(\mathbf {x} )$ . What gives? Is $f(\mathbf {x} )$ minimizing the residual? Can the norm of the residual be massaged to look like $f(\mathbf {x} )$ ? —Ben FrantzDale (talk) 21:30, 18 January 2023 (UTC)[reply]