Numerical integration

In scientific computing and numerical analysis, the term "numerical integration" is used to describe a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe numerical algorithms for solving differential equations.

The most primitive numerical integration method is suggested by the Riemann integral:

\int _{0}^{1}f(x)dx\approx n^{-1}\sum _{k=1}^{n}f(k/n)

(*)

The sum on the right is termed a Riemann sum, since it looks like the integral of a step function (see the section on the Riemann integral.) Of course, it may not be possible in general to construe the sum on the right as being either a lower, or upper sum, however, if one can build actual lower and upper sums, then this algorithm can be made to yield an interval of certainty.

The disadvantage of this algorithm is that it has extremely poor convergence properties. In order to get any precision at all, one needs n to be extremely large, even for extremely well behaved functions. For instance, with

We note at this point that the step function in (*) was built by approximating f(x) over each interval [k/n,(k+1)/n] by the value of f at x=(k+1)/n, that is, the right-hand side of the interval. Had we chosen the middle of the interval instead, for a large class of functions (at least all twice differentiable functions) the error term of the algorithm would have been at least one order of magnitude better.

f(x)=x

the integral on the left of (*) is 1/2 while the sum on the right of (*) is 1/2 + 1/(2n). Hence, if we use a sum with n terms in it, the error we obtain in our numerical approximation is 1/(2n). To obtain, say, 9 digits of precision, we must use 10⁹ terms in the sum, which has two problems. The first problem is that roundoff (see numerical analysis) will begin to dominate our calculations. The second is that the running time for 10⁹ iterations of any algorithm is usually too expensive; it takes too long to run.

Quadrature

A very simple choice to approximate the integral on the left hand side of (*) would be to evaluate f(0) and f(1), then to approximate f with a linear polynomial between 0 and 1, and then calculate the exact area under this linear polynomial. This yields

\int _{0}^{1}f(x)dx\approx (f(0)+f(1))/2

This is called the trapeze rule of numerical integration. Instead of using a single trapeze, we can use multiple trapezes placed at regular intervals, in which case we obtain

\int _{0}^{1}f(x)dx\approx n^{-1}(f(0)/2+f(1)/2+\sum _{k=1}^{n-1}f(k/n))

The preceding formula is called the iterated trapeze rule.

Instead of using linear polynomials, one can use polynomials of any degree. This leads to more general quadrature algorithms, the best of which are known as Gaussian quadrature. The error for such quadrature can be shown to be very small, if the function f is very smooth (if it has many derivatives.)

Adaptive algorithms

If f does not have many derivatives at all points, or if the derivatives become large, then Gaussian quadrature is often insufficient. In this case, an algorithm similar to the following will perform better:

   // This algorithm calculates the definite integral of a function
   // from 0 to 1, adaptively, by choosing smaller steps near
   // problematic points.
   // Set initial_h to the initial step size.

   x:=0
   h:=initial_h
   accumulator:=0
   WHILE x<1.0 DO
       IF x+h>1.0 THEN
           h=1.0-x
       END IF
       IF error from quadrature over [x,x+h] for f is too large THEN
           Make h smaller
       ELSE
           accumulator:=accumulator + quadrature of f over [x,x+h]
           x:=x+h
           IF error from quadrature over [x,x+h] is very small THEN
               Make h larger
           END IF
       END IF
   END WHILE
   RETURN accumulator

Some details of the algorithm require careful thought. For many cases, estimating the error from quadrature over an interval for a function f isn't obvious. One popular solution is to use two different rules of quadrature, and use their difference as an estimate of the error from quadrature. The other problem is deciding what "too large" or "very small" signify. A possible criterion for "too large" is that the quadrature error should not be larger than th where t, a real number, is the tolerance we wish to set for global error. Then again, if h is already tiny, it may not be worthwhile to make it even smaller even if the quadrature error is apparently large. This type of error analysis is usually called "a posteriori" since we compute the error after having computed the approximation.

Of course, more sophisticated heuristics are possible.

Conservative (a priori) error estimation

Let f have a continuous first derivative over [a,b]. The mean value theorem for f, where x<b, gives

(x-a)f'(y_{x})=f(x)-f(a)

for some y_x in [a,x] depending on x. If we integrate in x from a to b on both sides and the take absolute values, we obtain

|\int _{a}^{b}f(x)dx-(b-a)f(a)|=|\int (x-a)f'(y_{x})dx|

We can further approximate the integral on the right-hand side by bringing the absolute value into the integrand, and replacing the term in f' by an upper bound:

|\int _{a}^{b}f(x)dx-(b-a)f(a)|\leq (b-a)^{2}\max _{a\leq x\leq b}|f'(x)|/2

(**)

Hence, if we approximate the integral ∫_a^bf(x)dx by the quadrature rule (b-a)f(a) our error is no greater than the right hand side of (**). We can convert this into an error analysis for the Riemann sum (*), giving an upper bound of

n^{-1}\max _{0\leq x\leq 1}|f'(x)|

for the error term of that particular approximation. (Note that this is very close to the error we calculated for the example f(x)=x.) Using more derivatives, and by tweaking the quadrature, we can do a similar error analysis using a Taylor series (using a partial sum with remainder term) for f. This error analysis gives a strict upper bound on the error, if the derivatives of f are available.

This integration method can be combined with interval arithmetic to produce computer proofs and verified calculations.