Bisection method

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values, then selecting the subinterval in which the function changes sign, which therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.^[1] The method is also called the interval halving method,^[2] the binary search method,^[3] or the dichotomy method.^[4]

For polynomials, more elaborate methods exist for testing the existence of a root in an interval (Descartes' rule of signs, Sturm's theorem, Budan's theorem). They allow extending the bisection method into efficient algorithms for finding all real roots of a polynomial; see Real-root isolation.

The method

The method is applicable for numerically solving the equation $f(x)=0$ for the real variable $x$ , where $f$ is a continuous function defined on an interval $[a,b]$ and where $f(a)$ and $f(b)$ have opposite signs. In this case $a$ and $b$ are said to bracket a root since, by the intermediate value theorem, the continuous function $f$ must have at least one root in the interval $(a,b)$ .

At each step the method divides the interval in two parts/halves by computing the midpoint $c=(a+b)/2$ of the interval and the value of the function $f(c)$ at that point. If $c$ itself is a root then the process has succeeded and stops. Otherwise, there are now only two possibilities: either $f(a)$ and $f(c)$ have opposite signs and bracket a root, or $f(c)$ and $f(b)$ have opposite signs and bracket a root.^[5] The method selects the subinterval that is guaranteed to be a bracket as the new interval to be used in the next step. In this way an interval that contains a zero of $f$ is reduced in width by 50% at each step. The process is continued until the interval is sufficiently small.

Explicitly, if $f(c)=0$ then $c$ may be taken as the solution and the process stops.

Otherwise, if $f(a)$ and $f(c)$ have the same signs,

then the method sets $a=c$ ,
else the method sets $b=c$ .

In both cases, the new $f(a)$ and $f(b)$ have opposite signs, so the method may be applied to this smaller interval.^[6]

Once the process starts, the signs at the left and right ends of the interval remain the same for all iterations.

Stopping conditions

In order to determine when the iteration should stop, it is necessary to consider various possible stopping conditions with respect to a tolerance ( $\epsilon$ ). Burden & Faires state:^[7]

we can select a tolerance $\epsilon >0$ and generate p₁, ..., p_N until one of the following conditions is met:

$|p_{N}-p_{N-1}|<\epsilon ,$ (2.1)

$\left|{\frac {p_{N}-p_{N-1}}{p_{N}}}\right|<\epsilon ,$ $p_{N}\neq 0,$ or (2.2)

$|f(p_{N})|<\epsilon .$ (2.3)

Unfortunately, difficulties can arise using any of these stopping criteria ... Without additional knowledge about $f$ or $p$ , Inequality (2.2) is the best stopping criterion to apply because it comes closest to testing relative error.

The objective is to find an approximation, within the tolerance, to the root.

The following shows that (2.3) $|f(p_{N})|<\epsilon$ does not give such an approximation unless $|f'(p_{N})|\geq 1$ .

If $r$ is a root of $f(x)=0$ , then $f(x)=g(x)(x-r)$ and $|f(p_{N})|=|g(p_{N})||p_{N}-r|=M|p_{N}-r|$ .

This means that (2.3) can be written as

M|p_{N}-r|<\epsilon

or

r-{\frac {\epsilon }{M}}<p_{N}<r+{\frac {\epsilon }{M}}

.

Suppose, for the purpose of illustration, the tolerance is $5\times 10^{-6}$ . Now take the case in which $r=0$ so that the approximate value $p_{N}$ is in the interval

\left(-{\frac {5\times 10^{-6}}{M}},{\frac {5\times 10^{-6}}{M}}\right).

The problem is that, for small $M$ (i.e. the slope near the root is small), the interval can be quite large - for example, if $M=10^{-10}$ , the approximation to 0 could be in

\left(-{5\times 10^{4}},5\times 10^{4}\right).

Hence, unless the magniitude of the slope near the root is $\geq 1$ , this condition will not give a useful result - and the slope is unknown until the root is known.

In fact, using $|f(c)|<tol$ involves a logical fallacy. It is obvious that if $c$ is sufficiently close to the root that $|f(c)|$ will become less than $tol$ . That does not mean that, if $|f(c)|<tol$ , $c$ is close to the root!

The other two possibilities represent different concepts - for example, the absolute difference $|c-a|\leq 5\times 10^{-t}$ says that c and a are the same to $t$ decimal places, while the relative difference $\left|{\frac {c-a}{c}}\right|\leq 5\times 10^{-t}$ says that c and a are the same to $t$ significant digits^[8].

Note the use of $5\times 10^{-t}$ for the tolerance. If 2 numbers differ by this amount then, depending on whether absolute differences or relative differences are being used, the numbers have either $t$ decimal places or $t$ digits in common. The use of a tolerance such as $1\times 10^{-t}$ provides no better information.

Iteration process

The input for the method is a continuous function $f$ and an interval $[a,b]$ , such that the function values $f(a)$ and $f(b)$ are of opposite sign (there is at least one zero crossing within the interval). Each iteration performs these steps:

Calculate $c$ , the midpoint of the interval, $c={\frac {a+b}{2}}$ ;
Calculate the function value at the midpoint, $f(c)$ ;
If $f(c)=0$ , return c;
If convergence is satisfactory (that is, $\left|{\frac {c-a}{c}}\right|$ is sufficiently small), return $c$ ;
Examine the sign of $f(c)$ and replace either $a$ or $b$ with $c$ so that there is a zero crossing within the new interval.

Algorithm

import numpy as np


def bisect(f, a, b, tol):
    if a > b:  # Check that the user has the correct order for endpoints.
        a, b = b, a
    fa = f(a)
    fb = f(b)
    i = 0
    if np.sign(fa) == np.sign(fb):  # Check for opposite signs.
        return float("NAN"), i, b - a, "Same sign"
    while True:
        i += 1
        c = (b + a) / 2
        fc = f(c)
        if b - a <= 5e-324:  # Check for min. interval width.
            return 5e-324, i, b - a, "Min sub"
        if fc == 0:  # Return if f(c) evaluates to 0.
            return c, i, b - a, "f(c)=0"
        if b - a <= abs(c) * tol:  # Test for convergence.
            return c, i, b - a, "Tol"
        if np.sign(fa) == np.sign(fc):
            a = c
        else:
            b = c
        if i > 2100:  # See example 3.
            return c, i, b - a, "Max iter"

Note that there is only one function evaluation per iteration.

If the algorithm returns before tolerance is reached, it is possible to determine the number of significant figures that are returned.

If, after $n$ iterations, ${\frac {b_{n}-a_{n}}{c_{n}}}\leq 5*10^{-t}$ is satisfied, then $t$ digits have been returned.

So $10^{\tau }\geq {\frac {5c_{n}}{b_{n}-a_{n}}}$ can be solved - from which $\tau$ is the number of digits returned.

def compute_sig_digits(width, c):
    if width == 0 or c == 0:
        return 0
    ratio = (5 * abs(c)) / width
    return max(0, int(math.floor(math.log10(ratio))))

Testing

1) Test to see if $f(a)$ and $f(b)$ have same sign:

bisect(lambda x: (x - 1.234567890), 0, 1,  tol=5e-15)
               Root =                       nan | Iters =   0 
                         b - a = 1.0000000000000000e+00 Cause = Same sign
compute_sig_digits(b-a, c) =  0

2) Test to see what happens if user has entered the endpoints correctly:

bisect(lambda x: (x - 1.2345678901234567890), 2, 1,  tol=5e-15)
               Root = 1.2345678901234560243e+00 | Iters =  49 
                         b - a = 3.5527136788005009e-15 Cause = Tol
compute_sig_digits(b-a, c) = 15

3) Determine the maximum number of iterations likely to be needed:

bisect(lambda x: (x - 1.2345678901234567890e-308), 1e-308, 1.1e308, tol=5e-15)
               Root = 1.2345678901234546944e-308 | Iters = 2095 
                         b - a = 4.9406564584124654e-323 Cause = Tol
compute_sig_digits(b-a, c) = 15

4) Test behavior for a very large root:

bisect(lambda x: (x - 1.2345678901234567890e307), 0, 2e307,  tol=5e-15)
               Root = 1.2345678901234559937e+307 | Iters =  50 
                         b - a = 3.4927205416857597e+292 Cause = Tol
compute_sig_digits(b-a, c) = 15

5) Test behavior for a large root:

bisect(lambda x: (x - 1.2345678901234567890e50), 0, 2e50,  tol=5e-15)
               Root = 1.2345678901234562507e+50 | Iters =  50 
                         b - a = 3.5307618638036828e+35 Cause = Tol
compute_sig_digits(b-a, c) = 15

6) Test behavior for a polynomial:

bisect(lambda x: (x**3 - x - 2), 0, 10,  tol=5e-15)
               Root = 1.5213797068045686878e+00 | Iters =  52 
                         b - a = 4.4408920985006262e-15 Cause = Tol
compute_sig_digits(b-a, c) = 15

7) Test behavior multiple roots I:

bisect(lambda x: (math.sin(x)), 0.01, 10,  tol=5e-15)
               Root = 3.1415926535897922278e+00 | Iters =  51 
                         b - a = 8.8817841970012523e-15 Cause = Tol
compute_sig_digits(b-a, c) = 15

8) Test behavior multiple roots II:

bisect(lambda x: (math.sin(x)), 7, 10,  tol=5e-15)
               Root = 9.4247779607693829007e+00 | Iters =  47 
                         b - a = 4.2632564145606011e-14 Cause = Tol
compute_sig_digits(b-a, c) = 15

The root found depends on the initial interval.

9) Test behavior for higher order root:

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-15)
               Root = 1.2345678980942787685e-100 | Iters = 357 
                         b - a = 6.8127357440130412e-108 Cause = f(c)=0
compute_sig_digits(b-a, c) = 7

This test shows, for the first time, a situation in which the cause of termination was $f(c)=0$ even though $c$ is not the exact root.

This occurs because, using computer arithmetic, a function can evaluate to zero even if it is not identically zero -

   ${\begin{aligned}(c-r)^{3}&=(1.2345678980942787685\times 10^{-100}-1.23456789012345\times 10^{-100})^{3}\\&=(7.9708287685\times 10^{-109})^{3}\\&=5.064195217863605\times 10^{-325}\end{aligned}}$

This number is less than the minimum subnormal and hence equates to 0.

The iteration stopped for $f(c)=0$ , but the algorithm returned a result with 7 significant digits.

10) Test for a small root:

bisect(lambda x: (x - 1.2345678901234567890e-10), 0, 1,  tol=5e-15)
               Root = 1.2345678901234560005e-10 | Iters =  82 
                         b - a = 4.1359030627651384e-25 Cause = Tol
Significant digits =  15

11) Test for really small root:

bisect(lambda x: (x - 1.2345678901234567890e-308), 0, 1,  tol=5e-15)
               Root = 1.2345678901234551885e-308 | Iters = 1072 
                         b - a = 3.9525251667299724e-323 Cause = Tol
Significant digits =  15

This is close to the minimum normal. The final interval width is close to the minimum subnormal

12) Test for a root that is less than the minimum normal:

bisect(lambda x: (x - 1.2345678901234567890e-315), 0, 1,  tol=5e-15)
               Root = 1.2345678910036845193e-315 | Iters = 1074 
                         b - a = 9.8813129168249309e-324 Cause = f(c)=0
Significant digits =  8

Notice that the cause of the return was $f(c)=0$ .

Even though the root is less than the minimum normal, 8 significant figures are obtained

It is often stated that the Bisection method is 'slow but sure'.The slowness refers to the fact that it is linear in its convergence and its sureness comes from the fact that the root is always contained within an interval which is decreasing in width. As has been seen, this sureness depends on the stopping condition with the relative error performing best.

Popular alternatives to the bisection method, are Newton-Raphson method, Secant method, Ridders' method, or Brent's method. These typically exhibit a better order of convergence to the root. While these alternatives may have a better order of convergence they may also require more function evaluations and/or may fail to converge

An improvement to the bisection method can be achieved with a higher order of convergence without trading-off worst case performance with the rather new ITP Method.^[9]

Beware that most of the implementations of these algorithms use the absolute error as a stopping condition. Before using any algorithm, test to ensure that it produces correct results for small roots.

Generalization to higher dimensions

The bisection method has been generalized to multi-dimensional functions. Such methods are called generalized bisection methods.^[10]^[11]

Methods based on degree computation

Some of these methods are based on computing the topological degree.^[12]

Characteristic bisection method

The characteristic bisection method uses only the signs of a function in different points. Lef f be a function from R^d to R^d, for some integer d ≥ 2. A characteristic polyhedron^[13] (also called an admissible polygon)^[14] of f is a polyhedron in R^d, having 2^d vertices, such that in each vertex v, the combination of signs of f(v) is unique. For example, for d=2, a characteristic polyhedron of f is a quadrilateral with vertices (say) A,B,C,D, such that:

Sign f(A) = (-,-), that is, f₁(A)<0, f₂(A)<0.
Sign f(B) = (-,+), that is, f₁(B)<0, f₂(B)>0.
Sign f(C) = (+,-), that is, f₁(C)>0, f₂(C)<0.
Sign f(D) = (+,+), that is, f₁(D)>0, f₂(D)>0.

A proper edge of a characteristic polygon is a edge between a pair of vertices, such that the sign vector differs by only a single sign. In the above example, the proper edges of the characteristic quadrilateral are AB, AC, BD and CD. A diagonal is a pair of vertices, such that the sign vector differs by all d signs. In the above example, the diagonals are AD and BC.

At each iteration, the algorithm picks a proper edge of the polyhedron (say, A--B), and computes the signs of f in its mid-point (say, M). Then it proceeds as follows:

If Sign f(M) = Sign(A), then A is replaced by M, and we get a smaller characteristic polyhedron.
If Sign f(M) = Sign(B), then B is replaced by M, and we get a smaller characteristic polyhedron.
Else, we pick a new proper edge and try again.

Suppose the diameter (= length of longest proper edge) of the original characteristic polyhedron is $D$ . Then, at least $\log _{2}(D/\varepsilon )$ bisections of edges are required so that the diameter of the remaining polygon will be at most $\varepsilon$ .^[14]^{: 11, Lemma.4.7}

References

^ Burden & Faires 2016, p. 51
^ "Interval Halving (Bisection)". Archived from the original on 2013-05-19. Retrieved 2013-11-07.
^ Burden & Faires 2016, p. 48
^ "Dichotomy method - Encyclopedia of Mathematics". www.encyclopediaofmath.org. Retrieved 2015-12-21.
^ If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.
^ Burden & Faires 2016, p. 48
^ Burden & Faires 2016, p. 50
^ Burden & Faires 2016, p. 18
^ Ivo, Oliveira (2020-12-14). "An Improved Bisection Method".
^ Mourrain, B.; Vrahatis, M. N.; Yakoubsohn, J. C. (2002-06-01). "On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree". Journal of Complexity. 18 (2): 612–640. doi:10.1006/jcom.2001.0636. ISSN 0885-064X.
^ Vrahatis, Michael N. (2020). Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". Numerical Computations: Theory and Algorithms. Cham: Springer International Publishing: 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN 978-3-030-40616-5.
^ Kearfott, Baker (1979-06-01). "An efficient degree-computation method for a generalized method of bisection". Numerische Mathematik. 32 (2): 109–127. doi:10.1007/BF01404868. ISSN 0945-3245.
^ Vrahatis, Michael N. (1995-06-01). "An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings". Journal of Computational Physics. 119 (1): 105–119. doi:10.1006/jcph.1995.1119. ISSN 0021-9991.
^ ^a ^b Vrahatis, M. N.; Iordanidis, K. I. (1986-03-01). "A rapid Generalized Method of Bisection for solving Systems of Non-linear Equations". Numerische Mathematik. 49 (2): 123–138. doi:10.1007/BF01389620. ISSN 0945-3245.

Burden, Richard L.; Faires, J. Douglas (2016), "2.1 The Bisection Algorithm", Numerical Analysis (10th ed.), Cenage Learning, ISBN 978-1-305-25366-7

External links

Bisection Method Notes, PPT, Mathcad, Maple, Matlab, Mathematica from Holistic Numerical Methods Institute

⊤

[1] Burden & Faires 2016, p. 51

[2] "Interval Halving (Bisection)". Archived from the original on 2013-05-19. Retrieved 2013-11-07.

[3] Burden & Faires 2016, p. 48

[4] "Dichotomy method - Encyclopedia of Mathematics". www.encyclopediaofmath.org. Retrieved 2015-12-21.

[5] If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.

[6] Burden & Faires 2016, p. 48

[7] Burden & Faires 2016, p. 50

[8] Burden & Faires 2016, p. 18

[9] Ivo, Oliveira (2020-12-14). "An Improved Bisection Method".

[10] Mourrain, B.; Vrahatis, M. N.; Yakoubsohn, J. C. (2002-06-01). "On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree". Journal of Complexity. 18 (2): 612–640. doi:10.1006/jcom.2001.0636. ISSN 0885-064X.

[11] Vrahatis, Michael N. (2020). Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". Numerical Computations: Theory and Algorithms. Cham: Springer International Publishing: 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN 978-3-030-40616-5.

[12] Kearfott, Baker (1979-06-01). "An efficient degree-computation method for a generalized method of bisection". Numerische Mathematik. 32 (2): 109–127. doi:10.1007/BF01404868. ISSN 0945-3245.

[13] Vrahatis, Michael N. (1995-06-01). "An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings". Journal of Computational Physics. 119 (1): 105–119. doi:10.1006/jcph.1995.1119. ISSN 0021-9991.

[:2-14] Vrahatis, M. N.; Iordanidis, K. I. (1986-03-01). "A rapid Generalized Method of Bisection for solving Systems of Non-linear Equations". Numerische Mathematik. 49 (2): 123–138. doi:10.1007/BF01389620. ISSN 0945-3245.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Bernoulli's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method