Bisection method

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.^[1] The method is also called the interval halving method,^[2] the binary search method,^[3] or the dichotomy method.^[4]

For polynomials, more elaborate methods exist for testing the existence of a root in an interval (Descartes' rule of signs, Sturm's theorem, Budan's theorem). They allow extending the bisection method into efficient algorithms for finding all real roots of a polynomial; see Real-root isolation.

The method

The method is applicable for numerically solving the equation $f(x)=0$ for the real variable $x$ , where $f$ is a continuous function defined on an interval $[a,b]$ and where $f(a)$ and $f(b)$ have opposite signs. In this case $a$ and $b$ are said to bracket a root since, by the intermediate value theorem, the continuous function $f$ must have at least one root in the interval $(a,b)$ .

At each step the method divides the interval in two parts/halves by computing the midpoint $c=(a+b)/2$ of the interval and the value of the function $f(c)$ at that point. If $c$ itself is a root then the process has succeeded and stops. Otherwise, there are now only two possibilities: either $f(a)$ and $f(c)$ have opposite signs and bracket a root, or $f(c)$ and $f(b)$ have opposite signs and bracket a root.^[5] The method selects the subinterval that is guaranteed to be a bracket as the new interval to be used in the next step. In this way an interval that contains a zero of $f$ is reduced in width by 50% at each step. The process is continued until the interval is sufficiently small.

Explicitly, if $f(c)=0$ then $c$ may be taken as the solution and the process stops. Otherwise, if $f(a)$ and $f(c)$ have opposite signs, then the method sets $c$ as the new value for $b$ , and if $f(b)$ and $f(c)$ have opposite signs then the method sets $c$ as the new $a$ . In both cases, the new $f(a)$ and $f(b)$ have opposite signs, so the method is applicable to this smaller interval.^[6]

Algorithm

import numpy as np
import math


def bisect(f, a, b, tol, bound=9.8813129168249309e-324):
    ############################################################################E
    # input: Function f,
    #        endpoint values a, b,
    #        tolerance tol, (if tol = 5e-t and bound = 9.0e-324 the function 
    #                        returns t significant digits for a root between the 
    #                        minimum normal and the maximum normal),
    #         bound (if bound=9.8813129168249309e-324, the algorithm continues  
    #                until the interval cannot be further divided, a larger value 
    #                may result in termination before t digits are found).
    # conditions: f is a continuous function in the interval [a, b],
    #             a < b,
    #             and f(a)*f(b) < 0.
    # output:    [root, iterations, convergence, termination condition]    
    #############################################################################N
    if b <= a:
        return [float("NAN"), 0, "No convergence", "b < a"]
    fa = f(a)
    fb = f(b)
    if np.sign(fa) == np.sign(fb):
        return [float("NAN"), 0, "No convergence", "f(a)*f(b) > 0"]
    en = 0
    while en < 2200:
        en += 1
        if np.sign(a) == np.sign(b):  # avoid overflow
            c = a + (b - a)/2
        else:
            c = (a + b)/2
        fc = f(c)
        if b - a <= bound:
            return [bound, en, "No convergence", "Bound reached"]
        if fc == 0:
            return [c, en, "Converged", "f(c) = 0"]
        if b - a <= abs(c) * tol:
            return [c, en, "Converged", "Tolerance"]
        if np.sign(fa) == np.sign(fc):
            a = c
            fa = fc
        else:
            b = c

    return [float("NAN"), en, "No convergence", "Bad function"]

The first 2 examples test for incorrect input values:

 1 bisect(lambda x: x -                  1, 5, 1, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root =                nan 
No convergence after 0 iterations with termination b < a
Final interval [               nan,                nan]

 2 bisect(lambda x: x -                  1, 5, 7, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root =                nan 
No convergence after 0 iterations with termination f(a)*f(b) > 0
Final interval [               nan,                nan]

Large roots:

 3 bisect(lambda x: x -  12345678901.23456, 0, 1.23457e+14, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root =  12345678901.23454 
Converged after 62 iterations with termination Tolerance
Final interval [1.2345678901234526e+10, 1.2345678901234552e+10]

 4 bisect(lambda x: x - 1.23456789012456e+100, 0, 2e+100, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567890124561e+100 
Converged after 50 iterations with termination Tolerance
Final interval [1.2345678901245599e+100, 1.2345678901245619e+100]

The final interval is computed as [c - w/2, c + w/2] where $w={\frac {b-a}{2^{n}}}$ . This can give good measure as to the accuracy of the approximation

Root near maximum:

 5 bisect(lambda x: x - 1.234567890123456e+307, 0, 1e+308, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567890123454e+307 
Converged after 52 iterations with termination Tolerance
Final interval [1.2345678901234535e+307, 1.2345678901234555e+307]

Small roots:

 6 bisect(lambda x: x - 1.234567890123456e-05, 0, 1, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567890123455e-05 
Converged after 65 iterations with termination Tolerance
Final interval [1.2345678901234537e-05, 1.2345678901234564e-05]

 7 bisect(lambda x: x - 1.234567890123456e-100, 0, 1, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567890123454e-100 
Converged after 381 iterations with termination Tolerance
Final interval [1.2345678901234532e-100, 1.2345678901234552e-100]

Ex. 8 is beyond the minimum normal but gives a fairly good result because the approximation has a small interval. Calculations for values in the subnormal range can produce unexpected results.

 8 bisect(lambda x: x - 1.234567890123457e-310, 0, 1, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567890123457e-310 
Converged after 1071 iterations with termination f(c) = 0
Final interval [1.2345678901232595e-310, 1.2345678901236548e-310]

If the return state is ' $f(c)=0$ ', then the desired tolerance may not have been achieved. This can be checked by lowering the tolerance until a return state of 'Tolerance' is achieved.

8a bisect(lambda x: x - 1.234567890123457e-310, 0, 1, 5.000000e-13)
         Approx. root = 1.234567890123457e-310 
Converged after 1071 iterations with termination f(c) = 0
Final interval [1.2345678901232595e-310, 1.2345678901236548e-310]

8b bisect(lambda x: x - 1.234567890123457e-310, 0, 1, 5.000000e-12)
         Approx. root = 1.234567890124643e-310 
Converged after 1069 iterations with termination Tolerance
Final interval [1.2345678901238524e-310, 1.2345678901254334e-310]

8b shows that the result has 12 digits.

Even though the root is outside the 'normal' range, it may still be possible to achieve results with good tolerance.

 9 bisect(lambda x: x - 1.234567891003685e-315, 0, 1, 5.000000e-03, 9.8813129168249309e-324)
         Approx. root = 1.23558592808891e-315 
Converged after 1055 iterations with termination Tolerance
Final interval [1.2342907646422757e-315, 1.2368810915355439e-315]
1.2368810915355439e-315]

Ex. 10 shows the maximum number of iterations that should be expected:

10 bisect(lambda x: x - 1.234567891003685e-315, -1e+307, 1e+307, 5.000000e-15, 9.8813129168249309e-324)
         Approx. root = 1.234567891003685e-315 
Converged after 2093 iterations with termination f(c) = 0
Final interval [1.2345678910036845e-315, 1.2345678910036845e-315]

There may be situations in which a 'good' approximation is not required. This can be achieved by changing the 'Bound':

11 bisect(lambda x: x - 1.234567890123457e-100, 0, 1, 5.000000e-15, 4.9999999999999997e-12)
         Approx. root =              5e-12 
No convergence after 39 iterations with termination Bound reached
Final interval [4.0905052982270715e-12, 5.9094947017729279e-12]

Evaluation of the final interval may assist in determining accuracy.

The following show the behavior of subnormal numbers And shows how the significant digits are lost:

print(1.234567890123456e-310)
1.23456789012346e-310
print(1.234567890123456e-312)
1.234567890124e-312
print(1.234567890123456e-315)
1.23456789e-315
print(1.234567890123456e-317)
1.234568e-317
print(1.234567890123456e-319)
1.23457e-319
print(1.234567890123456e-321)
1.235e-321
print(1.234567890123456e-323)
1e-323
print(1.234567890123456e-324)
0.0

These examples show that this method gives 15 digit accuracy for functions of the form $f(x)=(x-r)g(x)$ for all $r$ in the range of normal numbers.

Higher order roots

Further problems can arise from the use of computer arithmetic for higher order roots.

To help in considering how to detect and correct inaccurate results consider the following:

bisect(lambda x: (x - 1.23456789012345e-100), 0, 1, 5e-15)
Approx. root = 1.23456789012345e-100 Converged after 381 iterations with termination f(c) = 0
Final interval [1.2345678901234491e-100, 1.2345678901234511e-100]

The final interval [1.2345678901234491e-100, 1.2345678901234511e-100] indicates fairly good accuracy. The bisection method has a distinct advantage over other root finding techniques in that the final interval can be used to determine the accuracy of the final solution. This information will be useful in assessing the accuracy of some following examples.

Next consider what happens for a root of order 3:

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-15)
Approx. root = 1.234567898094279e-100 Converged after 357 iterations with termination f(c) = 0
Final interval [1.2345678810624394e-100, 1.2345679151261181e-100]

The final interval [1.2345678810624394e-100, 1.2345679151261181e-100] indicates that 15 digits have not been returned.

The relative error

(1.234567898094279e-100 - 1.23456789012345e-100)/1.23456789012345e-100 
= 6.456371473106003e-09

shows that only 8 digits are correct and again $f(c)=0$ . This occurs because

${\begin{aligned}f(approx.root)&=f(1.234567898094279*10^{-100})\\&=(1.234567898094279*10^{-100}-1.23456789012345*10^{-100})^{3}\\&=(7.970828885817127*10^{-109})^{3}\\&=5.064195*10^{2}*10^{-327}\\&=5.064195*10^{-325}\end{aligned}}$

Because this is less than the minimum subnormal, it returns a value of 0.

This can occur in any root finding technique, not just the bisection method, and it is only the fact that the return conditions include the information about what stopping criteria was achieved that the problem can be diagnosed.

The use of the relative error as a stopping condition allows us to determine how accurate a solution can be obtained.

Consider what happens on trying to achieve 8 significant figures:

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-8)
[1.2345678980942788e-100, 357, 'Converged', 'f(c) = 0']

$f(c)=0$ Indicates that eight digits of accuracy have not been achieved, so try

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-4)
[1.2347947281308757e-100, 344, 'Converged', 'Tolerance']

At least four digits have been achieved and

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-6)
[1.2345658202098768e-100, 351, 'Converged', 'Tolerance']

6 digit convergence

bisect(lambda x: (x - 1.23456789012345e-100)**3, 0, 1, 5e-7)
[1.2345677277758852e-100, 354, 'Converged', 'Tolerance']

7 digit convergence

A similar problem can arise if there are two small roots close together:

bisect(lambda x: (x - 1.23456789012345e-23)*x, 1e-300, 1, 5e-15)
[1.2345678901234481e-23, 125, 'Converged', 'Tolerance']

15 digit convergence

bisect(lambda x: (x - 1.23456789012345e-24)*x, 1e-300, 1e-20, 5e-1)
[1.5509016039626554e-300, 931, 'Converged', 'f(c) = 0']

Final interval [1.2754508019813276e-300, 1.8263524059439830e-300]
relative error = 3.5521376891678086e-1 -- 1 digit convergence

bisect(lambda x: (x - 1.23456789012345e-23)*x, 1e-300, 1, 5e-1)
[1.1580528575742387e-23, 79, 'Converged', 'Tolerance']

Final interval [1.0753347963189360e-23, 1.2407709188295415e-23]
relative error = 1.4285714285714285e-1 -- 1 digit convergence

(The following has not been changed.)

Generalization to higher dimensions

The bisection method has been generalized to multi-dimensional functions. Such methods are called generalized bisection methods.^[7]^[8]

Methods based on degree computation

Some of these methods are based on computing the topological degree, which for a bounded region $\Omega \subseteq \mathbb {R} ^{n}$ and a differentiable function $f:\mathbb {R} ^{n}\rightarrow \mathbb {R} ^{n}$ is defined as a sum over its roots:

\deg(f,\Omega ):=\sum _{y\in f^{-1}(\mathbf {0} )}\operatorname {sgn} \det(Df(y))

,

where $Df(y)$ is the Jacobian matrix, $\mathbf {0} =(0,0,...,0)^{T}$ , and

\operatorname {sgn}(x)={\begin{cases}1,&x>0\\0,&x=0\\-1,&x<0\\\end{cases}}

is the sign function.^[9] In order for a root to exist, it is sufficient that $\deg(f,\Omega )\neq 0$ , and this can be verified using a surface integral over the boundary of $\Omega$ .^[10]

Characteristic bisection method

The characteristic bisection method uses only the signs of a function in different points. Lef f be a function from R^d to R^d, for some integer d ≥ 2. A characteristic polyhedron^[11] (also called an admissible polygon)^[12] of f is a polytope in R^d, having 2^d vertices, such that in each vertex v, the combination of signs of f(v) is unique and the topological degree of f on its interior is not zero (a necessary criterion to ensure the existence of a root).^[13] For example, for d=2, a characteristic polyhedron of f is a quadrilateral with vertices (say) A,B,C,D, such that:

⁠ $\operatorname {sgn} f(A)=(-,-)$ ⁠, that is, f₁(A)<0, f₂(A)<0.
⁠ $\operatorname {sgn} f(B)=(-,+)$ ⁠, that is, f₁(B)<0, f₂(B)>0.
⁠ $\operatorname {sgn} f(C)=(+,-)$ ⁠, that is, f₁(C)>0, f₂(C)<0.
⁠ $\operatorname {sgn} f(D)=(+,+)$ ⁠, that is, f₁(D)>0, f₂(D)>0.

A proper edge of a characteristic polygon is a edge between a pair of vertices, such that the sign vector differs by only a single sign. In the above example, the proper edges of the characteristic quadrilateral are AB, AC, BD and CD. A diagonal is a pair of vertices, such that the sign vector differs by all d signs. In the above example, the diagonals are AD and BC.

At each iteration, the algorithm picks a proper edge of the polyhedron (say, A—B), and computes the signs of f in its mid-point (say, M). Then it proceeds as follows:

If ⁠ $\operatorname {sgn} f(M)=\operatorname {sgn}(A)$ ⁠, then A is replaced by M, and we get a smaller characteristic polyhedron.
If ⁠ $\operatorname {sgn} f(M)=\operatorname {sgn}(B)$ ⁠, then B is replaced by M, and we get a smaller characteristic polyhedron.
Else, we pick a new proper edge and try again.

Suppose the diameter (= length of longest proper edge) of the original characteristic polyhedron is $D$ . Then, at least $\log _{2}(D/\varepsilon )$ bisections of edges are required so that the diameter of the remaining polygon will be at most $ε$ .^[12]^{: 11, Lemma.4.7} If the topological degree of the initial polyhedron is not zero, then there is a procedure that can choose an edge such that the next polyhedron also has nonzero degree.^[13]^[14]

References

^ Burden & Faires 2014, p. 51
^ "Interval Halving (Bisection)". Archived from the original on 2013-05-19. Retrieved 2013-11-07.
^ Burden & Faires 2014, p. 28
^ "Dichotomy method - Encyclopedia of Mathematics". www.encyclopediaofmath.org. Retrieved 2015-12-21.
^ If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.
^ Burden & Faires 2014, p. 28 for section
^ Mourrain, B.; Vrahatis, M. N.; Yakoubsohn, J. C. (2002-06-01). "On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree". Journal of Complexity. 18 (2): 612–640. doi:10.1006/jcom.2001.0636. ISSN 0885-064X.
^ Vrahatis, Michael N. (2020). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". In Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). Numerical Computations: Theory and Algorithms. Lecture Notes in Computer Science. Vol. 11974. Cham: Springer International Publishing. pp. 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN 978-3-030-40616-5. S2CID 211160947.
^ Polymilis, C.; Servizi, G.; Turchetti, G.; Skokos, Ch.; Vrahatis, M. N. (May 2003). "Locating Periodic Orbits by Topological Degree Theory". Libration Point Orbits and Applications: 665–676. arXiv:nlin/0211044. doi:10.1142/9789812704849_0031. ISBN 978-981-238-363-1.
^ Kearfott, Baker (1979-06-01). "An efficient degree-computation method for a generalized method of bisection". Numerische Mathematik. 32 (2): 109–127. doi:10.1007/BF01404868. ISSN 0945-3245. S2CID 122058552.
^ Vrahatis, Michael N. (1995-06-01). "An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings". Journal of Computational Physics. 119 (1): 105–119. Bibcode:1995JCoPh.119..105V. doi:10.1006/jcph.1995.1119. ISSN 0021-9991.
^ ^a ^b Vrahatis, M. N.; Iordanidis, K. I. (1986-03-01). "A rapid Generalized Method of Bisection for solving Systems of Non-linear Equations". Numerische Mathematik. 49 (2): 123–138. doi:10.1007/BF01389620. ISSN 0945-3245. S2CID 121771945.
^ ^a ^b Vrahatis, M.N.; Perdiou, A.E.; Kalantonis, V.S.; Perdios, E.A.; Papadakis, K.; Prosmiti, R.; Farantos, S.C. (July 2001). "Application of the Characteristic Bisection Method for locating and computing periodic orbits in molecular systems". Computer Physics Communications. 138 (1): 53–68. Bibcode:2001CoPhC.138...53V. doi:10.1016/S0010-4655(01)00190-4.
^ Vrahatis, Michael N. (December 1988). "Solving systems of nonlinear equations using the nonzero value of the topological degree". ACM Transactions on Mathematical Software. 14 (4): 312–329. doi:10.1145/50063.214384.

Burden, Richard L.; Faires, J. Douglas (2014). "2.1 The Bisection Algorithm". Numerical Analysis (10th ed.). Cengage Learning. ISBN 978-0-87150-857-7.

External links

Weisstein, Eric W. "Bisection". MathWorld.
Bisection Method Notes, PPT, Mathcad, Maple, Matlab, Mathematica from Holistic Numerical Methods Institute

[1] Burden & Faires 2014, p. 51

[2] "Interval Halving (Bisection)". Archived from the original on 2013-05-19. Retrieved 2013-11-07.

[3] Burden & Faires 2014, p. 28

[4] "Dichotomy method - Encyclopedia of Mathematics". www.encyclopediaofmath.org. Retrieved 2015-12-21.

[5] If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.

[6] Burden & Faires 2014, p. 28 for section

[7] Mourrain, B.; Vrahatis, M. N.; Yakoubsohn, J. C. (2002-06-01). "On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree". Journal of Complexity. 18 (2): 612–640. doi:10.1006/jcom.2001.0636. ISSN 0885-064X.

[8] Vrahatis, Michael N. (2020). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". In Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). Numerical Computations: Theory and Algorithms. Lecture Notes in Computer Science. Vol. 11974. Cham: Springer International Publishing. pp. 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN 978-3-030-40616-5. S2CID 211160947.

[9] Polymilis, C.; Servizi, G.; Turchetti, G.; Skokos, Ch.; Vrahatis, M. N. (May 2003). "Locating Periodic Orbits by Topological Degree Theory". Libration Point Orbits and Applications: 665–676. arXiv:nlin/0211044. doi:10.1142/9789812704849_0031. ISBN 978-981-238-363-1.

[10] Kearfott, Baker (1979-06-01). "An efficient degree-computation method for a generalized method of bisection". Numerische Mathematik. 32 (2): 109–127. doi:10.1007/BF01404868. ISSN 0945-3245. S2CID 122058552.

[11] Vrahatis, Michael N. (1995-06-01). "An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings". Journal of Computational Physics. 119 (1): 105–119. Bibcode:1995JCoPh.119..105V. doi:10.1006/jcph.1995.1119. ISSN 0021-9991.

[:2-12] Vrahatis, M. N.; Iordanidis, K. I. (1986-03-01). "A rapid Generalized Method of Bisection for solving Systems of Non-linear Equations". Numerische Mathematik. 49 (2): 123–138. doi:10.1007/BF01389620. ISSN 0945-3245. S2CID 121771945.

[:3-13] Vrahatis, M.N.; Perdiou, A.E.; Kalantonis, V.S.; Perdios, E.A.; Papadakis, K.; Prosmiti, R.; Farantos, S.C. (July 2001). "Application of the Characteristic Bisection Method for locating and computing periodic orbits in molecular systems". Computer Physics Communications. 138 (1): 53–68. Bibcode:2001CoPhC.138...53V. doi:10.1016/S0010-4655(01)00190-4.

[14] Vrahatis, Michael N. (December 1988). "Solving systems of nonlinear equations using the nonzero value of the topological degree". ACM Transactions on Mathematical Software. 14 (4): 312–329. doi:10.1145/50063.214384.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Bernoulli's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method