y = x3 − 6x2 + 11x − 6.06 is as shown in fig. 1, without any
reduction of scale for the ordinate.
It is clear that, in general, y is a continuous one-valued function of x, finite for every finite value of x, but becoming infinite when x is infinite; i.e., assuming throughout that the coefficient of xn is +1, then when x = ∞, y = +∞; but when x = −∞, then y = +∞ or −∞, according as n is even or odd; the curve cuts any line whatever, and in particular it cuts the axis (of x) in at most n points; and the value of x, at any point of intersection with the axis, is a root of the equation ƒ(x) = 0.
If β, α are any two values of x (α > β, that is, α nearer +∞), then if ƒ(β), ƒ(α) have opposite signs, the curve cuts the axis an odd number of times, and therefore at least once, between the points x = β, x = α; but if ƒ(β), ƒ(α) have the same sign, then between these points the curve cuts the axis an even number of times, or it may be not at all. That is, ƒ(β), ƒ(α) having opposite signs, there are between the limits β, α an odd number of real roots, and therefore at least one real root; but ƒ(β), ƒ(α) having the same sign, there are between these limits an even number of real roots, or it may be there is no real root. In particular, by giving to β, α the values −∞, +∞ (or, what is the same thing, any two values sufficiently near to these values respectively) it appears that an equation of an odd order has always an odd number of real roots, and therefore at least one real root; but that an equation of an even order has an even number of real roots, or it may be no real root.
If α be such that for x = or > a (that is, x nearer to +∞) ƒ(x) is always +, and β be such that for x = or < β (that is, x nearer to −∞) ƒ(x) is always −, then the real roots (if any) lie between these limits x = β, x = α; and it is easy to find by trial such two limits including between them all the real roots (if any).
3. Suppose that the positive value δ is an inferior limit to the difference between two real roots of the equation; or rather (since the foregoing expression would imply the existence of real roots) suppose that there are not two real roots such that their difference taken positively is = or < δ; then, γ being any value whatever, there is clearly at most one real root between the limits γ and γ + δ; and by what precedes there is such real root or there is not such real root, according as ƒ(γ), ƒ(γ + δ) have opposite signs or have the same sign. And by dividing in this manner the interval β to α into intervals each of which is = or <δ, we should not only ascertain the number of the real roots (if any), but we should also separate the real roots, that is, find for each of them limits γ, γ + δ between which there lies this one, and only this one, real root.
In particular cases it is frequently possible to ascertain the number of the real roots, and to effect their separation by trial or otherwise, without much difficulty; but the foregoing was the general process as employed by Joseph Louis Lagrange even in the second edition (1808) of the Traité de la résolution des équations numériques;[1] the determination of the limit δ had to be effected by means of the “equation of differences” or equation of the order 12 n(n − 1), the roots of which are the squares of the differences of the roots of the given equation, and the process is a cumbrous and unsatisfactory one.
4. The great step was effected by the theorem of J. C. F. Sturm (1835)—viz. here starting from the function ƒ(x), and its first derived function ƒ′(x), we have (by a process which is a slight modification of that for obtaining the greatest common measure of these two functions) to form a series of functions
of the degrees n, n − 1, n − 2 ... 0 respectively,—the last term ƒn(x) being thus an absolute constant. These lead to the immediate determination of the number of real roots (if any) between any two given limits β, α; viz. supposing α > β (that is, α nearer to +∞), then substituting successively these two values in the series of functions, and attending only to the signs of the resulting values, the number of the changes of sign lost in passing from β to α is the required number of real roots between the two limits. In particular, taking β, α = −∞, +∞ respectively, the signs of the several functions depend merely on the signs of the terms which contain the highest powers of x, and are seen by inspection, and the theorem thus gives at once the whole number of real roots.
And although theoretically, in order to complete by a finite number of operations the separation of the real roots, we still need to know the value of the before-mentioned limit δ; yet in any given case the separation may be effected by a limited number of repetitions of the process. The practical difficulty is when two or more roots are very near to each other. Suppose, for instance, that the theorem shows that there are two roots between 0 and 10; by giving to x the values 1, 2, 3, ... successively, it might appear that the two roots were between 5 and 6; then again that they were between 5.3 and 5.4, then between 5.34 and 5.35, and so on until we arrive at a separation; say it appears that between 5.346 and 5.347 there is one root, and between 5.348 and 5.349 the other root. But in the case in question δ would have a very small value, such as .002, and even supposing this value known, the direct application of the first-mentioned process would be still more laborious.
5. Supposing the separation once effected, the determination of the single real root which lies between the two given limits may be effected to any required degree of approximation either by the processes of W. G. Horner and Lagrange (which are in principle a carrying out of the method of Sturm’s theorem), or by the process of Sir Isaac Newton, as perfected by Joseph Fourier (which requires to be separately considered).
First as to Horner and Lagrange. We know that between the limits β, α there lies one, and only one, real root of the equation; ƒ(β) and ƒ(α) have therefore opposite signs. Suppose any intermediate value is θ; in order to determine by Sturm’s theorem whether the root lies between β, θ, or between θ, α, it would be quite unnecessary to calculate the signs of ƒ(θ),ƒ′(θ), ƒ2(θ) ...; only the sign of ƒ(θ) is required; for, if this has the same sign as ƒ(β), then the root is between β, θ; if the same sign as ƒ(α), then the root is between θ, α. We want to make θ increase from the inferior limit β, at which ƒ(θ) has the sign of ƒ(β), so long as ƒ(θ) retains this sign, and then to a value for which it assumes the opposite sign; we have thus two nearer limits of the required root, and the process may be repeated indefinitely.
Horner’s method (1819) gives the root as a decimal, figure by figure; thus if the equation be known to have one real root between 0 and 10, it is in effect shown say that 5 is too small (that is, the root is between 5 and 6); next that 5.4 is too small (that is, the root is between 5.4 and 5.5); and so on to any number of decimals. Each figure is obtained, not by the successive trial of all the figures which precede it, but (as in the ordinary process of the extraction of a square root, which is in fact Horner’s process applied to this particular case) it is given presumptively as the first figure of a quotient; such value may be too large, and then the next inferior integer must be tried instead of it, or it may require to be further diminished. And it is to be remarked that the process not only gives the approximate value α of the root, but (as in the extraction of a square root) it includes the calculation of the function ƒ(α), which should be, and approximately is, = 0. The arrangement of the calculations is very elegant, and forms an integral part of the actual method. It is to be observed that after a certain number of decimal places have been obtained, a good many more can be found by a mere division. It is in the progress tacitly assumed that the roots have been first separated.
Lagrange’s method (1767) gives the root as a continued fraction a + 1/b + 1/c + ..., where a is a positive or negative integer (which may be = 0), but b, c, ... are positive integers. Suppose the roots have been separated; then (by trial if need be of consecutive integer values) the limits may be made to be consecutive integer numbers: say they are a, a + 1; the value of x is therefore = a + 1/y, where y is positive and greater than 1; from the given equation for x, writing therein x = a + 1/y, we form an equation of the same order for y, and this equation will have one, and only one, positive root greater than 1; hence finding for it the limits b, b + 1 (where b is = or > 1), we have y = b + 1/z, where z is positive and greater than 1; and so on—that is, we thus obtain the successive denominators b, c, d ... of the continued fraction. The method is theoretically very elegant, but the disadvantage is that it gives the result in the form of a continued fraction, which for the most part must ultimately be converted into a decimal. There is one advantage in the method, that a commensurable root (that is, a root equal to a rational fraction) is found accurately, since, when such root exists, the continued fraction terminates.
6. Newton’s method (1711), as perfected by Fourier(1831), may be
- ↑ The third edition (1826) is a reproduction of that of 1808; the first edition has the date 1798, but a large part of the contents is taken from memoirs of 1767–1768 and 1770–1771.