roughly stated as follows. If x = γ be an approximate value of any root, and γ + h the correct value, then ƒ(γ + h) = 0, that is,
ƒ(γ) + | h | ƒ′(γ) + | h2 | ƒ″(γ) + ... = 0; |
1 | 1·2 |
and then, if h be so small that the terms after the second may be neglected, ƒ(γ) + hƒ′(γ) = 0, that is, h = {−ƒ(γ)/ƒ′(γ) }, or the new approximate value is x = γ − {ƒ(γ)/ƒ′(γ) }; and so on, as often as we please. It will be observed that so far nothing has been assumed as to the separation of the roots, or even as to the existence of a real root; γ has been taken as the approximate value of a root, but no precise meaning has been attached to this expression. The question arises, What are the conditions to be satisfied by γ in order that the process may by successive repetitions actually lead to a certain real root of the equation; or that, γ being an approximate value of a certain real root, the new value γ − {ƒ(γ)/ƒ′(γ) } may be a more approximate value.
Fig. 1. |
Referring to fig. 1, it is easy to see that if OC represent the assumed value γ, then, drawing the ordinate CP to meet the curve in P, and the tangent PC′ to meet the axis in C′, we shall have OC′ as the new approximate value of the root. But observe that there is here a real root OX, and that the curve beyond X is convex to the axis; under these conditions the point C′ is nearer to X than was C; and, starting with C′ instead of C, and proceeding in like manner to draw a new ordinate and tangent, and so on as often as we please, we approximate continually, and that with great rapidity, to the true value OX. But if C had been taken on the other side of X, where the curve is concave to the axis, the new point C′ might or might not be nearer to X than was the point C; and in this case the method, if it succeeds at all, does so by accident only, i.e. it may happen that C′ or some subsequent point comes to be a point C, such that CO is a proper approximate value of the root, and then the subsequent approximations proceed in the same manner as if this value had been assumed in the first instance, all the preceding work being wasted. It thus appears that for the proper application of the method we require more than the mere separation of the roots. In order to be able to approximate to a certain root α, = OX, we require to know that, between OX and some value ON, the curve is always convex to the axis (analytically, between the two values, ƒ(x) and ƒ″(x) must have always the same sign). When this is so, the point C may be taken anywhere on the proper side of X, and within the portion XN of the axis; and the process is then the one already explained. The approximation is in general a very rapid one. If we know for the required root OX the two limits OM, ON such that from M to X the curve is always concave to the axis, while from X to N it is always convex to the axis,—then, taking D anywhere in the portion MX and (as before) C in the portion XN, drawing the ordinates DQ, CP, and joining the points P, Q by a line which meets the axis in D′, also constructing the point C′ by means of the tangent at P as before, we have for the required root the new limits OD′, OC′; and proceeding in like manner with the points D′, C′, and so on as often as we please, we obtain at each step two limits approximating more and more nearly to the required root OX. The process as to the point D′, translated into analysis, is the ordinate process of interpolation. Suppose OD = β, OC = α, we have approximately ƒ(β + h) = ƒ(β) + h{ƒ(α) − ƒ(β) } / (α − β), whence if the root is β + h then h = − (α − β)ƒ(β) / {ƒ(α) − ƒ(β) }.
Returning for a moment to Horner’s method, it may be remarked that the correction h, to an approximate value α, is therein found as a quotient the same or such as the quotient ƒ(α) ÷ ƒ′(α) which presents itself in Newton’s method. The difference is that with Horner the integer part of this quotient is taken as the presumptive value of h, and the figure is verified at each step. With Newton the quotient itself, developed to the proper number of decimal places, is taken as the value of h; if too many decimals are taken, there would be a waste of work; but the error would correct itself at the next step. Of course the calculation should be conducted without any such waste of work.
Imaginary Theory.
7. It will be recollected that the expression number and the correlative epithet numerical were at the outset used in a wide sense, as extending to imaginaries. This extension arises out of the theory of equations by a process analogous to that by which number, in its original most restricted sense of positive integer number, was extended to have the meaning of a real positive or negative magnitude susceptible of continuous variation.
If for a moment number is understood in its most restricted sense as meaning positive integer number, the solution of a simple equation leads to an extension; ax − b = 0 gives x = b/a, a positive fraction, and we can in this manner represent, not accurately, but as nearly as we please, any positive magnitude whatever; so an equation ax + b = 0 gives x = −b/a, which (approximately as before) represents any negative magnitude. We thus arrive at the extended signification of number as a continuously varying positive or negative magnitude. Such numbers may be added or subtracted, multiplied or divided one by another, and the result is always a number. Now from a quadric equation we derive, in like manner, the notion of a complex or imaginary number such as is spoken of above. The equation x2 + 1 = 0 is not (in the foregoing sense, number = real number) satisfied by any numerical value whatever of x; but we assume that there is a number which we call i, satisfying the equation i 2 + 1 = 0, and then taking a and b any real numbers, we form an expression such as a + bi, and use the expression number in this extended sense: any two such numbers may be added or subtracted, multiplied or divided one by the other, and the result is always a number. And if we consider first a quadric equation x2 + px + q = 0 where p and q are real numbers, and next the like equation, where p and q are any numbers whatever, it can be shown that there exists for x a numerical value which satisfies the equation; or, in other words, it can be shown that the equation has a numerical root. The like theorem, in fact, holds good for an equation of any order whatever; but suppose for a moment that this was not the case; say that there was a cubic equation x3 + px2 + qx + r = 0, with numerical coefficients, not satisfied by any numerical value of x, we should have to establish a new imaginary j satisfying some such equation, and should then have to consider numbers of the form a + bj, or perhaps a + bj + cj 2 (a, b, c numbers α + βi of the kind heretofore considered),—first we should be thrown back on the quadric equation x2 + px + q = 0, p and q being now numbers of the last-mentioned extended form—non constat that every such equation has a numerical root—and if not, we might be led to other imaginaries k, l, &c., and so on ad infinitum in inextricable confusion.
But in fact a numerical equation of any order whatever has always a numerical root, and thus numbers (in the foregoing sense, number = quantity of the form α + βi) form (what real numbers do not) a universe complete in itself, such that starting in it we are never led out of it. There may very well be, and perhaps are, numbers in a more general sense of the term (quaternions are not a case in point, as the ordinary laws of combination are not adhered to), but in order to have to do with such numbers (if any) we must start with them.
8. The capital theorem as regards numerical equations thus is, every numerical equation has a numerical root; or for shortness (the meaning being as before), every equation has a root. Of course the theorem is the reverse of self-evident, and it requires proof; but provisionally assuming it as true, we derive from it the general theory of numerical equations. As the term root was introduced in the course of an explanation, it will be convenient to give here the formal definition.
A number a such that substituted for x it makes the function x1n − p1xn−1 ... ± pn to be = 0, or say such that it satisfies the equation ƒ(x) = 0, is said to be a root of the equation; that is, a being a root, we have
and it is then easily shown that x − a is a factor of the function ƒ(x), viz. that we have ƒ(x) = (x − a)ƒ1(x), where ƒ1(x) is a function xn−1 − q1xn−2 ... ± qn−1 of the order n − 1, with numerical coefficients q1, q2 ... qn−1.
In general a is not a root of the equation ƒ1(x) = 0, but it may be so—i.e. ƒ1(x) may contain the factor x − a; when this is so, ƒ(x) will contain the factor (x − a)2; writing then ƒ(x) = (x − a)2ƒ2(x), and assuming that a is not a root of the equation ƒ2(x) = 0, x = a is then said to