differential to zero (the second differential being always negative).
The resulting equation is of the form
y − rx − T − αx2 − 2βxy − γy2 = 0,
where T, α, β, γ are all small, linear functions of the k’s. As y is nearly equal to r x, it is legitimate to substitute r x for y, when y is multiplied by a small coefficient. The curve of regression thus reduces to a parabola with equation of the form
y − T = rx − qx2;
where q is a linear function of the third mean powers and moments of the given group.
163. Dissection of certain Heterogeneous Groups.—Under the head of law of error may be placed the case in which statistics relating to two (or more) different types, each separately conforming to the normal law, are mixed together; for instance, the measurements of human heights in a country comprising two distinct races.
In this case the quaesita are the constants in a curve of the form:
,
where α and β are the proportionate sizes of the two groups (α+β = 1); a and b are the respective centres of gravity; and c1, c2 the respective moduli. The data are measurements each of which relates to one or other of these component curves. A splendid solution of this difficult problem has been given by Professor Pearson. The five unknown quantities are connected by him with the centre of gravity of the given observations, and the mean second, third, fourth and fifth powers of their deviations from that centre of gravity, by certain rational algebraic equations, which reduce to an equation in one variable of the ninth dimension. In an example worked by Professor Pearson this fundamental equation had three possible roots, two of which gave very fair solutions of the problem, while the third suggested that there might be a negative solution, importing that the given system would be obtained by subtracting one of the normal groups from the other; but the coefficients for the negative solution proved to be imaginary. “In the case of crabs' foreheads, therefore, we cannot represent the frequency curve for their forehead length as the difference of two normal curves.” In another case, which primâ facie seemed normal, Professor Pearson found that “all nine roots of the fundamental nonic lead to imaginary solutions of the problem. The best and most accurate representation is the normal curve.”
164. This laborious method of separation seems best suited to cases in which it is known beforehand that the statistics are a mixture of two normal groups, or at least this is strongly suggested by the two-headed character of the given group. Otherwise the less troublesome generalized law of error may be preferable, as it is appropriate both to the mixture of two—not very widely different—normal groups, and also the other cases of composition. Even when a group of statistics can be broken up into two or three frequency curves of the normal—or not very abnormal—type, the group may yet be adequately represented by a single curve of the “generalized” type, provided that the heterogeneity is not very great, not great enough to prevent the constants k1, k2, k3, &c., from being small. Thus, suppose the given group to consist of two normal curves each having the same modulus c, and that the distance between the centres is considerable, so considerable as just to cause the central portion of the total group to become saddle-backed. This phenomenon sets in when the distance between the centre of gravity of the system and the centre of either component = √12c.[1] Even in this case k2 is only −0.125; k4 is 0.25 (the odd k’s are zero).
Section II.—Laws of Frequency.
165. A formula much more comprehensive than the corrected The “Generalized Probability Curve.” normal law is proposed by Professor Pearson under the designation of the “generalized probability-curve.” The round and scope of the new law cannot be better stated than in the words of the author: “The slope of the normal curve is given by a relation the form
1ydydx = −xc1
The slope of the curve correlated to the skew binomial, as the normal curve to the symmetrical binomial, is given by a relation of the form
1ydydx = −xc1 + c2x
Finally, the slope of the curve correlated to the hyper geometrical series (which expresses a probability distribution in which the contributory causes are not independent, and not equally likely to give equal deviations in excess and defect), as the above curves to their respective binomials, is given by a relation of the form
1ydydx = −xc1 + c2x + c3x2.
This latter curve comprises the two others as special cases, and, so far as my investigations have yet gone, practically covers all homogeneous statistics that I have had to deal with. Something still more general may be conceivable, but I have found no necessity for it.”[2] The “hypergeometrical series,” it should be explained, had appeared as representative of the distribution of black balls,[3] in the following case. “Take n balls in a bag, of which pn are black and qn are white, and let r balls be drawn and the num er of black be recorded. If r > pn, the range of black balls will lie between 0 and pn; the resulting frequency-polygon is given by a hypergeometrical series.”
Further reasons in favour of his construction are given by Professor Pearson in a later paper.[4] “The immense majority, if not the totality, of frequency distributions in homogeneous material show, when the frequency is indefinitely increased, a tendency to give a smooth curve characterized by the following properties. (i.) The frequency starts from zero, increases slowly or rapidly to a maximum and then falls again to zero—probably at a quite different rate—as the character for which the frequency is measured is steadily increased. This is the almost universal unimodal distribution of the frequency of homogeneous series . . . (ii.) In the next place there is generally contact of the frequency-curve at the extremities of the range. These characteristics at once suggest the following of frequency curve, if yδx measure the frequency falling between x and x+δx:—
dydx = y(x + a)F(x) ⋅ ⋅ ⋅
Now let us assume that F(x) can be expanded by Maclaurin’s theorem. Then our differential equation to the frequency will be
1ydydx = (x + a)b0 + b1x + b2x2 + ⋅ ⋅ ⋅
Experience shows that the form (x) [“keeping b0, b1, b2, only”]
suffices for certainly the great bulk of frequency distributions."[5]
166. The “generalized probability-curve” presents two main forms[6]—
y = y0(1 + x/a1)νa1) 1 − x/a2)νa2,
and y = y01(1 + x2/a2)me−ν tan−1x/a.
When a1, a2, ν are all finite and positive, the first form represents, in general, a skew curve, with limited range in both directions; in the particular case, when a1 = a2, a symmetrical curve, with range limited in both directions. If a2 = ∞, the curve reduces to
y = y0(2 + x/a1νa11e−νx);
representing an asymmetrical binomial with ν = 2μ2/μ3, and 21 = 2μ22/μ3 − aμ3/μ2, μ2 and μ3, being respectively the mean second and mean third power of deviation measured from the centre of gravity. In the particular case, when μ3 is small, this form reduces to what is above called the “quasi-normal” curve; and when μ3 is zero, a1 becoming infinite, to the simple normal curve. The pregnant general form yields two less familiar shapes apt to represent curves of the character shown in figs. 14 and 15—the one occurring in a good number of instances, such as infant deaths, the values of houses, the number of petals in certain flowers; the other less familiarly illustrated by Consumptivity and Cloudiness.[7] The second solution represents a skew curve with unlimited range in both directions.[8] Professor Pearson has successfully applied these formulae to a number of 'beautiful specimens culled in the most diverse fields of statistics. The flexibility with which the generalized probability-curve adapts itself to every variety of existing groups no doubt gives it a great advantage over the normal curve, even. in its extended form. It is only in respect of a priori evidence that the latter can claim precedence.[9]
Fig. 14. | Fig. 15. |
167. Skew Correlation.—Professor Pearson has extended his
- ↑ Cf. Journ. Stat. Soc. (1899), lxii. 131. A similar substitution of the generalized law of error may be recommended in preference to the method of translating a normal law of error (putting x = f(x), where x obeys the normal law of error) suggested by the present writer (Journ. Stat. Soc., 1898), and independently by Professor J. C. Kapteyn (Skew Frequency Curves, 1903).
- ↑ Trans. Roy. Soc. (1895), A, p. 381.
- ↑ Ibid. p. 360.
- ↑ “Mathematical Contributions to the Theory of Evolution” (Drapers' Company Research Memoirs, Biometric Series II.), xiv. 4.
- ↑ p. 7, loc. cit.
- ↑ Ibid. p. 367.
- ↑ Pearson, loc. cit., p. 364, and Proc. Roy. Soc.
- ↑ A lucid exposition of Professor Pearson’s various methods is given by W. Palin Elderton in Frequency-curves and Correlation (1906).
- ↑ Journ. Stat. Soc. (1895), p. 506.