Jump to content

Page:Popular Science Monthly Volume 65.djvu/378

From Wikisource
This page has been proofread, but needs to be validated.
374
THE POPULAR SCIENCE MONTHLY.

when he is so very, very far from ever having made one. But serious he seems to have been and perhaps never more so than when he declares that the 'average word length alone. . . would, in general, be indicative of the nature of the curve.' This is equivalent to saying that the form of a curve is known when its mean ordinate is known, and is a statement which, to those who are accustomed to the graphic representation of variables, will betray an almost immeasurable unfamiliarity with problems of this kind. Among other evidences of this state of mind which might be cited, the construction of a 'typical word-curve of extreme light dialogue'—from a count of 5,000 words from Swift's 'Polite Conversation'—is not the least convincing. To produce this Swift's curve is 'corrected' by the suppression of certain words of seven or eight letters, for no assigned or imaginable reason, except that perhaps Dr. Moritz thinks that Swift ought to have known better than to have used them. The curve of this expurgated edition of 5,000 words from Swift is interesting in form, but if it be the 'typical word-curve of extreme light dialogue' in the English language, as declared by Dr. Moritz, those who have dabbled, even a very little, in word-counting of modern comedy and humorous story writers will be saddened by the thought that the art of composing 'extreme light dialogue' must have long ago become extinct.

It seems impossible to avoid the conclusion that Dr. Moritz, perhaps as a result of a somewhat hasty examination of the subject, has failed to grasp in its entirety the fundamental principle on which the whole doctrine (if so dignified a term may be used) of 'characteristic curves of composition' is based, and a brief exposition of its most important propositions may not be out of place.

The notion that every author, however voluminous, must necessarily be restricted in his use of words to a vocabulary which would remain sensibly constant after his productive period had been reached, which, in its character and extent would be one of the personal 'qualities' of that author and thus offer a means of identification, is due, as is stated in the paper of twenty years ago, to Augustus De Morgan, who suggested that vocabularies might differ so much among different authors as to make it possible to differentiate them by means of the simple average number of letters in a word. In making some tests of this proposition it immediately became evident, as might have been anticipated, that vocabularies might differ indefinitely and enormously and at the same time agree in average word-length. The scheme for the graphic display of variations in the average frequency of occurrence of words of different lengths, as explained in the papers under discussion, was then devised and proved to be a vastly more powerful means of revealing peculiarities in composition. As to the value of this method of treatment, which is the one original feature of the whole, there seems to be no question, as even my critic has paid me the highest compliment in his power in making continued and apparently confiding use of it. The point at issue is, rather: Was De Morgan right in assuming that the personal element enters into the vocabulary of any author to such an extent as to furnish a means of identifying his writing? He evidently believes that it played so large a part as to determine the average length of words used; the theory of 'characteristic curves' implies that personality may determine the way in which words are used rather than their average length, and it furnishes a method for revealing peculiarities, such as persist in the long run (this is the kernel of the thing) in the relative frequencies of words of different numbers of letters, syllables, etc., of sentences of different lengths or of any 'qualities' that may be treated numerically. Because of simplicity, ease