This is a modern-English version of The Logic of Chance, 3rd edition: An Essay on the Foundations and Province of the Theory of Probability, With Especial Reference to Its Logical Bearings and Its Application to Moral and Social Science and to Statistics, originally written by Venn, John.
It has been thoroughly updated, including changes to sentence structure, words, spelling,
and grammar—to ensure clarity for contemporary readers, while preserving the original spirit and nuance. If
you click on a paragraph, you will see the original text that we modified, and you can toggle between the two versions.
Scroll to the bottom of this page and you will find a free ePUB download link for this book.
THE
LOGIC OF CHANCE
THE
LOGIC OF CHANCE
AN ESSAY
ON THE FOUNDATIONS AND PROVINCE OF
THE THEORY OF PROBABILITY,
WITH ESPECIAL REFERENCE TO ITS LOGICAL BEARINGS
AND ITS APPLICATION TO
MORAL AND SOCIAL SCIENCE AND TO STATISTICS,
AN ESSAY
ON THE FOUNDATIONS AND SCOPE OF
THE THEORY OF PROBABILITY,
WITH SPECIAL REFERENCE TO ITS LOGICAL IMPLICATIONS
AND ITS APPLICATION TO
MORAL AND SOCIAL SCIENCE AND TO STATISTICS,
BY
JOHN VENN, Sc.D., F.R.S.,
BY
JOHN VENN, Ph.D., F.R.S.,
FELLOW AND LECTURER IN THE MORAL SCIENCES, GONVILLE AND CAIUS COLLEGE,
CAMBRIDGE.
LATE EXAMINER IN LOGIC AND MORAL PHILOSOPHY IN THE
UNIVERSITY OF LONDON.
FELLOW AND LECTURER IN THE MORAL SCIENCES, GONVILLE AND CAIUS COLLEGE,
CAMBRIDGE.
FORMER EXAMINER IN LOGIC AND MORAL PHILOSOPHY AT THE
UNIVERSITY OF LONDON.
“So careful of the type she seems
So careless of the single life.”
“So careful about the type she appears to be
So indifferent to the single life.”
THIRD EDITION, RE-WRITTEN AND ENLARGED.
Third Edition, Revised and Expanded.
London:
MACMILLAN AND CO.
AND NEW YORK
1888
London:
MACMILLAN AND CO.
AND NEW YORK
1888
[All Rights reserved.]
All rights reserved.
First Edition printed 1866.
Second Edition 1876.
Third Edition 1888.
First Edition printed 1866.
Second Edition 1876.
Third Edition 1888.
PREFACE TO FIRST EDITION.
Any work on Probability by a Cambridge man will be so likely to have its scope and its general treatment of the subject prejudged, that it may be well to state at the outset that the following Essay is in no sense mathematical. Not only, to quote a common but often delusive assurance, will ‘no knowledge of mathematics beyond the simple rules of Arithmetic’ be required to understand these pages, but it is not intended that any such knowledge should be acquired by the process of reading them. Of the two or three occasions on which algebraical formulæ occur they will not be found to form any essential part of the text.
Any work on Probability by a Cambridge person is likely to have its focus and overall approach to the topic predetermined, so it's important to clarify from the start that the following Essay is not mathematical in nature. Not only will ‘no knowledge of mathematics beyond basic Arithmetic’ be needed to understand these pages, but there’s no intention for readers to gain any such knowledge by reading them. On the few occasions when algebraic formulas appear, they won't be considered essential to the text.
The science of Probability occupies at present a somewhat anomalous position. It is impossible, I think, not to observe in it some of the marks and consequent disadvantages of a sectional study. By a small body of ardent students it has been cultivated with great assiduity, and the results they have obtained will always be reckoned among the most extraordinary products of mathematical genius. But by the general body of thinking men its principles seem to be regarded with indifference or suspicion. Such persons may admire the ingenuity displayed, and be struck with the profundity of many of the calculations, but there seems to vi them, if I may so express it, an unreality about the whole treatment of the subject. To many persons the mention of Probability suggests little else than the notion of a set of rules, very ingenious and profound rules no doubt, with which mathematicians amuse themselves by setting and solving puzzles.
The science of Probability currently holds a somewhat unusual position. It’s hard, I think, not to notice some of the characteristics and resulting drawbacks of a sectional study. A small group of dedicated students has worked on it with great diligence, and the results they have achieved will always be considered among the most remarkable outcomes of mathematical genius. However, the broader community of thinkers seems to view its principles with indifference or skepticism. These individuals might appreciate the cleverness involved and be impressed by the depth of many calculations, but there seems to be, as I might put it, an unreality surrounding the entire treatment of the subject. For many, the idea of Probability brings to mind little more than a set of rules—very clever and profound rules, no doubt—that mathematicians use to entertain themselves by creating and solving puzzles.
It must be admitted that some ground has been given for such an opinion. The examples commonly selected by writers on the subject, though very well adapted to illustrate its rules, are for the most part of a special and peculiar character, such as those relating to dice and cards. When they have searched for illustrations drawn from the practical business of life, they have very generally, but unfortunately, hit upon just the sort of instances which, as I shall endeavour to show hereafter, are among the very worst that could be chosen for the purpose. It is scarcely possible for any unprejudiced person to read what has been written about the credibility of witnesses by eminent writers, without his experiencing an invincible distrust of the principles which they adopt. To say that the rules of evidence sometimes given by such writers are broken in practice, would scarcely be correct; for the rules are of such a kind as generally to defy any attempt to appeal to them in practice.
It must be acknowledged that some ground has been given for this opinion. The examples often chosen by writers on the topic, while well-suited to illustrate its rules, are mostly quite specific and unique, like those related to dice and cards. When they look for examples from real life, they typically, but unfortunately, select instances that, as I will show later, are actually among the worst possible choices for that purpose. It’s almost impossible for any open-minded person to read what prominent authors have said about the credibility of witnesses without feeling a strong distrust of the principles they use. To say that the rules of evidence provided by these writers are sometimes disregarded in practice would hardly be accurate; the rules are generally of a nature that makes it nearly impossible to apply them in real life.
This supposed want of harmony between Probability and other branches of Philosophy is perfectly erroneous. It arises from the belief that Probability is a branch of mathematics trying to intrude itself on to ground which does not altogether belong to it. I shall endeavour to show that this belief is unfounded. To answer correctly the sort of questions to which the science introduces us does generally demand some knowledge of mathematics, often a great knowledge, but the discussion of the fundamental principles on which the rules are based does not necessarily require any such vii qualification. Questions might arise in other sciences, in Geology, for example, which could only be answered by the aid of arithmetical calculations. In such a case any one would admit that the arithmetic was extraneous and accidental. However many questions of this kind there might be here, those persons who do not care to work out special results for themselves might still have an accurate knowledge of the principles of the science, and even considerable acquaintance with the details of it. The same holds true in Probability; its connection with mathematics, though certainly far closer than that of most other sciences, is still of much the same kind. It is principally when we wish to work out results for ourselves that mathematical knowledge is required; without such knowledge the student may still have a firm grasp of the principles and even see his way to many of the derivative results.
The supposed lack of harmony between Probability and other branches of Philosophy is completely wrong. It comes from the belief that Probability is a part of mathematics trying to intrude into an area that doesn't fully belong to it. I will try to show that this belief is not true. Answering the types of questions that this science brings up usually requires some knowledge of mathematics, often a lot, but discussing the fundamental principles that the rules are based on doesn’t necessarily need any such qualification. Questions could come up in other sciences, like Geology, that could only be answered with arithmetic calculations. In that case, everyone would agree that the arithmetic was external and incidental. No matter how many such questions there might be, those who don’t want to work out specific results for themselves could still have a solid understanding of the principles of the science and even a good grasp of its details. The same is true for Probability; its link to mathematics, although definitely closer than that of most other sciences, is still quite similar. It’s mainly when we want to calculate results ourselves that we need mathematical knowledge; without that knowledge, the student can still have a strong understanding of the principles and even see the path to many of the derived results.
The opinion that Probability, instead of being a branch of the general science of evidence which happens to make much use of mathematics, is a portion of mathematics, erroneous as it is, has yet been very disadvantageous to the science in several ways. Students of Philosophy in general have thence conceived a prejudice against Probability, which has for the most part deterred them from examining it. As soon as a subject comes to be considered ‘mathematical’ its claims seem generally, by the mass of readers, to be either on the one hand scouted or at least courteously rejected, or on the other to be blindly accepted with all their assumed consequences. Of impartial and liberal criticism it obtains little or nothing.
The belief that Probability, rather than being a part of the general science of evidence that heavily relies on mathematics, is merely a section of mathematics—although incorrect—has been quite harmful to the field in several ways. Philosophy students, in particular, have developed a bias against Probability, which has largely kept them from exploring it. Once a topic is labeled ‘mathematical,’ it tends to be either dismissed or politely disregarded by most readers, or, conversely, it is accepted without question along with all its presumed implications. It receives very little objective and open-minded criticism.
The consequences of this state of things have been, I think, disastrous to the students themselves of Probability. No science can safely be abandoned entirely to its own devotees. Its details of course can only be studied by those who viii make it their special occupation, but its general principles are sure to be cramped if it is not exposed occasionally to the free criticism of those whose main culture has been of a more general character. Probability has been very much abandoned to mathematicians, who as mathematicians have generally been unwilling to treat it thoroughly. They have worked out its results, it is true, with wonderful acuteness, and the greatest ingenuity has been shown in solving various problems that arose, and deducing subordinate rules. And this was all that they could in fairness be expected to do. Any subject which has been discussed by such men as Laplace and Poisson, and on which they have exhausted all their powers of analysis, could not fail to be profoundly treated, so far as it fell within their province. But from this province the real principles of the science have generally been excluded, or so meagrely discussed that they had better have been omitted altogether. Treating the subject as mathematicians such writers have naturally taken it up at the point where their mathematics would best come into play, and that of course has not been at the foundations. In the works of most writers upon the subject we should search in vain for anything like a critical discussion of the fundamental principles upon which its rules rest, the class of enquiries to which it is most properly applicable, or the relation it bears to Logic and the general rules of inductive evidence.
The consequences of this situation have, I believe, been disastrous for the students of Probability. No science should be completely left to its dedicated specialists. Of course, its details can only be learned by those who focus on it as their main occupation, but its core principles will be limited if they aren’t occasionally exposed to the open critique of those whose education has been broader. Probability has largely been left to mathematicians, who, as mathematicians, have generally been reluctant to explore it fully. They have indeed produced its results with remarkable insight, and great creativity has been shown in solving various problems that came up and developing related rules. This is all that could reasonably be expected of them. Any subject discussed by great thinkers like Laplace and Poisson, where they have exerted all their analytical skills, is bound to be treated profoundly, as far as it falls within their expertise. However, the fundamental principles of the science have generally been left out of this, or discussed so briefly that it would have been better to leave them out entirely. Treating the subject as mathematicians, these writers naturally pick it up at the points where their mathematics can be most effectively applied, and that has not been at the foundational level. In the works of most authors on this topic, we would search in vain for anything resembling a critical discussion of the fundamental principles that underlie its rules, the types of inquiries to which it is most appropriately relevant, or its relationship to Logic and the general principles of inductive evidence.
This want of precision as to ultimate principles is perfectly compatible here, as it is in the departments of Morals and Politics, with a general agreement on processes and results. But it is, to say the least, unphilosophical, and denotes a state of things in which positive error is always liable to arise whenever the process of controversy forces us to appeal to the foundations of the science.
This lack of clarity regarding core principles fits well here, just as it does in Morals and Politics, alongside a general consensus on methods and outcomes. However, it is, at the very least, unphilosophical and indicates a situation where actual mistakes can easily occur whenever the debate pushes us to look at the foundations of the science.
With regard to the remarks in the last few paragraphs, prominent exceptions must be made in the case of two recent works at least.[1] The first of these is Professor de Morgan's Formal Logic. He has there given an investigation into the foundations of Probability as conceived by him, and nothing can be more complete and precise than his statement of principles, and his deductions from them. If I could at all agree with these principles there would have been no necessity for the following essay, as I could not hope to add anything to their foundation, and should be far indeed from rivalling his lucid statement of them. But in his scheme Probability is regarded very much from the Conceptualist point of view; as stated in the preface, he considers that Probability is concerned with formal inferences in which the premises are entertained with a conviction short of absolute certainty. With this view I cannot agree. As I have entered into criticism of some points of his scheme in one of the following chapters, and shall have occasion frequently to refer to his work, I need say no more about it here. The other work to which I refer is the profound Laws of Thought of the late Professor Boole, to which somewhat similar remarks may in part be applied. Owing however to his peculiar treatment of the subject, I have scarcely anywhere come into contact with any of his expressed opinions.
With regard to the comments in the last few paragraphs, there are notable exceptions to consider regarding at least two recent works. Below is a short piece of text (5 words or fewer). Modernize it into contemporary English if there's enough context, but do not add or omit any information. If context is insufficient, return it unchanged. Do not add commentary, and do not modify any placeholders. If you see placeholders of the form __A_TAG_PLACEHOLDER_x__, you must keep them exactly as-is so they can be replaced with links. The first is Professor de Morgan's Formal Logic. In this book, he provides an in-depth examination of the foundations of Probability as he sees it, and his explanation of the principles and their implications is incredibly thorough and clear. If I could agree with these principles, there would be no need for this essay, as I wouldn't be able to contribute anything further to their foundation and would certainly fall short of matching his clear explanation of them. However, in his approach, Probability is viewed largely from a Conceptualist perspective; as he states in the preface, he believes that Probability deals with formal inferences in which the premises are accepted with a degree of conviction that falls short of absolute certainty. I do not share this viewpoint. Since I have critiqued some aspects of his framework in one of the following chapters and will frequently reference his work, I won't elaborate further here. The other work I’m referring to is the insightful Laws of Thought by the late Professor Boole, to which similar comments can partly apply. However, because of his unique approach to the subject, I have rarely encountered any of his specific opinions.
The view of the province of Probability adopted in this Essay differs so radically from that of most other writers on the subject, and especially from that of those just referred to, that I have thought it better, as regards details, to avoid all criticism of the opinions of others, except where conflict was x unavoidable. With regard to that radical difference itself Bacon's remark applies, behind which I must shelter myself from any change of presumption.—“Quod ad universalem istam reprehensionem attinet, certissimum vere est rem reputanti, eam et magis probabilem esse et magis modestam, quam si facta fuisset ex parte.”
The perspective on the province of Probability presented in this Essay is so different from that of most other writers on the topic, especially those mentioned earlier, that I decided it’s best to avoid critiquing others' views in detail, unless it's absolutely necessary. Regarding that significant difference itself, I find Bacon's remark fitting, which I will use to shield myself from any accusations of overstepping.—“As for this general criticism, it is truly certain for anyone who reflects on it that it is both more probable and more modest than if it had been made in part.”
Almost the only writer who seems to me to have expressed a just view of the nature and foundation of the rules of Probability is Mr Mill, in his System of Logic.[2] His treatment of the subject is however very brief, and a considerable portion of the space which he has devoted to it is occupied by the discussion of one or two special examples. There are moreover some errors, as it seems to me, in what he has written, which will be referred to in some of the following chapters.
Almost the only writer who seems to have accurately captured the nature and basis of the rules of Probability is Mr. Mill, in his System of Logic.[2] His treatment of the topic is quite brief, and a significant part of what he covers focuses on one or two specific examples. Additionally, I believe there are some errors in his writing that will be addressed in the upcoming chapters.
The reference to the work just mentioned will serve to convey a general idea of the view of Probability adopted in this Essay. With what may be called the Material view of Logic as opposed to the Formal or Conceptualist,—with that which regards it as taking cognisance of laws of things and not of the laws of our own minds in thinking about things,—I am in entire accordance. Of the province of Logic, regarded from this point of view, and under its widest aspect, Probability may, in my opinion, be considered to be a portion. The principal objects of this Essay are to ascertain how great a portion it comprises, where we are to draw the boundary between it and the contiguous branches of the general science xi of evidence, what are the ultimate foundations upon which its rules rest, what the nature of the evidence they are capable of affording, and to what class of subjects they may most fitly be applied. That the science of Probability, on this view of it, contains something more important than the results of a system of mathematical assumptions, is obvious. I am convinced moreover that it can and ought to be rendered both interesting and intelligible to ordinary readers who have any taste for philosophy. In other words, if the large and growing body of readers who can find pleasure in the study of books like Mill's Logic and Whewell's Inductive Sciences, turn with aversion from a work on Probability, the cause in the latter case must lie either in the view of the subject or in the manner and style of the book.
The reference to the earlier work will help to convey a general idea of the perspective on Probability taken in this Essay. I completely agree with what could be called the Material view of Logic, which contrasts with the Formal or Conceptualist approach. This view sees Logic as concerned with the laws of things rather than the laws of our minds when thinking about those things. From this perspective, I believe Probability can be seen as a part of the broader scope of Logic. The main goals of this Essay are to determine how significant a part it plays, where to set the boundary between it and related areas of the general science of evidence, what the fundamental principles of its rules are, what kind of evidence they can provide, and which subjects they are best suited for. It is clear that the science of Probability, viewed this way, encompasses something more significant than merely the results stemming from a system of mathematical assumptions. I am also convinced that it can and should be made both engaging and understandable to regular readers who have an interest in philosophy. In other words, if the substantial and growing audience that enjoys reading works like Mill's Logic and Whewell's Inductive Sciences turns away from a book on Probability, the issue must lie in either the perspective on the subject or in the way the book is written.
I take this opportunity of thanking several friends, amongst whom I must especially mention Mr Todhunter, of St John's College, and Mr H. Sidgwick, of Trinity College, for the trouble they have kindly taken in looking over the proof-sheets, whilst this work was passing through the Press. To the former in particular my thanks are due for thus adding to the obligations which I, as an old pupil, already owed him, by taking an amount of trouble, in making suggestions and corrections for the benefit of another, which few would care to take for anything but a work of their own. His extensive knowledge of the subject, and his extremely accurate judgment, render the service he has thus afforded me of the greatest possible value.
I want to take this opportunity to thank several friends, especially Mr. Todhunter from St John's College and Mr. H. Sidgwick from Trinity College, for their efforts in reviewing the proof sheets while this work was going through the press. I am particularly grateful to Mr. Todhunter for adding to the debt of gratitude I already owe him as a former student, by putting in significant effort to provide suggestions and corrections for the benefit of someone else, which few would be willing to do for anything other than their own work. His extensive knowledge of the subject and his keen judgment make the help he’s given me incredibly valuable.
Gonville and Caius College,
September, 1866.
Gonville and Caius College,
September, 1866.
1 I am here speaking, of course, of those only who have expressly treated of the foundations of the science. Mr Todhunter's admirable work on the History of the Theory of Probability being, as the name denotes, mainly historical, such enquiries have not directly fallen within his province.
1 I’m talking here, of course, about those who have specifically discussed the foundations of the science. Mr. Todhunter’s excellent book on the History of the Theory of Probability is, as the title suggests, primarily historical, so these inquiries haven’t directly been his focus.
2 This remark, and that at the commencement of the last paragraph, having been misunderstood, I ought to say that the only sense in which originality is claimed for this Essay is in the thorough working out of the Material view of Logic as applied to Probability. I have given a pretty full discussion of the general principles of this view in the tenth chapter, and have there pointed out some of the peculiarities to which it leads.
2 This comment, along with the one at the beginning of the last paragraph, has been misunderstood, so I should clarify that the only way in which I claim originality for this Essay is in the detailed exploration of the Material view of Logic as it relates to Probability. I've provided a comprehensive discussion of the general principles of this view in the tenth chapter, where I've highlighted some of the unique aspects it brings to light.
xiii
PREFACE TO SECOND EDITION.
The principal reason for designating this volume a second edition consists in the fact that the greater portion of what may be termed the first edition is incorporated into it. Besides various omissions (principally where the former treatment has since seemed to me needlessly prolix), I have added new matter, not much inferior in amount to the whole of the original work. In addition, moreover, to these alterations in the matter, the general arrangement of the subject as regards the successive chapters has been completely changed; the former arrangement having been (as it now seems to me) justly objected to as deficient and awkward in method.
The main reason for calling this volume a second edition is that most of what could be considered the first edition is included in it. In addition to several omissions (mainly where I found the previous approach overly detailed), I have added new content that is nearly equivalent in size to the entire original work. Furthermore, not only have I made these changes to the content, but I have also completely overhauled the overall organization of the subject in the successive chapters; the previous structure now seems to me to have been rightly criticized as inadequate and clumsy in its method.
After saying this, it ought to be explained whether any change of general view or results will be found in the present treatment.
After saying this, it should be clarified if any changes in general perspective or outcomes will be noticeable in the current approach.
The general view of Probability adopted is quite unchanged, further reading and reflection having only confirmed me in the conviction that this is the soundest and most fruitful way of regarding the subject. It is the more necessary to say this, as to a cursory reader it might seem xiv otherwise; owing to my having endeavoured to avoid the needlessly polemical tone which, as is often the case with those who are making their first essay in writing upon any subject, was doubtless too prominent in the former edition. I have not thought it necessary, of course, except in one or two cases, to indicate points of detail which it has seemed necessary to correct.
The general view of Probability I’ve taken remains largely the same; further reading and reflection have only strengthened my belief that this is the best and most productive way to approach the topic. It’s important to mention this because a casual reader might think otherwise, xiv since I’ve tried to steer clear of the overly argumentative tone that often comes from those writing about a subject for the first time, which was probably too noticeable in the earlier edition. I haven’t felt the need to point out specific details that needed correction, except in a couple of cases.
A number of new discussions have been introduced upon topics which were but little or not at all treated before. The principal of these refer to the nature and physical origin of Laws of Error (Ch. II.); the general view of Logic, and consequently of Probability, termed the Material view, adopted here (Ch. X.); a brief history and criticism of the various opinions held on the subject of Modality (Ch. XII.); the logical principles underlying the method of Least Squares (Ch. XIII.); and the practices of Insurance and Gambling, so far as the principles involved in them are concerned (Ch. XV.). The Chapter on the Credibility of Extraordinary Stories is also mainly new; this was the portion of the former work which has since seemed to me the least satisfactory, but owing to the extreme intricacy of the subject I am far from feeling thoroughly satisfied with it even now.
A number of new discussions have been introduced on topics that were barely touched upon before. The main ones refer to the nature and physical origin of Laws of Error (Ch. II.); the overall perspective of Logic, and consequently Probability, called the Material view, adopted here (Ch. X.); a brief history and critique of the various opinions about Modality (Ch. XII.); the logical principles behind the method of Least Squares (Ch. XIII.); and the practices of Insurance and Gambling, regarding the principles involved in them (Ch. XV.). The chapter on the Credibility of Extraordinary Stories is also mostly new; this was the part of the previous work that I found least satisfactory, but due to the complexity of the subject, I still don’t feel completely satisfied with it now.
I have again to thank several friends for the assistance they have so kindly afforded. Amongst these I must prominently mention Mr C. J. Monro, late fellow of Trinity. It is only the truth to say that I have derived more assistance from his suggestions and criticisms than has been consciously obtained from all other external sources together. Much of this xv criticism has been given privately in letters, and notes on the proof-sheets; but one of the most elaborate of his discussions of the subject was communicated to the Cambridge Philosophical Society some years ago; as it was not published, however, I am unfortunately unable to refer the reader to it. I ought to add that he is not in any way committed to any of my opinions upon the subject, from some of which in fact he more or less dissents. I am also much indebted to Mr J. W. L. Glaisher, also of Trinity College, for many hints and references to various publications upon the subject of Least Squares, and for careful criticism (given in the midst of much other labour) of the chapter in which that subject is treated.
I want to thank several friends for the help they’ve generously provided. Among them, I have to especially mention Mr. C. J. Monro, former fellow of Trinity. It’s only fair to say that I’ve gotten more support from his suggestions and critiques than from all other sources combined. Much of this feedback has come in private letters and notes on the proof-sheets; however, one of his most detailed discussions on the topic was presented to the Cambridge Philosophical Society a few years ago. Unfortunately, since it wasn’t published, I can’t direct the reader to it. I should also note that he doesn’t necessarily agree with all my views on the subject, as he dissents from some of them to varying degrees. I’m also very grateful to Mr. J. W. L. Glaisher, also from Trinity College, for his many hints and references to different publications about Least Squares, and for his careful critique (despite being busy with other work) of the chapter covering that topic.
I need not add that, like every one else who has had to discuss the subject of Probability during the last ten years, I have made constant use of Mr Todhunter's History.
I don't need to mention that, like everyone else who has had to discuss Probability over the past decade, I've frequently relied on Mr. Todhunter's History.
I may take this opportunity of adding that a considerable portion of the tenth chapter has recently appeared in the January number of Mind, and that the substance of several chapters, especially in the more logical parts, has formed part of my ordinary lectures in Cambridge; the foundation and logical treatment of Probability being now expressly included in the Schedule of Subjects for the Moral Sciences Tripos.
I want to take this chance to point out that a significant part of the tenth chapter has recently been published in the January issue of Mind, and that the main ideas from several chapters, especially in the more logical sections, have been included in my standard lectures at Cambridge; the basics and logical analysis of Probability are now specifically listed in the Schedule of Subjects for the Moral Sciences Tripos.
March, 1876.
March 1876.
xvii
PREFACE TO THIRD EDITION.
The present edition has been revised throughout, and in fact rewritten. Three chapters are new, viz. the fifth (On the conception of Randomness) and the eighteenth and nineteenth (On the nature, and on the employment, of Averages). The eighth, tenth, eleventh, and fifteenth chapters have been recast, and much new matter added, and numerous alterations made in the remaining portions.[1] On the other hand three chapters of the last edition have been nearly or entirely omitted.
The current edition has been completely updated and essentially rewritten. Three chapters are new: the fifth (On the Conception of Randomness) and the eighteenth and nineteenth (On the Nature and On the Use of Averages). The eighth, tenth, eleventh, and fifteenth chapters have been restructured, with a lot of new content added, and numerous changes made in the remaining sections.[1] Conversely, three chapters from the previous edition have been mostly or entirely removed.
These alterations do not imply any appreciable change of view on my part as to the foundations and province of Probability. Some of them are of course due to the necessary changes involved in the attempt to write up to date upon a subject which has not been stationary during the last eleven years. For instance the greatly increased interest now taken in what may be called the Theory of Statistics has rendered it desirable to go much more fully into the Nature and treatment of Laws of Error. The omissions are mainly xviii due to a wish to avoid increasing the bulk of this volume more than is actually necessary, and to a feeling that the portions treating specially of Inductive Logic (which occupied some space in the last edition) would be more suitable to a regular work on that subject. I am at present engaged on such a work.
These changes don't indicate significant shifts in my views about the foundations and scope of Probability. Some of them are naturally due to the necessary updates when discussing a topic that hasn't remained static over the past eleven years. For example, the growing interest in what we might call the Theory of Statistics has made it necessary to delve deeper into the Nature and handling of Laws of Error. The omissions mainly stem from a desire to avoid making this volume larger than necessary and from the belief that the sections specifically discussing Inductive Logic (which took up considerable space in the last edition) would fit better in a dedicated work on that subject. I'm currently working on such a project.
The publications which I have had occasion to notice have mostly appeared in various scientific journals. The principal authors of these have been Mr F. Galton and Mr F. Y. Edgeworth: to the latter of whom I am also personally much obliged for many discussions, oral and written, and for his kindness in looking through the proof-sheets. His published articles are too numerous for separate mention here, but I may say generally, in addition to the obligations specially noticed, that I have been considerably indebted to them in writing the last two chapters. Two authors of works of a somewhat more substantial character, viz. Prof. Lexis and Von Kries, only came under my notice unfortunately after this work was already in the printer's hands. With the latter of these authors I find myself in closer agreement than with most others, in respect of his general conception and treatment of Probability.
The publications I've noticed mostly came out in various scientific journals. The main authors have been Mr. F. Galton and Mr. F. Y. Edgeworth; I'm also personally grateful to Edgeworth for many discussions, both verbal and written, and for his kindness in reviewing the proof sheets. His published articles are too many to list individually here, but I want to acknowledge that I've greatly relied on them while writing the last two chapters. I only became aware of two authors—Prof. Lexis and Von Kries—after this work was already being printed. I find myself more in agreement with Von Kries than with most others regarding his overall understanding and approach to Probability.
December, 1887.
December 1887.
TABLE OF CONTENTS.[*]
* Chapters and sections which are nearly or entirely new are printed in italics.
Understood. Please provide the text you would like me to modernize. Chapters and sections that are almost or completely new are printed in italics.
PART I.
PHYSICAL FOUNDATIONS OF THE SCIENCE OF PROBABILITY. Chh. I–V.
CHAPTER I.
THE SERIES OF PROBABILITY.
§§ 1, 2. Distinction between the proportional propositions of Probability, and the propositions of Logic.
§§ 1, 2. Difference between the proportional statements of Probability and the statements of Logic.
3, 4. The former are best regarded as presenting a series of individuals,
3, 4. They are best considered as highlighting a group of people,
5. Which may occur in any order of time,
5. Which can happen at any time,
6, 7. And which present themselves in groups.
6, 7. And which appear in groups.
8. Comparison of the above with the ordinary phraseology.
8. Comparison of the above with everyday language.
9, 10. These series ultimately fluctuate,
These series ultimately vary,
11. Especially in the case of moral and social phenomena,
11. Especially regarding moral and social issues,
12. Though in the case of games of chance the fluctuation is practically inappreciable.
12. However, in games of chance, the variation is almost unnoticeable.
13, 14. In this latter case only can rigorous inferences be drawn.
13, 14. Strict conclusions can only be drawn in this latter case.
15, 16. The Petersburg Problem.
15, 16. The Petersburg Problem.
CHAPTER II.
ARRANGEMENT AND FORMATION OF THE SERIES. LAWS OF ERROR.
§§ 1, 2. Indication of the nature of a Law of Error or Divergence.
§§ 1, 2. Description of a Law of Error or Divergence.
3. Is there necessarily but one such law,
3. Is there just one law like that,
4. Applicable to widely distinct classes of things?
4. Does it apply to really different kinds of things?
5, 6. This cannot be proved directly by statistics;
5, 6. This can't be directly proven by statistics;
7, 8. Which in certain cases show actual asymmetry.
7, 8. Which in some cases show real imbalance.
9, 10. Nor deductively;
Nor deductively;
11. Nor by the Method of Least Squares.
11. Not using the Least Squares Method.
12. Distinction between Laws of Error and the Method of Least Squares.
12. Difference between Error Laws and the Least Squares Method.
13. Supposed existence of types.
13. Alleged existence of types.
14–16. Homogeneous and heterogeneous classes.
Homogeneous and heterogeneous classes.
17, 18. The type in the case of human stature, &c.
17, 18. The information about human height, etc.
19, 20. The type in mental characteristics.
19, 20. The type of mental traits.
21, 22. Applications of the foregoing principles and results.
21, 22. How to Use the Principles and Results Listed Above.
CHAPTER III.
ORIGIN OR PROCESS OF CAUSATION OF THE SERIES.
§ 1. The causes consist of (1) ‘objects,’
§ 1. The causes include (1) ‘stuff,’
2, 3. Which may or may not be distinguishable into natural kinds,
2, 3. Which may or may not be grouped into natural categories,
4–6. And (2) ‘agencies.’
4–6. And (2) ‘agencies.’
7. Requisites demanded in the above:
7. Requirements mentioned above:
8, 9. Consequences of their absence.
Consequences of their absence.
10. Where are the required causes found?
10. Where can the required causes be found?
11, 12. Not in the direct results of human will.
11, 12. Not directly from human intention.
13–15. Examination of apparent exceptions.
Examination of apparent exceptions.
16–18. Further analysis of some natural causes.
16–18. Further analysis of specific natural causes.
CHAPTER IV.
HOW TO DISCOVER AND PROVE THE SERIES.
§ 1. The data of Probability are established by experience;
§ 1. Probability data is based on experience;
2. Though in practice most problems are solved deductively.
2. Even though in reality, most problems are solved through reasoning.
3–7. Mechanical instance to show the inadequacy of any à priori proof.
3-7. A mechanical example to show the flaws in any a priori proof.
8. The Principle of Sufficient Reason inapplicable.
8. The Principle of Sufficient Reason does not apply.
9. Evidence of actual experience.
Evidence of real experience.
10, 11. Further examination of the causes.
10, 11. Further investigation into the causes.
12, 13. Distinction between the succession of physical events and the Doctrine of Combinations.
12, 13. Difference between the sequence of physical events and the Theory of Combinations.
14, 15. Remarks of Laplace on this subject.
14, 15. Laplace's Thoughts on This Topic.
16. Bernoulli's Theorem;
16. Bernoulli's Principle;
17, 18. Its inapplicability to social phenomena.
17, 18. Its unreliability for social phenomena.
19. Summation of preceding results.
19. Summary of previous results.
CHAPTER V.
THE CONCEPTION OF RANDOMNESS.
§ 1. General Indication.
§ 1. Overview.
2–5. The postulate of ultimate uniform distribution at one stage or another.
2–5. The concept of perfect uniform distribution at a certain point.
6. This area of distribution must be finite:
6. This distribution area has to be restricted:
7, 8. Geometrical illustrations in support:
7, 8. Geometric illustrations in support:
9. Can we conceive any exception here?
9. Can we think of any exceptions to this?
10, 11. Experimental determination of the random character when the events are many:
10, 11. How to experimentally figure out randomness when there are many events:
12. Corresponding determination when they are few.
12. Related judgment when there are few.
13, 14. Illustration from the constant π.
13, 14. Illustration of the constant π.
15, 16. Conception of a line drawn at random.
15, 16. Understanding a randomly drawn line.
17. Graphical illustration.
17. Visual representation.
PART II.
LOGICAL SUPERSTRUCTURE ON THE ABOVE PHYSICAL FOUNDATIONS. Chh. VI–XIV.
CHAPTER VI.
MEASUREMENT OF BELIEF.
§§ 1, 2. Preliminary remarks.
§§ 1, 2. Preliminary remarks.
3, 4. Are we accurately conscious of gradations of belief?
3, 4. Are we aware of the different levels of belief?
5. Probability only concerned with part of this enquiry.
5. Probability only pertains to part of this investigation.
6. Difficulty of measuring our belief;
6. Challenges in measuring our belief;
7. Owing to intrusion of emotions,
Due to emotional interference,
8. And complexity of the evidence.
And complexity of the evidence.
9. And when measured, is it always correct?
9. Is it always accurate when measured?
10, 11. Distinction between logical and psychological views.
10, 11. Difference between logical and psychological viewpoints.
12–16. Analogy of Formal Logic fails to show that we can thus detach and measure our belief.
12–16. The analogy of formal logic doesn't show that we can separate and measure our beliefs.
17. Apparent evidence of popular language to the contrary.
17. Clear signs of the common language indicating something different.
18. How is full belief justified in inductive enquiry?
18. How is complete trust established in inductive research?
19–23. Attempt to show how partial belief may be similarly justified.
19-23. Trying to show how partial belief can be justified in a similar way.
24–28. Extension of this explanation to cases which cannot be repeated in experience.
24–28. Applying this explanation to situations that can't be experienced again.
29. Can other emotions besides belief be thus measured?
29. Can other emotions, apart from belief, be measured this way?
30. Errors thus arising in connection with the Petersburg Problem.
30. Errors related to the Petersburg Problem.
31, 32. The emotion of surprise is a partial exception.
31, 32. Surprise is somewhat of an exception.
33, 34. Objective and subjective phraseology.
Objective and subjective language.
35. The definition of probability,
The definition of probability
36. Introduces the notion of a ‘limit’,
36. Introduces the concept of a 'limit',
37. And implies, vaguely, some degree of belief.
37. And implies, somewhat vaguely, a certain degree of belief.
CHAPTER VII.
THE RULES OF INFERENCE IN PROBABILITY.
§ 1. Nature of these inferences.
§ 1. Nature of these inferences.
2. Inferences by addition and subtraction.
Inferences by adding and taking away.
3. Inferences by multiplication and division.
Inferences from multiplication and division.
4–6. Rule for independent events.
Rule for independent events.
7. Other rules sometimes introduced.
7. Other rules occasionally added.
8. All the above rules may be interpreted subjectively, i.e. in terms of belief.
8. All of the rules mentioned above can be interpreted subjectively, which means based on personal belief.
9–11. Rules of so-called Inverse Probability.
9–11. Rules of Inverse Probability.
12, 13. Nature of the assumption involved in them:
12, 13. Nature of the assumption involved in them:
14–16. Arbitrary character of this assumption.
Arbitrary nature of this assumption.
17, 18. Physical illustrations.
17, 18. Visual illustrations.
CHAPTER VIII.
THE RULE OF SUCCESSION.
§ 1. Reasons for desiring some such rule:
§ 1. Reasons for wanting some sort of rule:
2. Though it could scarcely belong to Probability.
2. Even though it barely falls under Probability.
3. Distinction between Probability and Induction.
3. Distinction between Probability and Induction.
4, 5. Impossibility of reducing the various rules of the latter under one head.
4, 5. It's not possible to categorize the various rules of the latter into one group.
6. Statement of the Rule of Succession;
6. Statement of the Rule of Succession;
7. Proof offered for it.
7. Evidence provided for it.
8. Is it a strict rule of inference?
8. Is there a strict rule for drawing conclusions?
9. Or is it a psychological principle?
9. Or is it a psychological concept?
CHAPTER IX.
INDUCTION.
§§ 1–5. Statement of the Inductive problem, and origin of the Inductive inference.
§§ 1–5. Overview of the Inductive problem and the origin of Inductive reasoning.
6. Relation of Probability to Induction.
Relation of Probability to Induction.
7–9. The two are sometimes merged into one.
7-9. The two are occasionally merged into one.
10. Extent to which causation is needed in Probability.
10. How much causation is necessary in probability?
11–13. Difficulty of referring an individual to a class:
11–13. Challenges of putting a person into a group:
14. This difficulty but slight in Logic,
14. This challenge is small in Logic,
15, 16. But leads to perplexity in Probability:
15, 16. However, it leads to confusion in probability:
17–21. Mild form of this perplexity;
Mild version of this confusion;
22, 23. Serious form.
22, 23. Formal style.
24–27. Illustration from Life Insurance.
24–27. Illustration from Life Insurance.
28, 29. Meaning of ‘the value of a life’.
28, 29. Meaning of 'the value of a life.'
30, 31. Successive specialization of the classes to which objects are referred.
30, 31. Gradual specialization of the categories that objects are assigned to.
32. Summary of results.
32. Results Summary.
CHAPTER X.
CHANGE, CAUSATION AND DESIGN.
§ 1. Old Theological objection to Chance.
§ 1. Traditional Theological Objection to Randomness.
2–4. Scientific version of the same.
Scientific version of the same.
5. Statistics in reference to Free-will.
5. Statistics on Free Will.
6–8. Inconclusiveness of the common arguments here.
6–8. The usual arguments here are not convincing.
9, 10. Chance as opposed to Physical Causation.
9, 10. Chance vs. Physical Causes.
11. Chance as opposed to Design in the case of numerical constants.
11. Chance vs. Design in the context of numerical constants.
12–14. Theoretic solution between Chance and Design.
12–14. Theoretical Solution Between Randomness and Intention.
15. Illustration from the dimensions of the Pyramid.
15. Illustration showing the dimensions of the Pyramid.
16, 17. Discussion of certain difficulties here.
16, 17. Talk about some challenges here.
18, 19. Illustration from Psychical Phenomena.
18, 19. Illustration from Psychic Phenomena.
20. Arbuthnott's Problem of the proportion of the sexes.
20. Arbuthnott's Gender Ratio Problem.
21–23. Random or designed distribution of the stars.
21–23. Random or organized distribution of the stars.
(Note on the proportion of the sexes.)
(Note on the proportion of the sexes.)
CHAPTER XI.
MATERIAL AND FORMAL LOGIC.
§§ 1, 2. Broad distinction between these views;
§§ 1, 2. Key differences between these views;
2, 3. Difficulty of adhering consistently to the objective view;
2, 3. Challenges of consistently keeping an objective viewpoint;
4. Especially in the case of Hypotheses.
4. About Hypotheses.
5. The doubtful stage of our facts is only occasional in Inductive Logic.
5. The uncertain phase of our facts only happens occasionally in Inductive Logic.
6–9. But normal and permanent in Probability.
6-9. But normal and permanent in probability.
10, 11. Consequent difficulty of avoiding Conceptualist phraseology.
10, 11. The ongoing challenge of avoiding Conceptualist language.
CHAPTER XII.
CONSEQUENCES OF THE DISTINCTIONS OF THE PREVIOUS CHAPTER.
§§ 1, 2. Probability has no relation to time.
§§ 1, 2. Probability doesn't relate to time.
3, 4. Butler and Mill on Probability before and after the event.
3, 4. Butler and Mill on Probability Before and After the Event.
5. Other attempts at explaining the difficulty.
5. Other efforts to clarify the issue.
6–8. What is really meant by the distinction.
6–8. What the distinction really means.
9. Origin of the common mistake.
9. Source of the common mistake.
10–12. Examples in illustration of this view,
10–12. Examples showing this viewpoint,
13. Is Probability relative?
Is probability relative?
14. What is really meant by this expression.
What this expression means.
15. Objections to terming Probability relative.
Objections to calling Probability relative.
16, 17. In suitable examples the difficulty scarcely presents itself.
16, 17. In appropriate cases, the difficulty barely shows.
CHAPTER XIII.
ON MODALITY.
§ 1. Various senses of Modality;
§ 1. Various senses of Modality;
2. Having mostly some relation to Probability.
2. Mostly about Probability.
3. Modality must be recognized.
Modality needs to be recognized.
4. Sometimes relegated to the predicate,
Sometimes pushed to the predicate,
5, 6. Sometimes incorrectly rejected altogether.
Sometimes just rejected altogether.
7, 8. Common practical recognition of it.
7, 8. General Practical Acknowledgment of It.
9–11. Modal propositions in Logic and in Probability.
9/11. Modal Statements in Logic and Probability.
12. Aristotelian view of the Modals;
12. Aristotle's perspective on Modals;
13, 14. Founded on extinct philosophical views;
13, 14. Based on old philosophical ideas;
15. But long and widely maintained.
But long and widely accepted.
16. Kant's general view.
Kant's overall perspective.
17–19. The number of modal divisions admitted by various logicians.
17–19. The number of modal categories acknowledged by various logicians.
20. Influence of the theory of Probability.
20. Impact of Probability Theory.
21, 22. Modal syllogisms.
Modal syllogisms.
23. Popular modal phraseology.
23. Common modal phrases.
24–26. Probable and Dialectic syllogisms.
24–26. Probable and Dialectic Arguments.
27, 28. Modal difficulties occur in Jurisprudence.
27, 28. Challenges with modality occur in law.
29, 30. Proposed standards of legal certainty.
29, 30. Proposed standards for legal clarity.
31. Rejected formally in English Law, but possibly recognized practically.
31. Officially rejected in English law, but possibly recognized in practice.
32. How, if so, it might be determined.
32. How it could possibly be figured out.
CHAPTER XIV.
FALLACIES.
§§ 1–3. (I.) Errors in judging of events after they have happened.
§§ 1–3. (I.) Errors in assessing events after they have happened.
4–7. Very various judgments may be thus involved.
4–7. There can be a variety of judgments involved.
8, 9. (II.) Confusion between random and picked selections.
8, 9. (II.) Combining random and selected options.
10, 11. (III.) Undue limitation of the notion of Probability.
10, 11. (III.) Unfair limitation of the concept of Probability.
12–16. (IV.) Double or Quits: the Martingale.
12–16. (IV.) Double or Nothing: the Martingale.
17, 18. Physical illustration.
17, 18. Physical representation.
19, 20. (V.) Inadequate realization of large numbers.
19, 20. (V.) Lack of understanding of large numbers.
21–24. Production of works of art by chance.
21–24. Making art with randomness.
25. Illustration from doctrine of heredity.
25. Illustration from the heredity doctrine.
26–30. (VI.) Confusion between Probability and Induction.
26–30. (VI.) Confusion between Probability and Induction.
31–33. (VII.) Undue neglect of small chances.
31–33. (VII.) Unneeded disregard for small opportunities.
34, 35. (VIII.) Judging by the event in Probability and in Induction.
34, 35. (VIII.) Evaluating based on the event in Probability and Induction.
PART III.
VARIOUS APPLICATIONS OF THE THEORY OF PROBABILITY. Ch. XV–XIX.
CHAPTER XV.
INSURANCE AND GAMBLING.
§§ 1, 2. The certainties and uncertainties of life.
§§ 1, 2. The certainties and uncertainties of life.
3–5. Insurance a means of diminishing the uncertainties.
3–5. Insurance as a way to minimize uncertainties.
6, 7. Gambling a means of increasing them.
6, 7. Gambling as a means to enhance them.
8, 9. Various forms of gambling.
8, 9. Different types of gambling.
10, 11. Comparison between these practices.
10, 11. Comparing these practices.
12–14. Proofs of the disadvantage of gambling:—
12–14. Proofs of the negative effects of gambling:—
(1) on arithmetical grounds:
(1) on arithmetic grounds:
15, 16. Illustration from family names.
15, 16. Illustration from last names.
17. (2) from the ‘moral expectation’.
(2) from the 'moral expectation'.
18, 19. Inconclusiveness of these proofs.
18, 19. Uncertainty of these proofs.
20–22. Broader questions raised by these attempts.
20-22. Wider issues brought up by these efforts.
CHAPTER XVI.
APPLICATION OF PROBABILITY TO TESTIMONY.
§§ 1, 2. Doubtful applicability of Probability to testimony.
§§ 1, 2. Unclear importance of Probability to testimony.
3. Conditions of such applicability.
3. Applicable Conditions.
4. Reasons for the above conditions.
Reasons for the above conditions.
5, 6. Are these conditions fulfilled in the case of testimony?
5, 6. Are these conditions satisfied in the case of testimony?
7. The appeal here is not directly to statistics.
7. The appeal here doesn't focus directly on statistics.
8, 9. Illustrations of the above.
8, 9. Examples of the above.
10, 11. Is any application of Probability to testimony valid?
10, 11. Is any use of probability in testimony legitimate?
CHAPTER XVII.
CREDIBILITY OF EXTRAORDINARY STORIES.
§ 1. Improbability before and after the event.
§ 1. Unusual Events Before and After They Occur.
2, 3. Does the rejection of this lead to the conclusion that the credibility of a story is independent of its nature?
2, 3. Does rejecting this mean that the credibility of a story isn't influenced by its nature?
4. General and special credibility of a witness.
4. General and Special Credibility of a Witness.
5–8. Distinction between alternative and open questions, and the usual rules for application of testimony to each of these.
5-8. The difference between alternative and open questions, along with the usual guidelines for using testimony with each type.
9. Discussion of an objection.
9. Addressing an objection.
10, 11. Testimony of worthless witnesses.
10, 11. Testimony of unreliable witnesses.
12–14. Common practical ways of regarding such problems.
12–14. Common practical methods for addressing these issues.
15. Extraordinary stories not necessarily less probable.
15. Extraordinary stories aren't any less likely to happen.
16–18. Meaning of the term extraordinary, and its distinction from miraculous.
16-18. Meaning of the term extraordinary and how it differs from miraculous.
19, 20. Combination of testimony.
19, 20. Testimony Combined.
21, 22. Scientific meaning of a miracle.
21, 22. The scientific importance of a miracle.
23, 24. Two distinct prepossessions in regard to miracles, and the logical consequences of these.
23, 24. Two different beliefs about miracles and what they logically imply.
25. Difficulty of discussing by our rules cases in which arbitrary interference can be postulated.
25. It's challenging to discuss cases where we can assume random interference based on our rules.
26, 27. Consequent inappropriateness of many arguments.
26, 27. As a result, many arguments are inappropriate.
CHAPTER XVIII.
ON THE NATURE AND USE OF AN AVERAGE, AND ON THE DIFFERENT KINDS OF AVERAGE.
§:nbsp;1. Preliminary rude notion of an average,
§:nbsp;1. Basic concept of an average,
2. More precise quantitative notion, yielding
2. More accurate quantitative idea, producing
(1) the Arithmetical Average,
(1) the Average,
3. (2) the Geometrical.
(2) *the Geometrical.*
4. In asymmetrical curves of error the arithmetic average must be distinguished from,
4. In uneven error curves, it's important to differentiate the arithmetic average from,
5. (3) the Maximum Ordinate average,
5. (3) the Maximum Ordinate average,
6. (4) and the Median.
(4) and the Median.
7. Diagram in illustration.
7. Illustrative diagram.
8–10. Average departure from the average, considered under the above heads, and under that of
8–10. The average deviation from the mean, viewed through the perspectives mentioned above, and in terms of
11. (5) The (average of) Mean Square of Error,
(5) The Mean Squared Error (average),
12–14. The objects of taking averages.
12–14. The purpose of averaging.
15. Mr Galton's practical method of determining the average.
15. Mr. Galton's practical approach to finding the average.
16, 17. No distinction between the average and the mean.
16, 17. No difference between the average and the mean.
18–20. Distinction between what is necessary and what is experimental here.
18-20. Difference between what's essential and what's experimental here.
21, 22. Theoretical defects in the determination of the ‘errors’.
21, 22. Theoretical issues in identifying the 'errors'.
23. Practical escape from these.
23. Real escape from these.
(Note about the units in the exponential equation and integral.)
(Note about the units in the exponential equation and integral.)
CHAPTER XIX.
THE THEORY OF THE AVERAGE AS A MEANS OF APPROXIMATION TO THE TRUTH.
§§ 1–4. General indication of the problem: i.e. an inverse one requiring the previous consideration of a direct one.
§§ 1–4. Summary of the issue: specifically, a reverse problem that needs to first tackle a direct one.
[I. The direct problem:—given the central value and law of dispersion of the single errors, to determine those of the averages. §§ 6–20.]
[I. The direct problem:—given the central value and distribution of individual errors, find the averages. §§ 6–20.]
6. (i) The law of dispersion may be determinable à priori,
6. (i) The law of dispersion can be determined beforehand.
7. (ii) or experimentally, by statistics.
(ii) or experimentally, by stats.
8, 9. Thence to determine the modulus of the error curve.
8, 9. Next, we need to determine the magnitude of the error curve.
10–14. Numerical example to illustrate the nature and amount of the contraction of the modulus of the average-error curve.
10–14. Numerical example to demonstrate the nature and extent of the contraction of the average-error curve's modulus.
15. This curve is of the same general kind as that of the single errors;
15. This curve is similar to that of the individual errors;
16. Equally symmetrical,
16. Perfectly symmetrical,
17, 18. And more heaped up towards the centre.
17, 18. And more accumulated towards the center.
19, 20. Algebraic generalization of the foregoing results.
19, 20. Algebraic generalization of the earlier results.
[II. The inverse problem:—given but a few of the errors to determine their centre and law, and thence to draw the above deductions. §§ 21–25.]
[II. The inverse problem:—given only a few of the errors to identify their center and pattern, and then to draw the conclusions mentioned above. §§ 21–25.]
22, 23. The actual calculations are the same as before,
22, 23. The calculations are the same as before,
24. With the extra demand that we must determine how probable are the results.
24. With the rising demand, we need to determine how likely the outcomes are.
25. Summary.
25. Overview.
[III. Consideration of the same questions as applied to certain peculiar laws of error. §§ 26–37.]
[III. Examining the same issues as they pertain to specific unique error laws. §§ 26–37.]
26. (i) All errors equally probable.
(i) All errors have equal chance.
27, 28. (ii) Certain peculiar laws of error.
27, 28. (ii) Some unique error laws.
29, 30. Further analysis of the reasons for taking averages.
29, 30. Additional examination of the reasons for calculating averages.
31–35. Illustrative examples.
31–35. Examples.
36, 37. Curves with double centre and absence of symmetry.
36, 37. Curves with a double center and no symmetry.
38, 39. Conclusion.
38, 39. Conclusion.
1
THE LOGIC OF CHANCE.
CHAPTER 1.
ON CERTAIN KINDS OF GROUPS OR SERIES AS THE FOUNDATION OF PROBABILITY.
§ 1. It is sometimes not easy to give a clear definition of a science at the outset, so as to set its scope and province before the reader in a few words. In the case of those sciences which are more immediately and directly concerned with what are termed objects, rather than with what are termed processes, this difficulty is not indeed so serious. If the reader is already familiar with the objects, a simple reference to them will give him a tolerably accurate idea of the direction and nature of his studies. Even if he be not familiar with them, they will still be often to some extent connected and associated in his mind by a name, and the mere utterance of the name may thus convey a fair amount of preliminary information. This is more or less the case with many of the natural sciences; we can often tell the reader beforehand exactly what he is going to study. But when a science is concerned, not so much with objects directly, as with processes and laws, or when it takes for the subject of its enquiry some comparatively obscure feature drawn from phenomena which have little or nothing else in common, the difficulty of giving preliminary information becomes greater. Recognized classes of objects have then 2 to be disregarded and even broken up, and an entirely novel arrangement of the objects to be made. In such cases it is the study of the science that first gives the science its unity, for till it is studied the objects with which it is concerned were probably never thought of together. Here a definition cannot be given at the outset, and the process of obtaining it may become by comparison somewhat laborious.
§ 1. Sometimes it's not easy to clearly define a science right from the start, making it challenging to establish its scope for the reader in just a few words. For those sciences that deal more directly with what we call objects, rather than processes, this challenge isn't as significant. If the reader is already familiar with the objects, a simple reference to them will provide a fairly accurate idea of the focus and nature of the studies. Even if they aren’t familiar, the objects may still be somewhat connected and associated in their mind by name, and just mentioning the name can convey a reasonable amount of preliminary information. This applies to many natural sciences; we can often tell the reader exactly what they will be studying. However, when a science focuses not so much on objects directly but on processes and laws, or when it examines a relatively obscure aspect drawn from phenomena that have little in common, providing an initial definition becomes more difficult. Recognized classes of objects have to be set aside and even deconstructed, requiring a completely new arrangement of the objects. In such situations, the study of the science itself provides the unity, because until it's studied, the objects it focuses on likely weren't considered together. Here, a definition can't be given upfront, and the process of arriving at one might be comparatively challenging.
The science of Probability, at least on the view taken of it in the following pages, is of this latter description. The reader who is at present unacquainted with the science cannot be at once informed of its scope by a reference to objects with which he is already familiar. He will have to be taken in hand, as it were, and some little time and trouble will have to be expended in directing his attention to our subject-matter before he can be expected to know it. To do this will be our first task.
The science of Probability, at least as it's presented in the following pages, falls into this latter category. A reader who is not currently familiar with the science can't be immediately informed about it by referencing things they're already familiar with. They will need some guidance, and it will take some time and effort to focus their attention on our topic before they can be expected to understand it. This will be our first task.
§ 2. In studying Nature, in any form, we are continually coming into possession of information which we sum up in general propositions. Now in very many cases these general propositions are neither more nor less certain and accurate than the details which they embrace and of which they are composed. We are assuming at present that the truth of these generalizations is not disputed; as a matter of fact they may rest on weak evidence, or they may be uncertain from their being widely extended by induction; what is meant is, that when we resolve them into their component parts we have precisely the same assurance of the truth of the details as we have of that of the whole. When I know, for instance, that all cows ruminate, I feel just as certain that any particular cow or cows ruminate as that the whole class does. I may be right or wrong in my original statement, and I may have obtained it by any conceivable mode in which truths can be obtained; but whatever the value of 3 the general proposition may be, that of the particulars is neither greater nor less. The process of inferring the particular from the general is not accompanied by the slightest diminution of certainty. If one of these ‘immediate inferences’ is justified at all, it will be equally right in every case.
§ 2. When we study Nature in any form, we constantly gather information that we summarize in general statements. Often, these general statements are just as reliable and accurate as the specific details they cover. Right now, we're assuming these generalizations are accepted as true; however, they can actually be based on weak evidence or be uncertain due to broad inductive reasoning. What I mean is that when we break them down into their individual parts, we have just as much confidence in the truth of those details as we do in the overall statement. For example, when I know that all cows ruminate, I feel just as certain that any specific cow or cows ruminate as I do about the entire group. I could be right or wrong about my initial statement, and I might have derived it using any number of methods to obtain truths; but regardless of the validity of the general statement, the validity of the specifics is no greater or lesser. The process of drawing conclusions about specifics from a general statement doesn’t reduce our certainty at all. If one of these 'immediate inferences' is valid, it will be equally valid in every case.
But it is by no means necessary that this characteristic should exist in all cases. There is a class of immediate inferences, almost unrecognized indeed in logic, but constantly drawn in practice, of which the characteristic is, that as they increase in particularity they diminish in certainty. Let me assume that I am told that some cows ruminate; I cannot infer logically from this that any particular cow does so, though I should feel some way removed from absolute disbelief, or even indifference to assent, upon the subject; but if I saw a herd of cows I should feel more sure that some of them were ruminant than I did of the single cow, and my assurance would increase with the numbers of the herd about which I had to form an opinion. Here then we have a class of things as to the individuals of which we feel quite in uncertainty, whilst as we embrace larger numbers in our assertions we attach greater weight to our inferences. It is with such classes of things and such inferences that the science of Probability is concerned.
But it's not necessary for this characteristic to exist in every case. There's a kind of immediate inferences, often overlooked in logic but frequently made in practice, where as they become more specific, they actually become less certain. For example, if I hear that some cows ruminate, I can't logically conclude that any specific cow does this, although I would feel somewhat convinced—maybe not fully convinced, but not indifferent either—about the matter. However, if I see a herd of cows, I would feel more confident that some of them are ruminating than I would about just one cow, and my confidence would grow with the number of cows in front of me that I had to make a judgment about. So here we have a situation where we feel uncertain about individuals, but as we make broader claims, we give more weight to our conclusions. It's these types of situations and inferences that the science of Probability deals with.
§ 3. In the foregoing remarks, which are intended to be purely preliminary, we have not been able altogether to avoid some reference to a subjective element, viz. the degree of our certainty or belief about the things which we are supposed to contemplate. The reader may be aware that by some writers this element is regarded as the subject-matter of the science. Hence it will have to be discussed in a future chapter. As however I do not agree with the opinion of the writers just mentioned, at least as regards 4 treating this element as one of primary importance, no further allusion will be made to it here, but we will pass on at once to a more minute investigation of that distinctive characteristic of certain classes of things which was introduced to notice in the last section.
§ 3. In the earlier comments, which are meant to be purely preliminary, we haven’t been able to completely avoid mentioning a subjective element, namely, the level of our certainty or belief about the things we are supposed to consider. The reader might know that some writers see this element as the main focus of the science. Therefore, it will need to be discussed in a future chapter. However, since I don’t agree with the view of those writers, at least regarding treating this element as a top priority, I won’t bring it up here again, but will move on to a more detailed examination of the distinct characteristic of certain categories of things that was highlighted in the last section.
In these classes of things, which are those with which Probability is concerned, the fundamental conception which the reader has to fix in his mind as clearly as possible, is, I take it, that of a series. But it is a series of a peculiar kind, one of which no better compendious description can be given than that which is contained in the statement that it combines individual irregularity with aggregate regularity. This is a statement which will probably need some explanation. Let us recur to an example of the kind already alluded to, selecting one which shall be in accordance with experience. Some children will not live to thirty. Now if this proposition is to be regarded as a purely indefinite or, as it would be termed in logic, ‘particular’ proposition, no doubt the notion of a series does not obviously present itself in connection with it. It contains a statement about a certain unknown proportion of the whole, and that is all. But it is not with these purely indefinite propositions that we shall be concerned. Let us suppose the statement, on the contrary, to be of a numerical character, and to refer to a given proportion of the whole, and we shall then find it difficult to exclude the notion of a series. We shall find it, I think, impossible to do so as soon as we set before us the aim of obtaining accurate, or even moderately correct inferences. What, for instance, is the meaning of the statement that two new-born children in three fail to attain the age of sixty-three? It certainly does not declare that in any given batch of, say, thirty, we shall find just twenty that fail: whatever might be the strict meaning of the words, this 5 is not the import of the statement. It rather contemplates our examination of a large number, of a long succession of instances, and states that in such a succession we shall find a numerical proportion, not indeed fixed and accurate at first, but which tends in the long run to become so. In every kind of example with which we shall be concerned we shall find this reference to a large number or succession of objects, or, as we shall term it, series of them.
In these categories, which relate to Probability, the key idea that the reader should clearly understand is that of a series. However, it’s a specific type of series, best described as one that combines individual irregularity with aggregate regularity. This might need some clarification. Let’s refer back to an example we've mentioned before, choosing one that aligns with real-life experience. Some children won't live to see thirty. If we view this statement as a vague or, in logical terms, a 'particular' proposition, it's clear that the idea of a series doesn’t immediately come to mind. It only gives us information about an unknown portion of the total, and that’s it. But we won’t be focusing on these vague propositions. Instead, let’s assume that the statement has a numerical aspect and relates to a specific portion of the whole. In that case, we’ll find it hard to ignore the concept of a series. I believe it becomes impossible to do so once we aim to make accurate or even somewhat correct inferences. For example, what does it mean to say that two out of three newborns don’t reach the age of sixty-three? This doesn’t imply that in any specific group of, say, thirty, we’ll find exactly twenty that don’t make it; no matter how strict the wording is, that’s not what the statement means. It suggests that if we look at a large number of instances over time, we’ll observe a numerical proportion, which might not be exact at first but tends to stabilize in the long run. In every example we examine, we’ll see this reference to a large number or series of objects, or, as we’ll call it, series of them.
A few additional examples may serve to make this plain.
A few more examples might help clarify this.
Let us suppose that we toss up a penny a great many times; the results of the successive throws may be conceived to form a series. The separate throws of this series seem to occur in utter disorder; it is this disorder which causes our uncertainty about them. Sometimes head comes, sometimes tail comes; sometimes there is a repetition of the same face, sometimes not. So long as we confine our observation to a few throws at a time, the series seems to be simply chaotic. But when we consider the result of a long succession we find a marked distinction; a kind of order begins gradually to emerge, and at last assumes a distinct and striking aspect. We find in this case that the heads and tails occur in about equal numbers, that similar repetitions of different faces do so also, and so on. In a word, notwithstanding the individual disorder, an aggregate order begins to prevail. So again if we are examining the length of human life, the different lives which fall under our notice compose a series presenting the same features. The length of a single life is familiarly uncertain, but the average duration of a batch of lives is becoming in an almost equal degree familiarly certain. The larger the number we take out of any mixed crowd, the clearer become the symptoms of order, the more nearly will the average length of each selected class be the same. These few cases will serve as simple examples of a property 6 of things which can be traced almost everywhere, to a greater or less extent, throughout the whole field of our experience. Fires, shipwrecks, yields of harvest, births, marriages, suicides; it scarcely seems to matter what feature we single out for observation.[1] The irregularity of the single instances diminishes when we take a large number, and at last seems for all practical purposes to disappear.
Let's say we flip a penny many times; the results of each flip can be thought of as forming a series. The individual flips appear random, and it’s this randomness that creates our uncertainty about them. Sometimes we get heads, sometimes tails; occasionally, the same side comes up again, sometimes not. As long as we only look at a few flips at a time, the series seems completely chaotic. But when we look at a long sequence of results, a clear pattern starts to emerge, and eventually, it becomes distinct and noticeable. We find that heads and tails appear in roughly equal numbers, and similar repetitions of different sides happen too, and so forth. In short, despite the individual randomness, a general order begins to take shape. Similarly, if we examine human lifespan, the different lives we observe create a series with the same characteristics. The length of a single life is often uncertain, but the average length of a group of lives becomes fairly predictable. The larger the number we analyze from any mixed group, the clearer the signs of order become, and the more similar the average length of each selected category will be. These examples illustrate a property that can be found almost everywhere throughout our experiences. Fires, shipwrecks, harvest yields, births, marriages, suicides; it hardly seems to matter what aspect we focus on. The irregularity of individual cases decreases when we look at a large number, and eventually, it seems to disappear for all practical purposes. 6
In speaking of the effect of the average in thus diminishing the irregularities which present themselves in the details, the attention of the student must be prominently directed to the point, that it is not the absolute but the relative irregularities which thus tend to diminish without limit. This idea will be familiar enough to the mathematician, but to others it may require some reflection in order to grasp it clearly. The absolute divergences and irregularities, so far from diminishing, show a disposition to increase, and this (it may be) without limit, though their relative importance shows a corresponding disposition to diminish without limit. Thus in the case of tossing a penny, if we take a few throws, say ten, it is decidedly unlikely that there should be a difference of six between the numbers of heads and tails; that is, that 7 there should be as many as eight heads and therefore as few as two tails, or vice versâ. But take a thousand throws, and it becomes in turn exceedingly likely that there should be as much as, or more than, a difference of six between the respective numbers. On the other hand the proportion of heads to tails in the case of the thousand throws will be very much nearer to unity, in most cases, than when we only took ten. In other words, the longer a game of chance continues the larger are the spells and runs of luck in themselves, but the less their relative proportions to the whole amounts involved.
When discussing how the average reduces the irregularities found in the details, students must pay close attention to the fact that it is not the absolute but the relative irregularities that tend to decrease without limit. This concept might be familiar to mathematicians, but others may need some time to fully understand it. The absolute differences and irregularities, rather than diminishing, actually seem to increase, potentially without limit, although their relative significance seems to decrease indefinitely. For example, when tossing a penny, if we make a few throws, say ten, it's quite unlikely that there will be a difference of six between the number of heads and tails; in other words, it's improbable to get as many as eight heads and only two tails, or vice versa. However, with a thousand throws, it's much more likely to see a difference of six or more between the respective counts. On the other hand, the proportion of heads to tails in the case of the thousand throws will typically be much closer to equal than when we only did ten. In other words, the longer a game of chance goes on, the larger the streaks of luck may be, but their relative proportions compared to the overall totals will be smaller.
§ 4. In speaking as above of events or things as to the details of which we know little or nothing, it is not of course implied that our ignorance about them is complete and universal, or, what comes to the same thing, that irregularity may be observed in all their qualities. All that is meant is that there are some qualities or marks in them, the existence of which we are not able to predicate with certainty in the individuals. With regard to all their other qualities there may be the utmost uniformity, and consequently the most complete certainty. The irregularity in the length of human life is notorious, but no one doubts the existence of such organs as a heart and brains in any person whom he happens to meet. And even in the qualities in which the irregularity is observed, there are often, indeed generally, positive limits within which it will be found to be confined. No person, for instance, can calculate what may be the length of any particular life, but we feel perfectly certain that it will not stretch out to 150 years. The irregularity of the individual instances is only shown in certain respects, as e.g. the length of the life, and even in these respects it has its limits. The same remark will apply to most of the other examples with which we shall be concerned. The disorder in fact is not 8 universal and unlimited, it only prevails in certain directions and up to certain points.
§ 4. When talking about events or things we know little or nothing about, it doesn’t mean that we are completely ignorant of them, or that there’s irregularity in all their qualities. What we mean is that there are some qualities or characteristics we can’t confidently attribute to individuals. For all their other qualities, there could be complete uniformity and certainty. The variability in human lifespan is well-known, but no one doubts that every person has organs like a heart and brain. Even in the qualities where we see irregularity, there are often clear limits within which this variability is restricted. For example, while we can’t predict the length of any specific life, we are sure it won’t last for 150 years. The irregularity in individual cases only appears in certain ways, like lifespan, and even there, it has its limits. This observation holds true for most of the other examples we will discuss. The disorder is not universal and infinite; it only occurs in specific ways and to certain extents. 8
§ 5. In speaking as above of a series, it will hardly be necessary to point out that we do not imply that the objects themselves which compose the series must occur successively in time; the series may be formed simply by their coming in succession under our notice, which as a matter of fact they may do in any order whatever. A register of mortality, for instance, may be made up of deaths which took place simultaneously or successively; or, we might if we pleased arrange the deaths in an order quite distinct from either of these. This is entirely a matter of indifference; in all these cases the series, for any purposes which we need take into account, may be regarded as being of precisely the same description. The objects, be it remembered, are given to us in nature; the order under which we view them is our own private arrangement. This is mentioned here simply by way of caution, the meaning of this assertion will become more plain in the sequel.
§ 5. When discussing a series as mentioned above, it’s important to highlight that we’re not suggesting the items that make up the series need to happen one after the other in time; the series can be formed just by them appearing in succession to us, which they can do in any order. For example, a record of deaths might include incidents that occurred at the same time or at different times; or, if we wanted, we could arrange the deaths in a completely different order. This isn’t significant; in all these cases, the series, for any purposes we consider, can be seen as exactly the same type. It’s worth noting that the objects are given to us in nature; the order in which we observe them is our own personal arrangement. This is mentioned here simply as a caution, and the meaning of this statement will become clearer later on.
I am aware that the word ‘series’ in the application with which it is used here is liable to some misconstruction, but I cannot find any better word, or indeed any as suitable in all respects. As remarked above, the events need not necessarily have occurred in a regular sequence of time, though they often will have done so. In many cases (for instance, the throws of a penny or a die) they really do occur in succession; in other cases (for instance, the heights of men, or the duration of their lives), whatever may have been the order of their actual occurrence, they are commonly brought under our notice in succession by being arranged in statistical tables. In all cases alike our processes of inference involve the necessity of examining one after another of the members which compose the group, or at least of being prepared to do 9 this, if we are to be in a position to justify our inferences. The force of these considerations will come out in the course of the investigation in Chapter VI.
I know that the word ‘series’ in this context can be misunderstood, but I can't find a better term that fits as well. As mentioned earlier, the events don’t have to occur in a specific order over time, although they often do. In some cases (like flipping a coin or rolling a die), they really do happen one after another; in other cases (like the heights of people or their lifespans), regardless of the actual order they occurred, we typically look at them in succession when we arrange the data in tables. In every situation, our reasoning processes require us to examine each member of the group one by one, or at least be ready to do so, if we want to support our conclusions. The importance of these points will be highlighted in Chapter VI. 9
The late Leslie Ellis[2] has expressed what seems to me a substantially similar view in terms of genus and species, instead of speaking of a series. He says, “When individual cases are considered, we have no conviction that the ratios of frequency of occurrence depend on the circumstances common to all the trials. On the contrary, we recognize in the determining circumstances of their occurrence an extraneous element, an element, that is, extraneous to the idea of the genus and species. Contingency and limitation come in (so to speak) together; and both alike disappear when we consider the genus in its entirety, or (which is the same thing) in what may be called an ideal and practically impossible realization of all which it potentially contains. If this be granted, it seems to follow that the fundamental principle of the Theory of Probabilities may be regarded as included in the following statement,—The conception of a genus implies that of numerical relations among the species subordinated to it.” As remarked above, this appears a substantially similar doctrine to that explained in this chapter, but I do not think that the terms genus and species are by any means so well fitted to bring out the conception of a tendency or limit as when we speak of a series, and I therefore much prefer the latter expression.
The late Leslie Ellis[2] has shared what seems to be a very similar perspective about genus and species instead of referring to a series. He states, “When we look at individual cases, we can't be sure that the frequency ratios depend on the common circumstances of all the trials. On the contrary, we see that there’s an external factor at play in the circumstances of their occurrence, which is unrelated to the idea of genus and species. Contingency and limitation come into play together, and both disappear when we view the genus as a whole, or in what could be called an ideal and practically impossible realization of everything it potentially includes. If we accept this, it follows that the fundamental principle of the Theory of Probabilities can be expressed as: the concept of a genus includes the idea of numerical relationships among the subordinate species.” As noted earlier, this seems to reflect a similar doctrine to what’s explained in this chapter, but I don’t believe that the terms genus and species convey the idea of a tendency or limit as effectively as when we discuss a series, so I much prefer the latter term.
§ 6. The reader will now have in his mind the conception of a series or group of things or events, about the individuals of which we know but little, at least in certain respects, whilst we find a continually increasing uniformity as we take larger numbers under our notice. This is definite 10 enough to point out tolerably clearly the kind of things with which we have to deal, but it is not sufficiently definite for purposes of accurate thought. We must therefore attempt a somewhat closer analysis.
§ 6. The reader should now have a concept in mind of a series or group of things or events, about which we know very little, at least in some ways, while we observe a steadily increasing uniformity as we consider larger numbers. This is clear enough to indicate the types of things we are dealing with, but it isn’t detailed enough for precise thinking. Therefore, we need to try a more thorough analysis.
There are certain phrases so commonly adopted as to have become part of the technical vocabulary of the subject, such as an ‘event’ and the ‘way in which it can happen.’ Thus the act of throwing a penny would be called an event, and the fact of its giving head or tail would be called the way in which the event happened. If we were discussing tables of mortality, the former term would denote the mere fact of death, the latter the age at which it occurred, or the way in which it was brought about, or whatever else in it might be the particular circumstance under discussion. This phraseology is very convenient, and will often be made use of in this work, but without explanation it may lead to confusion. For in many cases the way in which the event happens is of such great relative importance, that according as it happens in one way or another the event would have a different name; in other words, it would not in the two cases be nominally the same event. The phrase therefore will have to be considerably stretched before it will conveniently cover all the cases to which we may have to apply it. If for instance we were contemplating a series of human beings, male and female, it would sound odd to call their humanity an event, and their sex the way in which the event happened.
There are certain phrases so commonly used that they've become part of the technical vocabulary of the subject, like ‘event’ and ‘the way in which it can happen.’ For example, throwing a penny would be called an event, and whether it lands on heads or tails would be referred to as the way the event happened. If we were discussing tables of mortality, the first term would refer to the simple fact of death, while the second would indicate the age at which it occurred, the manner in which it happened, or any other specific detail relevant to the discussion. This language is very useful and will often be used in this work, but without clarification, it might cause confusion. In many cases, the way an event happens is so significantly different that depending on how it occurs, it might deserve a different name; in other words, it wouldn't be considered the same event in both instances. Therefore, the phrase will need to be stretched quite a bit before it can adequately cover all the situations we may need to apply it to. For instance, if we were looking at a group of human beings, both male and female, it would seem strange to refer to their humanity as an event and their sex as the way the event happened.
If we recur however to any of the classes of objects already referred to, we may see our path towards obtaining a more accurate conception of what we want. It will easily be seen that in every one of them there is a mixture of similarity and dissimilarity; there is a series of events which have a certain number of features or attributes in 11 common,—without this they would not be classed together. But there is also a distinction existing amongst them; a certain number of other attributes are to be found in some and are not to be found in others. In other words, the individuals which form the series are compound, each being made up of a collection of things or attributes; some of these things exist in all the members of the series, others are found in some only. So far there is nothing peculiar to the science of Probability; that in which the distinctive characteristic consists is this;—that the occasional attributes, as distinguished from the permanent, are found on an extended examination to tend to exist in a certain definite proportion of the whole number of cases. We cannot tell in any given instance whether they will be found or not, but as we go on examining more cases we find a growing uniformity. We find that the proportion of instances in which they are found to instances in which they are wanting, is gradually subject to less and less comparative variation, and approaches continually towards some apparently fixed value.
If we go back to any of the types of objects we mentioned earlier, we can see a clearer way to understand what we really want. It's easy to notice that each of these has a mix of similarities and differences; there’s a range of events that share certain features or characteristics in common—without that, they wouldn’t be grouped together. However, there are also distinctions among them; some attributes appear in some and not in others. In other words, the individuals in the group are complex, each made up of a collection of things or attributes; some of these are present in all members of the group, while others are only found in some. Up to this point, there’s nothing unique to the science of Probability; what makes it distinctive is that the occasional attributes, as opposed to the permanent ones, tend to occur in a specific proportion of all the cases when examined more closely. We can’t predict in any one instance whether they will appear, but as we examine more cases, we notice a growing consistency. We see that the ratio of instances where they appear to those where they don’t gradually shows less and less variation, and it continuously approaches a seemingly fixed value.
The above is the most comprehensive form of description; as a matter of fact the groups will in many cases take a far simpler form; they may appear, e.g. simply as a succession of things of the same kind, say human beings, with or without an occasional attribute, say that of being left-handed. We are using the word attribute, of course, in its widest sense, intending it to include every distinctive feature that can be observed in a thing, from essential qualities down to the merest accidents of time and place.
The above is the most complete way to describe things; often, groups will take on a much simpler form. They might show up, for instance, just as a series of similar items, like human beings, with or without an occasional characteristic, such as being left-handed. We're using the term "characteristic" here in its broadest sense, meaning it includes every unique feature that can be seen in something, from essential qualities to the smallest details related to time and place.
§ 7. On examining our series, therefore, we shall find that it may best be conceived, not necessarily as a succession of events happening in different ways, but as a succession of groups of things. These groups, on being analysed, are found in every case to be resolvable into collections of substances 12 and attributes. That which gives its unity to the succession of groups is the fact of some of these substances or attributes being common to the whole succession; that which gives their distinction to the groups in the succession is the fact of some of them containing only a portion of these substances and attributes, the other portion or portions being occasionally absent. So understood, our phraseology may be made to embrace every class of things of which Probability can take account.
§ 7. When we examine our series, we can think of it, not just as a sequence of events occurring in different ways, but as a sequence of groups of things. When we break these groups down, we find that they can always be divided into collections of substances and attributes. The unity of the sequence of groups comes from the fact that some of these substances or attributes are common across the entire sequence; what distinguishes the groups is that some of them include only a portion of these substances and attributes, with other parts being occasionally missing. Understood this way, our terminology can cover every class of things that Probability can account for. 12
§ 8. It will be easily seen that the ordinary expression (viz. the ‘event,’ and the ‘way in which it happens’) may be included in the above. When the occasional attributes are unimportant the permanent ones are sufficient to fix and appropriate the name, the presence or absence of the others being simply denoted by some modification of the name or the addition of some predicate. We may therefore in all such cases speak of the collection of attributes as ‘the event,’—the same event essentially, that is—only saying that it (so as to preserve its nominal identity) happens in different ways in the different cases. When the occasional attributes however are important, or compose the majority, this way of speaking becomes less appropriate; language is somewhat strained by our implying that two extremely different assemblages are in reality the same event, with a difference only in its mode of happening. The phrase is however a very convenient one, and with this caution against its being misunderstood, it will frequently be made use of here.
§ 8. It's easy to see that the common phrase (specifically, the 'event' and 'how it occurs') can fit into the above discussion. When the temporary characteristics aren't important, the lasting ones are enough to define and assign the name, with the presence or absence of the others simply indicated by a modification of the name or the addition of a description. So, in all such instances, we can refer to the collection of characteristics as 'the event'—the same event essentially—just noting that it (to maintain its named identity) occurs in different ways across different cases. However, when the temporary characteristics are significant or make up the majority, this way of speaking becomes less fitting; language feels a bit strained when we suggest that two vastly different collections are actually the same event, differing only in how they happen. Nevertheless, this phrasing is quite useful, and with this caution against misunderstanding, it will often be used here.
§ 9. A series of the above-mentioned kind is, I apprehend, the ultimate basis upon which all the rules of Probability must be based. It is essential to a clear comprehension of the subject to have carried our analysis up to this point, but any attempt at further analysis into the intimate nature of the events composing the series, is not 13 required. It is altogether unnecessary, for instance, to form any opinion upon the questions discussed in metaphysics as to the independent existence of substances. We have discovered, on examination, a series composed of groups of substances and attributes, or of attributes alone. At such a series we stop, and thence investigate our rules of inference; into what these substances or attributes would themselves be ultimately analysed, if taken in hand by the psychologist or metaphysician, it is no business of ours to enquire here.
§ 9. I believe that a series like the one mentioned above is the fundamental foundation for all the rules of Probability. To fully understand the topic, it's important to have completed our analysis up to this point, but any further analysis of the underlying nature of the events in the series isn't necessary. For example, there's no need to weigh in on metaphysical debates about the independent existence of substances. We have found, through examination, a series made up of groups of substances and attributes, or just attributes alone. We focus on that series and then explore our rules of inference; what those substances or attributes could ultimately be broken down into, should a psychologist or metaphysician take it up, is not something we need to investigate here.
§ 10. The stage then which we have now reached is that of having discovered a quantity of things (they prove on analysis to be groups of things) which are capable of being classified together, and are best regarded as constituting a series. The distinctive peculiarity of this series is our finding in it an order, gradually emerging out of disorder, and showing in time a marked and unmistakeable uniformity.
§ 10. The stage we’ve reached now is that we’ve found a number of things (which, upon closer inspection, turn out to be collections of things) that can be grouped together and are best seen as forming a series. What makes this series unique is that we notice an order gradually appearing out of chaos, exhibiting a clear and undeniable consistency over time.
The impression which may possibly be derived from the description of such a series, and which the reader will probably already entertain if he have studied Probability before, is that the gradual evolution of this order is indefinite, and its approach therefore to perfection unlimited. And many of the examples commonly selected certainly tend to confirm such an impression. But in reference to the theory of the subject it is, I am convinced, an error, and one liable to lead to much confusion.
The impression that might come from the description of this series, and which the reader likely already has if they've studied Probability before, is that the gradual evolution of this order is indefinite, making its progress toward perfection unlimited. Many of the commonly chosen examples do seem to support this idea. However, regarding the theory of the subject, I believe it's a mistake and one that can cause a lot of confusion.
The lines which have been prefixed as a motto to this work, “So careful of the type she seems, so careless of the single life,” are soon after corrected by the assertion that the type itself, if we regard it for a long time, changes, and then vanishes and is succeeded by others. So in Probability; that uniformity which is found in the long run, and which presents so great a contrast to the individual 14 disorder, though durable is not everlasting. Keep on watching it long enough, and it will be found almost invariably to fluctuate, and in time may prove as utterly irreducible to rule, and therefore as incapable of prediction, as the individual cases themselves. The full bearing of this fact upon the theory of the subject, and upon certain common modes of calculation connected with it, will appear more fully in some of the following chapters; at present we will confine ourselves to very briefly establishing and illustrating it.
The lines that serve as a motto for this work, “So careful of the type she seems, so careless of the single life,” are soon followed by the claim that the type itself, if observed for a long time, changes, then fades away and is replaced by others. This is similar in Probability; the consistency that appears over time, which starkly contrasts with individual disorder, though long-lasting, isn’t permanent. If you keep watching it long enough, you’ll find that it almost always fluctuates and eventually might be just as unpredictable as the individual cases themselves. The full implications of this fact on the theory of the subject, and on certain common calculation methods related to it, will be more fully explored in some upcoming chapters; for now, we will briefly establish and illustrate it.
Let us take, for example, the average duration of life. This, provided our data are sufficiently extensive, is known to be tolerably regular and uniform. This fact has been already indicated in the preceding sections, and is a truth indeed of which the popular mind has a tolerably clear grasp at the present day. But a very little consideration will show that there may be a superior as well as an inferior limit to the extent within which this uniformity can be observed; in other words whilst we may fall into error by taking too few instances we may also fail in our aim, though in a very different way and from quite different reasons, by taking too many. At the present time the average duration of life in England may be, say, forty years; but a century ago it was decidedly less; several centuries ago it was presumably very much less; whilst if we possessed statistics referring to a still earlier population of the country we should probably find that there has been since that time a still more marked improvement. What may be the future tendency no man can say for certain. It may be, and we hope that it will be the case, that owing to sanitary and other improvements, the duration of life will go on increasing steadily; it is at least conceivable, though doubtless incredible, that it should do so without limit. On the other hand, and with much more likelihood, this duration might gradually tend towards some fixed 15 length. Or, again, it is perfectly possible that future generations might prefer a short and a merry life, and therefore reduce their average longevity. The duration of life cannot but depend to some extent upon the general tastes, habits and employments of the people, that is upon the ideal which they consciously or unconsciously set before them, and he would be a rash man who should undertake to predict what this ideal will be some centuries hence. All that it is here necessary however to indicate is, that this particular uniformity (as we have hitherto called it, in order to mark its relative character) has varied, and, under the influence of future eddies in opinion and practice, may vary still; and this to any extent, and with any degree of irregularity. To borrow a term from Astronomy, we find our uniformity subject to what might be called an irregular secular variation.
Let’s consider the average lifespan, for instance. If our data is broad enough, it’s known to be fairly regular and consistent. This point has already been made in the earlier sections, and it’s something that most people today understand quite well. However, a little thought will reveal that there could be both an upper and a lower limit to the range within which this consistency can be observed; in other words, while we might make mistakes by analyzing too few examples, we could also miss the mark, albeit in a different way and for totally different reasons, by analyzing too many. Right now, the average lifespan in England might be around forty years; but a century ago, it was definitely lower; several centuries ago, it was likely much lower; and if we had statistics from an even earlier population, we would probably find that significant improvements have happened since then. No one can say for sure what the future will hold. It’s possible, and we hope it will be, that due to health and other advancements, life expectancy will continue to rise steadily; it’s even conceivable—though probably unbelievable—that it could do so indefinitely. On the other hand, and much more likely, this lifespan could gradually settle into some fixed average. Or, it’s entirely possible that future generations might choose a short and fun life, resulting in a decrease in their average lifespan. Life expectancy can’t help but depend to some degree on the general preferences, habits, and jobs of the people, essentially on the ideals they consciously or unconsciously aspire to, and anyone would be foolish to predict what that ideal will look like centuries from now. What we need to highlight here is that this particular consistency (as we’ve referred to it, to indicate its relative nature) has changed, and, influenced by future shifts in opinion and practice, may continue to change; this could occur to any extent and with any degree of irregularity. To borrow a term from astronomy, we find that our consistency is subject to what could be called an irregular secular variation.
§ 11. The above is a fair typical instance. If we had taken a less simple feature than the length of life, or one less closely connected with what may be called by comparison the great permanent uniformities of nature, we should have found the peculiarity under notice exhibited in a far more striking degree. The deaths from small-pox, for example, or the instances of duelling or accusations of witchcraft, if examined during a few successive decades, might have shown a very tolerable degree of uniformity. But these uniformities have risen possibly from zero; after various and very great fluctuations seem tending towards zero again, at least in this century; and may, for anything we know, undergo still more rapid fluctuations in future. Now these examples must be regarded as being only extreme ones, and not such very extreme ones, of what is the almost universal rule in nature. I shall endeavour to show that even the few apparent exceptions, such as the proportions between male and female births, &c., may not be, and probably in reality 16 are not, strictly speaking, exceptions. A type, that is, which shall be in the fullest sense of the words, persistent and invariable is scarcely to be found in nature. The full import of this conclusion will be seen in future chapters. Attention is only directed here to the important inference that, although statistics are notoriously of no value unless they are in sufficient numbers, yet it does not follow but that in certain cases we may have too many of them. If they are made too extensive, they may again fall short, at least for any particular time or place, of their greatest attainable accuracy.
§ 11. The example above is a typical case. If we had chosen a more complex factor than life expectancy, or one more linked to what can be considered the significant, consistent patterns of nature, we would have seen the peculiarity in question displayed even more dramatically. For instance, the deaths from smallpox, instances of dueling, or accusations of witchcraft, if analyzed over a few decades, might show a fairly good level of consistency. However, these consistencies may have started from zero and, after various large fluctuations, seem to be returning towards zero again, at least in this century; and for all we know, they might undergo even more rapid changes in the future. These examples should be seen as just extreme cases, albeit not excessively extreme ones, of what is almost a universal principle in nature. I will attempt to demonstrate that even the few apparent exceptions, such as the ratios of male to female births, etc., may not actually be exceptions at all, and probably are not, strictly speaking. A type that is completely persistent and unchanging is hard to find in nature. The full significance of this conclusion will be clarified in later chapters. Here, I just want to emphasize the important takeaway that while statistics are famously unreliable without sufficient numbers, it doesn’t mean that in some cases we can’t have too many. If they become overly extensive, they might fall short of achieving their maximum accuracy for any specific time or place.
§ 12. These natural uniformities then are found at length to be subject to fluctuation. Now contrast with them any of the uniformities afforded by games of chance; these latter seem to show no trace of secular fluctuation, however long we may continue our examination of them. Criticisms will be offered, in the course of the following chapters, upon some of the common attempts to prove à priori that there must be this fixity in the uniformity in question, but of its existence there can scarcely be much doubt. Pence give heads and tails about equally often now, as they did when they were first tossed, and as we believe they will continue to do, so long as the present order of things continues. The fixity of these uniformities may not be as absolute as is commonly supposed, but no amount of experience which we need take into account is likely in any appreciable degree to interfere with them. Hence the obvious contrast, that, whereas natural uniformities at length fluctuate, those afforded by games of chance seem fixed for ever.
§ 12. These natural uniformities are eventually subject to change. Now, if we compare them to the uniformities found in games of chance, the latter appear to have no signs of long-term fluctuation, no matter how long we study them. In the upcoming chapters, we will critique some common attempts to prove à priori that there must be this stability in the uniformity we’re discussing, but there's hardly any doubt about its existence. Coins land on heads and tails about equally often now, just as they did when they were first tossed, and we believe they will continue to do so, as long as everything remains the same. The stability of these uniformities may not be as absolute as commonly thought, but no amount of experience we consider is likely to significantly affect them. Thus, we see a clear contrast: while natural uniformities eventually fluctuate, those seen in games of chance seem to be fixed forever.
§ 13. Here then are series apparently of two different kinds. They are alike in their initial irregularity, alike in their subsequent regularity; it is in what we may term their ultimate form that they begin to diverge from each other. The one tends without any irregular variation 17 towards a fixed numerical proportion in its uniformity; in the other the uniformity is found at last to fluctuate, and to fluctuate, it may be, in a manner utterly irreducible to rule.
§ 13. Here we have two different types of series. They both start off irregularly and then become regular, but they begin to differ in their final forms. One series consistently moves towards a stable numerical ratio in its uniformity; in the other, the uniformity ultimately varies, possibly in a way that can't be simplified into a rule. 17
As this chapter is intended to be little more than explanatory and illustrative of the foundations of the science, the remark may be made here (for which subsequent justification will be offered) that it is in the case of series of the former kind only that we are able to make anything which can be interpreted into strict scientific inferences. We shall be able however in a general way to see the kind and extent of error that would be committed if, in any example, we were to substitute an imaginary series of the former kind for any actual series of the latter kind which experience may present to us. The two series are of course to be as alike as possible in all respects, except that the variable uniformity has been replaced by a fixed one. The difference then between them would not appear in the initial stage, for in that stage the distinctive characteristics of the series of Probability are not apparent; all is there irregularity, and it would be as impossible to show that they were alike as that they were different; we can only say generally that each shows the same kind of irregularity. Nor would it appear in the next subsequent stage, for the real variability of the uniformity has not for some time scope to make itself perceived. It would only be in what we have called the ultimate stage, when we suppose the series to extend for a very long time, that the difference would begin to make itself felt.[3] The proportion of persons, for example, who die each year at the age of six months is, when the numbers examined are on a 18 small scale, utterly irregular; it becomes however regular when the numbers examined are on a larger scale; but if we continued our observation for a very great length of time, or over a very great extent of country, we should find this regularity itself changing in an irregular way. The substitution just mentioned is really equivalent to saying, Let us assume that the regularity is fixed and permanent. It is making a hypothesis which may not be altogether consistent with fact, but which is forced upon us for the purpose of securing precision of statement and definition.
As this chapter aims to be mostly explanatory and illustrative of the foundations of the science, it's important to mention (with future justification to follow) that we can only draw strict scientific conclusions from series of the first kind. However, we will generally see the kind and extent of error that could occur if, in any example, we were to replace an imaginary series of the first kind with any actual series of the second kind that experience might present. The two series should be as similar as possible in every way, except that the variable uniformity has been replaced with a fixed one. The differences between them won't be apparent at first since the distinctive characteristics of probability series aren't clear at that stage; everything appears irregular, and it would be impossible to demonstrate that they are alike or different; we can only say that they exhibit the same type of irregularity. The distinction also wouldn't show in the next stage, since the real variability of the uniformity takes time to make itself known. It is only in what we call the ultimate stage, when we assume that the series extends over a long time, that the difference begins to emerge. For instance, the proportion of people who die each year at six months old is utterly irregular when examined on a small scale; however, it becomes regular on a larger scale. Yet, if we continued our observation for a very long time or across a vast area, we would find that even this regularity starts to change in an irregular manner. The substitution mentioned really means we're assuming that the regularity is fixed and permanent. This is a hypothesis that may not completely align with reality, but it's necessary for achieving precision in statement and definition.
§ 14. The full meaning and bearing of such a substitution will only become apparent in some of the subsequent chapters, but it may be pointed out at once that it is in this way only that we can with perfect strictness introduce the notion of a ‘limit’ into our account of the matter, at any rate in reference to many of the applications of the subject to purely statistical enquiries. We say that a certain proportion begins to prevail among the events in the long run; but then on looking closer at the facts we find that we have to express ourselves hypothetically, and to say that if present circumstances remain as they are, the long run will show its characteristics without disturbance. When, as is often the case, we know nothing accurately of the circumstances by which the succession of events is brought about, but have strong reasons to suspect that these circumstances are likely to undergo some change, there is really nothing else to be done. We can only introduce the conception of a limit, towards which the numbers are tending, by assuming that these circumstances do not change; in other words, by substituting a series with a fixed uniformity for the actual one with the varying uniformity.[4]
§ 14. The full meaning and significance of such a substitution will only become clear in some of the following chapters, but it's worth noting right away that this is the only way we can strictly introduce the idea of a ‘limit’ into our discussion, especially concerning many applications of the subject to purely statistical inquiries. We say that a certain proportion starts to dominate among events over time; however, when we examine the facts more closely, we realize we have to speak hypothetically and say that if the current circumstances stay the same, the long run will reveal its characteristics without interruption. Often, we don't clearly understand the conditions that affect how events unfold, but we have strong reasons to think these conditions are likely to change; in such cases, there’s really nothing else we can do. We can only introduce the idea of a limit that the numbers are approaching by assuming these circumstances remain unchanged; in other words, by replacing a series with consistent uniformity for the actual one with varying uniformity.[4]
§ 15. If the reader will study the following example, one well known to mathematicians under the name of the Petersburg[5] problem, he will find that it serves to illustrate several of the considerations mentioned in this chapter. It serves especially to bring out the facts that the series with which we are concerned must be regarded as indefinitely extensive in point of number or duration; and that when so regarded certain series, but certain series only (the one in question being a case in point), take advantage of the indefinite range to keep on producing individuals in it whose deviation from the previous average has no finite limit whatever. When rightly viewed it is a very simple problem, but it has given rise, at one time or another, to a good deal of confusion and perplexity.
§ 15. If the reader studies the following example, well known among mathematicians as the Petersburg problem, they will see that it illustrates several points mentioned in this chapter. It specifically highlights that the series we are discussing should be considered as indefinitely large in terms of either numbers or duration; and that when viewed this way, certain series—only certain series (the one in question being a prime example)—exploit the limitless range to keep producing outcomes that have no finite limit in deviation from the previous average. When understood correctly, it’s a very straightforward problem, but it has caused quite a bit of confusion and uncertainty at various times.
The problem may be stated thus:—a penny is tossed up; if it gives head I receive one pound; if heads twice running two pounds; if heads three times running four pounds, and so on; the amount to be received doubling every time that a fresh head succeeds. That is, I am to go on as long as it continues to give a succession of heads, to regard this succession as a ‘turn’ or set, and then take another turn, and so on; and for each such turn I am to receive a payment; the occurrence of tail being understood to yield nothing, in fact being omitted from our consideration. However many times head may be given in succession, the number of pounds I may claim is found by raising two to a power one less 20 than that number of times. Here then is a series formed by a succession of throws. We will assume,—what many persons will consider to admit of demonstration, and what certainly experience confirms within considerable limits,—that the rarity of these ‘runs’ of the same face is in direct proportion to the amount I receive for them when they do occur. In other words, if we regard only the occasions on which I receive payments, we shall find that every other time I get one pound, once in four times I get two pounds, once in eight times four pounds, and so on without any end. The question is then asked, what ought I to pay for this privilege? At the risk of a slight anticipation of the results of a subsequent chapter, we may assume that this is equivalent to asking, what amount paid each time would on the average leave me neither winner nor loser? In other words, what is the average amount I should receive on the above terms? Theory pronounces that I ought to give an infinite sum: that is, no finite sum, however great, would be an adequate equivalent. And this is really quite intelligible. There is a series of indefinite length before me, and the longer I continue to work it the richer are my returns, and this without any limit whatever. It is true that the very rich hauls are extremely rare, but still they do come, and when they come they make it up by their greater richness. On every occasion on which people have devoted themselves to the pursuit in question, they made acquaintance, of course, with but a limited portion of this series; but the series on which we base our calculation is unlimited; and the inferences usually drawn as to the sum which ought in the long run to be paid for the privilege in question are in perfect accordance with this supposition.
The problem can be stated like this: a penny is tossed up; if it lands on heads, I receive one pound; if it lands on heads twice in a row, I get two pounds; if it lands on heads three times in a row, I receive four pounds, and so on—the amount I receive doubles every time there’s a new head. In other words, I keep going as long as heads keep coming up, treat this streak as a ‘set,’ then take another set, and so forth. For each set, I get a payout; a tail is understood to mean I get nothing, and we simply ignore it. However many times heads appear in a row, the number of pounds I can claim is calculated by raising two to a power one less than that number of times. Here we have a series created by a sequence of tosses. We will assume—what many might consider provable, and what certainly experience supports up to a certain point—that the infrequency of these 'streaks' of the same side is directly related to the amount I receive when they do happen. In other words, if we focus only on the times I receive payouts, we will find that every other time I get one pound, once every four times I get two pounds, once every eight times I get four pounds, and this pattern continues indefinitely. The question is then asked, how much should I pay for this privilege? Although this might anticipate the outcomes of a later chapter slightly, we can assume this is equivalent to asking, what amount paid each time would, on average, leave me neither winning nor losing? In other words, what is the average amount I should receive based on the terms above? Theory suggests that I should pay an infinite sum: no finite sum, no matter how large, would be an adequate equivalent. And this makes perfect sense. There is a series of unlimited length ahead of me, and the longer I work it, the richer my returns, without any limit. It’s true that the really big payouts are extremely rare, but they do happen, and when they do, they compensate for their rarity with greater value. Whenever people have pursued this, they encountered only a limited part of this series; however, the series we base our calculations on is boundless, and the conclusions typically drawn about the amount that ought to be paid for this privilege align perfectly with this assumption.
The common form of objection is given in the reply, that so far from paying an infinite sum, no sensible man would 21 give anything approaching to £50 for such a chance. Probably not, because no man would see enough of the series to make it worth his while. What most persons form their practical opinion upon, is such small portions of the series as they have actually seen or can reasonably expect. Now in any such portion, say one which embraces 100 turns, the longest succession of heads would not amount on the average to more than seven or eight. This is observed, but it is forgotten that the formula which produced these, would, if it had greater scope, keep on producing better and better ones without any limit. Hence it arises that some persons are perplexed, because the conduct they would adopt, in reference to the curtailed portion of the series which they are practically likely to meet with, does not find its justification in inferences which are necessarily based upon the series in the completeness of its infinitude.
The common objection is that instead of paying an infinite amount, no reasonable person would spend even close to £50 for such a chance. Probably not, because no one would observe enough of the series to make it worthwhile. Most people base their practical opinion on the small parts of the series they have actually seen or can realistically expect. Now, in any such part, say one that includes 100 flips, the longest run of heads would typically only be about seven or eight. This is noted, but it’s often forgotten that the formula producing these outcomes would, if given a larger scope, continue to generate longer runs without any limit. This leads to confusion for some individuals, as the approach they would take regarding the limited portion of the series they are likely to encounter doesn’t hold up when compared to conclusions that must be based on the series in its complete infinity.
§ 16. This will be more clearly seen by considering the various possibilities, and the scope required in order to exhaust them, when we confine ourselves to a limited number of throws. Begin with three. This yields eight equally likely possibilities. In four of these cases the thrower starts with tail and therefore loses: in two he gains a single point (i.e. £1); in one he gains two points, and in one he gains four points. Hence his total gain being eight pounds achieved in four different contingencies, his average gain would be two pounds.
§ 16. This will be clearer if we look at the different possibilities and the variety needed to cover them when we limit ourselves to a few throws. Let’s start with three. This results in eight equally likely outcomes. In four of these cases, the thrower starts with tails and therefore loses; in two, he gains a single point (i.e. £1); in one, he gains two points, and in one, he gains four points. Therefore, his total gain is eight pounds achieved in four different scenarios, so his average gain would be two pounds.
Now suppose he be allowed to go as far as n throws, so that we have to contemplate 2n possibilities. All of these have to be taken into account if we wish to consider what happens on the average. It will readily be seen that, when all the possible cases have been reckoned once, his total gain will be (reckoned in pounds),
Now let’s say he can throw the dice up to n times, which means we need to think about 2n possibilities. We have to consider all of these if we want to understand what happens on average. It will be clear that once we’ve counted all the possible outcomes, his total gain will be (measured in pounds),
This being spread over 2n−1 different occasions of gain his average gain will be 1/2(n + 1).
This being spread over 2n−1 different occasions of gain, his average gain will be 1/2(n + 1).
Now when we are referring to averages it must be remembered that the minimum number of different occurrences necessary in order to justify the average is that which enables each of them to present itself once. A man proposes to stop short at a succession of ten heads. Well and good. We tell him that his average gain will be £5. 10s. 0d.: but we also impress upon him that in order to justify this statement he must commence to toss at least 1024 times, for in no less number can all the contingencies of gain and loss be exhibited and balanced. If he proposes to reach an average gain of £20, he will require to be prepared to go up to 39 throws, To justify this payment he must commence to throw 239 times, i.e. about a million million times. Not before he has accomplished this will he be in a position to prove to any sceptic that this is the true average value of a ‘turn’ extending to 39 successive tosses.
Now when we talk about averages, it's important to remember that the minimum number of different outcomes needed to support the average is the amount that allows each of them to show up at least once. A guy decides to stop after flipping ten heads. That's fine. We tell him that his average gain will be £5.10. But we also emphasize that to support this claim, he needs to start tossing at least 1024 times, since that’s the minimum number needed to cover all the possible wins and losses. If he aims for an average gain of £20, he should be prepared to go up to 39 flips. To justify this payout, he must be ready to flip 239 times, which is about a million million times. Only after he does this will he be able to convince any skeptics that this is the true average value of a series of 39 consecutive tosses.
Of course if he elects to toss to all eternity we must adopt the line of explanation which alone is possible where questions of infinity in respect of number and magnitude are involved. We cannot tell him to pay down ‘an infinite sum,’ for this has no strict meaning. But we tell him that, however much he may consent to pay each time runs of heads occur, he will attain at last a stage in which he will have won back his total payments by his total receipts. However large n may be, if he perseveres in trying 2n times he may have a true average receipt of 1/2 (n + 1) pounds, and if he continues long enough onwards he will have it.
Of course, if he chooses to gamble endlessly, we need to stick to the only explanation that makes sense when dealing with infinity in terms of numbers and amounts. We can’t tell him to pay an 'infinite sum,' since that doesn't have a clear meaning. Instead, we explain that no matter how much he agrees to pay each time he gets heads, he will eventually reach a point where his total payouts equal his total winnings. No matter how large n gets, if he keeps trying 2n times, he may achieve an average return of 1/2 (n + 1) pounds, and if he continues long enough, he will get there.
The problem will recur for consideration in a future chapter.
The issue will come up again in a later chapter.
1 The following statistics will give a fair idea of the wide range of experience over which such regularity is found to exist: “As illustrations of equal amounts of fluctuation from totally dissimilar causes, take the deaths in the West district of London in seven years (fluctuation 13.66), and offences against the person (fluctuation 13.61); or deaths from apoplexy (fluctuation 5.54), and offences against property, without violence (fluctuation 5.48); or students registered at the College of Surgeons (fluctuation 1.85), and the number of pounds of manufactured tobacco taken for home consumption (fluctuation 1.89); or out-door paupers (fluctuation 3.45) and tonnage of British vessels entered in ballast (fluctuation 3.43), &c.” [Extracted from a paper in the Journal of the Statistical Society, by Mr Guy, March, 1858; the ‘fluctuation’ here given is a measure of the amount of irregularity, that is of departure from the average, estimated in a way which will be described hereafter.]
1 The following statistics will provide a clear idea of the broad range of experience where such regularity is observed: “For examples of equal levels of fluctuation from completely different causes, consider the deaths in the West district of London over seven years (fluctuation 13.66), and offenses against the person (fluctuation 13.61); or deaths from apoplexy (fluctuation 5.54), and offenses against property without violence (fluctuation 5.48); or students registered at the College of Surgeons (fluctuation 1.85), and the amount of manufactured tobacco consumed at home (fluctuation 1.89); or outdoor paupers (fluctuation 3.45) and the tonnage of British vessels entering in ballast (fluctuation 3.43), etc.” [Extracted from a paper in the Journal of the Statistical Society, by Mr Guy, March, 1858; the ‘fluctuation’ provided here is a measure of the amount of irregularity, meaning the deviation from the average, estimated in a way that will be explained later.]
2 Transactions of the Cambridge Philosophical Society, Vol. IX. p. 605. Reprinted in the collected edition of his writings, p. 50.
2 Transactions of the Cambridge Philosophical Society, Vol. IX. p. 605. Reprinted in the collected edition of his writings, p. 50.
3 We might express it thus:—a few instances are not sufficient to display a law at all; a considerable number will suffice to display it; but it takes a very great number to establish that a change is taking place in the law.
3 We could say this: a few examples don't really show a law; a good number can illustrate it; but it takes a really large number to prove that a change is happening in the law.
4 The mathematician may illustrate the nature of this substitution by the analogies of the ‘circle of curvature’ in geometry, and the ‘instantaneous ellipse’ in astronomy. In the cases in which these conceptions are made use of we have a phenomenon which is continuously varying and also changing its rate of variation. We take it at some given moment, suppose its rate at that moment to be fixed, and then complete its career on that supposition.
4 The mathematician can explain this substitution using the concepts of the ‘circle of curvature’ in geometry and the ‘instantaneous ellipse’ in astronomy. In these cases, we deal with a phenomenon that is constantly changing and also altering its rate of change. We observe it at a specific moment, assume its rate at that moment is constant, and then continue its progression based on that assumption.
5 So called from its first mathematical treatment appearing in the Commentarii of the Petersburg Academy; a variety of notices upon it will be found in Mr Todhunter's History of the Theory of Probability.
5 It's named after its first mathematical discussion that appeared in the Commentarii of the Petersburg Academy; various mentions of it can be found in Mr. Todhunter's History of the Theory of Probability.
CHAPTER 2.
FURTHER DISCUSSION UPON THE NATURE OF THE SERIES MENTIONED IN THE LAST CHAPTER.
§ 1. In the course of the last chapter the nature of a particular kind of series, that namely, which must be considered to constitute the basis of the science of Probability, has received a sufficiently general explanation for the preliminary purpose of introduction. One might indeed say more than this; for the characteristics which were there pointed out are really sufficient in themselves to give a fair general idea of the nature of Probability, and of the sort of problems with which it deals. But in the concluding paragraphs an indication was given that the series of this kind, as they actually occur in nature or as the results of more or less artificial production, are seldom or never found to occur in such a simple form as might possibly be expected from what had previously been said; but that they are almost always seen to be associated together in groups after a somewhat complicated fashion. A fuller discussion of this topic must now be undertaken.
§ 1. In the last chapter, we explored the nature of a specific type of series that forms the foundation of Probability theory. The explanation provided was broad enough for an introductory purpose. One might even go further; the characteristics discussed are adequate to give a clear overall understanding of Probability and the kinds of problems it addresses. However, it was noted in the concluding paragraphs that these series, as they appear in nature or result from more or less artificial processes, are rarely found in the straightforward form one might expect based on previous discussions. Instead, they are almost always seen grouped together in a more complex manner. A more detailed examination of this topic needs to be addressed now.
We will take for examination an instance of a kind with which the investigations of Quetelet will have served to familiarize some readers. Suppose that we measure the heights of a great many adult men in any town or country. These heights will of course lie between certain extremes in 24 each direction, and if we continue to accumulate our measures it will be found that they tend to lie continuously between these extremes; that is to say, that under those circumstances no intermediate height will be found to be permanently unrepresented in such a collection of measurements. Now suppose these heights to be marshalled in the order of their magnitude. What we always find is something of the following kind;—about the middle point between the extremes, a large number of the results will be found crowded together: a little on each side of this point there will still be an excess, but not to so great an extent; and so on, in some diminishing scale of proportion, until as we get towards the extreme results the numbers thin off and become relatively exceedingly small.
We will examine an example that the research of Quetelet will have helped some readers become familiar with. Let’s say we measure the heights of a large number of adult men in any town or country. These heights will naturally fall between certain extremes on either end, and as we keep collecting our measurements, we’ll find that they tend to fall continuously between these extremes. This means that, in this case, no height in between will be permanently absent from our collection of measurements. Now, let’s arrange these heights in order from shortest to tallest. What we usually find is something like the following: around the midpoint between the extremes, there will be a significant number of results clustered together; slightly on each side of this midpoint, there will still be more heights, but not as many; and this pattern continues, tapering off until we reach the extremes, where the numbers become relatively very small.
The point to which attention is here directed is not the mere fact that the numbers thus tend to diminish from the middle in each direction, but, as will be more fully explained directly, the law according to which this progressive diminution takes place. The word ‘law’ is here used in its mathematical sense, to express the formula connecting together the two elements in question, namely, the height itself, and the relative number that are found of that height. We shall have to enquire whether one of these elements is a function of the other, and, if so, what function.
The focus here isn’t just on the fact that the numbers tend to decrease from the middle in both directions, but, as will be explained shortly, on the law that describes how this gradual decrease happens. The term ‘law’ is used in its mathematical context to refer to the formula linking the two elements involved: the height itself and the number of instances at that height. We will need to investigate whether one of these elements is a function of the other, and if it is, what kind of function it is.
§ 2. After what was said in the last chapter, it need hardly be insisted upon that the interest and significance of such investigations as these are almost entirely dependent upon the statistics being very extensive. In one or other of Quetelet's works on Social Physics[1] will be found a selection of measurements of almost every element which the physical frame of man can furnish:—his height, his weight, the muscular power of various limbs, the dimensions of almost every 25 part and organ, and so on. Some of the most extensive of these express the heights of 25,000 Federal soldiers from the Army of the Potomac, and the circumferences of the chests of 5738 Scotch militia men taken many years ago. Those who wish to consult a large repertory of such statistics cannot be referred to any better sources than to these and other works by the same author.[2]
§ 2. Building on what was discussed in the last chapter, it’s clear that the importance and relevance of these investigations heavily rely on the data being extensive. In one of Quetelet's works on Social Physics[1], you can find a collection of measurements for almost every aspect of the human body: his height, weight, the strength of different limbs, the sizes of nearly every part and organ, and more. Some of the most comprehensive data includes the heights of 25,000 Federal soldiers from the Army of the Potomac and the chest measurements of 5,738 Scottish militia men taken many years ago. Those who want to explore a broad range of such statistics can't find better sources than these and other works by the same author.[2]
Interesting and valuable, however, as are Quetelet's statistical investigations (and much of the importance now deservedly attached to such enquiries is, perhaps, owing more to his efforts than to those of any other person), I cannot but feel convinced that there is much in what he has written upon the subject which is erroneous and confusing as regards the foundations of the science of Probability, and the philosophical questions which it involves. These errors are not by any means confined to him, but for various reasons they will be better discussed in the form of a criticism of his explicit or implicit expression of them, than in any more independent way.
Interesting and valuable as Quetelet's statistical studies are (and much of the significance currently attached to such inquiries is likely due more to his efforts than to anyone else's), I can’t help but feel that there is a lot in what he has written on the subject that is wrong and confusing regarding the foundations of Probability and the philosophical questions it raises. These mistakes aren’t limited to him, but for various reasons, it’s better to discuss them by critiquing his explicit or implicit expressions rather than in a more independent manner.
§ 3. In the first place then, he always, or almost always, assumes that there can be but one and the same law of arrangement for the results of our observations, measurements, and so on, in these statistical enquiries. That is, he assumes that whenever we get a group of such magnitudes clustering about a mean, and growing less frequent as 26 they depart from that mean, we shall find that this diminution of frequency takes place according to one invariable law, whatever may be the nature of these magnitudes, and whatever the process by which they may have been obtained.
§ 3. First of all, he always, or almost always, assumes that there can only be one consistent law governing the arrangement of the results from our observations, measurements, and so on in these statistical studies. This means he believes that whenever we have a group of values clustering around a mean and becoming less frequent as they deviate from that mean, we will find that this decrease in frequency follows one unchanging law, no matter what the nature of these values is or how they were obtained.
That such a uniformity as this should prevail amongst many and various classes of phenomena would probably seem surprising in any case. But the full significance of such a fact as this (if indeed it were a fact) only becomes apparent when attention is directed to the profound distinctions in the nature and origin of the phenomena which are thus supposed to be harmonized by being brought under one comprehensive principle. This will be better appreciated if we take a brief glance at some of the principal classes into which the things with which Probability is chiefly concerned may be divided. These are of a three-fold kind.
That such uniformity should exist among many different types of phenomena would likely seem surprising in any situation. However, the true importance of this fact (if it is indeed a fact) only becomes clear when we focus on the deep differences in the nature and origin of the phenomena that are thought to be unified under one broad principle. This will be better understood if we take a quick look at some of the main categories of things that Probability mainly deals with. These can be classified into three types.
§ 4. In the first place there are the various combinations, and runs of luck, afforded by games of chance. Suppose a handful, consisting of ten coins, were tossed up a great many times in succession, and the results were tabulated. What we should obtain would be something of the following kind. In a certain proportion of cases, and these the most numerous of all, we should find that we got five heads and five tails; in a somewhat less proportion of cases we should have, as equally frequent results, four heads six tails, and four tails six heads; and so on in a continually diminishing proportion until at length we came down, in a very small relative number of cases, to nine heads one tail, and nine tails one head; whilst the least frequent results possible would be those which gave all heads or all tails.[3] 27 Here the statistical elements under consideration are, as regards their origin at any rate, optional or brought about by human choice. They would, therefore, be commonly described as being mainly artificial, but their results ultimately altogether a matter of chance.
§ 4. First, there are the different combinations and outcomes that come from games of chance. Imagine tossing a handful of ten coins many times in a row and recording the results. What we would get might look something like this: In a certain number of cases, which would be the most common, we'd find five heads and five tails; in a somewhat smaller number of cases, we would see equally frequent outcomes of four heads and six tails, and four tails and six heads; and so on, with the proportions continually decreasing until we reach, in a very small number of cases, nine heads and one tail, and nine tails and one head; while the least frequent outcomes would be getting all heads or all tails.[3] 27 Here, the statistical elements we’re looking at, at least in terms of their origin, are optional or influenced by human choice. Therefore, they are usually described as mainly artificial, but their outcomes are ultimately a matter of chance.
Again, in the second place, we might take the accurate measurements—i.e. the actual magnitudes themselves,—of a great many natural objects, belonging to the same genus or class; such as the cases, already referred to, of the heights, or other characteristics of the inhabitants of any district. Here human volition or intervention of any kind seem to have little or nothing to do with the matter. It is optional with us to collect the measures, but the things measured are quite outside our control. They would therefore be commonly described as being altogether the production of nature, and it would not be supposed that in strictness chance had anything whatever to do with the matter.
Once again, in the second point, we could take precise measurements—i.e. the actual sizes themselves—of many natural objects that belong to the same genus or class; like the examples already mentioned regarding the heights or other characteristics of the inhabitants in any area. Here, human choice or intervention seems to have very little, if anything, to do with it. It's up to us to gather the measurements, but the items being measured are completely beyond our control. They would typically be described as entirely the result of nature, and it wouldn’t be assumed that pure chance had any role in the outcome.
In the third place, the result at which we are aiming may be some fixed magnitude, one and the same in each of our successive attempts, so that if our measurements were rigidly accurate we should merely obtain the same result repeated over and over again. But since all our methods of attaining our aims are practically subject to innumerable imperfections, the results actually obtained will depart more or less, in almost every case, from the real and fixed value which we are trying to secure. They will be sometimes more wide of the mark, sometimes less so, the worse attempts being of course the less frequent. If a man aims at a target he will seldom or never hit it precisely in the centre, but his good shots will be more[4] 28 numerous than his bad ones. Here again, then, we have a series of magnitudes (i.e. the deflections of the shots from the point aimed at) clustering about a mean, but produced in a very different way from those of the last two cases. In this instance the elements would be commonly regarded as only partially the results of human volition, and chance therefore as being only a co-agent in the effects produced. With these must be classed what may be called estimates, as distinguished from measurements. By the latter are generally understood the results of a certain amount of mechanism or manipulation; by the former we may understand those cases in which the magnitude in question is determined by direct observation or introspection. The interest and importance of this class, so far as scientific principles are concerned, dates mainly from the investigations of Fechner. Its chief field is naturally to be found amongst psychological data.
In the third place, the outcome we're aiming for could be some fixed value, the same one in each of our successive attempts. So, if our measurements were perfectly accurate, we would just get the same result repeatedly. But since all our methods for reaching our goals are practically subject to countless imperfections, the results we actually get will differ from the real and fixed value we're trying to achieve. Sometimes they'll miss the mark by a lot and other times by less, with the worst attempts being, of course, the least frequent. If a person aims at a target, they will rarely hit it precisely in the center, but their good shots will be more numerous than their bad ones. Again, we see a series of values (i.e. the deviations of the shots from the target) clustering around an average, but these are produced very differently from those in the last two cases. In this instance, the elements might be seen as partially the results of human intention, with chance being just a co-agent in the outcomes. Also, we can classify what might be called estimates, as distinct from measurements. Measurements generally refer to the results from a certain amount of mechanical process or manipulation, while estimates relate to cases where the value in question is determined through direct observation or introspection. The significance of this class, particularly in terms of scientific principles, primarily stems from the research of Fechner. Its main arena is naturally centered around psychological data. 28
Other classes of things, besides those alluded to above, might readily be given. These however are the classes about which the most extensive statistics are obtainable, or to which the most practical importance and interest are attached. The profound distinctions which separate their origin and character are obvious. If they all really did display precisely the same law of variation it would be a most remarkable fact, pointing doubtless to some deep-seated identity underlying the various ways, apparently so widely distinct, in which they had been brought about. The questions now to be discussed are: Is it the case, with any considerable degree of rigour, that only one law of distribution does really prevail? and, in so far as this is so, how does it come to pass?
Other categories of things, apart from those mentioned earlier, could easily be provided. However, these are the categories for which the most extensive statistics are available, or that have the most practical significance and interest. The clear differences that separate their origins and characteristics are apparent. If they really did all follow exactly the same pattern of variation, it would be an astonishing fact, likely indicating some deep-rooted similarity underlying the various methods, which seem so distinct, in which they were created. The questions to be explored now are: Is it true, with any significant level of certainty, that only one distribution law actually exists? And, to the extent that this is the case, how does that happen?
§ 5. In support of an affirmative answer to the former of these two questions, several different kinds of proof are, or might be, offered.
§ 5. To support a yes answer to the first of these two questions, various types of evidence are, or could be, provided.
(I.) For one plan we may make a direct appeal to experience, by collecting sets of statistics and observing what is their law of distribution. As remarked above, this has been done in a great variety of cases, and in some instances to a very considerable extent, by Quetelet and others. His researches have made it abundantly convincing that many classes of things and processes, differing widely in their nature and origin, do nevertheless appear to conform with a considerable degree of accuracy to one and the same[5] law. At least this is made plain for the more 30 central values, for those that is which are situated most nearly about the mean. With regard to the extreme values there is, on the other hand, some difficulty. For instance in the arrangements of the heights of a number of men, these extremes are rather a stumbling-block; indeed it has been proposed to reject them from both ends of the scale on the plea that they are monstrosities, the fact being that their relative numbers do not seem to be by any means those which theory would assign.[6] Such a plan of rejection is however quite unauthorized, for these dwarfs and giants are born into the world like their more normally sized brethren, and have precisely as much right as any others to be included in the formulæ we draw up.
(I.) One way we can directly appeal to experience is by collecting sets of statistics and examining how they are distributed. As mentioned earlier, this has been done in a variety of cases, and to a significant extent, by Quetelet and others. His research has clearly shown that many classes of things and processes, which vary widely in nature and origin, still tend to adhere to one common law with a notable degree of accuracy. This is especially evident for the more central values, those that are closest to the mean. However, there is some difficulty regarding the extreme values. For example, when looking at the heights of a group of men, these extremes can be problematic; in fact, it has been suggested to disregard them from both ends of the scale on the grounds that they are outliers, since their relative numbers don’t align with what theory would predict. Such a plan of exclusion is, however, completely unjustified, as these short and tall individuals enter the world just like their more average-sized counterparts and have just as much right to be included in the formulas we create.
Besides the instance of the heights of men, other classes 31 of observations of a somewhat similar character have been already referred to as collected and arranged by Quetelet. From the nature of the case, however, there are not many appropriate ones at hand; for when our object is, not to illustrate a law which can be otherwise proved, but to obtain actual direct proof of it, the collection of observations and measurements ought to be made upon such a large scale as to deter any but the most persevering computers from undergoing the requisite labour. Some of the remarks made in the course of the note on the opposite page will serve to illustrate the difficulties which would lie in the way of such a mode of proof.
Besides the example of men's heights, other types of observations with a similar nature have already been mentioned as collected and organized by Quetelet. However, there aren't many suitable ones available; because when our goal is not to illustrate a law that can be proven in other ways, but to obtain direct proof of it, the collection of observations and measurements should be done on such a large scale that only the most dedicated researchers would be willing to undertake the necessary effort. Some of the comments made in the note on the opposite page will help illustrate the challenges that would arise with this method of proof.
We are speaking here, it must be understood, only of symmetrical curves: if there is asymmetry, i.e. if the Law of Error is different on different sides of the mean,—a comparatively very small number of observations would suffice to detect the fact. But, granted symmetry and rapid decrease of frequency on each side of the mean, we could generally select some one species of the exponential curve which should pretty closely represent our statistics in the neighbourhood of the mean. That is, where the statistics are numerous we could secure agreement; and where we could not secure agreement the statistics would be comparatively so scarce that we should have to continue the observations for a very long time in order to prove the disagreement.
We are only discussing symmetrical curves here: if there's any asymmetry, meaning if the Law of Error varies on different sides of the mean, then a relatively small number of observations would be enough to notice that. However, assuming symmetry and a quick drop in frequency on both sides of the mean, we could usually choose one specific type of exponential curve that would closely match our statistics around the mean. In other words, where the statistics are plentiful, we could ensure consistency; and where we couldn't ensure consistency, the statistics would be so limited that we'd need to keep observing for a long time to demonstrate the inconsistency.
§ 6. Allowing the various statistics such credit as they deserve, for their extent, appropriateness, accuracy and so on, the general conclusion which will on the whole be drawn by almost every one who takes the trouble to consult them, is that they do, in large part, conform approximately to one type or law, at any rate for all except the extreme values. So much as this must be fully admitted. But that they do not, indeed we may say that they cannot, always do so in 32 the case of the extreme values, will become obvious on a little consideration. In some of the classes of things to which the law is supposed to apply, for example, the successions of heads and tails in the throws of a penny, there is no limit to the magnitude of the fluctuations which may and will occur. Postulate as long a succession of heads or of tails as we please, and if we could only live and toss long enough for it we should succeed in getting it at length. In other cases, including many of the applications of Probability to natural phenomena, there can hardly fail to be such limits. Deviations exceeding a certain range may not be merely improbable, that is of very rare occurrence, but they may often from the nature of the case be actually impossible. And even when they are not actually impossible it may frequently appear on examination that they are only rendered possible by the occasional introduction of agencies which are not supposed to be available in the production of the more ordinary or intermediate values. When, for instance, we are making observations with any kind of instrument, the nature of its construction may put an absolute limit upon the possible amount of error. And even if there be not an absolute limit under all kinds of usage it may nevertheless be the case that there is one under fair and proper usage; it being the case that only when the instrument is designedly or carelessly tampered with will any new causes of divergence be introduced which were not confined within the old limits.
§ 6. Giving credit to the various statistics for their scope, relevance, accuracy, and so on, the overall conclusion that most people will reach after reviewing them is that they largely conform to a common type or law, at least for all values except the extreme ones. This much must be fully accepted. However, it should be noted that they do not—and we might say they cannot—always do so in the case of extreme values, which will become clear upon a bit of reflection. In some cases, such as the sequences of heads and tails from flipping a coin, there are no limits to the size of the fluctuations that can and will happen. If we imagine a long series of heads or tails, if we could just live and flip the coin long enough, we would eventually achieve it. In other cases, particularly many applications of Probability to natural phenomena, there are likely to be such limits. Deviations beyond a certain range may not just be unlikely, meaning that they occur very rarely, but they can often be practically impossible due to the nature of the situation. Moreover, even when they are not outright impossible, examination may reveal that they are only made possible by the occasional introduction of factors that are not typically considered in producing the more usual or intermediate values. For example, when we take measurements with any kind of instrument, its design can impose a strict limit on the potential amount of error. And even if there isn't an absolute limit in all usage scenarios, there may still be one under proper and fair usage; typically, only when the instrument is intentionally or carelessly manipulated will any new causes of error be introduced that were not part of the previous limits.
Suppose, for instance, that a man is firing at a mark. His worst shots must be supposed to be brought about by a combination of such causes as were acting, or prepared to act, in every other case; the extreme instance of what we may thus term ‘fair usage’ being when a number of distinct causes have happened to conspire together so as 33 to tend in the same direction, instead of, as in the other cases, more or less neutralizing one another's work. But the aggregate effect of such causes may well be supposed to be limited. The man will not discharge his shot nearly at right angles to the true line of fire unless some entirely new cause comes in, as by some unusual circumstance having distracted his attention, or by his having had some spasmodic seizure. But influences of this kind were not supposed to have been available before; and even if they were we are taking a bold step in assuming that these occasional great disturbances are subject to the same kind of laws as are the aggregates of innumerable little ones.
Imagine, for example, a man aiming at a target. His worst shots can be thought of as being caused by a mix of factors that are at play, or ready to act, in every other situation; the extreme case of what we can call ‘fair usage’ occurs when several distinct factors happen to come together to work in the same direction, rather than, as in other situations, canceling each other out. However, the overall impact of such factors is likely to be limited. The man is unlikely to fire his shot at a nearly right angle to the true line of fire unless some completely new factor intervenes, like an unusual distraction or a sudden spasm. But these kinds of influences weren’t thought to be present before; and even if they were, we’re making a bold assumption by suggesting that these occasional major disturbances follow the same kind of rules as the combination of countless smaller ones.
We cannot indeed lay much stress upon an example of this last kind, as compared with those in which we can see for certain that there is a fixed limit to the range of error. It is therefore offered rather for illustration than for proof. The enormous, in fact inconceivable magnitude of the numbers expressive of the chance of very rare combinations, such as those in question, has such a bewildering effect upon the mind that one may be sometimes apt to confound the impossible with the higher degrees of the merely mathematically improbable.
We really can't emphasize an example of this last kind too much, especially compared to those where we can clearly see a fixed limit to the range of error. So, it’s more for illustration than for proof. The huge, almost unimaginable size of the numbers indicating the likelihood of very rare combinations, like the ones we're discussing, can be so confusing that it’s easy to confuse the impossible with the more likely but still mathematically improbable situations.
§ 7. At the time the first edition of this essay was composed writers on Statistics were, I think, still for the most part under the influence of Quetelet, and inclined to overvalue his authority on this particular subject: of late however attention has been repeatedly drawn to the necessity of taking account of other laws of arrangement than the binomial or exponential.
§ 7. When the first edition of this essay was written, I believe most writers on Statistics were still largely influenced by Quetelet and tended to place too much importance on his authority regarding this topic. Recently, however, there has been a growing emphasis on the need to consider other arrangements beyond just the binomial or exponential.
Mr Galton, for instance,—to whom every branch of the theory of statistics owes so much,—has insisted[7] that the “assumption which lies at the basis of the well-known law of 34 ‘Frequency of Error’… is incorrect in many groups of vital and social phenomena…. For example, suppose we endeavour to match a tint; Fechner's law, in its approximative and simplest form of sensation = log stimulus, tells us that a series of tints, in which the quantities of white scattered on a black ground are as 1, 2, 4, 8, 16, 32, &c., will appear to the eye to be separated by equal intervals of tint. Therefore, in matching a grey that contains 8 portions of white, we are just as likely to err by selecting one that has 16 portions as one that has 4 portions. In the first case there would be an error in excess, of 8; in the second there would be an error, in deficiency, of 4. Therefore, an error of the same magnitude in excess or in deficiency is not equally probable.” The consequences of this assumption are worked out in a remarkable paper by Dr D. McAlister, to which allusion will have to be made again hereafter. All that concerns us here to point out is that when the results of statistics of this character are arranged graphically we do not get a curve which is symmetrical on both sides of a central axis.
Mr. Galton, for example—who has contributed so much to the theory of statistics—has emphasized that the “assumption underlying the well-known law of ‘Frequency of Error’ is incorrect in many groups of vital and social phenomena. For instance, if we try to match a color, Fechner's law, in its simplest and most approximate form of sensation = log stimulus, tells us that a series of colors where the amounts of white mixed into a black background are 1, 2, 4, 8, 16, 32, etc., will appear to the eye as if they are evenly spaced in color. Therefore, when matching a gray that has 8 portions of white, we are just as likely to make a mistake by choosing one that has 16 portions as we are by selecting one that has 4 portions. In the first scenario, the error would be an excess of 8; in the second, there would be a deficiency error of 4. Consequently, an error of the same size, whether excess or deficiency, is not equally probable.” The implications of this assumption are explored in a remarkable paper by Dr. D. McAlister, which will be referenced again later. What we need to point out here is that when the results of statistics like this are arranged graphically, we do not get a curve that is symmetrical on both sides of a central axis.
§ 8. More recently, Mr F. Y. Edgeworth (in a report of a Committee of the British Association appointed to enquire into the variation of the monetary standard) has urged the same considerations in respect of prices of commodities. He gives a number of statistics “drawn from the prices of twelve commodities during the two periods 1782–1820, 1820–1865. The maximum and minimum entry for each series having been noted, it is found that the number of entries above the ‘middle point,’ half-way between the maximum and minimum,[8] is in every instance less than half the total number of entries in the series. In the twenty-four trials there is not a single exception to the rule, and in very few cases even an approach 35 to an exception. We may presume then that the curves are of the lop-sided character indicated by the accompanying diagram.” The same facts are also ascertained in respect to place variations as distinguished from time variations. To these may be added some statistics of my own, referring to the heights of the barometer taken at the same hour on more than 4000 successive days (v. Nature, Sept. 2, 1887). So far as these go they show a marked asymmetry of arrangement.
§ 8. Recently, Mr. F. Y. Edgeworth (in a report from a Committee of the British Association assigned to investigate changes in the monetary standard) has emphasized similar points regarding commodity prices. He presents various statistics based on the prices of twelve commodities during two periods: 1782–1820 and 1820–1865. After noting the highest and lowest entries for each series, it turns out that the number of entries above the 'middle point,' which is halfway between the maximum and minimum, [8] is consistently less than half of the total number of entries in the series. In the twenty-four trials, there hasn’t been a single exception to this rule, and in very few cases is there even a near exception. We can therefore assume that the curves have the lopsided shape shown in the accompanying diagram.” The same findings apply to place variations as distinct from time variations. Additionally, I can include some of my own statistics regarding the barometric pressure recorded at the same hour over more than 4000 consecutive days (see Nature, Sept. 2, 1887). As far as these results go, they indicate a significant asymmetry in the arrangement.
In fact it appears to me that this want of symmetry ought to be looked for in all cases in which the phenomena under measurement are of a ‘one-sided’ character; in the sense that they are measured on one side only of a certain fixed point from which their possibility is supposed to start. For not only is it impossible for them to fall below this point: long before they reach it the influence of its proximity is felt in enhancing the difficulty and importance of the same amount of absolute difference.
In fact, it seems to me that this lack of symmetry should be considered in all cases where the phenomena being measured are 'one-sided'; in the sense that they are measured only on one side of a certain fixed point from which their possibility is assumed to begin. Not only is it impossible for them to fall below this point, but long before they reach it, the effect of being close to this point is felt in increasing the difficulty and significance of the same absolute difference.
Look at a table of statures, for instance, with a mean value of 69 inches. A diminution of three feet (were this possible) is much more influential,—counts for much more, in every sense of the term,—than an addition of the same amount; for the former does not double the mean, while the latter more than halves it. Revert to an illustration. If a vast number of petty influencing circumstances of the kind already described were to act upon a swinging pendulum we should expect the deflections in each direction to display symmetry; but if they were to act upon a spring we should not expect such a result. Any phenomena of which the latter is the more appropriate illustration can hardly be expected to range themselves with symmetry about a mean.[9]
Look at a table of heights, for example, with an average value of 69 inches. A decrease of three feet (if that were possible) has a much bigger impact—it means a lot more in every way—than an increase of the same amount; because the former doesn’t double the average, while the latter reduces it by more than half. Let’s go back to an example. If a lot of small influencing factors, like the ones we’ve already talked about, were to act on a swinging pendulum, we would expect the swings in both directions to be symmetrical; but if they were acting on a spring, we wouldn’t expect that outcome. Any situations that are better illustrated by the latter can hardly be expected to arrange themselves symmetrically around an average.[9]
§ 9. (II.) The last remarks will suggest another kind of proof which might be offered to establish the invariable nature of the law of error. It is of a direct deductive kind, not appealing immediately to statistics, but involving an enquiry into the actual or assumed nature of the causes by which the events are brought about. Imagine that the event under consideration is brought to pass, in the first place, by some fixed cause, or group of fixed causes. If this comprised all the influencing circumstances the event would invariably happen in precisely the same way: there would be no errors or deflections whatever to be taken account of. But now suppose that there were also an enormous number of very small causes which tended to produce deflections; that these causes acted in entire independence of one another; and that each of the lot told as often, in the long run, in one direction as in the opposite. It is easy[10] to see, in a general way, what would follow from these assumptions. In a very few cases nearly all the causes would tell in the same direction; in other words, in a very few cases the deflection would be extreme. In a greater number of cases, however, it would only be the most part of them that would tell in one direction, whilst a few did what they could to counteract the rest; the result being a comparatively larger number of somewhat smaller deflections. So on, in increasing numbers, till we approach the middle point. Here we shall have a very large number of very small deflections: the cases in which the opposed influences just succeed in balancing one another, so that no error whatever is produced, being, though actually infrequent, relatively the most frequent of all.
§ 9. (II.) The final comments will introduce another type of proof that could be used to demonstrate the consistent nature of the error law. This proof is direct and deductive, not immediately relying on statistics, but instead examining the actual or assumed nature of the causes that lead to the events. Imagine that the event we’re discussing is caused, first and foremost, by some fixed cause or a group of fixed causes. If these were the only influencing factors, the event would always occur in exactly the same way: there would be no errors or variations to consider. Now, let’s suppose there are also a vast number of very small causes that tend to create deviations; these causes operate completely independently of each other, and each one contributes equally over time, both in one direction and the opposite. It’s easy to understand, in a general sense, what would result from these assumptions. In a very limited number of cases, nearly all the causes would influence the same direction; in other words, there would be a few cases of extreme deviation. However, in a larger number of cases, most of the causes would act in one direction while a few would try to offset the others, resulting in a comparatively larger number of somewhat smaller deviations. This continues, increasing in numbers, until we reach the midpoint. Here, we’ll find a very large number of very small deviations: the situations in which opposing influences effectively balance each other out, leading to no error at all, being, although actually rare, the most common of all.
Now if all deflections from a mean were brought about in the way just indicated (an indication which must suffice for the present) we should always have one and the same law of arrangement of frequency for these deflections or errors, viz. the exponential[11] law mentioned in § 5.
Now, if all deviations from an average happened in the way just described (which will have to do for now), we would always have the same pattern of frequency for these deviations or errors, namely the exponential[11] law mentioned in § 5.
§ 10. It may be readily admitted from what we know about the production of events that something resembling 38 these assumptions, and therefore something resembling the consequences which follow from them, is really secured in a very great number of cases. But although this may prevail approximately, it is in the highest degree improbable that it could ever be secured, even artificially, with anything approaching to rigid accuracy. For one thing, the causes of deflection will seldom or never be really independent of one another. Some of them will generally be of a kind such that the supposition that several are swaying in one direction, may affect the capacity of each to produce that full effect which it would have been capable of if it had been left to do its work alone. In the common example, for instance, of firing at a mark, so long as we consider the case of the tolerably good shots the effect of the wind (one of the causes of error) will be approximately the same whatever may be the precise direction of the bullet. But when a shot is considerably wide of the mark the wind can no longer be regarded as acting at right angles to the line of flight, and its effect in consequence will not be precisely the same as before. In other words, the causes here are not strictly independent, as they were assumed to be; and consequently the results to be attributed to each are not absolutely uninfluenced by those of the others. Doubtless the effect is trifling here, but I apprehend that if we were carefully to scrutinize the modes in which the several elements of the total cause conspire together, we should find that the assumption of absolute independence was hazardous, not to say unwarrantable, in a very great number of cases. These brief remarks upon the process by which the deflections are brought about must suffice for the present purpose, as the subject will receive a fuller investigation in the course of the next chapter.
§ 10. It can be easily accepted that, based on what we know about how events happen, something like these assumptions—and therefore something like the consequences that come from them—does actually hold true in a lot of cases. However, while this might be generally accurate, it's extremely unlikely that it could ever be achieved, even artificially, with anything close to exact precision. One reason is that the causes of variation are rarely or never truly independent from each other. Many of them are typically of a nature that the idea of several influencing one direction may limit each one’s ability to create the full impact it would have had if it were allowed to operate on its own. For instance, in the common example of aiming at a target, as long as we consider reasonably good shots, the effect of the wind (one of the error causes) will be roughly the same regardless of the bullet's exact direction. But when a shot is significantly off-target, the wind can no longer be seen as acting at a right angle to the path of the bullet, and its impact will not be exactly the same as before. In other words, the causes here are not entirely independent, as we had assumed; therefore, the results attributed to each aren't completely unaffected by the others. Certainly, the effect is minor in this case, but I suspect that if we were to closely examine how the different elements of the overall cause interact, we’d find that the assumption of complete independence is risky, if not unjustified, in many cases. These brief comments on how the variations occur will have to be enough for now, as the topic will be more thoroughly explored in the next chapter.
According, therefore, to the best consideration which can at the present stage be afforded to this subject, we may 39 draw a similar conclusion from this deductive line of argument as from the direct appeal to statistics. The same general result seems to be established; namely, that approximately, with sufficient accuracy for all practical purposes, we may say that an examination of the causes by which the deflections are generally brought about shows that they are mostly of such a character as would result in giving us the commonly accepted ‘Law of Error,’ as it is termed.[12] The two lines of enquiry, therefore, within the limits assigned, afford each other a decided mutual confirmation.
Based on the best analysis we can provide at this stage, we can reach a similar conclusion from this deductive reasoning as we can from the straightforward use of statistics. The same general outcome appears to be confirmed; namely, that approximately, with enough accuracy for practical purposes, we can state that an investigation into the causes of the deflections typically reveals that they mostly align with what is commonly known as the 'Law of Error.' The two lines of inquiry, therefore, within the defined limits, support each other significantly.
§ 11. (III.) There still remains a third, indirect and mathematical line of proof, which might be offered to establish the conclusion that the Law of Error is always one and the same. It may be maintained that the recognized and universal employment of one and the same method, that known to mathematicians and astronomers as the Method of Least Squares, in all manner of different cases with very satisfactory results, is compatible only with the supposition that the errors to which that method is applied must be grouped according to one invariable law. If all ‘laws of error’ were not of one and the same type, that is, if the relative frequency of large and small divergences (such as we have been speaking of) were not arranged according to one pattern, how could one method or rule equally suit them all?
§ 11. (III.) There’s also a third, indirect, and mathematical way to prove that the Law of Error is always the same. It can be argued that the widespread and consistent use of the same method, known to mathematicians and astronomers as the Method of Least Squares, in various cases with very good results, only makes sense if the errors this method is applied to follow a single, unchanging law. If all “laws of error” were not of the same kind, meaning if the relative frequency of large and small divergences (like we’ve been discussing) were not organized according to one pattern, how could one method or rule work for all of them?
In order to preserve a continuity of treatment, some notice must be taken of this enquiry here, though, as in the case of the last argument, any thorough discussion of the 40 subject is impossible at the present stage. For one thing, it would involve too much employment of mathematics, or at any rate of mathematical conceptions, to be suitable for the general plan of this treatise: I have accordingly devoted a special chapter to the consideration of it.
To maintain a consistent approach, we need to address this inquiry here, although, similar to the last point, a detailed discussion of the subject isn't feasible at this stage. For one thing, it would require too much mathematical work, or at least mathematical ideas, to fit the overall plan of this book: I've therefore dedicated a separate chapter to explore it.
The main reason, however, against discussing this argument here, is, that to do so would involve the anticipation of a totally different side of the science of Probability from that hitherto treated of. This must be especially insisted upon, as the neglect of it involves much confusion and some error. During these earlier chapters we have been entirely occupied with laying what may be called the physical foundations of Probability. We have done nothing else than establish, in one way or another, the existence of certain groups or arrangements of things which are found to present themselves in nature; we have endeavoured to explain how they come to pass, and we have illustrated their principal characteristics. But these are merely the foundations of Inference, we have not yet said a word upon the logical processes which are to be erected upon these foundations. We have not therefore entered yet upon the logic of chance.
The main reason for not discussing this argument here is that doing so would involve anticipating a completely different aspect of the science of Probability than what we've talked about so far. It’s important to emphasize this, as ignoring it can lead to a lot of confusion and some mistakes. In these earlier chapters, we have focused solely on building what might be called the physical foundations of Probability. We have done nothing but establish, in one way or another, the existence of certain groups or arrangements of things that appear in nature; we have tried to explain how they arise, and we have illustrated their main characteristics. But these are just the foundations of Inference; we haven’t yet discussed the logical processes that will be built on these foundations. Therefore, we have not yet addressed the logic of chance.
§ 12. Now the way in which the Method of Least Squares is sometimes spoken of tends to conceal the magnitude of this distinction. Writers have regarded it as synonymous with the Law of Error, whereas the fact is that the two are not only totally distinct things but that they have scarcely even any necessary connection with each other. The Law of Error is the statement of a physical fact; it simply assigns, with more or less of accuracy, the relative frequency with which errors or deviations of any kind are found in practice to present themselves. It belongs therefore to what may be termed the physical foundations of the science. The Method of Least Squares, on the other hand, is not a law at all in the 41 scientific sense of the term. It is simply a rule or direction informing us how we may best proceed to treat any group of these errors which may be set before us, so as to extract the true result at which they have been aiming. Clearly therefore it belongs to the inferential or logical part of the subject.
§ 12. The way the Method of Least Squares is sometimes discussed can downplay the importance of this distinction. Some writers have treated it as the same as the Law of Error, but in reality, the two are completely different and barely related. The Law of Error describes a physical fact; it indicates, with varying degrees of accuracy, how often errors or deviations occur in practice. This belongs to what can be called the physical foundations of the science. The Method of Least Squares, however, isn't a law in the scientific sense. It’s simply a guideline that tells us how to best handle a group of these errors to get to the true result we’re aiming for. Therefore, it clearly falls under the inferential or logical aspects of the subject.
It cannot indeed be denied that the methods we employ must have some connection with the arrangement of the facts to which they are applied; but the two things are none the less distinct in their nature, and in this case the connection does not seem at all a necessary one, but at most one of propriety and convenience. The Method of Least Squares is usually applied, no doubt, to the most familiar and common form of the Law of Error, namely the exponential form with which we have been recently occupied. But other forms of laws of error may exist, and, if they did, the method in question might equally well be applied to them. I am not asserting that it would necessarily be the best method in every case, but it would be a possible one; indeed we may go further and say, as will be shown in a future chapter, that it would be a good method in almost every case. But its particular merits or demerits do not interfere with its possible employment in every case in which we may choose to resort to it. It will be seen therefore, even from the few remarks that can be made upon the subject here, that the fact that one and the same method is very commonly employed with satisfactory results affords little or no proof that the errors to which it is applied must be arranged according to one fixed law.
It can't be denied that the methods we use must have some connection with the way the facts are organized; however, these two aspects are still distinct in nature, and in this case, the connection doesn’t seem necessary—it's more one of appropriateness and practicality. The Method of Least Squares is often applied to the most familiar and common form of the Law of Error, specifically the exponential form we've been discussing recently. However, there could be other forms of error laws, and if they exist, this method could be applied to them just as well. I'm not claiming it would necessarily be the best method in every situation, but it would be a valid option; in fact, we can say, as will be demonstrated in a future chapter, that it would likely be a good method in almost all cases. But whether it's particularly advantageous or disadvantageous doesn't affect its potential use in any situation we choose to apply it to. Therefore, even from the limited comments I can provide on this topic here, the fact that a single method is commonly used with positive results offers little to no evidence that the errors it's applied to must follow one specific law.
§ 13. So much then for the attempt to prove the prevalence, in all cases, of this particular law of divergence. The next point in Quetelet's treatment of the subject which deserves attention as erroneous or confusing, is the doctrine maintained by him and others as to the existence of what he terms a type 42 in the groups of things in question. This is a not unnatural consequence from some of the data and conclusions of the last few paragraphs. Refer back to two of the three classes of things already mentioned in § 4. If it really were the case that in arranging in order a series of incorrect observations or attempts of our own, and a collection of natural objects belonging to some one and the same species or class, we found that the law of their divergence was in each case identical in the long run, we should be naturally disposed to apply the same expression ‘Law of Error’ to both instances alike, though in strictness it could only be appropriate to the former. When we perform an operation ourselves with a clear consciousness of what we are aiming at, we may quite correctly speak of every deviation from this as being an error; but when Nature presents us with a group of objects of any kind, it is using a rather bold metaphor to speak in this case also of a law of error, as if she had been aiming at something all the time, and had like the rest of us missed her mark more or less in almost every instance.[13]
§ 13. So much for the effort to prove the dominance, in every situation, of this specific law of divergence. The next aspect of Quetelet's discussion on the subject that warrants attention as problematic or misleading is the belief he and others hold regarding the existence of what he calls a type 42 in the groups of items in question. This is a somewhat expected result from some of the data and conclusions of the last few paragraphs. Refer back to two of the three categories of items already mentioned in § 4. If it were indeed true that when we organized a series of inaccurate observations or our attempts, and a collection of natural objects belonging to the same species or class, we found that the law of their divergence was consistently the same over time, we would naturally be inclined to apply the same term ‘Law of Error’ to both situations, even though technically it could only apply to the former. When we conduct an operation ourselves with a clear understanding of our goal, we can correctly refer to any deviation from this as an error; however, when Nature presents us with a collection of objects of any kind, it's a rather bold metaphor to label this situation as a law of error, as if Nature has been aiming for something all along and, like us, has missed the mark in almost every case. [13]
Suppose we make a long succession of attempts to measure accurately the precise height of a man, we should from one cause or another seldom or never succeed in doing so with absolute accuracy. But we have no right to assume that these imperfect measurements of ours would be found so to deviate according to one particular law of error as to present the precise counterpart of a series of actual heights of different men, supposing that these latter were assigned with absolute precision. What might be the actual law of error in a series of direct measurements of any given magnitude could hardly be asserted beforehand, and probably the attempt to determine 43 it by experience has not been made sufficiently often to enable us to ascertain it; but upon general grounds it seems by no means certain that it would follow the so-called exponential law. Be this however as it may, it is rather a licence of language to talk as if nature had been at work in the same way as one of us; aiming (ineffectually for the most part) at a given result, that is at producing a man endowed with a certain stature, proportions, and so on, who might therefore be regarded as the typical man.
If we repeatedly try to measure the exact height of a person, we would rarely or never achieve complete accuracy for one reason or another. However, we can’t assume that these imperfect measurements would deviate in a specific way that matches up perfectly with a set of actual heights of differently sized individuals, assuming those heights were determined with absolute precision. We couldn't predict what the actual pattern of error would be in a set of direct measurements of any specific quantity, and likely, the effort to figure it out hasn't been done frequently enough for us to know it; but in general, it doesn’t seem likely that it would follow the so-called exponential law. Regardless, it's somewhat incorrect to suggest that nature operates the same way we do, striving (mostly unsuccessfully) for a specific outcome, like creating an individual with a certain height, body proportions, and so on, who could therefore be seen as the typical person.
§ 14. Stated as above, namely, that there is a fixed invariable human type to which all individual specimens of humanity may be regarded as having been meant to attain, but from which they have deviated in one direction or another; according to a law of deviation capable of à priori determination, the doctrine is little else than absurd. But if we look somewhat closer at the facts of the case, and the probable explanation of these facts, we may see our way to an important truth. The facts, on the authority of Quetelet's statistics (the great interest and value of which must be frankly admitted), are very briefly as follows: if we take any element of our physical frame which admits of accurate measurement, say the height, and determine this measure in a great number of different individuals belonging to any tolerably homogeneous class of people, we shall find that these heights do admit of an orderly arrangement about a mean, after the fashion which has been already repeatedly mentioned. What is meant by a homogeneous class? is a pertinent and significant enquiry, but applying this condition to any simple cases its meaning is readily stated. It implies that the mean in question will be different according to the nationality of the persons under measurement. According to Quetelet,[14] in the case of Englishmen the mean is about 44 5 ft. 9 in.; for Belgians about 5 ft. 7 in.; for the French about 5 ft. 4 in. It need hardly be added that these measures are those of adult males.
§ 14. As stated above, there exists a fixed human type that all individuals are meant to achieve, even though they've strayed from it in various ways; proposing that there's a law of deviation that can be determined à prior is mostly foolish. However, if we examine the facts more closely and consider their likely explanations, we might uncover an important truth. According to Quetelet's statistics (which are undeniably intriguing and valuable), the facts are straightforward: if we take a measurable part of our physical body, like height, and assess it in a large group of fairly similar people, we will find that these heights can be organized around an average, as previously discussed. What exactly constitutes a homogeneous class? This is a relevant question, but when it applies to simple cases, its meaning is easy to understand. It indicates that the average height will differ based on the nationality of the individuals being measured. According to Quetelet, for English men, the average height is about 44 5 ft. 9 in.; for Belgians, it's about 5 ft. 7 in.; and for the French, it's about 5 ft. 4 in. It's worth noting that these measurements pertain to adult males.
§ 15. It may fairly be asked here what would have been the consequence, had we, instead of keeping the English and the French apart, mixed the results of our measurements of them all together? The question is an important one, as it will oblige us to understand more clearly what we mean by homogeneous classes. The answer that would usually be given to it, though substantially correct, is somewhat too decisive and summary. It would be said that we are here mixing distinctly heterogeneous elements, and that in consequence the resultant law of error will be by no means of the simple character previously exhibited. So far as such an answer is to be admitted its grounds are easy to appreciate. In accordance with the usual law of error the divergences from the mean grow continuously less numerous as they increase in amount. Now, if we mix up the French and English heights, what will follow? Beginning from the English mean of 5 feet 9 inches, the heights will at first follow almost entirely the law determined by these English conditions, for at this point the English data are very numerous, and the French by comparison very few. But, as we begin to approach the French mean, the numbers will cease to show that continual diminution which they should show, according to the English scale of arrangement, for here the French data are in turn very numerous, and the English by comparison few. The result of such a combination of heterogeneous 45 elements is illustrated by the figure annexed, of course in a very exaggerated form.
§ 15. It’s reasonable to ask what would have happened if, instead of keeping the English and French data separate, we had combined all our measurements. This is an important question because it pushes us to clarify what we mean by homogeneous categories. The typical response, while mostly correct, tends to be a bit too definitive and brief. It would be said that we are mixing distinctly different elements, and as a result, the overall error law won’t be as straightforward as before. If we accept this response, its reasoning is easy to understand. According to the usual law of error, the deviations from the average become progressively less frequent as they increase in size. So, if we mix the heights of the French and the English, what happens? Starting from the English average of 5 feet 9 inches, the heights will initially follow the pattern determined by the English data because at this point there are a lot of English measurements and relatively few French ones. However, as we get closer to the French average, the frequency of heights won’t show the consistent decrease expected based on the English arrangement because now the French data is more abundant and the English data comparatively scarce. The outcome of such a mix of different elements is shown in the accompanying figure, albeit in a very exaggerated manner. 45

§ 16. In the above case the nature of the heterogeneity, and the reasons why the statistics should be so collected and arranged as to avoid it, seemed tolerably obvious. It will be seen still more plainly if we take a parallel case drawn from artificial proceedings. Suppose that after a man had fired a few thousand shots at a certain spot, say a wafer fixed somewhere on a wall, the position of the spot at which he aims were shifted, and he fired a few thousand more shots at the wafer in its new position. Now let us collect and arrange all the shots of both series in the order of their departure from either of the centres, say the new one. Here we should really be mingling together two discordant sets of elements, either of which, if kept apart from the other, would have been of a simple and homogeneous character. We should find, in consequence, that the resultant law of error betrayed its composite or heterogeneous origin by a glaring departure from the customary form, somewhat after the fashion indicated in the above diagram.
§ 16. In this case, the nature of the differences, along with the reasons for organizing the statistics to avoid them, is pretty clear. It will become even clearer if we look at a similar example from controlled experiments. Imagine a man who has fired several thousand shots at a specific target, like a sticker placed on a wall. If the target is then moved and he fires several thousand more shots at the sticker in its new location, and we collect and arrange all the shots from both rounds based on their distance from the new center, we would be mixing two incompatible sets of data. Each set, if analyzed separately, would have been simple and uniform. As a result, the overall pattern of errors would show a noticeable deviation from the usual form, similar to what the previous diagram illustrates.
The instance of the English and French heights resembles the one just given, but falls far short of it in the stringency with which the requisite conditions are secured. The fact is we have not here got the most suitable requirements, viz. a group consisting of a few fixed causes supplemented by innumerable little disturbing influences. What we call a nation is really a highly artificial body, the members of 46 which are subject to a considerable number of local or occasional disturbing causes. Amongst Frenchmen were included, presumably, Bretons, Provençals, Alsatians, and so on, thus commingling distinctions which, though less than those between French and English, regarded as wholes, are very far from being insignificant. And to these differences of race must be added other disturbances, also highly important, dependent upon varying climate, food and occupation. It is plain, therefore, that whatever objections exist against confusing together French and English statistics, exist also, though of course in a less degree, against confusing together those of the various provincial and other components which make up the French people.
The example of the English and French differences is similar to the previous one but doesn't match it in how strictly the necessary conditions are met. The truth is that we don't have the most suitable requirements here, namely a group made up of a few fixed factors along with countless small influencing factors. What we refer to as a nation is actually a very artificial entity, with members that are affected by a significant number of local or occasional disruptive causes. Among French people, there are likely Bretons, Provençals, Alsatians, and so on, mixing distinctions that, while fewer than those between the French and English as whole groups, are still quite significant. Additionally, these racial differences must be considered along with other important influences related to varying climate, food, and occupations. Therefore, it's clear that any objections to merging French and English statistics also apply, though to a lesser extent, when it comes to combining statistics from the various provincial and other groups that make up the French population.
§ 17. Out of the great variety of important causes which influence the height of men, it is probable that those which most nearly fulfil the main conditions required by the ‘Law of Error’ are those about which we know the least. Upon the effects of food and employment, observation has something to say, but upon the purely physiological causes by which the height of the parents influences the height of the offspring, we have probably nothing which deserves to be called knowledge. Perhaps the best supposition we can make is one which, in accordance with the saying that ‘like breeds like’, would assume that the purely physiological causes represent the constant element; that is, given a homogeneous race of people to begin with, who freely intermarry, and are subject to like circumstances of climate, food, and occupation, the standard would remain on the whole constant.[15] In such a case the man who possessed the mean height, mean weight, mean strength, and so on, might then be 47 called, in a sort of way, a ‘type’. The deviations from this type would then be produced by innumerable small influences, partly physiological, partly physical and social, acting for the most part independently of one another, and resulting in a Law of Error of the usual description. Under such restrictions and explanations as these, there seems to be no reasonable objection to speaking of a French or English type or mean. But it must always be remembered that under the present circumstances of every political nation, these somewhat heterogeneous bodies might be subdivided into various smaller groups, each of which would frequently exhibit the characteristics of such a type in an even more marked degree.
§ 17. Among the many significant factors that affect height, it's likely that those which best align with the main conditions of the ‘Law of Error’ are the ones we understand the least. We have some observations on the impacts of diet and jobs, but when it comes to the pure physiological reasons that a parent's height influences their children's height, we probably have little that could be called true knowledge. The best assumption we can make, in line with the saying that ‘like breeds like’, is that these physiological factors are the constant elements. That is, if we start with a homogeneous group of people who intermarry freely and share similar conditions of climate, diet, and occupation, the average would likely remain fairly stable. In this scenario, a person with average height, weight, strength, and so on could be referred to, in a sense, as a ‘type’. Variations from this type would then result from countless small influences, some physiological and some physical or social, mostly acting independently and leading to a typical Law of Error. Given these conditions and explanations, there's no reasonable objection to discussing a French or English type or average. However, it’s important to remember that within each political nation today, these somewhat mixed groups can be divided into various smaller segments, each of which might often display the traits of such a type even more prominently.
§ 18. On this point the reports of the Anthropometrical Committee, already referred to, are most instructive. They illustrate the extent to which this subdivision could be carried out, and prove,—if any proof were necessary,—that the discovery of Quetelet's homme moyen would lead us a long chase. So far as their results go the mean ‘English’ stature (in inches) is 67.66. But this is composed of Scotch, Irish, English and Welsh constituents, the separate means of these being, respectively; 68.71, 67.90, 67.36, and 66.66. But these again may be subdivided; for careful observation shows that the mean English stature is distinctly greater in certain districts (e.g. the North-Eastern counties) than in others. Then again the mean of the professional classes is considerably greater than that of the labourers; and that of the honest and intelligent is very much greater than that of the criminal and lunatic constituents of the population. And, so far as the observations are extensive enough for the purpose, it appears that every characteristic in respect of the grouping about a mean which can be detected in the more extensive of these classes can be detected also in the narrower. 48 Nor is there any reason to suppose that the same process of subdivision could not be carried out as much farther as we chose to prolong it.
§ 18. In this regard, the reports from the Anthropometrical Committee mentioned earlier are very informative. They demonstrate how far this subdivision can go and prove—if any proof is needed—that the discovery of Quetelet's homme moyen would lead us on a long pursuit. So far as the data shows, the average 'English' height (in inches) is 67.66. However, this average includes Scottish, Irish, English, and Welsh figures, with the averages for these being 68.71, 67.90, 67.36, and 66.66, respectively. Moreover, these can be broken down further; careful observation indicates that the average English height is noticeably greater in certain areas (for example, the North-Eastern counties) than in others. Additionally, the average height of professionals is significantly taller than that of laborers, and the average height of the honest and educated is much greater than that of criminals and the mentally ill within the population. Furthermore, as long as the observations are detailed enough, it seems that every characteristic regarding the distribution around an average that can be found in the broader classes can also be seen in the narrower ones. 48 There is also no reason to believe that the same process of subdivision couldn’t continue further if we wanted to take it that far.
§ 19. It need hardly be added to the above remarks that no one who gives the slightest adhesion to the Doctrine of Evolution could regard the type, in the above qualified sense of the term, as possessing any real permanence and fixity. If the constant causes, whatever they may be, remain unchanged, and if the variable ones continue in the long run to balance one another, the results will continue to cluster about the same mean. But if the constant ones undergo a gradual change, or if the variable ones, instead of balancing each other suffer one or more of their number to begin to acquire a preponderating influence, so as to put a sort of bias upon their aggregate effect, the mean will at once begin, so to say, to shift its ground. And having once begun to shift, it may continue to do so, to whatever extent we recognize that Species are variable and Development is a fact. It is as if the point on the target at which we aim, instead of being fixed, were slowly changing its position as we continue to fire at it; changing almost certainly to some extent and temporarily, and not improbably to a considerable extent and permanently.
§ 19. It's worth noting that anyone who supports the Theory of Evolution wouldn't see the type, in the way defined above, as having any real permanence or stability. If the constant causes, whatever they are, stay the same, and if the variable ones continue to balance each other out over time, the results will cluster around the same average. However, if the constant ones gradually change, or if the variable ones start to influence the outcome more heavily instead of balancing each other, the average will begin to shift. Once it starts shifting, it may continue to do so, given that we acknowledge that species are variable and development is a fact. It's like aiming at a target that is slowly moving while we try to hit it; it changes, possibly a little and temporarily, or quite a lot and permanently.
§ 20. Our examples throughout this chapter have been almost exclusively drawn from physical characteristics, whether of man or of inanimate things; but it need not be supposed that we are necessarily confined to such instances. Mr Galton, for instance, has proposed to extend the same principles of calculation to mental phenomena, with a view to their more accurate determination. The objects to be gained by so doing belong rather to the inferential part of our subject, and will be better indicated further on; but they do not involve any distinct principle. Like other attempts 49 to apply the methods of science in the region of the mind, this proposal has met with some opposition; with very slight reason, as it seems to me. That our mental qualities, if they could be submitted to accurate measurement, would be found to follow the usual Law of Error, may be assumed without much hesitation. The known extent of the correlation of mental and bodily characteristics gives high probability to the supposition that what is proved to prevail, at any rate approximately, amongst most bodily elements which have been submitted to measurement, will prevail also amongst the mental elements.
§ 20. The examples we've used in this chapter have mostly focused on physical traits, whether of people or objects; however, it shouldn’t be assumed that we're limited to these cases. Mr. Galton, for example, has suggested applying the same calculation principles to mental phenomena for more accurate assessment. The outcomes of this approach relate more to the inferential aspect of our subject and will be explained further; yet, they don't hinge on any unique principle. Like other attempts to use scientific methods in understanding the mind, this idea has faced some criticism; arguably with little justification, in my opinion. It's reasonable to assume that if our mental traits could be precisely measured, they would align with the usual Law of Error. The known correlation between mental and physical characteristics strongly supports the idea that principles proven to generally apply to most measurable physical traits will also apply to mental traits.
To what extent such measurements could be carried out practically, is another matter. It does not seem to me that it could be done with much success; partly because our mental qualities are so closely connected with, indeed so run into one another, that it is impossible to isolate them for purposes of comparison.[16] This is to some extent indeed a difficulty in bodily measurements, but it is far more so in those of the mind, where we can hardly get beyond what can be called a good guess. The doctrine, therefore, that mental qualities follow the now familiar law of arrangement can scarcely be grounded upon anything more than a strong analogy. Still this analogy is quite strong enough to justify us in accepting the doctrine and all the conclusions which follow from it, in so far as our estimates and measurements can be regarded as trustworthy. There seems therefore nothing unreasonable in the attempt to establish a system of natural classification of mankind by arranging them into a certain number of groups above and below the average, each group being intended to correspond 50 to certain limits of excellency or deficiency.[17] All that is necessary for such a purpose is that the rate of departure from the mean should be tolerably constant under widely different circumstances: in this case throughout all the races of man. Of course if the law of divergence is the same as that which prevails in inanimate nature we have a still wider and more natural system of classification at hand, and one which ought to be familiar, more or less, to every one who has thus to estimate qualities.
To what extent such measurements could actually be done is another question. It doesn’t seem to me that it could be very successful; partly because our mental traits are so interconnected, that it’s impossible to isolate them for comparison. This is somewhat of a challenge in physical measurements, but it’s even more of an issue with mental qualities, where we can hardly get beyond what might be called a good guess. Therefore, the idea that mental qualities follow the now-familiar law of arrangement can hardly be founded on anything more than a strong analogy. Still, this analogy is strong enough to justify accepting the idea and all the conclusions that come from it, as long as our assessments and measurements can be considered reliable. There seems to be nothing unreasonable in trying to establish a system of natural classification of humanity by organizing people into a certain number of groups above and below average, with each group intended to correspond to specific limits of excellence or deficiency. All that’s needed for this purpose is that the rate of deviation from the mean should be fairly consistent under a variety of circumstances: in this case, across all human races. Of course, if the law of divergence is the same as that which occurs in the inanimate world, we have an even wider and more natural classification system available, one that should be somewhat familiar to anyone who needs to assess qualities.
§ 21. Perhaps one of the best illustrations of the legitimate application of such principles is to be found in Mr Galton's work on Hereditary Genius. Indeed the full force and purport of some of his reasonings there can hardly be appreciated except by those who are familiar with the conceptions which we have been discussing in this chapter. We can only afford space to notice one or two points, but the student will find in the perusal, of at any rate the more argumentive parts, of that volume[18] an interesting illustration of the doctrines now under discussion. For one thing it may be safely asserted, that no one unfamiliar with the Law of Error would ever in the least appreciate the excessive 51 rapidity with which the superior degrees of excellence tend to become scarce. Every one, of course, can see at once, in a numerical way at least, what is involved in being ‘one of a million’; but they would not at all understand, how very little extra superiority is to be looked for in the man who is ‘one of two million’. They would confound the mere numerical distinction, which seems in some way to imply double excellence, with the intrinsic superiority, which would mostly be represented by a very small fractional advantage. To be ‘one of ten million’ sounds very grand, but if the qualities under consideration could be estimated in themselves without the knowledge of the vastly wider area from which the selection had been made, and in freedom therefore from any consequent numerical bias, people would be surprised to find what a very slight comparative superiority was, as a rule, thus obtained.
§ 21. One of the best examples of the proper use of these principles can be found in Mr. Galton's work on Hereditary Genius. In fact, the full impact and meaning of some of his arguments can only be truly understood by those who are familiar with the ideas we’ve been discussing in this chapter. We can only touch on one or two points, but readers will find that the more analytical sections of that book[18]offer an interesting example of the theories we're currently examining. For instance, it’s safe to say that anyone unfamiliar with the Law of Error would not grasp how quickly higher levels of excellence tend to become rare. Everyone can see, at least in numerical terms, what it means to be ‘one of a million’; however, they wouldn’t understand how little additional superiority there is to find in someone who is ‘one of two million’. They would mix up the simple numerical difference, which seems to indicate double excellence, with the actual superiority, which usually represents a very small fractional advantage. Being ‘one of ten million’ sounds impressive, but if the qualities we’re considering could be evaluated on their own, without knowing about the much larger pool from which the selection was made, and thus free from any resulting numerical bias, people would be surprised to see how slight the comparative superiority generally is.
§ 22. The point just mentioned is an important one in arguments from statistics. If, for instance, we find a small group of persons, connected together by blood-relationship, and all possessing some mental characteristic in marked superiority, much depends upon the comparative rarity of such excellence when we are endeavouring to decide whether or not the common possession of these qualities was accidental. Such a decision can never be more than a rough one, but if it is to be made at all this consideration must enter as a factor. Again, when we are comparing one nation with another,[19] say the Athenian with any modern European people, does the popular mind at all appreciate what sort of evidence of general superiority is implied by the production, out of one nation, of such a group as can be composed of Socrates, Plato, and a few of their contemporaries? In this 52 latter case we are also, it should be remarked, employing the ‘Law of Error’ in a second way; for we are assuming that where the extremes are great so will also the means be, in other words we are assuming that every amount of departure from the mean occurs with a (roughly) calculable degree of relative frequency. However generally this truth may be accepted in a vague way, its evidence can only be appreciated by those who know the reasons which can be given in its favour.
§ 22. The point just mentioned is really important in statistical arguments. For example, if we find a small group of people who are related by blood and all share a clearly superior mental trait, the rarity of such excellence plays a huge role in determining whether their shared qualities are just a coincidence. This decision will always be somewhat rough, but if we’re going to make it at all, we need to factor this in. Again, when comparing one nation to another, say the Athenians to any modern European society, does the public really grasp what kind of evidence of overall superiority is shown by the existence of a group like Socrates, Plato, and a few of their contemporaries? In this context, we’re also using the ‘Law of Error’ in a second way; we’re assuming that if the extremes are significant, so will the averages be. In other words, we assume that any degree of deviation from the average happens with a (roughly) measurable frequency. While this truth may be generally accepted in a vague way, its significance can only be truly understood by those who know the reasoning behind it.
But the same principles will also supply a caution in the case of the last example. They remind us that, for the mere purpose of comparison, the average man of any group or class is a much better object for selection than the eminent one. There may be greater difficulties in the way of detecting him, but when we have done so we have got possession of a securer and more stable basis of comparison. He is selected, by the nature of the case, from the most numerous stratum of his society; the eminent man from a thinly occupied stratum. In accordance therefore with the now familiar laws of averages and of large numbers the fluctuations amongst the former will generally be very few and small in comparison with those amongst the latter.
But the same principles also serve as a warning in the last example. They remind us that, just for the sake of comparison, the average individual in any group or class is a much better choice for selection than the exceptional one. There might be greater challenges in identifying him, but once we do, we have a more secure and stable basis for comparison. He is chosen, by nature, from the largest segment of his society; the exceptional person comes from a much smaller segment. Therefore, in line with the now well-known principles of averages and large numbers, the fluctuations among the former will usually be very few and minor compared to those among the latter.
1 Essai de Physique Sociale, 1869. Anthropométrie, 1870.
1 Essai de Physique Sociale, 1869. Anthropométrie, 1870.
2 As regards later statistics on the same subject the reader can refer to the Reports of the Anthropometrical Committee of the British Association (1879, 1880, 1881, 1883;—especially this last). These reports seem to me to represent a great advance on the results obtained by Quetelet, and fully to justify the claim of the Secretary (Mr C. Roberts) that their statistics are “unique in range and numbers”. They embrace not merely military recruits—like most of the previous tables—but almost every class and age, and both sexes. Moreover they refer not only to stature but to a number of other physical characteristics.
2 For more recent statistics on the same topic, readers can check out the Reports of the Anthropometrical Committee of the British Association (1879, 1880, 1881, 1883;—especially the last one). These reports seem to show significant progress over the results achieved by Quetelet, and they strongly support the Secretary's (Mr. C. Roberts) claim that their statistics are “unique in range and numbers.” They cover not just military recruits—like most earlier tables—but almost every class and age group, as well as both genders. Additionally, they include not only height but also several other physical characteristics.
3 As every mathematician knows, the relative numbers of each of these possible throws are given by the successive terms of the expansion of (1 + 1)10, viz. 1, 10, 45, 120, 210, 252, 210, 120, 45, 10, 1.
3 As every mathematician knows, the relative frequency of each of these possible outcomes is represented by the successive terms of the expansion of (1 + 1)10, namely: 1, 10, 45, 120, 210, 252, 210, 120, 45, 10, 1.
4 That is they will be more densely aggregated. If a space the size of the bull's-eye be examined in each successive circle, the number of shot marks which it contains will be successively less. The actual number of shots which strike the bull's-eye will not be the greatest, since it covers so much less surface than any of the other circles.
4 This means they will be more tightly grouped. If you look at a space the size of the bull's-eye within each successive circle, the number of shot marks in that space will decrease each time. The actual number of shots hitting the bull's-eye won't be the highest, because it has much less surface area than any of the other circles.
5 Commonly called the exponential law; its equation being of the form y = Ae−hx2. The curve corresponding to it cuts the axis of y at right angles (expressing the fact that near the mean there are a large number of values approximately equal); after a time it begins to slope away rapidly towards the axis of x (expressing the fact that the results soon begin to grow less common as we recede from the mean); and the axis of x is an asymptote in both directions (expressing the fact that no magnitude, however remote from the mean, is strictly impossible; that is, every deviation, however excessive, will have to be encountered at length within the range of a sufficiently long experience). The curve is obviously symmetrical, expressing the fact that equal deviations from the mean, in excess and in defect, tend to occur equally often in the long run.
5 Commonly known as the exponential law, its equation takes the form y = Ae−hx2. The curve related to it intersects the y axis at a right angle (indicating that close to the mean, there are many values that are roughly the same); after some time, it begins to slope down quickly towards the x axis (indicating that the results quickly become less frequent as we move away from the mean); and the x axis acts as an asymptote in both directions (indicating that no value, no matter how far from the mean, is strictly impossible; that is, any deviation, no matter how extreme, will eventually be observed given a sufficiently long experience). The curve is clearly symmetrical, indicating that deviations from the mean, whether positive or negative, tend to occur with equal frequency over time.

A rough graphic representation of the curve is given above. For the benefit of those unfamiliar with mathematics one or two brief remarks may be here appended concerning some of its properties. (1) It must not be supposed that all specimens of the curve are similar to one another. The dotted lines are equally specimens of it. In fact, by varying the essentially arbitrary units in which x and y are respectively estimated, we may make the portion towards the vertex of the curve as obtuse or as acute as we please. This consideration is of importance; for it reminds us that, by varying one of these arbitrary units, we could get an ‘exponential curve’ which should tolerably closely resemble any symmetrical curve of error, provided that this latter recognized and was founded upon the assumption that extreme divergences were excessively rare. Hence it would be difficult, by mere observation, to prove that the law of error in any given case was not exponential; unless the statistics were very extensive, or the actual results departed considerably from the exponential form. (2) It is quite impossible by any graphic representation to give an adequate idea of the excessive rapidity with which the curve after a time approaches the axis of x. At the point R, on our scale, the curve would approach within the fifteen-thousandth part of an inch from the axis of x, a distance which only a very good microscope could detect. Whereas in the hyperbola, e.g. the rate of approach of the curve to its asymptote is continually decreasing, it is here just the reverse; this rate is continually increasing. Hence the two, viz. the curve and the axis of x, appear to the eye, after a very short time, to merge into one another.
A rough graphic representation of the curve is shown above. For those who aren’t familiar with mathematics, a few brief notes on some of its properties may be helpful. (1) It's important to understand that not all examples of the curve are alike. The dotted lines are also examples of it. In fact, by changing the essentially arbitrary units used to measure x and y, we can make the part of the curve near the vertex as blunt or as sharp as we want. This point is significant because it reminds us that by varying one of these arbitrary units, we can create an ‘exponential curve’ that closely resembles any symmetrical error curve, as long as this latter acknowledges that extreme deviations are extremely rare. Therefore, it would be challenging to prove, through mere observation, that the error law in any given case isn’t exponential unless the statistics are very extensive or the actual results significantly deviate from the exponential shape. (2) It's completely impossible to adequately convey through any graphic representation how incredibly fast the curve approaches the axis of x after a certain point. At point R on our scale, the curve would get within fifteen-thousandths of an inch from the axis of x, a distance that only a very good microscope could detect. While in the hyperbola, for instance, the rate of the curve approaching its asymptote continually decreases, here it’s the opposite; this rate continually increases. As a result, the two—specifically, the curve and the axis of x—appear to merge into each other after a very short time.
6 As by Quetelet: noted, amongst others, by Herschel, Essays, page 409.
6 As noted by Quetelet and others, including Herschel, Essays, page 409.
7 Proc. R. Soc. Oct. 21, 1879.
__A_TAG_PLACEHOLDER_0__ Proc. R. Soc. Oct. 21, 1879.
8 We are here considering, remember, the case of a finite amount of statistics; so that there are actual limits at each end.
8 We are here thinking about, remember, the situation of a finite amount of statistics; which means there are real limits on both ends.
9 It must be admitted that experience has not yet (I believe) shown this asymmetry in respect of heights.
9 It has to be acknowledged that experience has not yet (I think) demonstrated this imbalance regarding heights.
10 The above reasoning will probably be accepted as valid at this stage of enquiry. But in strictness, assumptions are made here, which however justifiable they may be in themselves, involve somewhat of an anticipation. They demand, and in a future chapter will receive, closer scrutiny and criticism.
10 The reasoning above will likely be accepted as valid at this point in our investigation. However, strictly speaking, there are assumptions made here that, while they may be justifiable on their own, involve a bit of anticipation. They require, and will receive, more detailed examination and critique in a future chapter.
11 A definite numerical example of this kind of concentration of frequency about the mean was given in the note to § 4. It was of a binomial form, consisting of the successive terms of the expansion of (1 + 1)m. Now it may be shown (Quetelet, Letters, p. 263; Liagre, Calcul des Probabilités, § 34) that the expansion of such a binomial, as m becomes indefinitely great, approaches as its limit the exponential form; that is, if we take a number of equidistant ordinates proportional respectively to 1, m, m(m − 1)/1·2 &c., and connect their vertices, the figure we obtain approximately represents some form of the curve y = Ae−hx2, and tends to become identical with it, as m is increased without limit. In other words, if we suppose the errors to be produced by a limited number of finite, equal and independent causes, we have an approximation to the exponential Law of Error, which merges into identity as the causes are increased in number and diminished in magnitude without limit. Jevons has given (Principles of Science, p. 381) a diagram drawn to scale, to show how rapid this approximation is. One point must be carefully remembered here, as it is frequently overlooked (by Quetelet, for instance). The coefficients of a binomial of two equal terms—as (1 + 1)m, in the preceding paragraph—are symmetrical in their arrangement from the first, and very speedily become indistinguishable in (graphical) outline from the final exponential form. But if, on the other hand, we were to consider the successive terms of such a binomial as (1 + 4)m (which are proportional to the relative chances of 0, 1, 2, 3, … failures in m ventures, of an event which has one chance in its favour to four against it) we should have an unsymmetrical succession. If however we suppose m to increase without limit, as in the former supposition, the unsymmetry gradually disappears and we tend towards precisely the same exponential form as if we had begun with two equal terms. The only difference is that the position of the vertex of the curve is no longer in the centre: in other words, the likeliest term or event is not an equal number of successes and failures but successes and failures in the ratio of 1 to 4.
11 A clear numerical example of this kind of concentration of frequency around the mean was presented in the note to § 4. It was a binomial form, consisting of the successive terms from the expansion of (1 + 1)m. It can now be shown (Quetelet, Letters, p. 263; Liagre, Calcul des Probabilités, § 34) that as m becomes indefinitely large, the expansion of such a binomial approaches an exponential form as its limit. Specifically, if we take a number of evenly spaced ordinates proportional to 1, m, m(m - 1)/1·2 &c., and connect their vertices, the resulting figure approximately represents some form of the curve y = Ae−hx2, which tends to become identical with it as m increases without limit. In simpler terms, if we assume the errors are caused by a limited number of equal, independent sources, we have an approximation to the exponential Law of Error, which merges into identity as the number of causes increases and their impact decreases without limit. Jevons has provided a diagram drawn to scale in his work (Principles of Science, p. 381) to illustrate how quickly this approximation happens. It’s important to remember one point that is often overlooked (for example, by Quetelet). The coefficients of a binomial with two equal terms—such as (1 + 1)m—are symmetrically arranged from the start and quickly become indistinguishable in their graphical outline from the final exponential form. However, if we consider the successive terms of a binomial like (1 + 4)m (which are proportional to the relative chances of 0, 1, 2, 3, … failures in m attempts, where the event has one chance in favor to four against it), we get an asymmetrical succession. Nevertheless, if we let m increase indefinitely, as in the previous case, this asymmetry gradually fades away and we move towards the same exponential form as if we started with two equal terms. The only difference is that the vertex of the curve is no longer at the center: in other words, the most likely outcome is not an equal number of successes and failures, but rather a ratio of successes to failures of 1 to 4.
12 ‘Law of Error’ is the usual technical term for what has been elsewhere spoken of above as a Law of Divergence from a mean. It is in strictness only appropriate in the case of one, namely the third, of the three classes of phenomena mentioned in § 4, but by a convenient generalization it is equally applied to the other two; so that we term the amount of the divergence from the mean an ‘error’ in every case, however it may have been brought about.
12 ‘Law of Error’ is the standard technical term for what was previously described as a Law of Divergence from a mean. Strictly speaking, it's only relevant to the third of the three types of phenomena mentioned in § 4, but for convenience, it's also used for the other two. Therefore, we refer to the amount of divergence from the mean as an ‘error’ in every instance, regardless of how it occurred.
13 This however seems to be the purport, either by direct assertion or by implication, of two elaborate works by Quetelet, viz. his Physique Sociale and his Anthropométrie.
13 This, however, appears to be the main point, either through direct statement or by suggestion, of two detailed works by Quetelet, namely his Physique Sociale and his Anthropométrie.
14 He scarcely, however, professes to give these as an accurate measure of the mean height, nor does he always give precisely the same measure. Practically, none but soldiers being measured in any great numbers, the English stature did not afford accurate data on any large scale. The statistics given a few pages further on are probably far more trustworthy.
14 He hardly claims to provide these as a precise measure of the average height, nor does he consistently provide exactly the same measurement. Since only soldiers were measured in significant quantities, average English height didn't provide reliable data on a large scale. The statistics mentioned a few pages later are likely much more reliable.
15 This statement will receive some explanation and correction in the next chapter.
15 This statement will get some clarification and correction in the next chapter.
16 I am not speaking here of the now familiar results of Psychophysics, which are mainly occupied with the measurement of perceptions and other simple states of consciousness.
16 I’m not talking about the now well-known findings of Psychophysics, which primarily focus on measuring perceptions and other basic states of consciousness.
17 Perhaps the best brief account of Mr Galton's method is to be found in a paper in Mind (July, 1880) on the statistics of Mental Imagery. The subject under comparison here—viz. the relative power, possessed by different persons, of raising clear visual images of objects no longer present to us—is one which it seems impossible to ‘measure’, in the ordinary sense of the term. But by arranging all the answers in the order in which the faculty in question seems to be possessed we can, with some approach to accuracy, select the middlemost person in the row and use him as a basis of comparison with the corresponding person in any other batch. And similarly with those who occupy other relative positions than that of the middlemost.
17 Maybe the best short explanation of Mr. Galton's method can be found in a paper in Mind (July 1880) about the statistics of Mental Imagery. The subject we're comparing here—the ability of different people to create clear visual images of objects that are no longer present—is one that seems impossible to "measure" in the usual sense. However, by arranging all the responses based on how this ability seems to be possessed, we can, with some level of accuracy, identify the person in the middle of the list and use them as a basis for comparison with the corresponding person in any other group. The same applies to those who hold other relative positions besides being in the middle.
18 I refer to the introductory and concluding chapters: the bulk of the book is, from the nature of the case, mainly occupied with statistical and biographical details.
18 I'm talking about the introductory and concluding chapters: most of the book is primarily focused on statistical and biographical details.
CHAPTER 3.
ON THE CAUSAL PROCESS BY WHICH THE GROUPS OR SERIES OF PROBABILITY ARE BROUGHT ABOUT.
§ 1. In discussing the question whether all the various groups and series with which Probability is concerned are of precisely one and the same type, we made some examination of the process by which they are naturally produced, but we must now enter a little more into the details of this process. All events are the results of numerous and complicated antecedents, far too numerous and complicated in fact for it to be possible for us to determine or take them all into account. Now, though it is strictly true that we can never determine them all, there is a broad distinction between the case of Induction, in which we can make out enough of them, and with sufficient accuracy, to satisfy a reasonable certainty, and Probability, in which we cannot do so. To Induction we shall return in a future chapter, and therefore no more need be said about it here.
§ 1. In discussing whether all the different groups and series related to Probability are exactly the same type, we looked into how they are naturally generated, but now we need to delve a bit deeper into this process. All events are the results of numerous and complex factors, far too many and intricate for us to identify or consider them all. While it's true that we can never fully determine all of them, there's a significant difference between Induction, where we can identify enough factors accurately to achieve reasonable certainty, and Probability, where that's not possible. We will revisit Induction in a future chapter, so there's no need to elaborate on it here.
We shall find it convenient to begin with a division which, though not pretending to any philosophical accuracy, will serve as a preliminary guide. It is the simple division into objects, and the agencies which affect them. All the phenomena with which Probability is concerned (as indeed most of those with which science of any kind is concerned) are the product of certain objects natural and artificial, acting under the influence of certain agencies natural and 54 artificial. In the tossing of a penny, for instance, the objects would be the penny or pence which were successively thrown; the agencies would be the act of throwing, and everything which combined directly or indirectly with this to make any particular face come uppermost. This is a simple and intelligible division, and can easily be so extended in meaning as to embrace every class of objects with which we are concerned.
We’ll find it helpful to start with a division that, while not claiming any philosophical precision, will act as a useful guide. It’s a straightforward split between objects and the forces that impact them. All the phenomena that Probability deals with (and indeed most of what science looks at) arise from certain natural and artificial objects interacting with specific natural and artificial forces. For example, in the act of tossing a coin, the objects are the coin or coins that are thrown; the forces include the act of throwing and everything else that directly or indirectly influences which side lands face up. This is a clear and understandable division, and it can easily be broadened to cover every category of objects we are interested in.
Now if, in any two or more cases, we had the same object, or objects indistinguishably alike, and if they were exposed to the influence of agencies in all respects precisely alike, we should expect the results to be precisely similar. By one of the applications of the familiar principle of the uniformity of nature we should be confident that exact likeness in the antecedents would be followed by exact likeness in the consequents. If the same penny, or similar pence, were thrown in exactly the same way, we should invariably find that the same face falls uppermost.
Now, if in any two or more cases we had the same object, or objects that were indistinguishably alike, and if they were subjected to the influence of factors that were exactly the same, we would expect the results to be exactly similar. According to the well-known principle of the uniformity of nature, we would be sure that an exact similarity in the causes would lead to an exact similarity in the outcomes. If the same penny, or similar pennies, were tossed in exactly the same way, we would consistently find that the same side lands face up.
§ 2. What we actually find is, of course, very far removed from this. In the case of the objects, when they are artificial constructions, e.g. dice, pence, cards, it is true that they are purposely made as nearly as possible indistinguishably alike. We either use the same thing over and over again or different ones made according to precisely the same model. But in natural objects nothing of the sort prevails. In fact when we come to examine them, we find reproduced in them precisely the same characteristics as those which present themselves in the final result which we were asked to explain, so that unless we examine them a stage further back, as we shall have to do to some extent at any rate, we seem to be merely postulating again the very peculiarity of the phenomena which we were undertaking to explain. They will be found, for instance, to 55 consist of large classes of objects, throughout all the individual members of which a general resemblance extends. Suppose that we were considering the length of life. The objects here are the human beings, or that selected class of them, whose lives we are considering. The resemblance existing among them is to be found in the strength and soundness of their principal vital organs, together with all the circumstances which collectively make up what we call the goodness of their constitutions. It is true that most of these circumstances do not admit of any approach to actual measurement; but, as was pointed out in the last chapter, very many of the circumstances which do admit of such measurement have been measured, and found to display the characteristics in question. Hence, from the known analogy and correlation between our various organs, there can be no reasonable doubt that if we could arrange human constitutions in general, or the various elements which compose them in particular, in the order of their strength, we should find just such an aggregate regularity and just such groupings about the mean, as the final result (viz. in this case the length of their lives) presents to our notice.
§ 2. What we actually discover is, of course, quite different from this. In the case of objects, when they are man-made items, like dice, coins, or cards, it's true that they are designed to be very similar to one another. We use the same item repeatedly or different ones created from the same design. But with natural objects, that's not the case. When we examine them, we find that they share the same traits as the final outcomes we were trying to explain. Unless we look at them from an earlier stage, as we will need to do to some extent, it seems like we’re just repeating the very uniqueness of the phenomena we aimed to explain. For instance, they will be found to consist of large groups of objects that share a general resemblance throughout all their individual members. Suppose we're looking at lifespan. The objects in this case are the human beings, or that specific group of them, whose lives we are examining. The similarities among them can be seen in the strength and health of their main vital organs, along with all the factors that together constitute what we call the quality of their health. While most of these factors can’t be accurately measured, as noted in the last chapter, many of the factors that can be measured have been, and they show the characteristics in question. Therefore, based on the known similarities and relationships between our various organs, it's reasonable to conclude that if we could rank human bodies in general, or the different elements that make them up in particular, by their strength, we would find a similar overall pattern and groupings around the average, just like the outcome (in this case, their lifespan) presents to us.
§ 3. It will be observed therefore that for this purpose the existence of natural kinds or groups is necessary. In our games of chance of course the same die may be thrown, or a card be drawn from the same pack, as often as we please; but many of the events which occur to human beings either cannot be repeated at all, or not often enough to secure in the case of the single individual any sufficient statistical uniformity. Such regularity as we trace in nature is owing, much more than is often suspected, to the arrangement of things in natural kinds, each of them containing a large number of individuals. Were each kind of animals or vegetables limited to a single pair, or 56 even to but a few pairs, there would not be much scope left for the collection of statistical tables amongst them. Or to take a less violent supposition, if the numbers in each natural class of objects were much smaller than they are at present, or the differences between their varieties and sub-species much more marked, the consequent difficulty of extracting from them any sufficient length of statistical tables, though not fatal, might be very serious. A large number of objects in the class, together with that general similarity which entitles the objects to be fairly comprised in one class, seem to be important conditions for the applicability of the theory of Probability to any phenomenon. Something analogous to this excessive paucity of objects in a class would be found in the attempt to apply special Insurance offices to the case of those trades where the numbers are very limited, and the employment so dangerous as to put them in a class by themselves. If an insurance society were started for the workmen in gunpowder mills alone, a premium would have to be charged to avoid possible ruin, so high as to illustrate the extreme paucity of appropriate statistics.
§ 3. It's important to note that for this purpose, the existence of natural groups or categories is essential. In our games of chance, we can roll the same die or draw a card from the same deck as often as we want; however, many events that happen to people either can't be repeated at all or aren't repeated often enough to provide enough statistical consistency for individual cases. The regularity we observe in nature is, more than is usually realized, due to the organization of things into natural categories, each containing a large number of individuals. If each species of animals or plants was limited to just one pair, or even a few pairs, there wouldn't be much opportunity to gather statistical data from them. Alternatively, if the numbers within each natural category were much smaller than they are now, or if the differences among their varieties and subspecies were more pronounced, the resulting difficulty in collecting adequate statistical data, while not necessarily fatal, could be quite serious. Having a large number of items in a category, along with a general similarity that allows them to be grouped together, appears to be crucial for applying Probability theory to any phenomenon. A situation similar to this extreme scarcity of items in a category would be found when trying to apply specific insurance companies to trades with very few participants, where the risks are so high that they form a class by themselves. If an insurance company were established solely for workers in gunpowder mills, the premiums would need to be set so high to prevent potential financial disaster that they would highlight the severe lack of relevant statistics.
§ 4. So much (at present) for the objects. If we turn to what we have termed the agencies, we find much the same thing again here. By the adjustment of their relative intensity, and the respective frequency of their occurrence, the total effects which they produce are found to be also tolerably uniform. It is of course conceivable that this should have been otherwise. It might have been found that the second group of conditions so exactly corrected the former as to convert the merely general uniformity into an absolute one; or it might have been found, on the other hand, that the second group should aggravate or disturb the influence of the former to such an extent 57 as to destroy all the uniformity of its effects. Practically neither is the case. The second condition simply varies the details, leaving the uniformity on the whole of precisely the same general description as it was before. Or if the objects were supposed to be absolutely alike, as in the case of successive throws of a penny, it may serve to bring about a uniformity. Analysis will show these agencies to be thus made up of an almost infinite number of different components, but it will detect the same peculiarity that we have so often had occasion to refer to, pervading almost all these components. The proportions in which they are combined will be found to be nearly, though not quite, the same; the intensity with which they act will be nearly though not quite equal. And they will all unite and blend into a more and more perfect regularity as we proceed to take the average of a larger number of instances.
§ 4. That covers the objects for now. When we look at what we've called the agencies, we find a similar situation. By adjusting their relative strengths and how often they occur, the total effects they produce tend to be quite consistent. It’s possible that the results could have been different. The second group of conditions might have perfectly corrected the first, turning what was a general consistency into an absolute one; or, on the flip side, the second group might have made the influence of the first so much worse or disordered that it eliminated any consistency in its effects. In practice, neither of these scenarios occurs. The second condition merely alters the details while keeping the overall consistency essentially the same as it was before. If the objects were assumed to be exactly alike, like in consecutive penny flips, a uniformity could be achieved. Analysis will reveal that these agencies consist of an almost infinite number of different components, but it will also uncover the same characteristic we've often noted, present in almost all these components. The ratios in which they combine will be found to be almost, but not entirely, identical; the strength with which they operate will be nearly, though not completely, equal. And as we average more instances, they will all come together and merge into an increasingly perfect regularity.
Take, for instance, the length of life. As we have seen, the constitutions of a very large number of persons selected at random will be found to present much the same feature; general uniformity accompanied by individual irregularity. Now when these persons go out into the world, they are exposed to a variety of agencies, the collective influence of which will assign to each the length of life allotted to him. These agencies are of course innumerable, and their mutual interaction complicated beyond all power of analysis to extricate. Each effect becomes in its turn a cause, is interwoven inextricably with an indefinite number of other causes, and reacts upon the final result. Climate, food, clothing, are some of these agencies, or rather comprise aggregate groups of them. The nature of a man's work is also important. One man overworks himself, another follows an unhealthy trade, a third exposes himself to infection, and so on.
Take, for example, the length of life. As we've seen, the health conditions of a large number of randomly selected people tend to show a similar pattern: overall uniformity with individual differences. When these people enter the world, they face various influences, and together, these will determine the length of life each person has. These influences are countless, and their interactions are so complex it's impossible to fully analyze them. Each effect becomes a cause in itself, intricately linked to an endless number of other causes, and influences the final outcome. Climate, diet, and clothing are some of these influences, or rather, they represent broader categories of them. The type of work a person does is also significant. One person might work too hard, another might be in an unhealthy occupation, a third might be exposed to illness, and so on.
The result of all this interaction between what we have thus called objects and agencies is that the final outcome presents the same general characteristics of uniformity as may be detected separately in the two constituent elements. Or rather, as we shall proceed presently to show, it does so in the great majority of cases.
The result of all this interaction between what we've referred to as objects and agencies is that the final outcome shows the same general characteristics of uniformity that can be observed separately in the two components. Or rather, as we will soon demonstrate, this is true in the vast majority of cases.
§ 5. It may be objected that such an explanation as the above does not really amount to anything deserving of the name, for that instead of explaining how a particular state of things is caused it merely points out that the same state exists elsewhere. There is a uniformity discovered in the objects at the stage when they are commonly submitted to calculation; we then grope about amongst the causes of them, and after all only discover a precisely similar uniformity existing amongst these causes. This is to some extent true, for though part of the objection can be removed, it must always remain the case that the foundations of an objective science will rest in the last resort upon the mere fact that things are found to be of such and such a character.
§ 5. One might argue that the explanation given above doesn’t really hold any significance because, instead of explaining how a specific situation arises, it simply points out that a similar situation exists elsewhere. There is a consistency observed in the objects when they are typically measured; we then search for their causes, only to find a similar consistency among those causes. This is partly true, as some of the objections can be addressed, but it will always be the case that the foundations of an objective science ultimately rely on the simple fact that things are found to have a certain nature.
§ 6. This division, into objects and the agencies which affect them, is merely intended for a rough practical arrangement, sufficient to point out to the reader the immediate nature of the causes which bring about our familiar uniformities. If we go back a step further, it might fairly be maintained that they may be reduced to one, namely, to the agencies. The objects, as we have termed them, are not an original creation in the state in which we now find them. No one supposes that whole groups or classes were brought into existence simultaneously, with all their general resemblances and particular differences fully developed. Even if it were the case that the first parents of each natural kind had been specially created, instead 59 of being developed out of pre-existing forms, it would still be true that amongst the numbers of each that now present themselves the characteristic differences and resemblances are the result of what we have termed agencies. Take, for instance, a single characteristic only, say the height; what determines this as we find it in any given group of men? Partly, no doubt, the nature of their own food, clothing, employment, and so on, especially in the earliest years of their life; partly also, very likely, similar conditions and circumstances on the part of their parents at one time or another. No one, I presume, in the present state of knowledge, would attempt to enumerate the remaining causes, or even to give any indication of their exact nature; but at the same time few would entertain any doubt that agencies of this general description have been the determining causes at work.
§ 6. This division into objects and the forces that affect them is just a basic way to organize things, enough to show the reader the immediate causes of our familiar patterns. If we look deeper, we could argue that they can all be simplified into one category: the forces. The objects we’ve referred to aren’t original creations in the current state we observe them. No one thinks that entire groups or classes appeared at the same time, fully formed with all their general similarities and specific differences. Even if the first parents of each natural kind were specially created instead of evolving from earlier forms, it would still be true that the various differences and similarities we see among them result from these so-called forces. Take, for example, just one feature, like height; what influences this in any given group of people? Partly, it’s likely due to their food, clothing, jobs, and so on, especially during their early years; and also, probably, similar conditions and circumstances from their parents at some point. No one, I assume, with our current knowledge would try to list all the other causes or even specify their exact nature, but at the same time, few would doubt that forces of this general kind have been the key factors at play.
If it be asked again, Into what may these agencies themselves be ultimately analysed? the answer to this question, in so far as it involves any detailed examination of them, would be foreign to the plan of this essay. In so far as any general remarks, applicable to nearly all classes alike of such agencies, are called for, we are led back to the point from which we started in the previous chapter, when we were discussing whether there is necessarily one fixed law according to which all our series are formed. We there saw that every event might be regarded as being brought about by a comparatively few important causes, of the kind which comprises all of which ordinary observation takes any notice, and an indefinitely numerous group of small causes, too numerous, minute, and uncertain in their action for us to be able to estimate them or indeed to take them individually into account at all. The important ones, it is true, may also in turn be themselves conceived to be 60 made up of aggregates of small components, but they are still best regarded as being by comparison simple and distinct, for their component parts act mostly in groups collectively, appearing and disappearing together, so that they possess the essential characteristics of unity.
If we ask again, what can these agencies ultimately be broken down into? The answer to that question, as far as it requires a detailed examination, would go beyond the scope of this essay. However, when it comes to general remarks relevant to most types of these agencies, we're led back to the point we started from in the previous chapter, where we discussed whether there is a single, fixed law that governs how all our series are formed. We saw that every event can be viewed as the result of a limited number of significant causes, which includes everything that ordinary observation focuses on, and an infinite number of small causes, too numerous, tiny, and unpredictable in their effect for us to actually measure them or consider them individually. True, the significant causes can also be thought of as made up of collections of smaller components, but they are still best understood as relatively simple and distinct. This is because their components mostly work together in groups, appearing and disappearing as a unit, giving them the essential traits of unity. 60
§ 7. Now, broadly speaking, it appears to me that the most suitable conditions for Probability are these: that the important causes should be by comparison fixed and permanent, and that the remaining ones should on the average continue to act as often in one direction as in the other. This they may do in two ways. In the first place we may be able to predicate nothing more of them than the mere fact that they act[1] as often in one direction as the other; what we should then obtain would be merely the simple statistical uniformity that is described in the first chapter. But it may be the case, and in practice generally is so more or less approximately, that these minor causes act also in independence of one another. What we then get is a group of uniformities such as was explained and illustrated in the second chapter. Every possible combination of these causes then occurring with a regular degree of frequency, we find one peculiar kind of uniformity exhibited, not merely in the mere fact of excess and defect (of whatever may be the variable quality in question), but also in every particular amount of excess and defect. Hence, in this case, we get what some writers term a ‘mean’ or ‘type,’ instead of a simple average. For instance, suppose a man throwing a quoit at a mark. Here our fixed causes are his strength, the weight of the quoit, 61 and the intention of aiming at a given point. These we must of course suppose to remain unchanged, if we are to obtain any such uniformity as we are seeking. The minor and variable causes are all those innumerable little disturbing influences referred to in the last chapter. It might conceivably be the case that we were only able to ascertain that these acted as often in one direction as in the other; what we should then find was that the quoit tended to fall short of the mark as often as beyond it. But owing to these little causes being mostly independent of one another, and more or less equal in their influence, we find also that every amount of excess and defect presents the same general characteristics, and that in a large number of throws the quantity of divergences from the mark, of any given amount, is a tolerably determinate function, according to a regular law, of that amount of divergence.[2]
§ 7. Generally speaking, it seems to me that the best conditions for Probability are that the key causes should remain relatively fixed and constant, while the other causes should, on average, influence outcomes equally in both directions. This can happen in two ways. First, we might only be able to observe that these causes act as often in one direction as in the other; in this case, we'd only see the simple statistical uniformity mentioned in the first chapter. However, it's often the case that these smaller causes operate independently from each other, even if it's only approximately. This leads to a collection of uniformities, as explained and illustrated in the second chapter. When all possible combinations of these causes occur with a regular frequency, we notice a specific type of uniformity not just in the mere fact of surplus and shortage (regardless of what variable is being measured), but also in every specific level of surplus and shortage. Thus, we end up with what some authors call a 'mean' or 'type,' rather than just a simple average. For example, imagine a person throwing a quoit at a target. In this scenario, our fixed causes are their strength, the weight of the quoit, and the intent to hit a specific point. We must assume these remain unchanged in order to achieve the uniformity we’re looking for. The minor and variable causes consist of countless little disturbances mentioned in the last chapter. It might happen that we can only determine that these influences act as often in one direction as the other; in that case, the quoit would tend to land short of the target just as often as it would go beyond it. However, because these minor causes mostly operate independently and have somewhat equal effects, we also find that every amount of surplus and shortage exhibits the same general traits, and with a lot of throws, the frequency of divergences from the target, of any specific amount, follows a fairly definite function based on that level of divergence. [2]
§ 8. The necessity of the conditions just hinted at will best be seen by a reference to cases in which any of 62 them happen to be missing. Thus we know that the length of life is on the whole tolerably regular, and so are the numbers of those who die in successive years or centuries of most of the commoner diseases. But it does not seem to be the case with all diseases. What, for instance, of the Sweating Sickness, the Black Death, the Asiatic Cholera? The two former either do not recur, or, if they do, recur in such a mild form as not to deserve the same name. What in fact of any of the diseases which are epidemic rather than endemic? All these have their causes doubtless, and would be produced again by the recurrence of the conditions which caused them before. But some of them apparently do not recur at all. They seem to have depended upon such rare conditions that their occurrence was almost unique. And of those which do recur the course is frequently so eccentric and irregular, often so much dependent upon human will or want of will, as to entirely deprive their results (that is, the annual number of deaths which they cause) of the statistical uniformity of which we are speaking.
§ 8. The need for the conditions mentioned earlier becomes clear when we look at cases where any of them are absent. We can see that the average lifespan is fairly consistent, as are the numbers of people who die from most common diseases in consecutive years or centuries. However, this isn’t true for all diseases. Take the Sweating Sickness, the Black Death, or Asiatic Cholera, for example. The first two either don’t return, or if they do, come back in such a mild form that they don't really qualify for the same name. What about diseases that are epidemic rather than endemic? They certainly have their causes, and could occur again if the conditions that triggered them before were present again. Yet some of them seemingly never return at all. Their occurrences appear to have depended on such rare conditions that they were almost one-time events. For those that do recur, the pattern is often so erratic and inconsistent, frequently influenced by human actions or inactions, that the outcomes (specifically, the annual death toll they cause) lose the statistical regularity we are discussing.
The explanation probably is that one of the principal causes in such cases is what we commonly call contagion. If so, we have at once a cause which so far from being fixed is subject to the utmost variability. Stringent caution may destroy it, carelessness may aggravate it to any extent. The will of man, as finding its expression either on the part of government, of doctors, or of the public, may make of it pretty nearly what is wished, though against the possibility of its entrance into any community no precautions can absolutely insure us.
The explanation is likely that one of the main reasons in these situations is what we usually refer to as contagion. If that's the case, we have a cause that is anything but constant and can change drastically. Strict caution can eliminate it, while carelessness can make it worse to any degree. The will of individuals, whether expressed by the government, doctors, or the public, can almost shape it as desired, although no precautions can completely guarantee against its entry into any community.
§ 9. If it be replied that this want of statistical regularity only arises from the fact of our having confined ourselves to too limited a time, and that we should find 63 irregularity disappear here, as elsewhere, if we kept our tables open long enough, we shall find that the answer will suggest another case in which the requisite conditions for Probability are wanting. Such a reply would only be conclusive upon the supposition that the ways and thoughts of men are in the long run invariable, or if variable, subject to periodic changes only. On the assumption of a steady progress in society, either for the better or the worse, the argument falls to the ground at once. From what we know of the course of the world, these fearful pests of the past may be considered as solitary events in our history, or at least events which will not be repeated. No continued uniformity would therefore be found in the deaths which they occasion, though the registrar's books were kept open for a thousand years. The reason here is probably to be sought in the gradual alteration of those indefinitely numerous conditions which we term collectively progress or civilization. Every little circumstance of this kind has some bearing upon the liability of any one to catch a disease. But when a kind of slow and steady tide sets in, in consequence of which these influences no longer remain at about the same average strength, warring on about equal terms with hostile influences, but on the contrary show a steady tendency to increase their power, the statistics will, with consequent steadiness and permanence, take the impress of such a change.
§ 9. If someone argues that this lack of statistical consistency is just because we've limited ourselves to too short a timeframe, and that we'd see irregularity fade away if we kept our records open long enough, it's important to recognize that this response points to another situation where the necessary conditions for Probability are missing. Such a reply would only hold true if we assume that people's behaviors and thoughts are ultimately unchanging, or if they only change periodically. If we consider that society progresses steadily, whether for the better or worse, the argument falls apart immediately. Based on what we know about history, these terrible plagues from the past can be seen as isolated incidents, or at least events that are unlikely to happen again. Therefore, no consistent pattern would be apparent in the deaths they caused, even if the registrar's books were kept open for a thousand years. The explanation likely lies in the slow, gradual changes in the countless factors that we collectively refer to as progress or civilization. Every little detail can influence an individual's susceptibility to disease. However, when a persistent and steady shift occurs that causes these influences to no longer maintain a roughly average strength—competing evenly with opposing factors—but instead shows a consistent trend of increasing strength, the statistics will reflect such a change with reliability and permanence.
§ 10. Briefly then, if we were asked where the distinctive characteristics of Probability are most prominently to be found, and where they are most prominently absent, we might say that (1) they prevail principally in the properties of natural kinds, both in the ultimate and in the derivative or accidental properties. In all the characteristics of natural species, in all they do and in all which happens to them, so far as it depends upon their properties, we seldom 64 fail to detect this regularity. Thus in men; their height, strength, weight, the age to which they live, the diseases of which they die; all present a well-known uniformity. Life insurance tables offer the most familiar instance of the importance of these applications of Probability.
§ 10. To sum up, if we were to identify where the unique traits of Probability are most evident and where they are least evident, we could say that (1) they primarily show up in the properties of natural categories, both in their fundamental and in their accidental or derived properties. In all the characteristics of natural species, in everything they do and everything that happens to them, as far as it relates to their properties, we rarely fail to notice this regularity. For example, in humans; their height, strength, weight, lifespan, and the diseases they die from all exhibit a well-known consistency. Life insurance tables provide the most familiar example of the significance of these applications of Probability.
(2) The same peculiarity prevails again in the force and frequency of most natural agencies. Wind and weather are seen to lose their proverbial irregularity when examined on a large scale. Man's work therefore, when operated on by such agencies as these, even though it had been made in different cases absolutely alike to begin with, afterwards shows only a general regularity. I may sow exactly the same amount of seed in my field every year. The yield may one year be moderate, the next year be abundant through favourable weather, and then again in turn be destroyed by hail. But in the long run these irregularities will be equalized in the result of my crops, because they are equalized in the power and frequency of the productive agencies. The business of underwriters, and offices which insure the crops against hail, would fall under this class; though, as already remarked, there is no very profound distinction between them and the former class.
(2) The same uniqueness appears in the strength and frequency of most natural forces. Wind and weather seem to lose their well-known unpredictability when looked at from a broader perspective. Therefore, even if my work is identical in various instances, when influenced by these natural forces, it only displays a general consistency. I can plant the exact same amount of seeds in my field every year. One year, the yield might be average, the next might be plentiful due to good weather, and then it could be wiped out by hail. However, over time, these inconsistencies will balance out in the outcome of my crops because they balance out in the strength and frequency of the productive forces. The work of underwriters and companies that insure crops against hail falls into this category; although, as mentioned earlier, there's not a significant difference between them and the previous category.
The reader must be reminded again that this fixity is only temporary, that is, that even here the series belong to the class of those which possess a fluctuating type. Those indeed who believe in the fixity of natural species will have the best chance of finding a series of the really permanent type amongst them, though even they will admit that some change in the characteristic is attainable in length of time. In the case of the principal natural agencies, it is of course incontestable that the present average is referable to the present geological period only. Our average temperature and average rainfall have in former times been widely 65 different from what they now are, and doubtless will be so again.
The reader needs to be reminded again that this stability is only temporary, meaning that even here the series belong to the category of those that have a fluctuating nature. Those who believe in the permanence of natural species will have the best chance of identifying a series that is genuinely lasting among them, although even they will concede that some changes in characteristics can occur over time. Regarding the main natural processes, it is undeniably true that the current averages are only applicable to the present geological period. Our average temperature and average rainfall have been significantly different in the past compared to what they are now, and they will likely change again in the future. 65
Any fuller investigation of the process by which, on the Theory of Evolution, out of a primeval simplicity and uniformity the present variety was educed, hardly belongs to the scope of the present work: at most, a few hints must suffice.
Any deeper exploration of how, according to the Theory of Evolution, the current diversity emerged from a basic simplicity and sameness, really goes beyond the focus of this work: at most, just a few suggestions will have to do.
§ 11. The above, then, are instances of natural objects and natural agencies. There seems reason to believe that it is in such things only, as distinguished from things artificial, that the property in question is to be found. This is an assertion that will need some discussion and explanation. Two instances, in apparent opposition, will at once occur to the mind of some readers; one of which, from its great intrinsic importance, and the other, from the frequency of the problems which it furnishes, will demand a few minutes' separate examination.
§ 11. The examples above are of natural objects and natural processes. There’s a good reason to think that the property we’re discussing exists only in these natural items, not in artificial ones. This claim requires some discussion and clarification. Two examples that seem contradictory might come to mind for some readers; one is very significant, and the other frequently raises issues worth a brief separate look.
(1) The first of these is the already mentioned case of instrumental observations. In the use of astronomical and other instruments the utmost possible degree of accuracy is often desired, a degree which cannot be reasonably hoped for in any one single observation. What we do therefore in these cases is to make a large number of successive observations which are naturally found to differ somewhat from each other in their results; by means of these the true value (as explained in a future chapter, on the Method of Least Squares) is to be determined as accurately as possible. The subjects then of calculation here are a certain number of elements, slightly incorrect elements, given by successive observations. Are not these observations artificial, or the direct product of voluntary agency? Certainly not: or rather, the answer depends on what we understand by voluntary. What is really intended and aimed at by the observer, is of 66 course, perfect accuracy, that is, the true observation, or the voluntary steps and preliminaries on which this observation depends. Whether voluntary or not, this result only can be called intentional. But this result is not obtained. What we actually get in its place is a series of deviations from it, containing results more or less wide of the truth. Now by what are these deviations caused? By just such agencies as we have been considering in some of the earlier sections in this chapter. Heat and its irregular warping influence, draughts of air producing their corresponding effects, dust and consequent friction in one part or another, the slight distortion of the instrument by strains or the slow uneven contraction which continues long after the metal was cast; these and such as these are some of the causes which divert us from the truth. Besides this group, there are others which certainly do depend upon human agency, but which are not, strictly speaking, voluntary. They are such as the irregular action of the muscles, inability to make our various organs and members execute precisely the purposes we have in mind, perhaps different rates in the rapidity of the nervous currents, or in the response to stimuli, in the same or different observers. The effect produced by some of these, and the allowance that has in consequence to be made, are becoming familiar even to the outside world under the name of the ‘personal equation’ in astronomical, psychophysical, and other observations.
(1) The first example is the previously mentioned case of instrumental observations. When using astronomical and other instruments, the highest level of accuracy is often desired, a level that can’t reasonably be expected from any single observation. In these cases, we make a large number of successive observations, which naturally tend to vary from each other in their results; through these, we aim to determine the true value (as explained in a future chapter on the Method of Least Squares) as accurately as possible. The subjects of calculation here are a certain number of elements, slightly incorrect elements, derived from successive observations. Are these observations artificial or the direct result of human intention? Certainly not; or rather, the answer depends on our definition of voluntary. What the observer really intends and aims for is perfect accuracy, that is, the true observation, or the voluntary steps and preliminaries on which this observation relies. Whether voluntary or not, this result can only be called intentional. However, this result is not achieved. Instead, we get a series of deviations from it, containing results that are more or less far from the truth. Now, what causes these deviations? They arise from just the kinds of influences we discussed in earlier sections of this chapter. Heat and its unpredictable warping effects, drafts of air producing their corresponding impacts, dust and the resulting friction in one part or another, slight distortion of the instrument from strain, or the slow, uneven contraction that continues long after the metal has been cast; these and similar factors mislead us from the truth. In addition to this group, there are others that certainly depend on human action, but which are not strictly voluntary. These include the irregular actions of our muscles, the inability to make our various organs and limbs perform exactly what we intend, and perhaps different rates in the speed of nervous impulses or responses to stimuli, whether in the same observer or different ones. The effects produced by some of these, and the adjustments that must therefore be made, are becoming familiar even to the general public under the name of the 'personal equation' in astronomical, psychophysical, and other observations.
§ 12. (2) The other example, alluded to above, is the stock one of cards and dice. Here, as in the last case, the result is remotely voluntary, in the sense that deliberate volition presents itself at one stage. But subsequently to this stage, the result is produced or affected by so many involuntary agencies that it owes its characteristic properties to these. The turning up, for example, of a particular face of a die is 67 the result of voluntary agency, but it is not an immediate result. That particular face was not chosen, though the fact of its being chosen was the remote consequence of an act of choice. There has been an intermediate chaos of conflicting agencies, which no one can calculate before or distinguish afterwards. These agencies seem to show a uniformity in the long run, and thence to produce a similar uniformity in the result. The drawing of a card from a pack is indeed more directly volitional, as in cutting for partners in a game of whist. But no one continues to do this long without having the pack well shuffled in the interval, whereby a host of involuntary influences are let in.
§ 12. (2) The other example mentioned earlier is the classic one of cards and dice. Here, like in the previous case, the outcome is somewhat voluntary, in the way that a deliberate choice comes into play at one point. However, after this point, the outcome is influenced by so many involuntary factors that its defining characteristics come from these influences. For instance, the appearance of a certain side of a die is a result of voluntary action, but it isn’t an immediate outcome. That specific side wasn’t chosen directly, even though its selection was a distant result of a previous choice. There has been a mix of conflicting factors in between that can’t be anticipated beforehand or clearly differentiated afterward. These factors tend to show a pattern over time, leading to a similar consistency in the outcome. Drawing a card from a deck is indeed more directly an act of will, as seen in cutting for partners in a game of whist. But no one keeps doing this for long without thoroughly shuffling the deck in the meantime, which introduces a range of involuntary influences.
§ 13. The once startling but now familiar uniformities exhibited in the cases of suicides and misdirected letters, do not belong to the same class. The final resolution, or want of it, which leads to these results, is in each case indeed an important ingredient in the individual's action or omission; but, in so far as volition has anything to do with the results as a whole, it instantly disturbs them. If the voice of the Legislature speaks out, or any great preacher or moralist succeeds in deterring, or any impressive example in influencing, our moral statistics are instantly tampered with. Some further discussion will be devoted to this subject in a future chapter; it need only be remarked here that (always excluding such common or general influence as those just mentioned) the average volition, potent as it is in each separate case, is on the whole swayed by non-voluntary conditions, such as those of health, the casualties of employment, &c., in fact the various circumstances which influence the length of a man's life.
§ 13. The once shocking but now familiar patterns seen in cases of suicides and misdirected letters don’t belong to the same category. The ultimate decision, or lack of one, that leads to these outcomes is indeed a significant factor in the individual's actions or inactions; however, whenever personal choice plays a role in the overall results, it disrupts them immediately. If the Legislature speaks out, or if a prominent preacher or moralist successfully discourages, or if any compelling example influences behavior, our moral statistics are quickly interfered with. Some additional discussion will be dedicated to this topic in a later chapter; for now, it’s worth noting that (always excluding the common or general influences just mentioned) the average decision, while powerful in each individual case, is generally influenced by non-voluntary factors, such as health conditions, job-related incidents, etc., in fact, the various circumstances that affect the span of a person's life.
§ 14. Such distinctions as those just insisted on may seem to some persons to be needless, but serious errors have occasionally arisen from the neglect of them. The immediate 68 products of man's mind, so far indeed as we can make an attempt to obtain them, do not seem to possess this essential characteristic of Probability. Their characteristic seems rather to be, either perfect mathematical accuracy or utter want of it, either law unfailing or mere caprice. If, e.g., we find the trees in a forest growing in straight lines, we unhesitatingly conclude that they were planted by man as they stand. It is true on the other hand, that if we find them not regularly planted, we cannot conclude that they were not planted by man; partly because the planter may have worked without a plan, partly because the subsequent irregularities brought on by nature may have obscured the plan. Practically the mind has to work by the aid of imperfect instruments, and is subjected to many hindrances through various and conflicting agencies, and by these means the work loses its original properties. Suppose, for instance, that a man, instead of producing numerical results by imperfect observations or by the cast of dice, were to select them at first hand for himself by simply thinking of them at once; what sort of series would he obtain? It would be about as difficult to obtain in this way any such series as those appropriate to Probability as it would be to keep his heart or pulse working regularly by direct acts of volition, supposing that he had the requisite control over these organs. But the mere suggestion is absurd. A man must have an object in thinking, he must think according to a rule or formula; but unless he takes some natural series as a copy, he will never be able to construct one mentally which shall permanently imitate the originals. Or take another product of human efforts, in which the intention can be executed with tolerable success. When any one builds a house, there are many slight disturbing influences at work, such as shrinking of bricks and mortar, settling of foundations, &c. But the 69 effect which these disturbances are able to produce is so inappreciably small, that we may fairly consider that the result obtained is the direct product of the mind, the accurate realization of its intention. What is the consequence? Every house in the row, if designed by one man and at one time, is of exactly the same height, width, &c. as its neighbours; or if there are variations they are few, definite, and regular. The result offers no resemblance whatever to the heights, weights, &c. of a number of men selected at random. The builder probably had some regular design in contemplation, and he has succeeded in executing it.
§ 14. The distinctions mentioned earlier may seem unnecessary to some, but ignoring them can lead to serious mistakes. The immediate outcomes of human thought, as much as we can try to understand them, don’t seem to have the essential characteristic of Probability. Instead, they tend to show either perfect mathematical precision or complete randomness—either consistent laws or mere chance. For example, when we see trees in a forest arranged in straight lines, we readily conclude that they were deliberately planted that way by a person. However, if they appear to be growing irregularly, we can't assume they weren’t planted by someone; the planter could have had no specific plan, or nature’s later disruptions might have obscured the original layout. In practice, our thinking relies on imperfect tools and faces various obstacles from conflicting factors, causing the outcome to lose its original characteristics. Imagine if someone were to generate numerical results not through flawed observations or dice rolls, but by purely envisioning them directly—what kind of series would result? It would be as challenging to generate a series suitable for Probability this way as it would be to control one’s heartbeat or pulse merely by willpower, even if one could actually manage those functions. The notion itself is ridiculous. A person needs a purpose in their thinking, needing to follow some rule or guideline; without using a natural series as a model, they won’t manage to create a mental series that consistently resembles the originals. Or consider another human endeavor, where the intentions can be realized with reasonable success. When someone builds a house, various slight disturbances come into play, like the shrinking of bricks and mortar, or the settling of foundations, etc. Yet, the impact of these disturbances is so minuscule that we can justifiably view the resulting structure as a direct product of the builder's mind, accurately reflecting their intentions. What’s the result? Every house in a row, if designed by one person at the same time, will be exactly the same height, width, etc., as its neighbors; and if there are any differences, they are minimal, specific, and regular. The outcome bears no resemblance to the heights, weights, etc., of a randomly selected group of men. The builder likely had a consistent plan in mind and successfully executed it.
§ 15. It may be replied that if we extend our observations, say to the houses of a large city, we shall then detect the property under discussion. The different heights of a great number, when grouped together, might be found to resemble those of a great number of human beings under similar treatment. Something of this kind might not improbably be found to be the case, though the resemblance would be far from being a close one. But to raise this question is to get on to different ground, for we were speaking (as remarked above) not of the work of different minds with their different aims, but of that of one mind. In a multiplicity of designs, there may be that variable uniformity, for which we may look in vain in a single design. The heights which the different builders contemplated might be found to group themselves into something of the same kind of uniformity as that which prevails in most other things which they should undertake to do independently. We might then trace the action of the same two conditions,—a uniformity in the multitude of their different designs, a uniformity also in the infinite variety of the influences which have modified those designs. But this is a very different thing from saying that the work of one man 70 will show such a result as this. The difference is much like that between the tread of a thousand men who are stepping without thinking of each other, and their tread when they are drilled into a regiment. In the former case there is the working, in one way or another, of a thousand minds; in the latter, of one only.
§ 15. One could argue that if we expand our observations to include the buildings in a large city, we would identify the property in question. The varying heights of many structures, when viewed together, might resemble those of a large group of individuals experiencing similar conditions. While this might be somewhat true, the resemblance would not be very close. However, bringing up this point leads us to a different discussion, as we were focused (as noted earlier) not on the work of various minds with different goals, but rather on that of a single mind. In a variety of designs, we may find a kind of inconsistent uniformity that we cannot expect from just one design. The heights that different builders envisioned might cluster into a form of uniformity similar to what we see in most other projects they undertake independently. We could then observe the influence of the same two factors—uniformity among their diverse designs and uniformity within the endless variety of influences that have shaped those designs. But this is quite distinct from suggesting that the work of one person will yield such a result. The difference is comparable to the footsteps of a thousand people walking without any awareness of each other, versus those marching in formation. In the former scenario, the actions reflect the thoughts of a thousand minds; in the latter, just one.
The investigations of this and the former chapter constitute a sufficiently close examination into the detailed causes by which the peculiar form of statistical results with which we are concerned is actually produced, to serve the purpose of a work which is occupied mainly with the methods of the Science of Probability. The great importance, however, of certain statistical or sociological enquiries will demand a recurrence in a future chapter to one particular application of these statistics, viz. to those concerned with some classes of human actions.
The investigations in this chapter and the previous one provide a thorough look into the specific causes that create the unique statistical results we're discussing, which is essential for a work focused mainly on the methods of Probability Science. However, the significant importance of certain statistical or sociological studies will require us to revisit one specific application of these statistics in a future chapter, namely, those related to certain types of human actions.
§ 16. The only important addition to, or modification of, the foregoing remarks which I have found occasion to make is due to Mr Galton. He has recently pointed out,—and was I believe the first to do so,—that in certain cases some analysis of the causal processes can be effected, and is in fact absolutely necessary in order to account for the facts observed. Take, for instance, the heights of the population of any country. If the distribution or dispersion of these about their mean value were left to the unimpeded action of those myriad productive agencies alluded to above, we should certainly obtain such an arrangement in the posterity of any one generation as had already been exhibited in the parents. That is, we should find repeated in the previous stage the same kind of order as we were trying to account for in the following stage.
§ 16. The only significant addition to or change in the previous comments that I've felt the need to make comes from Mr. Galton. He recently pointed out—and I believe he was the first to do so—that in certain situations, some analysis of the causal processes is possible and is actually essential to explain the observed facts. For example, consider the heights of the population in any country. If the distribution or spread of these heights around their average value were solely determined by the countless productive factors mentioned earlier, we would certainly see a pattern in the offspring of any generation that reflected what was already present in the parents. In other words, we would observe the same type of order in the earlier generation that we are trying to explain in the later generation.
But then, as Mr Galton insists, if such agencies acted freely and independently, though we should get the same 71 kind of arrangement or distribution, we should not get the same degree of it: there would, on the contrary, be a tendency towards further dispersion. The ‘curve of facility’ (v. the diagram on p. 29) would belong to the same class, but would have a different modulus. We shall see this at once if we take for comparison a case in which similar agencies work their way without any counteraction whatever. Suppose, for instance, that a large number of persons, whose fortunes were equal to begin with, were to commence gambling or betting continually for some small sum. If we examine their circumstances after successive intervals of time, we should expect to find their fortunes distributed according to the same general law,—i.e. the now familiar law in question,—but we should also expect to find that the poorest ones were slightly poorer, and the richest ones slightly richer, on each successive occasion. We shall see more about this in a future chapter (on Gambling), but it may be taken for granted here that there is nothing in the laws of chance to resist this tendency towards intensifying the extremes.
But then, as Mr. Galton points out, if such agencies acted freely and independently, while we would still achieve the same 71 kind of arrangement or distribution, we wouldn't achieve the same degree of it: instead, there would be a tendency for further dispersion. The ‘curve of facility’ (see the diagram on p. 29) would belong to the same class but would have a different modulus. We can quickly see this if we compare it to a case where similar agencies operate without any opposing forces. For example, if a large group of people, all of whom started with equal fortunes, began gambling or betting small amounts continuously. If we examine their situations after successive time intervals, we would expect to find their fortunes distributed according to the same general law—that is, the now-familiar law in question—but we would also expect to see that the poorest individuals became slightly poorer and the richest slightly richer each time we looked. We will explore this more in a future chapter (on Gambling), but it can be assumed here that there is nothing in the laws of chance that counters this tendency to intensify the extremes.
Now it is found, on the contrary, in the case of vital phenomena,—for instance in that of height, and presumably of most of the other qualities which are in any way characteristic of natural kinds,—that there is, through a number of successive generations, a remarkable degree of fixity. The tall men are not taller, and the short men are not shorter, per cent. of the population in successive generations: always supposing of course that some general change of circumstances, such as climate, diet, &c. has not set in. There must therefore here be some cause at work which tends, so to say, to draw in the extremes and thus to check the otherwise continually increasing dispersion.
Now, on the contrary, in the case of essential qualities—like height and likely many other traits that define natural categories—it turns out that there is, over several generations, a significant degree of stability. Tall men aren’t getting taller, and short men aren’t getting shorter, as a percentage of the population in successive generations, assuming, of course, that there hasn’t been a general change in circumstances like climate, diet, etc. There must therefore be some force at play that tends to pull the extremes closer together and thus prevent the continually increasing spread.
§ 17. The facts were first tested by careful experiment. 72 At the date of Mr Galton's original paper on the subject,[3] there were no available statistics of heights of human beings; so a physical element admitting of careful experiment (viz. the size or weight of certain seeds) was accurately estimated. From these data the actual amount of reversion from the extremes, that is, of the slight pressure continually put upon the extreme members with the result of crowding them back towards the mean, was determined, and this was compared with what theory would require in order to keep the characteristics of the species permanently fixed. Since then, statistics have been obtained to a large extent which deal directly with the heights of human beings.
§ 17. The facts were first tested through careful experiments. 72 At the time of Mr. Galton's original paper on the subject, there were no available statistics on human heights; so a physical element that could be carefully experimented with (specifically, the size or weight of certain seeds) was accurately assessed. From this data, the actual amount of reversion from the extremes—meaning the slight pressure constantly applied to the extreme members that results in them being pushed back towards the average—was determined, and this was compared with what theory would suggest is needed to keep the species' characteristics permanently fixed. Since then, statistics related to human heights have been gathered extensively.
The general conclusion at which we arrive is that there are several causes at work which are neither slight nor independent. There is, for instance, the observed fact that the extremes are as a rule not equally fertile with the means, nor equally capable of resisting death and disease. Hence as regards their mere numbers, there is a tendency for them somewhat to thin out. Then again there is a distinct positive cause in respect of ‘reversion.’ Not only are the offspring of the extremes less numerous, but these offspring also tend to cluster about a mean which is, so to say, shifted a little towards the true centre of the whole group; i.e. towards the mean offspring of the mean parents.
The overall conclusion we reach is that there are several causes at play that are neither minor nor independent. For example, it's been observed that the extremes are usually not as fertile as the averages, nor are they as capable of resisting death and disease. Therefore, in terms of their sheer numbers, there's a tendency for them to somewhat dwindle. Additionally, there's a clear positive cause related to 'reversion.' Not only are the offspring of the extremes fewer in number, but they also tend to group around an average that shifts a bit closer to the true center of the entire group; that is, towards the average offspring of the average parents.
§ 18. For a full discussion of these characteristics, and for a variety of most ingenious illustrations of their mode of agency and of their comparative efficacy, the reader may be referred to Mr Galton's original articles. For our present purpose it will suffice to say that these characteristics tend towards maintaining the fixity of species; and that though they do not affect what may be called the general nature of 73 the ‘probability curve’ or ‘law of facility’, they do determine its precise value in the cases in question. If, indeed, it be asked why there is no need for any such corrective influence in the case of, say, firing at a mark: the answer is that there is no opening for it except where a cumulative influence is introduced. The reason why the fortunes of our betting party showed an ever increasing divergency, and why some special correction was needed in order to avert such a tendency in the case of vital phenomena, was that the new starting-point at every step was slightly determined by the results of the previous step. The man who has lost a shilling one time starts, next time, worse off by just a shilling; and, but for the corrections we have been indicating, the man who was born tall would, so to say, throw off his descendants from a vantage ground of superior height. The true parallel in the case of the marksmen would be to suppose that their new points of aim were always shifted a little in the direction of the last divergence. The spreading out of the shot-marks would then continue without limit, just as would the divergence of fortunes of the supposed gamblers.
§ 18. For a complete discussion of these characteristics, along with a variety of clever illustrations of how they work and their relative effectiveness, the reader can refer to Mr. Galton's original articles. For our current purposes, it’s enough to say that these characteristics help keep species stable; and even though they don’t change the overall nature of the ‘probability curve’ or ‘law of ease’, they do define its exact value in the specific cases being discussed. If someone asks why there’s no need for any corrective influence when, for example, aiming at a target, the answer is that it only comes into play where a cumulative influence is involved. The reason the outcomes of our betting group showed an increasingly large divergence, and why some specific correction was needed to prevent this trend in vital phenomena, was because the new starting point at each stage was slightly affected by the results of the previous stage. The person who lost a shilling once starts off worse off by that same shilling the next time; and without the corrections we've been discussing, a person born tall would effectively advantageously pass on their height to their descendants. The true parallel for the marksmen would be to imagine that their new aiming points are always adjusted slightly towards the last divergence. This would mean that the spread of the shot marks would continue indefinitely, just like the diverging fortunes of the hypothetical gamblers.
1 As stated above, this is really little more than a re-statement, a stage further back, of the existence of the same kind of uniformity as that which we are called upon to explain in the concrete details presented to us in experience.
1 As mentioned before, this is basically just a rephrasing, a step back, of the existence of the same type of consistency that we need to clarify in the specific details we encounter in our experiences.
2 “It would seem in fact that in coarse and rude observations the errors proceed from a very few principal causes, and in consequence our hypothesis [as to the Exponential Law of Error] will probably represent the facts only imperfectly, and the frequency of the errors will only approximate roughly and vaguely to the law which follows from it. But when astronomers, not content with the degree of accuracy they had reached, prosecuted their researches into the remaining sources of error, they found that not three or four, but a great number of minor sources of error of nearly co-ordinate importance began to reveal themselves, having been till then masked and overshadowed by the graver errors which had been now approximately removed…. There were errors of graduation, and many others in the contraction of instruments; other errors of their adjustments; errors (technically so called) of observation; errors from the changes of temperature, of weather, from slight irregular motions and vibrations; in short, the thousand minute disturbing influences with which modern astronomers are familiar.” (Extracted from a paper by Mr Crofton in the Vol. of the Philosophical Transactions for 1870, p. 177.)
2 “It seems that when looking at rough observations, most errors come from a very few main causes. As a result, our hypothesis [regarding the Exponential Law of Error] will probably only represent the facts imperfectly, and the frequency of these errors will only be a rough and vague approximation of the law it suggests. However, when astronomers, not satisfied with their level of accuracy, continued to investigate the remaining sources of error, they discovered that there were not just three or four, but a great number of minor sources of error that were almost equally significant, which had been hidden until then, overshadowed by the larger errors that had now been largely addressed…. There were errors in graduation, among many others related to instrument contraction; errors in adjustments; errors (technically termed) of observation; errors due to temperature changes, weather variations, slight irregular movements, and vibrations; in short, the countless small disturbances that modern astronomers are well-acquainted with.” (Extracted from a paper by Mr. Crofton in the Vol. of the Philosophical Transactions for 1870, p. 177.)
CHAPTER 4.
ON THE MODES OF ESTABLISHING AND DETERMINING THE EXISTENCE AND NUMERICAL PROPORTIONS OF THE CHARACTERISTIC PROPERTIES OF OUR SERIES OR GROUPS.
§ 1. At the point which we have now reached, we are supposed to be in possession of series or groups of a certain kind, lying at the bottom, as one may say, and forming the foundation on which the Science of Probability is to be erected. We have described with sufficient particularity the characteristics of such a series, and have indicated the process by which it is, as a rule, actually brought about in nature. The next enquiries which have to be successively made are, how in any particular case we are to establish their existence and determine their special character and properties? and secondly,[1] when we have obtained them, in what mode are they to be employed for logical purposes?
§ 1. At the point we’ve reached, we assume we have certain series or groups that are foundational, essentially forming the base for the Science of Probability. We’ve detailed the features of these series and pointed out how they typically occur in nature. The next questions we need to address are: how do we establish their existence in a specific case and identify their unique characteristics and properties? And secondly, when we have them, how do we use them for logical purposes?
The answer to the former enquiry does not seem difficult. Experience is our sole guide. If we want to discover what is in reality a series of things, not a series of our own conceptions, we must appeal to the things themselves to obtain it, for we cannot find much help elsewhere. We cannot tell how many persons will be born or die in a year, or how many houses will be burnt or ships wrecked, without actually counting them. When we thus speak of ‘experience’ we 75 mean to employ the term in its widest signification; we mean experience supplemented by all the aids which inductive or deductive logic can afford. When, for instance, we have found the series which comprises the numbers of persons of any assigned class who die in successive years, we have no hesitation in extending it some way into the future as well as into the past. The justification of such a procedure must be sought in the ordinary canons of Induction. As a special discussion will be given upon the connection between Probability and Induction, no more need be said upon this subject here; but nothing will be found there at variance with the assertion just made, that the series we employ are ultimately obtained by experience only.
The answer to the earlier question doesn’t seem too hard. Experience is our only guide. If we want to find out what actually exists as a series of things, not just our own ideas about them, we need to look at the things themselves to find out, since we can't rely much on anything else. We can’t predict how many people will be born or die in a year, or how many houses will burn or ships will sink, without actually counting them. When we refer to ‘experience,’ we mean it in the broadest sense; we’re talking about experience combined with all the insights that inductive or deductive logic can provide. For example, when we've identified the series that includes the number of people in any specific category who die in consecutive years, we have no hesitation in projecting it a bit into the future as well as the past. The reasoning behind this approach can be found in the standard principles of Induction. Since there will be a specific discussion about the link between Probability and Induction, there’s no need to go further on this topic here; but what you’ll find there will not contradict the statement made earlier—that the series we use ultimately come from experience alone.
§ 2. In many cases it is undoubtedly true that we do not resort to direct experience at all. If I want to know what is my chance of holding ten trumps in a game of whist, I do not enquire how often such a thing has occurred before. If all the inhabitants of the globe were to divide themselves up into whist parties they would have to keep on at it for a great many years, if they wanted to settle the question satisfactorily in that way. What we do of course is to calculate algebraically the proportion of possible combinations in which ten trumps can occur, and take this as the answer to our problem. So again, if I wanted to know the chance of throwing six with a die whose faces were unequal, it would be a question if my best way would not be to calculate geometrically the solid angle subtended at the centre of gravity by the opposite face, and the ratio of this to the whole surface of a sphere would represent sufficiently closely the chance required.
§ 2. In many cases, it’s definitely true that we don’t rely on direct experience at all. If I want to know what my chances are of holding ten trumps in a game of whist, I don’t ask how often this has happened before. If everyone in the world were to split into whist parties, they would have to play for many years to answer that question satisfactorily. What we actually do is calculate algebraically the proportion of possible combinations in which ten trumps can occur and use that as the answer to our question. Similarly, if I wanted to know the chance of rolling a six with a die that has uneven faces, I would need to consider whether it would be better to calculate geometrically the solid angle at the center of gravity by the opposite face, and the ratio of this to the total surface of a sphere would closely represent the chance I’m looking for.
It is quite true that in such examples as the above, especially the former one, nobody would ever think of appealing to statistics. This would be a tedious process to adopt when, 76 as here, the mechanical and other conditions upon which the production of the events depend are comparatively few, determinate, and admit of isolated consideration, whilst the enormous number of combinations which can be constructed out of them causes an enormous consequent multiplicity of ways in which the events can possibly happen. Hence, in practice, à priori determination is often easy, whilst à posteriori appeal to experience would be not merely tedious but utterly impracticable. This, combined with the frequent simplicity and attractiveness of such examples when deductively treated, has made them very popular, and produced the impression in many quarters that they are the proper typical instances to illustrate the theory of chance. Whereas, had the science been concerned with those kinds of events only which in practice are commonly made subjects of insurance, probably no other view would ever have been taken than that it was based upon direct appeal to experience.
It's true that in cases like the ones mentioned, especially the first one, no one would think of relying on statistics. This would be a tedious approach when, as in this case, the mechanical and other conditions that affect the events are relatively few, defined, and can be considered in isolation, while the vast number of combinations that can be formed from them leads to an immense variety of ways the events could occur. Therefore, in practice, à priori determination is often straightforward, whereas à posteriori reliance on experience would not only be tedious but completely impractical. This, along with the often simple and appealing nature of such examples when analyzed deductively, has made them very popular and created the impression in many circles that they are the right typical cases to illustrate the theory of chance. However, if the science had focused only on those types of events that are typically insured, it's likely that no other perspective would have emerged other than that it was grounded in direct reliance on experience.
§ 3. When, however, we look a little closer, we find that there is no occasion for such a sharp distinction as that apparently implied between the two classes of examples just indicated. In such cases as those of dice and cards, even, in which we appear to reason directly from the determining conditions, or possible variety of the events, rather than from actual observation of their occurrence, we shall find that this procedure is only valid by the help of a tacit assumption which can never be determined otherwise than by direct experience. It is, no doubt, an exceedingly natural and obvious assumption, and one which is continually deriving fresh weight from every-day observation, but it is one which ought not to be admitted without consideration. As this is a very important matter, not so much in itself as in connection with the light which it throws upon the theory of the subject, we will enter into a somewhat detailed examination of it.
§ 3. However, when we take a closer look, we see that there’s no need for such a sharp distinction between the two types of examples just mentioned. In cases like dice and cards, where it seems we reason directly from the determining conditions or possible outcomes instead of from actual observations of what happens, we find that this approach only works due to an unspoken assumption that can only be validated through direct experience. This assumption is clearly natural and obvious, and it gains further support from everyday experiences, but it shouldn’t be accepted without careful thought. Since this is a crucial matter, not just on its own but also in relation to the insights it provides about the theory of the subject, we will take a detailed look at it.
Let us take a very simple example, that of tossing up a penny. Suppose that I am contemplating a succession of two throws; I can see that the only possible events are[2] HH, HT, TH, TT. So much is certain. We are moreover tolerably well convinced from experience that these events occur, in the long run, about equally often. This is of course admitted on all hands. But on the view commonly maintained, it is contended that we might have known the fact beforehand on grounds which are applicable to an indefinite number of other and more complex cases. The form in which this view would generally be advanced is, that we are enabled to state beforehand that the four throws above mentioned are equally likely. If in return we ask what is meant by the expression ‘equally likely’, it appears that there are two and only two possible forms of reply. One of these seeks the explanation in the state of mind of the observer, the other seeks it in some characteristic of the things observed.
Let’s consider a simple example, like flipping a penny. Imagine I’m thinking about two flips in a row; I can see that the only possible outcomes are [2] HH, HT, TH, TT. That much is clear. We also have good reason to believe from experience that these outcomes happen, over time, about equally often. This is generally accepted by everyone. However, according to the commonly held view, it’s claimed that we could have known this fact in advance based on principles that apply to countless other, more complex situations. The way this idea is usually presented is that we can say in advance that the four outcomes mentioned above are equally likely. If we then ask what ‘equally likely’ means, it turns out there are only two possible ways to respond. One explanation looks at the observer's mindset, while the other looks at some characteristic of the things being observed.
(1) It might, for instance, be said on the one hand, that what is meant is that the four events contemplated are equally easy to imagine, or, more accurately, that our expectation or belief in their occurrence is equal. We could hardly be content with this reply, for the further enquiry would immediately be urged, On what ground is this to be believed? What are the characteristics of events of which our expectation is equal? If we consented to give an answer to this further enquiry, we should be led to the second form of reply, to be noticed directly; if we did not consent we should, it seems, be admitting that Probability was only a 78 portion of Psychology, confined therefore to considering states of mind in themselves, rather than in their reference to facts, viz. as being true or false. We should, that is, be ceasing to make it a science of inference about things. This point will have to be gone into more thoroughly in another chapter; but it is impossible to direct attention too prominently to the fact that Logic (and therefore Probability as a branch of Logic) is not concerned with what men do believe, but with what they ought to believe, if they are to believe correctly.
(1) It could be argued that, on one hand, the four events being considered are equally easy to imagine, or more accurately, we have the same expectation or belief in their happening. However, we wouldn't really be satisfied with this answer because it would immediately raise the question, On what basis is this belief held? What are the features of events where our expectations are equal? If we agreed to respond to this question, we would be led to a second type of answer that needs to be addressed directly; if we didn't agree, it seems we would be admitting that Probability is just a part of Psychology, limited to examining states of mind on their own, instead of in relation to facts, that is, as being true or false. In that case, we would stop treating it as a science of inference about things. This issue will be explored in more detail in another chapter; however, it’s important to highlight that Logic (and thus Probability as a part of Logic) is not focused on what people do believe, but rather on what they should believe if they want to believe correctly.
(2) In the other form of reply the explanation of the phrase in question would be sought, not in a state of mind, but in a quality of the things contemplated. We might assign the following as the meaning, viz. that the events really would occur with equal frequency in the long run. The ground of this assertion would probably be found in past experience, and it would doubtless be impossible so to frame the answer as to exclude the notion of our belief altogether. But still there is a broad distinction between seeking an equality in the amount of our belief, as before, and in the frequency of occurrence of the events themselves, as here.
(2) In the other type of response, the explanation of the phrase in question would be sought not in a state of mind, but in a quality of the things being considered. We might define the meaning as the idea that the events would actually happen with the same frequency over the long term. The basis for this claim would probably be found in past experience, and it would likely be impossible to frame the answer in a way that completely excludes the concept of our belief. However, there is a clear distinction between looking for an equality in the amount of our belief, as mentioned earlier, and in the frequency of the actual events occurring, as described here.
§ 4. When we have got as far as this it can readily be shown that an appeal to experience cannot be long evaded. For can the assertion in question (viz. that the throws of the penny will occur equally often) be safely made à priori? Those who consider that it can seem hardly to have fully faced the difficulties which meet them. For when we begin to enquire seriously whether the penny will really do what is expected of it, we find that restrictions have to be introduced. In the first place it must be an ideal coin, with its sides equal and fair. This restriction is perfectly intelligible; the study of solid geometry enables us to idealize a penny into a circular or cylindrical lamina. But this condition 79 by itself is not sufficient, others are wanted as well. The penny was supposed to be tossed up, as we say ‘at random.’ What is meant by this, and how is this process to be idealized? To ask this is to introduce no idle subtlety; for it would scarcely be maintained that the heads and tails would get their fair chances if, immediately before the throwing, we were so to place the coin in our hands as to start it always with the same side upwards. The difference that would result in consequence, slight as its cause is, would tend in time to show itself in the results. Or, if we persisted in starting with each of the two sides alternately upwards, would the longer repetitions of the same side get their fair chance?
§ 4. At this point, it's clear that we can't avoid looking at actual experience for long. Can we really claim, beforehand, that the outcomes of flipping a penny will be equal? Those who believe that may not have fully considered the challenges they face. When we take a serious look at whether the penny will behave as expected, we find that we need to impose some conditions. First, it has to be a perfect coin, with equal and fair sides. This requirement makes sense; the study of geometry allows us to think of a penny as a perfect circle or cylinder. But this condition alone isn't enough; we need additional ones. The penny is meant to be tossed ‘at random.’ What does that mean, and how do we conceptualize this process? This isn't just an academic question; it’s unlikely that heads and tails would have equal chances if, right before the toss, we positioned the coin in our hands so that one side was always facing up. The resulting difference, although subtle, would eventually show in the outcomes. And if we flipped the coin alternately starting with each side facing up, would longer runs of the same side really get fair chances?
Perhaps it will be replied that if we think nothing whatever about these matters all will come right of its own accord. It may, and doubtless will be so, but this is falling back upon experience. It is here, then, that we find ourselves resting on the experimental assumption above mentioned, and which indeed cannot be avoided. For suppose, lastly, that the circumstances of nature, or my bodily or mental constitution, were such that the same side always is started upwards, or indeed that they are started in any arbitrary order of our own? Well, it will be replied, it would not then be a fair trial. If we press in this way for an answer to such enquiries, we shall find that these tacit restrictions are really nothing else than a mode of securing an experimental result. They are only another way of saying, Let a series of actions be performed in such a way as to secure a sequence of a particular kind, viz., of the kind described in the previous chapters.
Maybe someone will respond that if we don't think about these things at all, everything will sort itself out. It might, and it probably will, but that relies on past experiences. So, here we find ourselves depending on the experimental assumption mentioned earlier, which we really can't avoid. Suppose, for the sake of argument, that the conditions in nature, or my physical or mental makeup, were such that the same side always ends up facing up, or that they appear in any random order we choose? Well, the response would be that it wouldn’t be a fair test. If we push for an answer to these questions, we'll see that these unspoken limitations are really just a way to ensure a certain experimental outcome. They’re simply another way of saying, "Let’s perform a series of actions in such a way as to achieve a specific order, namely, the kind described in the previous chapters."
§ 5. An intermediate way of evading the direct appeal to experience is sometimes found by defining the probability of an event as being measured by the ratio which the 80 number of cases favourable to the event bears to the total number of cases which are possible. This seems a somewhat loose and ambiguous way of speaking. It is clearly not enough to count the number of cases merely, they must also be valued, since it is not certain that each is equally potent in producing the effect. This, of course, would never be denied, but sufficient importance does not seem to be attached to the fact that we have really no other way of valuing them except by estimating the effects which they actually do, or would produce. Instead of thus appealing to the proportion of cases favourable to the event, it is far better (at least as regards the foundation of the science, for we are not at this moment discussing the practical method of facilitating our calculations) to appeal at once to the proportion of cases in which the event actually occurs.
§ 5. A middle-ground approach to avoiding a direct appeal to experience is sometimes found by defining the probability of an event as measured by the ratio of the number of favorable cases to the total number of possible cases. This seems like a somewhat vague and unclear way of expressing it. It's clearly not enough to just count the number of cases; they also need to be valued because it's uncertain that each case has the same potential to produce the effect. While no one would dispute this, it seems that not enough importance is placed on the fact that we really have no other way to value them except by estimating the effects they actually do or would produce. Instead of appealing to the proportion of favorable cases, it’s much better (at least concerning the foundation of the science, as we're not currently discussing practical methods to ease our calculations) to directly refer to the proportion of cases in which the event actually occurs.
§ 6. The remarks above made will apply, of course, to most of the other common examples of chance; the throwing of dice, drawing of cards, of balls from bags, &c. In the last case, for instance, one would naturally be inclined to suppose that a ball which had just been put back would thereby have a better chance of coming out again next time, since it will be more in the way for that purpose. How is this to be prevented? If we designedly thrust it to the middle or bottom of the others, we may overdo the precaution; and are in any case introducing human design, that element so essentially hostile to all that we understand by chance. If we were to trust to a good shake setting matters right, we may easily be deceived; for shaking the bag can hardly do more than diminish the disposition of those balls which were already in each other's neighbourhood, to remain so. In the consequent interaction of each upon all, the arrangement in which they start cannot but leave its impress to some extent upon their final positions. In all such cases, 81 therefore, if we scrutinize our language, we shall find that any supposed à priori mode of stating a problem is little else than a compendious way of saying, Let means be taken for obtaining a given result. Since it is upon this result that our inferences ultimately rest, it seems simpler and more philosophical to appeal to it at once as the groundwork of our science.
§ 6. The comments mentioned earlier also apply to most common examples of randomness, like rolling dice, drawing cards, or pulling balls from bags, etc. In the last example, for instance, one might think that a ball just placed back would have a better chance of being drawn again since it's more available for that purpose. How can this be avoided? If we intentionally push it to the middle or the bottom among the others, we might go too far; and we're also introducing human intent, which contradicts everything we consider as randomness. If we rely on a good shake to mix things up properly, we could easily be misled; shaking the bag can barely do more than reduce the likelihood of the balls that were already near each other staying that way. In the resulting interactions among them, the initial arrangement will undoubtedly influence their final positions to some degree. In all such cases, 81 if we examine our words closely, we’ll see that any supposed à priori way of stating a problem is mostly just a concise way of saying, Let’s find ways to achieve a specific result. Since our conclusions ultimately depend on this result, it seems simpler and more philosophical to refer to it directly as the foundation of our science.
§ 7. Let us again take the instance of the tossing of a penny, and examine it somewhat more minutely, to see what can be actually proved about the results we shall obtain. We are willing to give the pence fair treatment by assuming that they are perfect, that is, that in the long run they show no preference for either head or tail; the question then remains, Will the repetitions of the same face obtain the proportional shares to which they are entitled by the usual interpretations of the theory? Putting then, as before, for the sake of brevity, H for head, and HH for heads twice running, we are brought to this issue;—Given that the chance of H is 1/2, does it follow necessarily that the chance of HH (with two pence) is 1/4? To say nothing of ‘H ten times’ occurring once in 1024 times (with ten pence), need it occur at all? The mathematicians, for the most part, seem to think that this conclusion follows necessarily from first principles; to me it seems to rest upon no more certain evidence than a reasonable extension by Induction.
§ 7. Let's take another look at tossing a penny and examine it a bit more closely to see what we can actually prove about the results we get. We're willing to treat the pennies fairly by assuming they're perfect, meaning that over time they don't show a bias towards either heads or tails. The question remains: Will the repetitions of the same side get the proportional results they're expected to based on the usual interpretations of the theory? So, for the sake of brevity, let's call heads "H" and two heads in a row "HH." This leads us to the following issue: If the chance of H is 1/2, can we necessarily say that the chance of HH (with two pennies) is 1/4? Not to mention that getting 'H ten times' happens once in 1024 times (with ten pennies), does it need to occur at all? Most mathematicians seem to believe this conclusion is a necessary result from first principles; however, it seems to me that it relies on no stronger evidence than a reasonable extension through induction.
Taking then the possible results which can be obtained from a pair of pence, what do we find? Four different results may follow, namely, (1) HT, (2) HH, (3) TH, (4) TT. If it can be proved that these four are equally probable, that is, occur equally often, the commonly accepted conclusions will follow, for a precisely similar argument would apply to all the larger numbers.
Taking into account the possible outcomes from a pair of coins, what do we discover? Four different results can emerge, namely, (1) HT, (2) HH, (3) TH, (4) TT. If it can be shown that these four outcomes are equally likely, meaning they occur with the same frequency, the commonly accepted conclusions will hold true, as a similar reasoning would apply to larger numbers as well.
§ 8. The proof usually advanced makes use of what is 82 called the Principle of Sufficient Reason. It takes this form;—Here are four kinds of throws which may happen; once admit that the separate elements of them, namely, H and T, happen equally often, and it will follow that the above combinations will also happen equally often, for no reason can be given in favour of one of them that would not equally hold in favour of the others.
§ 8. The proof typically presented relies on what is known as the Principle of Sufficient Reason. It states the following: there are four possible outcomes of a throw; if we accept that the individual results, H and T, occur with equal frequency, then it follows that the aforementioned combinations will also occur with equal frequency, since there is no justification for favoring one over the others that wouldn’t equally apply to the rest. 82
To a certain extent we must admit the validity of the principle for the purpose. In the case of the throws given above, it would be valid to prove the equal frequency of (1) and (3) and also of (2) and (4); for there is no difference existing between these pairs except what is introduced by our own notation.[3] TH is the same as HT, except in the order of the occurrence of the symbols H and T, which we do not take into account. But either of the pair (1) and (3) is different from either of the pair (2) and (4). Transpose the notation, and there would still remain here a distinction which the mind can recognize. A succession of the same thing twice running is distinguished from the conjunction of two different things, by a distinction which does not depend upon our arbitrary notation only, and would remain entirely unaltered by a change in this notation. The principle therefore of Sufficient Reason, if admitted, would only prove that doublets of the two kinds, for example (2) and (4), occur equally often, but it would not prove that they must each 83 occur once in four times. It cannot be proved indeed in this way that they need ever occur at all.
To some extent, we have to acknowledge that the principle is valid for the purpose. In the examples of the throws provided above, it's valid to show that (1) and (3) occur with equal frequency, as well as (2) and (4); because the only difference between these pairs comes from our own labeling. [3] TH is the same as HT, except for the order in which the symbols H and T appear, which we don’t consider. However, either pair (1) and (3) is different from either pair (2) and (4). If we change the notation, there would still be a distinction that the mind can recognize. A sequence of the same event happening twice is different from two dissimilar events occurring together, by a distinction that doesn’t rely solely on our arbitrary notation and would remain unchanged even if the notation were altered. Therefore, if we accept the principle of Sufficient Reason, it would only demonstrate that pairs of the two types, such as (2) and (4), happen with equal frequency, but it wouldn’t prove that they must each occur once in four tries. In fact, it can’t be shown that they ever need to occur at all.
§ 9. The formula, then, not being demonstrable à priori, (as might have been concluded,) can it be obtained by experience? To a certain extent it can; the present experience of mankind in pence and dice seems to show that the smaller successions of throws do really occur in about the proportions assigned by the theory. But how nearly they do so no one can say, for the amount of time and trouble to be expended before we could feel that we have verified the fact, even for small numbers, is very great, whilst for large numbers it would be simply intolerable. The experiment of throwing often enough to obtain ‘heads ten times’ has been actually performed by two or three persons, and the results are given by De Morgan, and Jevons.[4] This, however, being only sufficient on the average to give ‘heads ten times’ a single chance, the evidence is very slight; it would take a considerable number of such experiments to set the matter nearly at rest.
§ 9. So, since the formula can’t be proven à priori, can we get it through experience? To some degree, yes; the current experiences of people with coins and dice seem to indicate that smaller sequences of rolls occur in the proportions suggested by the theory. But no one can say exactly how accurate this is, because the time and effort needed to verify this, even for small numbers, is significant, and for larger numbers, it would be practically unbearable. The experiment of rolling enough times to get 'heads ten times' has been actually conducted by a few individuals, and the results are reported by De Morgan and Jevons.[4] However, this is only enough on average to give 'heads ten times' a single chance, so the evidence is quite weak; it would require a considerable number of such experiments to really confirm the situation.
Any such rule, then, as that which we have just been discussing, which professes to describe what will take place in a long succession of throws, is only conclusively proved by experience within very narrow limits, that is, for small repetitions of the same face; within limits less narrow, indeed, we feel assured that the rule cannot be flagrantly in error, otherwise the variation would be almost sure to be detected. From this we feel strongly inclined to infer that the same law will hold throughout. In other words, we are inclined to extend the rule by Induction and Analogy. Still there are so many instances in nature of proposed laws which hold within narrow limits but get egregiously astray when we 84 attempt to push them to great lengths, that we must give at best but a qualified assent to the truth of the formula.
Any rule, like the one we’ve just discussed, that claims to predict what will happen over many throws is only really proven by experience within very limited circumstances, meaning with just a few repetitions of the same outcome. Beyond those narrow limits, we feel confident that the rule can’t be wildly wrong, or else the differences would likely stand out. Because of this, we are inclined to believe that the same principle will apply more broadly. In other words, we tend to extend the rule through Induction and Analogy. However, there are many examples in nature of proposed laws that work within tight limits but become completely inaccurate when we try to apply them on a larger scale, so we can only cautiously agree to the validity of the formula.
§ 10. The object of the above reasoning is simply to show that we cannot be certain that the rule is true. Let us now turn for a minute to consider the causes by which the succession of heads and tails is produced, and we may perhaps see reasons to make us still more doubtful.
§ 10. The point of the reasoning above is just to show that we can't be sure the rule is correct. Now, let's take a moment to think about the factors that create the sequence of heads and tails, and we might find even more reasons to feel uncertain.
It has been already pointed out that in calculating probabilities à priori, as it is called, we are only able to do so by introducing restrictions and suppositions which are in reality equivalent to assuming the expected results. We use words which in strictness mean, Let a given process be performed; but an analysis of our language, and an examination of various tacit suppositions which make themselves felt the moment they are not complied with, soon show that our real meaning is, Let a series of a given kind be obtained; it is to this series only, and not to the conditions of its production, that all our subsequent calculations properly apply. The physical process being performed, we want to know whether anything resembling the contemplated series really will be obtained.
It has already been pointed out that when calculating probabilities à priori, we can only do so by introducing restrictions and assumptions that essentially equate to assuming the expected results. We use terms that strictly mean, "Let a specific process occur"; however, an analysis of our language and an examination of various implicit assumptions that become apparent when they aren’t followed soon reveal that our true meaning is, "Let a series of a certain kind be produced"; it is to this series only, and not to the conditions of its production, that all our subsequent calculations properly apply. Once the physical process is underway, we want to know whether anything resembling the intended series will actually be obtained.
Now if the penny were invariably set the same side uppermost, and thrown with the same velocity of rotation and to the same height, &c.—in a word, subjected to the same conditions,—it would always come down with the same side uppermost. Practically, we know that nothing of this kind occurs, for the individual variations in the results of the throws are endless. Still there will be an average of these conditions, about which the throws will be found, as it were, to cluster much more thickly than elsewhere. We should be inclined therefore to infer that if the same side were always set uppermost there would really be a departure from the sort of series which we ordinarily expect. In a very large 85 number of throws we should probably begin to find, under such circumstances, that either head or tail was having a preference shown to it. If so, would not similar effects be found to be connected with the way in which we started each successive pair of throws? According as we chose to make a practice of putting HH or TT uppermost, might there not be a disturbance in the proportion of successions of two heads or two tails? Following out this train of reasoning, it would seem to point with some likelihood to the conclusion that in order to obtain a series of the kind we expect, we should have to dispose the antecedents in a similar series at the start. The changes and chances produced by the act of throwing might introduce infinite individual variations, and yet there might be found, in the very long run, to be a close similarity between these two series.
Now, if the penny were always set with the same side facing up, thrown with the same spin and to the same height, and so on—in other words, under the same conditions—it would always land with the same side up. In reality, we know that this doesn't happen because the variations in the results of the throws are countless. Still, there will be an average of these conditions, around which the outcomes will tend to cluster more closely than elsewhere. Therefore, we might think that if the same side was always facing up, it would actually lead to a deviation from the kind of outcomes we usually expect. After a significant number of throws, we might notice that either heads or tails is favored. If that's the case, wouldn’t similar effects be associated with how we started each successive pair of throws? Depending on whether we consistently put HH or TT facing up, could there be a change in the frequency of occurrences of two heads or two tails? Following this line of thought, it seems likely that to achieve the series we expect, we would need to arrange the initial conditions in a similar series. The variations created by the act of throwing might result in countless individual fluctuations, yet over a long enough period, there could be a significant resemblance between these two series.
§ 11. This is, to a certain extent, only shifting the difficulty, I admit; for the claim formerly advanced about the possibility of proving the proportions of the throws in the former series, will probably now be repeated in favour of those in the latter. Still the question is very much narrowed, for we have reduced it to a series of voluntary acts. A man may put whatever side he pleases uppermost. He may act consciously, as I have said, or he may think nothing whatever about the matter, that is, throw at random; if so, it will probably be asserted by many that he will involuntarily produce a series of the kind in question. It may be so, or it may not; it does not seem that there are any easily accessible data by which to decide. All that I am concerned with here is to show the likelihood that the commonly received result does in reality depend upon the fulfilment of a certain condition at the outset, a condition which it is certainly optional with any one to fulfil or not as he pleases. The short successions doubtless will take care of 86 themselves, owing to the infinite complications produced by the casual variations in throwing; but the long ones may suffer, unless their interest be consciously or unconsciously regarded at the outset.
§ 11. This is, to some extent, just shifting the challenge, I admit; because the argument previously made about proving the proportions of the throws in the earlier series will likely now be argued for those in the later series. Still, the question is much more focused, as we have narrowed it down to a series of voluntary actions. A person can choose which side they want facing up. They can act intentionally, as I mentioned, or they can not think about it at all, which means throwing randomly; if that’s the case, many may claim that they will involuntarily create a series like the one in question. It could be true, or it might not be; there don’t seem to be any easily accessible facts to determine that. What I aim to show here is the likelihood that the commonly accepted outcome actually relies on fulfilling a specific condition at the start, a condition that anyone can choose to meet or ignore. The short sequences will certainly manage themselves due to the infinite complications caused by random variations in throwing; however, the longer ones may struggle unless their significance is considered, either consciously or unconsciously, from the beginning.
§ 12. The advice, ‘Only try long enough, and you will sooner or later get any result that is possible,’ is plausible, but it rests only on Induction and Analogy; mathematics do not prove it. As has been repeatedly stated, there are two distinct views of the subject. Either we may, on the one hand, take a series of symbols, call them heads and tails; H, T, &c.; and make the assumption that each of these, and each pair of them, and so on, will occur in the long run with a regulated degree of frequency. We may then calculate their various combinations, and the consequences that may be drawn from the data assumed. This is a purely algebraical process; it is infallible; and there is no limit whatever to the extent to which it may be carried. This way of looking at the matter may be, and undoubtedly should be, nothing more than the counterpart of what I have called the substituted or idealized series which generally has to be introduced as the basis of our calculation. The danger to be guarded against is that of regarding it too purely as an algebraical conception, and thence of sinking into the very natural errors both of too readily evolving it out of our own consciousness, and too freely pushing it to unwarranted lengths.
§ 12. The advice, “Just keep trying, and eventually you’ll achieve any possible outcome,” sounds reasonable, but it only relies on Induction and Analogy; mathematics doesn’t back it up. As has been said many times, there are two different perspectives on the topic. On one hand, we can take a series of symbols, call them heads and tails; H, T, etc.; and assume that each of these, along with each possible combination, will occur at a consistent frequency over time. We can then calculate their various combinations and the implications that come from the assumptions made. This is purely an algebraic process; it’s foolproof, and there’s no limit to how far it can go. This perspective should be seen as just another version of what I’ve referred to as the substituted or idealized series that usually needs to be introduced as the basis for our calculations. The risk to avoid is treating it solely as an algebraic concept and, as a result, falling into the common mistakes of too easily deriving it from our own thoughts and extending it too far beyond reason.
Or on the other hand, we may consider that we are treating of the behaviour of things;—balls, dice, births, deaths, &c.; and drawing inferences about them. But, then, what were in the former instance allowable assumptions, become here propositions to be tested by experience. Now the whole theory of Probability as a practical science, in fact as anything more than an algebraical truth, depends of course upon there being a close correspondence between these two views 87 of the subject, in other words, upon our substituted series being kept in accordance with the actual series. Experience abundantly proves that, between considerable limits, in the example in question, there does exist such a correspondence. But let no one attempt to enforce our assent to every remote deduction that mathematicians can draw from their formulæ. When this is attempted the distinction just traced becomes prominent and important, and we have to choose our side. Either we go over to the mathematics, and so lose all right of discussion about the things; or else we take part with the things, and so defy the mathematics. We do not question the formal accuracy of the latter within their own province, but either we dismiss them as somewhat irrelevant, as applying to data of whose correctness we cannot be certain, or we take the liberty of remodelling them so as to bring them into accordance with facts.
Or, on the other hand, we might think that we're dealing with the behavior of things;—balls, dice, births, deaths, etc.; and drawing conclusions about them. However, what were previously acceptable assumptions now become claims that need to be tested by experience. The entire theory of Probability as a practical science, truly as anything more than a mathematical truth, relies on a close connection between these two perspectives on the subject. In other words, it depends on keeping our substituted series aligned with the actual series. Experience clearly shows that, within significant limits, in the example we’re discussing, such a correspondence exists. But let no one try to force us to agree with every distant conclusion that mathematicians derive from their formulas. When this happens, the distinction just mentioned becomes clear and important, and we must choose a side. We can either side with mathematics and lose our right to discuss the actual things, or we can side with the things and challenge the mathematics. We don't doubt the formal accuracy of the latter within their own domain, but we either dismiss them as somewhat irrelevant, applying to data whose correctness we can’t be sure of, or we feel free to reshape them to align them with the facts.
§ 13. A critic of any doctrine can hardly be considered to have done much more than half his duty when he has explained and justified his grounds for objecting to it. It still remains for him to indicate, if only in a few words, what he considers its legitimate functions and position to be, for it can seldom happen that he regards it as absolutely worthless or unmeaning. I should say, then, that when Probability is thus divorced from direct reference to objects, as it substantially is by not being founded upon experience, it simply resolves itself into the common algebraical or arithmetical doctrine of Permutations and Combinations.[5] The considerations upon which these depend are purely formal and necessary, and can be fully reasoned out without any appeal to experience. We there start from pure considerations of number or magnitude, and we terminate with them, 88 having only arithmetical calculations to connect them together. I wish, for instance, to find the chance of throwing heads three times running with a penny. All I have to do is first to ascertain the possible number of throws. Permutations tell me that with two things thus in question (viz. head and tail) and three times to perform the process, there are eight possible forms of the result. Of these eight one only being favourable, the chance in question is pronounced to be one-eighth.
§ 13. A critic of any idea can't really be seen as having done his job if he only explains and justifies his reasons for opposing it. He also needs to suggest, even briefly, what he thinks its valid functions and role are, because it’s rare for him to see it as completely worthless or meaningless. So, when Probability is separated from a direct connection to objects, as it essentially is when it's not based on experience, it just turns into the standard algebraic or mathematical principles of Permutations and Combinations.[5] The principles these are based on are purely formal and necessary, and you can fully reason them out without referring to experience. We start with pure considerations of numbers or magnitudes and finish with them, 88 having only arithmetic calculations linking them. If, for example, I want to find the probability of tossing heads three times in a row with a coin, I simply need to determine the possible number of tosses. Permutations tell me that with two outcomes to consider (heads and tails) and three tosses, there are eight possible results. Out of these eight, only one is favorable, so the probability in question is one in eight.
Now though it is quite true that the actual calculation of every chance problem must be of the above character, viz. an algebraical or arithmetical process, yet there is, it seems to me, a broad and important distinction between a material science which employs mathematics, and a formal one which consists of nothing but mathematics. When we cut ourselves off from the necessity of any appeal to experience, we are retaining only the intermediate or calculating part of the investigation; we may talk of dice, or pence, or cards, but these are really only names we choose to give to our symbols. The H's and T's with which we deal have no bearing on objective occurrences, but are just like the x's and y's with which the rest of algebra deals. Probability in fact, when so treated, seems to be absolutely nothing else than a system of applied Permutations and Combinations.
Now, while it’s true that calculating every chance problem involves an algebraic or arithmetic process, there’s a significant distinction between a practical science that uses mathematics and a theoretical one that’s purely mathematical. When we separate ourselves from the need to refer to real-world experiences, we only keep the intermediate or calculating part of the investigation; we might refer to dice, coins, or cards, but these are just labels we assign to our symbols. The H's and T's we work with have no connection to actual events; they’re just like the x's and y's found in other areas of algebra. In fact, when treated this way, probability seems to be nothing more than a system of applied permutations and combinations.
It will now readily be seen how narrow is the range of cases to which any purely deductive method of treatment can apply. It is almost entirely confined to such employments as games of chance, and, as already pointed out, can only be regarded as really trustworthy even there, by the help of various tacit restrictions. This alone would be conclusive against the theory of the subject being rested upon such a basis. The experimental method, on the other hand, is, in the same theoretical sense, of universal application. It would 89 include the ordinary problems furnished by games of chance, as well as those where the dice are loaded and the pence are not perfect, and also the indefinitely numerous applications of statistics to the various kinds of social phenomena.
It will now be clear how limited the range of cases is to which any purely deductive method of treatment can apply. It is mostly restricted to activities like games of chance and, as already noted, can only be considered truly reliable there with the aid of various unspoken limitations. This alone would be enough to argue against the theory being based on such a foundation. The experimental method, on the other hand, is, in the same theoretical sense, universally applicable. It would encompass the typical problems presented by games of chance, as well as those situations where the dice are loaded and the coins are not perfect, in addition to the countless applications of statistics to various social phenomena. 89
§ 14. The particular view of the deductive character of Probability above discussed, could scarcely have intruded itself into any other examples than those of the nature of games of chance, in which the conditions of occurrence are by comparison few and simple, and are amenable to accurate numerical determination. But a doctrine, which is in reality little else than the same theory in a slightly disguised form, is very prevalent, and has been applied to truths of the most purely empirical character. This doctrine will be best introduced by a quotation from Laplace. After speaking of the irregularity and uncertainty of nature as it appears at first sight, he goes on to remark that when we look closer we begin to detect “a striking regularity which seems to suggest a design, and which some have considered a proof of Providence. But, on reflection, it is soon perceived that this regularity is nothing but the development of the respective probabilities of the simple events, which ought to occur more frequently according as they are more probable.”[6]
§ 14. The specific idea about the deductive nature of Probability we've discussed could hardly apply to anything other than games of chance, where the conditions are relatively few and straightforward, and can be measured accurately. However, a theory that is essentially just this idea in a slightly different form is quite common and has been applied to purely empirical truths. This theory is best explained through a quote from Laplace. After discussing the irregularity and uncertainty of nature as it initially seems, he points out that when we examine things more closely, we start to notice “a striking regularity that appears to imply a design, which some have taken as evidence of Providence. But upon further thought, it becomes clear that this regularity is simply the outcome of the probabilities of the simple events, which should occur more often as they become more likely.”[6]
If this remark had been made about the succession of heads and tails in the throwing up of a penny, it would have been intelligible. It would simply mean this: that the constitution of the body was such that we could anticipate with some confidence what the result would be when it was treated in a certain way, and that experience would justify our anticipation in the long run. But applied as it is in a more general form to the facts of nature, it seems really to have but little meaning in it. Let us test it by an instance. Amidst the irregularity of individual births, we find that the 90 male children are to the female, in the long run, in about the proportion of 106 to 100. Now if we were told that there is nothing in this but “the development of their respective probabilities,” would there be anything in such a statement but a somewhat pretentious re-statement of the fact already asserted? The probability is nothing but that proportion, and is unquestionably in this case derived from no other source but the statistics themselves; in the above remark the attempt seems to be made to invert this process, and to derive the sequence of events from the mere numerical statement of the proportions in which they occur.
If this comment had been about the sequence of heads and tails when flipping a coin, it would have made sense. It would mean that the structure of the system allows us to predict with some certainty what the outcome will be when treated in a certain way, and that over time, experience would support our prediction. But when applied in a broader sense to the facts of nature, it seems to lack real significance. Let's test it with an example. Despite the randomness of individual births, we find that male children typically outnumber female children at a ratio of about 106 to 100 over time. Now, if we were told that this is simply due to “the development of their respective probabilities,” would that be more than just a somewhat pretentious restatement of the previously mentioned fact? The probability is just that ratio, and it clearly comes from the statistics themselves; in the previous remark, it seems like an attempt is made to reverse this process and derive the sequence of events from the mere numerical statement of the proportions in which they occur.
§ 15. It will very likely be replied that by the probability above mentioned is meant, not the mere numerical proportion between, the births, but some fact in our constitution upon which this proportion depends; that just as there was a relation of equality between the two sides of the penny, which produced the ultimate equality in the number of heads and tails, so there may be something in our constitution or circumstances in the proportion of 106 to 100, which produces the observed statistical result. When this something, whatever it might be, was discovered, the observed numbers might be supposed capable of being determined beforehand. Even if this were the case, however, it must not be forgotten that there could hardly fail to be, in combination with such causes, other concurrent conditions in order to produce the ultimate result; just as besides the shape of the penny, we had also to take into account the nature of the ‘randomness’ with which it was tossed. What these may be, no one at present can undertake to say, for the best physiologists seem indisposed to hazard even a guess upon the subject.[7] But without going into particulars, one may 91 assert with some confidence that these conditions cannot well be altogether independent of the health, circumstances, manners and customs, &c. (to express oneself in the vaguest way) of the parents; and if once these influencing elements are introduced, even as very minute factors, the results cease to be dependent only on fixed and permanent conditions. We are at once letting in other conditions, which, if they also possess the characteristics that distinguish Probability (an exceedingly questionable assumption), must have that fact specially proved about them. That this should be the case indeed seems not merely questionable, but almost certainly impossible; for these conditions partaking of the nature of what we term generally, Progress and Civilization, cannot be expected to show any permanent disposition to hover about an average.
§ 15. It’s likely that the response will be that the probability mentioned refers not just to the numbers of births, but to some aspect of our makeup that affects this ratio; just like there’s a relationship of equality between the two sides of a coin, which leads to the final equality of heads and tails, there might be something in our nature or circumstances related to the 106 to 100 ratio that produces the observed statistical outcome. Once this factor, whatever it is, is uncovered, it might be possible to predict the observed numbers in advance. However, even if this is true, we must remember that there are probably other contributing factors that must be considered to produce the final outcome; similar to how we need to account for the nature of the ‘randomness’ of the coin toss in addition to the shape of the coin. What these factors may be is something no one can currently determine, as the best physiologists seem reluctant to even speculate on the topic. But without getting into specifics, one can reasonably assert that these factors can’t be completely independent of the health, circumstances, habits, and customs, etc. (to put it in very vague terms) of the parents; and once these influential factors are introduced, even as minor aspects, the results are no longer solely reliant on fixed and permanent conditions. We are immediately introducing additional factors, which, if they also exhibit characteristics that define Probability (a highly debatable assumption), must have that aspect specifically proven about them. That being the case seems not just dubious, but almost definitely impossible; because these factors, which are part of what we generally refer to as Progress and Civilization, cannot be expected to consistently gravitate towards an average.
§ 16. The reader who is familiar with Probability is of course acquainted with the celebrated theorem of James Bernoulli. This theorem, of which the examples just adduced are merely particular cases, is generally expressed somewhat as follows:—in the long run all events will tend to occur with a relative frequency proportional to their objective probabilities. With the mathematical proof of this theorem we need not trouble ourselves, as it lies outside the province of this work; but indeed if there is any value in the foregoing criticism, the basis on which the mathematics rest is faulty, owing to there being really nothing which we can with propriety call an objective probability.
§ 16. A reader familiar with probability is surely aware of the well-known theorem of James Bernoulli. This theorem, which the previously mentioned examples are just specific instances of, is generally stated like this: in the long run, all events will happen with a frequency that relates to their actual probabilities. We don’t need to dive into the mathematical proof of this theorem, as it's outside the scope of this work; however, if any value can be found in the previous critique, it lies in the fact that the foundation on which the mathematics is built is flawed since there is really nothing we can properly refer to as an objective probability.
If one might judge by the interpretation and uses to 92 which this theorem is sometimes exposed, we should regard it as one of the last remaining relics of Realism, which after being banished elsewhere still manages to linger in the remote province of Probability. It would be an illustration of the inveterate tendency to objectify our conceptions, even in cases where the conceptions had no right to exist at all. A uniformity is observed; sometimes, as in games of chance, it is found to be so connected with the physical constitution of the bodies employed as to be capable of being inferred beforehand; though even here the connection is by no means so necessary as is commonly supposed, owing to the fact that in addition to these bodies themselves we have also to take into account their relation to the agencies which influence them. This constitution is then converted into an ‘objective probability’, supposed to develop into the sequence which exhibits the uniformity. Finally, this very questionable objective probability is assumed to exist, with the same faculty of development, in all the cases in which uniformity is observed, however little resemblance there may be between these and games of chance.
If we judge by the way this theorem is sometimes interpreted and used, we should see it as one of the last remnants of Realism, which, after being pushed out elsewhere, still manages to hang on in the obscure area of Probability. It illustrates our strong tendency to make our ideas concrete, even when those ideas shouldn't exist at all. A pattern can be seen; sometimes, like in games of chance, it seems to be linked to the physical traits of the objects involved in a way that we can predict ahead of time; however, even here, that connection is not as necessary as people often think, because, besides these objects themselves, we also have to consider their relationship to the factors that affect them. This setup is then turned into an 'objective probability,' which is thought to evolve into the pattern that shows the uniformity. In the end, this very dubious objective probability is believed to exist, with the same ability to develop, in all situations where uniformity is observed, no matter how little these cases resemble games of chance.
§ 17. How utterly inappropriate any such conception is in most of the cases in which we find statistical uniformity, will be obvious on a moment's consideration. The observed phenomena are generally the product, in these cases, of very numerous and complicated antecedents. The number of crimes, for instance, annually committed in any society, is a function amongst other things, of the strictness of the law, the morality of the people, their social condition, and the vigilance of the police, each of these elements being in itself almost infinitely complex. Now, as a result of all these agencies, there is some degree of uniformity; but what has been called above the change of type, which it sooner or later tends to display, is unmistakeable. The average annual 93 numbers do not show a steady gradual approach towards what might be considered in some sense a limiting value, but, on the contrary, fluctuate in a way which, however it may depend upon causes, shows none of the permanent uniformity which is characteristic of games of chance. This fact, combined with the obvious arbitrariness of singling out, from amongst the many and various antecedents which produced the observed regularity, a few only, which should constitute the objective probability (if we took all, the events being absolutely determined, there would be no occasion for an appeal to probability in the case), would have been sufficient to prevent any one from assuming the existence of any such thing, unless the mistaken analogy of other cases had predisposed him to seek for it.
§ 17. It's clear that any such idea is completely inappropriate in most cases where we observe statistical uniformity. The phenomena we notice usually result from a large number of complex factors. For example, the number of crimes committed each year in a society depends on various elements including the strictness of the law, the morality of the people, their social conditions, and the alertness of the police, each of which is incredibly intricate on its own. As a result of these influences, we see some level of uniformity; however, the shifts in patterns, which we mentioned earlier, are unmistakable. The average annual figures do not indicate a steady approach toward any potential limiting value; instead, they fluctuate in a way that, while influenced by different causes, lacks the consistent uniformity typical of games of chance. This reality, along with the obvious randomness of selecting only a few factors from the many that led to the observed regularity to define the objective probability (if we considered all factors, since the events would be entirely determined, there would be no need to invoke probability), should have been enough to stop anyone from assuming such a thing exists, unless they were mistakenly led to it by comparing it to other cases.
There is a familiar practical form of the same error, the tendency to which may not improbably be derived from a similar theoretical source. It is that of continuing to accumulate our statistical data to an excessive extent. If the type were absolutely fixed we could not possibly have too many statistics; the longer we chose to take the trouble of collecting them the more accurate our results would be. But if the type is changing, in other words, if some of the principal causes which aid in their production have, in regard to their present degree of intensity, strict limits of time or space, we shall do harm rather than good if we overstep these limits. The danger of stopping too soon is easily seen, but in avoiding it we must not fall into the opposite error of going on too long, and so getting either gradually or suddenly under the influence of a changed set of circumstances.
There’s a common practical version of the same mistake, which likely comes from a similar theoretical source. It involves continually gathering our statistical data to an excessive degree. If the type were completely fixed, we couldn’t possibly have too many statistics; the longer we took the time to collect them, the more accurate our results would be. But if the type is changing—in other words, if some of the main causes that contribute to their production have strict time or space limits regarding their current intensity—we could end up causing more harm than good if we exceed these limits. It’s easy to see the risk of stopping too soon, but in trying to avoid that, we must not make the opposite mistake of going on too long, which could lead us to be influenced, either gradually or suddenly, by a changed set of circumstances.
§ 18. This chapter was intended to be devoted to a consideration, not of the processes by which nature produces the series with which we are concerned, but of the theoretic basis of the methods by which we can determine the existence 94 of such series. But it is not possible to keep the two enquiries apart, for here, at any rate, the old maxim prevails that to know a thing we must know its causes. Recur for a minute to the considerations of the last chapter. We there saw that there was a large class of events, the conditions of production of which could be said to consist of (1) a comparatively few nearly unchangeable elements, and (2) a vast number of independent and very changeable elements. At least if there were any other elements besides these, we are assumed either to make special allowance for them, or to omit them from our enquiry. Now in certain cases, such as games of chance, the unchangeable elements may without practical error be regarded as really unchangeable throughout any range of time and space. Hence, as a result, the deductive method of treatment becomes in their case at once the most simple, natural, and conclusive; but, as a further consequence, the statistics of the events, if we choose to appeal to them, may be collected ad libitum with better and better approximation to truth. On the other hand, in all social applications of Probability, the unchangeable causes can only be regarded as really unchangeable under many qualifications. We know little or nothing of them directly; they are often in reality numerous, indeterminate, and fluctuating; and it is only under the guarantee of stringent restrictions of time and place, that we can with any safety attribute to them sufficient fixity to justify our theory. Hence, as a result, the deductive method, under whatever name it may go, becomes totally inapplicable both in theory and practice; and, as a further consequence, the appeal to statistics has to be made with the caution in mind that we shall do mischief rather than good if we go on collecting too many of them.
§ 18. This chapter was meant to focus on the theoretical foundations of the methods we use to determine the existence of the series we're discussing, rather than the processes through which nature produces them. However, we can't completely separate these two inquiries, as the old saying goes: to understand something, we need to know its causes. Let’s take a moment to revisit the points from the last chapter. We noted that there’s a significant group of events whose production can be said to involve (1) a relatively small number of stable elements, and (2) a huge number of independent and highly variable elements. If there are any other elements beyond these, we either have to account for them specifically or leave them out of our investigation. In certain situations, like games of chance, stable elements can be reasonably considered truly unchangeable across any time and space. As a result, the deductive method becomes the simplest, most natural, and most conclusive approach in these cases. Furthermore, the statistics of events can be gathered freely with increasing accuracy to the truth. On the flip side, in all social aspects of probability, the stable causes can only be seen as truly stable with several qualifications. Our direct knowledge of them is limited; they’re often numerous, uncertain, and fluctuating; and we can only safely attribute enough consistency to them to support our theory when we apply strict time and place restrictions. Consequently, the deductive method, by any name, becomes completely inapplicable both theoretically and practically; and as a further implication, we need to approach statistical data carefully, as gathering too much can do more harm than good.
§ 19. The results of the last two chapters may be summed up as follows:—We have extended the conception 95 of a series obtained in the first chapter; for we have found that these series are mostly presented to us in groups. These groups are found upon examination to be formed upon approximately the same type throughout a very wide and varied range of experience; the causes of this agreement we discussed and explained in some detail. When, however, we extend our examination by supposing the series to run to a very great length, we find that they may be divided into two classes separated by important distinctions. In one of these classes (that containing the results of games of chance) the conditions of production, and consequently the laws of statistical occurrence, may be practically regarded as absolutely fixed; and the extent of the divergences from the mean seem to know no finite limit. In the other class, on the contrary (containing the bulk of ordinary statistical enquiries), the conditions of production vary with more or less rapidity, and so in consequence do the results. Moreover it is often impossible that variations from the mean should exceed a certain amount. The former we may term ideal series. It is they alone which show the requisite characteristics with any close approach to accuracy, and to make the theory of the subject tenable, we have really to substitute one of this kind for one of the less perfect ones of the other class, when these latter are under treatment. The former class have, however, been too exclusively considered by writers on the subject; and conceptions appropriate only to them, and not always even to them, have been imported into the other class. It is in this way that a general tendency to an excessive deductive or à priori treatment of the science has been encouraged.
§ 19. The results from the last two chapters can be summed up like this: We’ve expanded our understanding of the series introduced in the first chapter; we discovered that these series often appear in groups. Analyzing these groups reveals that they are mostly formed around a similar type across a wide and diverse range of experiences; we discussed and explained the reasons for this consistency in detail. However, when we broaden our examination by assuming the series could extend indefinitely, we find they can be divided into two categories based on significant differences. In one of these categories (the one that includes the results of games of chance), the production conditions, and consequently the statistical laws, can be viewed as essentially fixed; the variations from the mean seem limitless. In the other category, however (which encompasses most regular statistical investigations), the production conditions change more or less quickly, which also affects the outcomes. Furthermore, it’s often improbable that variations from the mean will exceed a certain level. We can refer to the first category as ideal series. They alone display the necessary attributes with a reasonable degree of accuracy, and to support the theory of the subject, we actually need to replace one of these ideal series with a less accurate one from the other category when addressing the latter. However, the first category has been overly emphasized by authors in the field; ideas suited only to them, and not always even to them, have been applied to the other category. This has led to a general tendency towards an excessive deductive or à priori approach to the science.
1 This latter enquiry belongs to what may be termed the more purely logical part of this volume, and is entered on in the course of Chapter VI.
1 This latter inquiry falls under the more purely logical section of this book and is discussed in Chapter VI.
2 For the use of those not acquainted with the common notation employed in this subject, it may be remarked that HH is simply an abbreviated way of saying that the two successive throws of the penny give head; HT that the first of them gives head, and the second tail; and so on with the remaining symbols.
2 For those who aren't familiar with the usual notation used in this topic, it’s worth noting that HH is just a shorthand way of stating that the two consecutive tosses of the penny result in heads; HT indicates that the first toss results in heads, and the second in tails; and this pattern continues with the other symbols.
3 I am endeavouring to treat this rule of Sufficient Reason in a way that shall be legitimate in the opinion of those who accept it, but there seem very great doubts whether a contradiction is not involved when we attempt to extract results from it. If the sides are absolutely alike, how can there he any difference between the terms of the series? The succession seems then reduced to a dull uniformity, a mere iteration of the same thing many times; the series we contemplated has disappeared. If the sides are not absolutely alike, what becomes of the applicability of the rule?
3 I'm trying to approach the rule of Sufficient Reason in a way that will be considered valid by those who believe in it, but there are significant doubts about whether there's a contradiction when we try to draw conclusions from it. If the sides are completely the same, how can there be any difference between the terms of the series? It then seems like the progression is turned into a dull sameness, just repeating the same thing over and over; the series we were considering has vanished. If the sides aren't completely the same, what happens to the applicability of the rule?
4 Formal Logic, p. 185. Principles of Science, p. 208.
4 Formal Logic, p. 185. Principles of Science, p. 208.
5 The close connection between these subjects is well indicated in the title of Mr Whitworth's treatise, Choice and Chance.
5 The strong link between these topics is clearly shown in the title of Mr. Whitworth's work, Choice and Chance.
7 An opinion prevailed rather at one time (quoted and supported by Quetelet amongst others) that the relative ages of the parents had something to do with the sex of the offspring. If this were so, it would quite bear out the above remarks. As a matter of fact, it should be observed, that the proportion of 106 to 100 does not seem by any means universal in all countries or at all times. For various statistical tables on the subject see Quetelet, Physique Sociale, Vol. I. 166, 173, 238.
7 An opinion once held (supported by Quetelet among others) was that the ages of the parents influenced the sex of their children. If this were true, it would confirm the earlier points made. In reality, it's important to note that the ratio of 106 to 100 doesn't appear to be universal across all countries or at all times. For various statistical tables on the topic, see Quetelet, Physique Sociale, Vol. I. 166, 173, 238.
CHAPTER 5.
THE CONCEPTION RANDOMNESS AND ITS SCIENTIFIC TREATMENT.
§ 1. There is a term of frequent occurrence in treatises on Probability, and which we have already had repeated occasion to employ, viz. the designation random applied to an event, as in the expression ‘a random distribution’. The scientific conception involved in the correct use of this term is, I apprehend, nothing more than that of aggregate order and individual irregularity (or apparent irregularity), which has been already described in the preceding chapters. A brief discussion of the requisites in this scientific conception, and in particular of the nature and some of the reasons for the departure from the popular conception, may serve to clear up some of the principal remaining difficulties which attend this part of our subject.
§ 1. There is a term commonly found in texts on Probability that we have already used multiple times, namely the term random, as in ‘a random distribution’. The scientific idea behind the correct use of this term is, I believe, simply that of overall order combined with individual irregularity (or seeming irregularity), which has already been explained in the previous chapters. A brief discussion of the elements in this scientific idea, especially regarding its nature and some reasons for the differences from the common understanding, may help clarify some of the main remaining challenges we face in this area of our topic.
The original,[1] and still popular, signification of the term is of course widely different from the scientific. What it looks to is the origin, not the results, of the random performance, and it has reference rather to the single action than to a group or series of actions. Thus, when a man 97 draws a bow ‘at a venture’, or ‘at random’, we mean only to point out the aimless character of the performance; we are contrasting it with the definite intention to hit a certain mark. But it is none the less true, as already pointed out, that we can only apply processes of inference to such performances as these when we regard them as being capable of frequent, or rather of indefinitely extended repetition.
The original, [1] and still popular, meaning of the term is obviously very different from the scientific one. It focuses on the origin, not the results, of the random act, and it refers more to the single action than to a group or series of actions. So, when a person 97 draws a bow 'on a whim' or 'at random', we're simply pointing out the aimless nature of the act; we're contrasting it with the clear intention to hit a specific target. However, it remains true, as mentioned earlier, that we can only apply inference processes to such acts when we see them as being capable of frequent, or even unlimited, repetition.
Begin with an illustration. Perhaps the best typical example that we can give of the scientific meaning of random distribution is afforded by the arrangement of the drops of rain in a shower. No one can give a guess whereabouts at any instant a drop will fall, but we know that if we put out a sheet of paper it will gradually become uniformly spotted over; and that if we were to mark out any two equal areas on the paper these would gradually tend to be struck equally often.
Begin with an illustration. Perhaps the best typical example we can provide of the scientific meaning of random distribution is the way raindrops fall during a shower. No one can predict exactly where a drop will land at any moment, but we know that if we lay out a sheet of paper, it will gradually become evenly spotted. Additionally, if we marked out two equal areas on the paper, those areas would eventually be hit equally often.
§ 2. I. Any attempt to draw inferences from the assumption of random arrangement must postulate the occurrence of this particular state of things at some stage or other. But there is often considerable difficulty, leading occasionally to some arbitrariness, in deciding the particular stage at which it ought to be introduced.
§ 2. I. Any attempt to make conclusions based on the idea of random arrangement must assume that this specific situation occurs at some point. However, it can be quite challenging, sometimes leading to arbitrary decisions, to determine the exact point at which it should be introduced.
(1) Thus, in many of the problems discussed by mathematicians, we look as entirely to the results obtained, and think as little of the actual process by which they are obtained, as when we are regarding the arrangement of the drops of rain. A simple example of this kind would be the following. A pawn, diameter of base one inch, is placed at random on a chess-board, the diameter of the squares of which is one inch and a quarter: find the chance that its base shall lie across one of the intersecting lines. Here we may imagine the pawns to be so to say rained down vertically upon the board, 98 and the question is to find the ultimate proportion of those which meet a boundary line to the total of those which fall. The problem therefore becomes a merely geometrical one, viz. to determine the ratio of a certain area on the board to the whole area. The determination of this ratio is all that the mathematician ever takes into account.
(1) So, in many of the problems discussed by mathematicians, we focus entirely on the results obtained and pay little attention to the actual process through which they are achieved, just like when we look at the arrangement of raindrops. A simple example of this would be the following. A pawn, with a base diameter of one inch, is placed randomly on a chessboard, where each square has a diameter of one and a quarter inches: find the chance that its base will lie across one of the intersecting lines. Here, we can imagine the pawns being, so to speak, dropped straight down onto the board, and the question is to find the ultimate proportion of those that touch a boundary line compared to the total that fall. The problem, therefore, becomes a purely geometrical one, namely, to determine the ratio of a certain area on the board to the overall area. Determining this ratio is all that the mathematician considers.
Now take the following. A straight brittle rod is broken at random in two places: find the chance that the pieces can make a triangle.[2] Since the only condition for making a triangle with three straight lines is that each two shall be greater than the third, the problem seems to involve the same general conception as in the former case. We must conceive such rods breaking at one pair of spots after another,—no one can tell precisely where,—but showing the same ultimate tendency to distribute these spots throughout the whole length uniformly. As in the last case, the mathematician thinks of nothing but this final result, and pays no heed to the process by which it may be brought about. Accordingly the problem is again reduced to one of mensuration, though of a somewhat more complicated character.
Now consider this. A straight, brittle rod is randomly broken in two places: find the probability that the pieces can form a triangle.[2] Since the only requirement for creating a triangle with three straight lines is that the length of any two sides must be greater than the length of the third, this problem seems to involve a similar idea as before. We need to imagine the rods breaking at various pairs of spots—no one knows exactly where—while maintaining the same overall tendency to distribute these spots evenly along the entire length. Like in the previous case, the mathematician focuses solely on the final outcome and ignores the process by which it occurs. Therefore, the problem is again simplified to one of measurement, although it is a bit more complex.
§ 3. (2) In another class of cases we have to contemplate an intermediate process rather than a final result; but the same conception has to be introduced here, though it is now applied to the former stage, and in consequence will not in general apply to the latter.
§ 3. (2) In another category of cases, we need to consider an ongoing process instead of just a final outcome; however, the same idea needs to be applied here, though it's now relevant to the earlier stage and typically won't apply to the final stage.
For instance: a shot is fired at random from a gun whose maximum range (i.e. at 45° elevation) is 3000 yards: what is the chance that the actual range shall exceed 2000 yards? The ultimately uniform (or random) distribution here is commonly assumed to apply to the various directions in which the gun can be pointed; all possible directions above 99 the horizontal being equally represented in the long run. We have therefore to contemplate a surface of uniform distribution, but it will be the surface, not of the ground, but of a hemisphere whose centre is occupied by the man who fires. The ultimate distribution of the bullets on the spots where they strike the ground will not be uniform. The problem is in fact to discover the law of variation of the density of distribution.
For example, if a shot is fired randomly from a gun with a maximum range of 3000 yards (at a 45° angle), what are the chances that the actual range will exceed 2000 yards? It's generally assumed that the distribution of the directions the gun can point in is uniform (or random); all possible angles above the horizontal are equally represented over time. So, we need to consider a surface of uniform distribution, but it will be the surface of a hemisphere centered around the person firing the gun. The distribution of where the bullets land on the ground won't be uniform. The real challenge is to determine how the density of distribution varies.
The above is, I presume, the treatment generally adopted in solving such a problem. But there seems no absolute necessity for any such particular choice. It is surely open to any one to maintain[3] that his conception of the randomness of the firing is assigned by saying that it is likely that a man should begin by facing towards any point of the compass indifferently, and then proceed to raise his gun to any angle indifferently. The stage of ultimately uniform distribution here has receded a step further back. It is not assigned directly to the surface of an imaginary hemisphere, but to the lines of altitude and azimuth drawn on that surface. Accordingly, the distribution over the hemisphere itself will not now be uniform,—there will be a comparative crowding up towards the pole,—and the ultimate distribution over the ground will not be the same as before.
The above is, I assume, the approach usually taken in addressing such a problem. However, there doesn’t seem to be any strict requirement for choosing a specific method. It's certainly open for anyone to argue that their view of randomness in firing is based on the idea that a person is likely to start facing any direction on the compass without preference and then aim their gun at any angle without preference. The stage of a completely uniform distribution has now moved a step further back. It is no longer directly linked to the surface of an imaginary hemisphere, but to the altitude and azimuth lines drawn on that surface. As a result, the distribution across the hemisphere itself won’t be uniform anymore—there will be a relative concentration near the pole—and the final distribution on the ground will differ from what it was before.
§ 4. Difficulties of this kind, arising out of the uncertainty as to what stage should be selected for that of uniform distribution, will occasionally present themselves. For instance: let a book be taken at random out of a bookcase; what is the chance of hitting upon some assigned volume? I hardly know how this question would commonly be treated. If we were to set our man opposite the middle of the shelf 100 and inquire what would generally happen in practice, supposing him blindfolded, there cannot be much doubt that the volumes would not be selected equally often. On the contrary, it is likely that there would be a tendency to increased frequency about a centre indicated by the height of his shoulder, and (unless he be left-handed) a trifle to the right of the point exactly opposite his starting point.
§ 4. Such difficulties, stemming from the uncertainty about which stage to choose for uniform distribution, can occasionally arise. For example, if someone randomly picks a book from a bookshelf, what are the chances they'll select a specific title? I'm not sure how this question would typically be addressed. If we position our subject in front of the middle of the shelf 100 and ask what usually happens in practice, assuming they're blindfolded, it seems unlikely that the books would be selected with equal frequency. More likely, there would be a tendency to select more often from a center point around the height of their shoulder, and (unless they are left-handed) a bit to the right of the exact point they started from.
If the question were one which it were really worth while to work out on these lines we should be led a long way back. Just as we imagined our rifleman's position (on the second supposition) to be determined by two independent coordinates of assumed continuous and equal facility, so we might conceive our making the attempt to analyse the man's movements into a certain number of independent constituents. We might suppose all the various directions from his starting point, along the ground, to be equally likely; and that when he reaches the shelves the random motion of his hand is to be regulated after the fashion of a shot discharged at random.
If the question were one that was truly worth exploring along these lines, we would be taken back quite a bit. Just as we imagined our rifleman's position (based on the second assumption) to be defined by two independent coordinates of assumed continuous and equal ease, we might think about breaking down the man's movements into a certain number of independent components. We could assume that all the various directions from his starting point on the ground are equally probable, and that when he gets to the shelves, the random motion of his hand is controlled like a shot fired at random.
The above would be one way of setting about the statement of the problem. But the reader will understand that all which I am here proposing to maintain is that in these, as in every similar case, we always encounter, under this conception of ‘randomness’, at some stage or other, this postulate of ultimate uniformity of distribution over some assigned magnitude: either time; or space, linear, superficial, or solid. But the selection of the stage at which this is to be applied may give rise to considerable difficulty, and even arbitrariness of choice.
The above is one way to approach the statement of the problem. However, the reader should understand that what I'm suggesting is that in these cases, just like in any similar situation, we always come across, under this concept of 'randomness,' at some point, the idea of ultimate uniformity of distribution over a certain range: whether it's time or space, whether linear, surface, or three-dimensional. However, choosing the point at which this applies can lead to significant challenges and even some randomness in the choice.
§ 5. Some years ago there was a very interesting discussion upon this subject carried on in the mathematical part of the Educational Times (see, especially, Vol. VII.). As not unfrequently happens in mathematics there was an almost 101 entire accord amongst the various writers as to the assumptions practically to be made in any particular case, and therefore as to the conclusion to be drawn, combined with a very considerable amount of difference as to the axioms and definitions to be employed. Thus Mr M. W. Crofton, with the substantial agreement of Mr Woolhouse, laid it down unhesitatingly that “at random” has “a very clear and definite meaning; one which cannot be better conveyed than by Mr Wilson's definition, ‘according to no law’; and in this sense alone I mean to use it.” According to any scientific interpretation of ‘law’ I should have said that where there was no law there could be no inference. But ultimate tendency towards equality of distribution is as much taken for granted by Mr Crofton as by any one else: in fact he makes this a deduction from his definition:—“As this infinite system of parallels are drawn according to no law, they are as thickly disposed along any part of the [common] perpendicular as along any other” (VII. p. 85). Mr Crofton holds that any kind of unequal distribution would imply law,—“If the points [on a plane] tended to become denser in any part of the plane than in another, there must be some law attracting them there” (ib. p. 84). The same view is enforced in his paper on Local Probability (in the Phil. Trans., Vol. 158). Surely if they tend to become equally dense this is just as much a case of regularity or law.
§ 5. A few years ago, there was a really interesting discussion on this topic published in the mathematical section of the Educational Times (see, especially, Vol. VII.). As often happens in mathematics, there was almost total agreement among various writers about the assumptions that should be made in each specific case, and therefore about the conclusions drawn, combined with quite a bit of disagreement over the axioms and definitions to be used. For instance, Mr. M. W. Crofton, with the strong support of Mr. Woolhouse, clearly stated that “at random” has “a very clear and definite meaning; one that can be best expressed by Mr. Wilson's definition, ‘according to no law’; and I intend to use it only in this sense.” According to any scientific interpretation of ‘law’, I would argue that where there is no law, there can be no inference. However, the ultimate tendency toward equal distribution is assumed by Mr. Crofton as much as by anyone else: in fact, he makes this a deduction from his definition:—“As this infinite system of parallels is drawn according to no law, they are distributed as densely along any part of the [common] perpendicular as along any other” (VII. p. 85). Mr. Crofton believes that any kind of unequal distribution would imply a law,—“If the points [on a plane] tended to become denser in one area of the plane than in another, there must be some law attracting them there” (ib. p. 84). The same perspective is emphasized in his paper on Local Probability (in the Phil. Trans., Vol. 158). Surely, if the points tend to become equally dense, that is just as much a case of regularity or law.
It may be remarked that wherever any serious practical consequences turn upon duly securing the desired randomness, it is always so contrived that no design or awkwardness or unconscious one-sidedness shall disturb the result. The principal case in point here is of course afforded by games of chance. What we want, when we toss a die, is to secure that all numbers from 1 to 6 shall be equally often represented in the long run, but that no person shall be able to predict the 102 individual occurrence. We might, in our statement of a problem, as easily postulate ‘a number thought of at random’ as ‘a shot fired at random’, but no one would risk his chances of gain and loss on the supposition that this would be done with continued fairness. Accordingly, we construct a die whose sides are accurately alike, and it is found that we may do almost what we like with this, at any previous stage to that of its issue from the dice box on to the table, without interfering with the random nature of the result.
It can be noted that whenever serious practical consequences depend on achieving the desired randomness, it's always arranged so that no design, awkwardness, or unintentional bias disrupts the outcome. A key example of this is games of chance. What we want when we roll a die is to ensure that all numbers from 1 to 6 are represented equally over time, while making sure no one can predict the outcome of a specific roll. We could easily say a ‘number thought of at random’ or a ‘shot fired at random’ in our problem statement, but nobody would gamble on the assumption that this would happen fairly every time. Therefore, we create a die where the sides are perfectly uniform, and we find that we can do almost anything with it, at any stage before it comes out of the dice box and lands on the table, without affecting the randomness of the result.
§ 6. II. Another characteristic in which the scientific conception seems to me to depart from the popular or original signification is the following. The area of distribution which we take into account must be a finite or limited one. The necessity for this restriction may not be obvious at first sight, but the consideration of one or two examples will serve to indicate the point at which it makes itself felt. Suppose that one were asked to choose a number at random, not from a finite range, but from the inexhaustible possibilities of enumeration. In the popular sense of the term,—i.e. of uttering a number without pausing to choose,—there is no difficulty. But a moment's consideration will show that no arrangement even tending towards ultimately uniform distribution can be secured in this way. No average could be struck with ever increasing steadiness. So with spatial infinity. We can rationally speak of choosing a point at random in a given straight line, area, or volume. But if we suppose the line to have no end, or the selection to be made in infinite space, the basis of ultimate tendency towards what may be called the equally thick deposit of our random points fails us utterly.
§ 6. II. Another way the scientific understanding differs from the common or original meaning is as follows. The area we consider must be finite or limited. At first, this limitation may not seem necessary, but looking at a couple of examples will highlight its importance. Imagine being asked to pick a number at random, not from a limited set, but from endless possibilities. In the common usage of the term—meaning to say a number without thinking—there’s no problem. However, a moment’s thought reveals that you can't achieve any kind of uniform distribution this way. There wouldn't be a reliable average that becomes consistently steady. The same goes for spatial infinity. We can logically talk about choosing a point at random along a specific line, area, or volume. But if we suppose the line goes on forever, or if we're making our choice in infinite space, we lose the foundation for a tendency toward what we might call an equally dense spread of our random points.
Similarly in any other example in which one of the magnitudes is unlimited. Suppose I fling a stick at random in a horizontal plane against a row of iron railings and 103 inquire for the chance of its passing through without touching them. The problem bears some analogy to that of the chessmen, and so far as the motion of translation of the stick is concerned (if we begin with this) it presents no difficulty. But as regards the rotation it is otherwise. For any assigned linear velocity there is a certain angular velocity below which the stick may pass through without contact, but above which it cannot. And inasmuch as the former range is limited and the latter is unlimited, we encounter the same impossibility as before in endeavouring to conceive a uniform distribution. Of course we might evade this particular difficulty by beginning with an estimate of the angular velocity, when we should have to repeat what has just been said, mutatis mutandis, in reference to the linear velocity.
Likewise, in any other scenario where one of the factors is unlimited. Imagine I throw a stick randomly in a horizontal plane at a row of iron railings and ask about the chance of it passing through without hitting them. This situation is somewhat similar to that of the chess pieces, and as far as the stick's linear motion is concerned (if we start with that), it poses no challenge. However, when it comes to its rotation, things change. For any given linear speed, there's a specific angular speed below which the stick can pass through without making contact, but above which it cannot. Since the first range is limited and the second is not, we face the same impossibility as before when trying to envision a uniform distribution. Of course, we could avoid this specific problem by starting with an estimate of the angular speed, in which case we would have to go over what’s just been mentioned, mutatis mutandis, regarding the linear speed.
§ 7. I am of course aware that there are a variety of problems current which seem to conflict with what has just been said, but they will all submit to explanation. For instance; What is the chance that three straight lines, taken or drawn at random, shall be of such lengths as will admit of their forming a triangle? There are two ways in which we may regard the problem. We may, for one thing, start with the assumption of three lines not greater than a certain length n, and then determine towards what limit the chance tends as n increases unceasingly. Or, we may maintain that the question is merely one of relative proportion of the three lines. We may then start with any magnitude we please to represent one of the lines (for simplicity, say, the longest of them), and consider that all possible shapes of a triangle will be represented by varying the lengths of the other two. In either case we get a definite result without need to make an attempt to conceive any random selection from the infinity of possible length.
§ 7. I know there are many current issues that seem to contradict what I've just said, but they can all be explained. For example, what is the chance that three random straight lines can be arranged to form a triangle? There are two ways we can look at this problem. First, we can assume that the three lines are no longer than a specific length n, and then see what limit the probability approaches as n keeps increasing. Alternatively, we can argue that the question is simply about the relative proportion of the three lines. In this case, we can choose any length we want to represent one of the lines (let's say the longest one for simplicity) and consider how varying the lengths of the other two will show all possible triangle shapes. In both scenarios, we arrive at a clear result without needing to try to visualize any random selection from the infinite possibilities of length.
So in what is called the “three-point problem”:—Three points in space are selected at random; find the chance of their forming an acute-angled triangle. What is done is to start with a closed volume,—say a sphere, from its superior simplicity,—find the chance (on the assumption of uniform distribution within this volume); and then conceive the continual enlargement without limit of this sphere. So regarded the problem is perfectly consistent and intelligible, though I fail to see why it should be termed a random selection in space rather than in a sphere. Of course if we started with a different volume, say a cube, we should get a different result; and it is therefore contended (e.g. by Mr Crofton in the Educational Times, as already referred to) that infinite space is more naturally and appropriately regarded as tended towards by the enlargement of a sphere than by that of a cube or any other figure.
So in what’s known as the “three-point problem”: three points in space are chosen at random; determine the likelihood of them forming an acute-angled triangle. What we do is start with a closed volume—like a sphere, due to its straightforward nature—calculate the probability (assuming uniform distribution within this volume); and then imagine continuously expanding this sphere without limit. Viewed this way, the problem is completely logical and understandable, although I don’t see why it’s referred to as a random selection in space rather than in a sphere. Of course, if we started with a different volume, like a cube, we would get a different outcome; and it’s argued (e.g. by Mr. Crofton in the Educational Times, as mentioned earlier) that infinite space is more appropriately considered as approached by the expansion of a sphere rather than by that of a cube or any other shape.
Again: A group of integers is taken at random; show that the number thus taken is more likely to be odd than even. What we do in answering this is to start with any finite number n, and show that of all the possible combinations which can be made within this range there are more odd than even. Since this is true irrespective of the magnitude of n, we are apt to speak as if we could conceive the selection being made at random from the true infinity contemplated in numeration.
Again: A group of integers is randomly chosen; demonstrate that the number selected is more likely to be odd than even. To answer this, we begin with any finite number n and show that among all the possible combinations that can be created within this range, there are more odd numbers than even numbers. Since this holds true regardless of the size of n, we tend to speak as if the selection could be made randomly from the true infinity considered in counting.
§ 8. Where these conditions cannot be secured then it seems to me that the attempt to assign any finite value to the probability fails. For instance, in the following problem, proposed by Mr J. M. Wilson, “Three straight lines are drawn at random on an infinite plane, and a fourth line is drawn at random to intersect them: find the probability of its passing through the triangle formed by the other three” (Ed. Times, Reprint, Vol. V. p. 82), he offers the following 105 solution: “Of the four lines, two must and two must not pass within the triangle formed by the remaining three. Since all are drawn at random, the chance that the last drawn should pass through the triangle formed by the other three is consequently 1/2.”
§ 8. When these conditions can't be guaranteed, it seems to me that trying to assign any finite value to the probability fails. For instance, in the following problem posed by Mr. J. M. Wilson, “Three straight lines are drawn at random on an infinite plane, and a fourth line is drawn at random to intersect them: find the probability of it passing through the triangle formed by the other three” (Ed. Times, Reprint, Vol. V. p. 82), he offers the following 105 solution: “Of the four lines, two must and two must not pass within the triangle formed by the remaining three. Since all are drawn at random, the chance that the last drawn should pass through the triangle formed by the other three is consequently 1/2.”
I quote this solution because it seems to me to illustrate the difficulty to which I want to call attention. As the problem is worded, a triangle is supposed to be assigned by three straight lines. However large it may be, its size bears no finite ratio whatever to the indefinitely larger area outside it; and, so far as I can put any intelligible construction on the supposition, the chance of drawing a fourth random line which should happen to intersect this finite area must be reckoned as zero. The problem Mr Wilson has solved seems to me to be a quite different one, viz. “Given four intersecting straight lines, find the chance that we should, at random, select one that passes through the triangle formed by the other three.”
I mention this solution because it clearly highlights the difficulty I want to address. The problem states that a triangle is to be defined by three straight lines. No matter how large it is, its size has no measurable ratio to the infinitely larger area outside it; and, as far as I can interpret the assumption, the likelihood of drawing a fourth random line that intersects this finite area is effectively zero. The problem that Mr. Wilson has solved appears to be quite different, namely: “Given four intersecting straight lines, what are the chances that we randomly select one that passes through the triangle formed by the other three?”
The same difficulty seems to me to turn up in most other attempts to apply this conception of randomness to real infinity. The following seems an exact analogue of the above problem:—A number is selected at random, find the chance that another number selected at random shall be greater than the former;—the answer surely must be that the chance is unity, viz. certainty, because the range above any assigned number is infinitely greater than that below it. Or, expressed in the only language in which I can understand the term ‘infinity’, what I mean is this. If the first number be m and I am restricted to selecting up to n (n > m) then the chance of exceeding m is n − m : n; if I am restricted to 2n then it is 2n − m : 2n and so on. That is, however large n and m may be the expression is always intelligible; but, m being chosen first, n may be made as 106 much larger than m as we please: i.e. the chance may be made to approach as near to unity as we please.
The same issue seems to arise in most other attempts to apply this idea of randomness to true infinity. The following appears to be a direct parallel to the previous problem: A number is chosen at random; what are the odds that another randomly chosen number will be greater than the first? The answer must be that the probability is one, or certainty, because the range above any given number is infinitely larger than that below it. To put it in the only way I can grasp the concept of ‘infinity’: If the first number is m and I’m limited to selecting up to n (n > m), then the odds of exceeding m are n − m : n; if I’m restricted to 2n, then it’s 2n − m : 2n, and so on. That is, no matter how large n and m may be, the expression is always clear; however, since m is chosen first, n can be made as much larger than m as we want: meaning the odds can get as close to one as we desire.
I cannot but think that there is a similar fallacy in De Morgan's admirably suggestive paper on Infinity (Camb. Phil. Trans. Vol. 11.) when he is discussing the “three-point problem”:—i.e. given three points taken at random find the chance that they shall form an acute-angled triangle. All that he shows is, that if we start with one side as given and consider the subsequent possible positions of the opposite vertex, there are infinitely as many such positions which would form an acute-angled triangle as an obtuse: but, as before, this is solving a different problem.
I can't help but think there's a similar mistake in De Morgan's incredibly insightful paper on Infinity (Camb. Phil. Trans. Vol. 11.) when he talks about the “three-point problem”:—specifically, how to determine the likelihood that three randomly chosen points will create an acute-angled triangle. All he demonstrates is that if we start with one side fixed and look at the possible positions of the opposite vertex, there are infinitely as many positions that would create an acute-angled triangle as there are that would create an obtuse one; but, as mentioned before, this is solving a different problem.
§ 9. The nearest approach I can make towards true indefinite randomness, or random selection from true indefiniteness, is as follows. Suppose a circle with a tangent line extended indefinitely in each direction. Now from the centre draw radii at random; in other words, let the semicircumference which lies towards the tangent be ultimately uniformly intersected by the radii. Let these radii be then produced so as to intersect the tangent line, and consider the distribution of these points of intersection. We shall obtain in the result one characteristic of our random distribution; i.e. no portion of this tangent, however small or however remote, but will find itself in the position ultimately of any small portion of the pavement in our supposed continual rainfall. That is, any such elementary patch will become more and more closely dotted over with the points of intersection. But the other essential characteristic, viz. that of ultimately uniform distribution, will be missing. There will be a special form of distribution,—what in fact will have to be discussed in a future chapter under the designation of a ‘law of error’,—by virtue of which the concentration will tend to be greatest at a certain point (that of contact with the circle), and will thin 107 out from here in each direction according to an easily calculated formula. The existence of such a state of things as this is quite opposed to the conception of true randomness.
§ 9. The closest I can get to true indefinite randomness, or random selection from true indefiniteness, is like this. Imagine a circle with a tangent line extended infinitely in both directions. From the center, draw radii at random; in other words, let the semicircumference toward the tangent be ultimately uniformly intersected by the radii. Then, extend these radii to intersect the tangent line and look at the distribution of these points of intersection. We will find one characteristic of our random distribution; that is, no portion of this tangent, no matter how small or far away, will fail to end up in a position similar to any small area of pavement during our imagined continuous rainfall. In other words, any tiny patch will become increasingly dotted with intersection points. However, the other key characteristic—namely, that of ultimately uniform distribution—will be absent. There will be a specific form of distribution—what we will actually discuss in a future chapter called a ‘law of error’—where the concentration will be greatest at a certain point (the point of contact with the circle) and will decrease from there in each direction according to an easily calculated formula. The existence of this situation is completely contrary to the idea of true randomness.
§ 10. III. Apart from definitions and what comes of them, perhaps the most important question connected with the conception of Randomness is this: How in any given case are we to determine whether an observed arrangement is to be considered a random one or not? This question will have to be more fully discussed in a future chapter, but we are already in a position to see our way through some of the difficulties involved in it.
§ 10. III. Besides definitions and their implications, one of the most important questions related to the idea of Randomness is this: How do we decide whether a particular arrangement we observe is random or not? We will explore this question in more detail in a future chapter, but we can already begin to navigate some of the challenges it presents.
(1) If the events or objects under consideration are supposed to be continued indefinitely, or if we know enough about the mode in which they are brought about to detect their ultimate tendency,—or even, short of this, if they are numerous enough to be beyond practical counting,—there is no great difficulty. We are simply confronted with a question of fact, to be settled like other questions of fact. In the case of the rain-drops, watch two equal squares of pavement or other surfaces, and note whether they come to be more and more densely uniformly and evenly spotted over: if they do, then the arrangement is what we call a random one. If I want to know whether a tobacco-pipe really breaks at random, and would therefore serve as an illustration of the problem proposed some pages back, I have only to drop enough of them and see whether pieces of all possible lengths are equally represented in the long run. Or, I may argue deductively, from what I know about the strength of materials and the molecular constitution of such bodies, as to whether fractures of small and large pieces are all equally likely to occur.
(1) If the events or objects we're looking at are meant to go on forever, or if we know enough about how they happen to spot their overall trend,—or even if they’re so numerous that counting them isn’t practical,—it’s not that complicated. We’re just faced with a factual question, which can be answered like any other factual question. In the case of raindrops, observe two equal squares of pavement or other surfaces and check if they become more and more evenly and uniformly spotted over time: if they do, then that arrangement is what we call random. If I want to determine whether a tobacco pipe actually breaks randomly, which would illustrate the problem discussed a few pages back, I just need to drop enough of them and see if pieces of every possible length appear equally over time. Alternatively, I could deduce from what I know about material strength and the molecular structure of these objects whether small and large pieces are equally likely to break off.
§ 11. The reader's attention must be carefully directed to a source of confusion here, arising out of a certain cross-division. 108 What we are now discussing is a question of fact, viz. the nature of a certain ultimate arrangement; we are not discussing the particular way in which it is brought about. In other words, the antithesis is between what is and what is not random: it is not between what is random and what is designed. As we shall see in a few moments it is quite possible that an arrangement which is the result,—if ever anything were so,—of ‘design’, may nevertheless present the unmistakeable stamp of randomness of arrangement.
§ 11. The reader's attention needs to be carefully focused on a potential source of confusion here, which comes from a certain cross-division. 108 What we're discussing now is a matter of fact, specifically the nature of a certain ultimate arrangement; we are not talking about the specific way in which it is achieved. In other words, the contrast is between what is and what is not random: it is not between what is random and what is intentional. As we will see shortly, it is entirely possible for an arrangement that results—if anything can be said to do so—from ‘intention’ to still exhibit unmistakable signs of randomness in its arrangement.
Consider a case which has been a good deal discussed, and to which we shall revert again: the arrangement of the stars. The question here is rather complicated by the fact that we know nothing about the actual mutual positions of the stars, all that we can take cognizance of being their apparent or visible places as projected upon the surface of a supposed sphere. Appealing to what alone we can thus observe, it is obvious that the arrangement, as a whole, is not of the random sort. The Milky Way and the other resolvable nebulæ, as they present themselves to us, are as obvious an infraction of such an arrangement as would be the occurrence here and there of patches of ground in a rainfall which received a vast number more drops than the spaces surrounding them. If we leave these exceptional areas out of the question and consider only the stars which are visible by the naked eye or by slight telescopic power, it seems equally certain that the arrangement is, for the most part, a fairly representative random one. By this we mean nothing more than the fact that when we mark off any number of equal areas on the visible sphere these are found to contain approximately the same number of stars.
Consider a case that has been discussed quite a bit, and we will revisit it later: the arrangement of the stars. The situation here is complicated by the fact that we don’t know the actual mutual positions of the stars; all we have are their apparent or visible locations as projected on the surface of an imagined sphere. Based on what we can observe, it's clear that the overall arrangement is not random. The Milky Way and other resolvable nebulae, as we see them, obviously disrupt this arrangement just like patches of ground in rainfall that receive many more drops than the areas around them. If we set aside these exceptional areas and only look at the stars visible to the naked eye or with a small telescope, it also seems that the arrangement is mostly a fairly representative random one. This means that when we divide the visible sphere into equal areas, these areas tend to contain about the same number of stars.
The actual arrangement of the stars in space may also be of the same character: that is, the apparently denser aggregation may be apparent only, arising from the fact that 109 we are looking through regions which are not more thickly occupied but are merely more extensive. The alternative before us, in fact, is this. If the whole volume, so to say, of the starry heavens is tolerably regular in shape, then the arrangement of the stars is not of the random order; if that volume is very irregular in shape, it is possible that the arrangement within it may be throughout of that order.
The actual layout of the stars in space might also be similar: the seemingly denser clusters may just look that way because we’re viewing areas that aren’t actually more crowded, but simply more extensive. The choice we face is this: if the entire expanse of the starry sky is fairly regular in shape, then the placement of the stars isn’t random; however, if that expanse is very irregularly shaped, it’s possible that the arrangement within it could follow that randomness.
§ 12. (2) When the arrangement in question includes but a comparatively small number of events or objects, it becomes much more difficult to determine whether or not it is to be designated a random one. In fact we have to shift our ground, and to decide not by what has been actually observed but by what we have reason to conclude would be observed if we could continue our observation much longer. This introduces what is called ‘Inverse Probability’, viz. the determination of the nature of a cause from the nature of the observed effect; a question which will be fully discussed in a future chapter. But some introductory remarks may be conveniently made here.
§ 12. (2) When the arrangement in question includes only a relatively small number of events or objects, it becomes much harder to decide if it should be labeled as random. In fact, we need to change our approach and base our decision not on what has been directly observed, but on what we can reasonably conclude would be observed if we could continue our observations for a longer period. This leads to what’s known as ‘Inverse Probability’, which is the process of determining the nature of a cause based on the nature of the observed effect; a topic that will be thoroughly explored in a future chapter. However, some introductory comments can be helpful here.
Every problem of Probability, as the subject is here understood, introduces the conception of an ultimate limit, and therefore presupposes an indefinite possibility of repetition. When we have only a finite number of occurrences before us, direct evidence of the character of their arrangement fails us, and we have to fall back upon the nature of the agency which produces them. And as the number becomes smaller the confidence with which we can estimate the nature of the agency becomes gradually less.
Every problem in probability, as we understand it here, involves the idea of an ultimate limit and implies an endless possibility of repetition. When we only have a limited number of occurrences, direct evidence of how they are arranged is lacking, and we have to rely on the nature of the process that produces them. As the number decreases, our confidence in estimating the nature of that process gradually diminishes.
Begin with an intermediate case. There is a small lawn, sprinkled over with daisies: is this a random arrangement? We feel some confidence that it is so, on mere inspection; meaning by this that (negatively) no trace of any regular pattern can be discerned and (affirmatively) that if we take 110 any moderately small area, say a square yard, we shall find much about the same number of the plants included in it. But we can help ourselves by an appeal to the known agency of distribution here. We know that the daisy spreads by seed, and considering the effect of the wind and the continued sweeping and mowing of the lawn we can detect causes at work which are analogous to those by which the dealing of cards and the tossing of dice are regulated.
Start with an intermediate example. There’s a small lawn dotted with daisies: is this arrangement random? We can confidently say it seems so, just by looking at it; which means that (negatively) we can't see any clear pattern, and (positively) if we take a moderately small area, like a square yard, we’ll find roughly the same number of plants in that space. But we can clarify this by looking at the known factors involved. We know that daisies spread by seed, and considering the effects of the wind and the ongoing mowing of the lawn, we can identify causes at work that are similar to how card dealing and dice tossing are managed. 110
In the above case the appeal to the process of production was subsidiary, but when we come to consider the nature of a very small succession or group this appeal becomes much more important. Let us be told of a certain succession of ‘heads’ and ‘tails’ to the number of ten. The range here is far too small for decision, and unless we are told whether the agent who obtained them was tossing or designing we are quite unable to say whether or not the designation of ‘random’ ought to be applied to the result obtained. The truth must never be forgotten that though ‘design’ is sure to break down in the long run if it make the attempt to produce directly the semblance of randomness,[4] yet for a short spell it can simulate it perfectly. Any short succession, say of heads and tails, may have been equally well brought about by tossing or by deliberate choice.
In the situation described, the reference to the production process is secondary. However, when we look at a very small series or group, this reference becomes much more significant. Let’s say we have a specific sequence of ‘heads’ and ‘tails’ totaling ten. The sample size is too small to draw a conclusion, and unless we know whether the person conducting the flips was tossing the coin randomly or making a choice, we cannot determine if the result should be labeled as ‘random.’ It’s important to remember that while ‘design’ will eventually fail if it tries to create a true appearance of randomness, in the short term, it can mimic it perfectly. Any brief sequence of heads and tails could have resulted from either tossing or intentional choice.
§ 13. The reader will observe that this question of randomness is being here treated as simply one of ultimate statistical fact. I have fully admitted that this is not the primitive conception, nor is it the popular interpretation, but to adopt it seems the only course open to us if we are to draw inferences such as those contemplated in Probability. When we look to the producing agency of the ultimate arrangement we may find this very various. It may prove itself to be (a few stages back) one of conscious deliberate 111 purpose, as in drawing a card or tossing a die: it may be the outcome of an extremely complicated interaction of many natural causes, as in the arrangement of the flowers scattered over a lawn or meadow: it may be of a kind of which we know literally nothing whatever, as in the case of the actual arrangement of the stars relatively to each other.
§ 13. The reader will notice that the question of randomness is being treated here as simply a matter of statistical fact. I have fully acknowledged that this is not the original idea, nor is it the common interpretation, but adopting this view seems to be the only option available if we want to make inferences like those discussed in Probability. When we consider the causes behind the final arrangement, we might find them to be quite varied. It could be (a few steps back) the result of conscious, intentional action, like drawing a card or rolling a die; it may arise from a highly complex interaction of many natural causes, like the way flowers are arranged across a lawn or meadow; or it could be something we literally know nothing about, such as the actual arrangement of the stars in relation to each other.
This was the state of things had in view when it was said a few pages back that randomness and design would result in something of a cross-division. Plenty of arrangements in which design had a hand, a stage or two back, can be mentioned, which would be quite indistinguishable in their results from those in which no design whatever could be traced. Perhaps the most striking case in point here is to be found in the arrangement of the digits in one of the natural arithmetical constants, such as π or e, or in a table of logarithms. If we look to the process of production of these digits, no extremer instance can be found of what we mean by the antithesis of randomness: every figure has its necessarily pre-ordained position, and a moment's flagging of intention would defeat the whole purpose of the calculator. And yet, if we look to results only, no better instance can be found than one of these rows of digits if it were intended to illustrate what we practically understand by a chance arrangement of a number of objects. Each digit occurs approximately equally often, and this tendency develops as we advance further: the mutual juxtaposition of the digits also shows the same tendency, that is, any digit (say 5) is just as often followed by 6 or 7 as by any of the others. In fact, if we were to take the whole row of hitherto calculated figures, cut off the first five as familiar to us all, and contemplate the rest, no one would have the slightest reason to suppose that these had not come out as the results of a die with ten equal faces.
This was the situation we had in mind when we mentioned a few pages ago that randomness and design would lead to some kind of crossover. There are plenty of examples where design played a role that would be indistinguishable in outcomes from those where no design can be found. Perhaps the most notable example is in the arrangement of the digits in one of the natural mathematical constants, like π or e, or in a logarithm table. Looking at how these digits are produced, you won't find a clearer example of what we mean by the opposite of randomness: every digit has its specific, predetermined position, and even a moment of distraction can ruin the whole purpose of the calculation. Yet, if we focus only on the results, no better example exists than one of these rows of digits if we were to use it to illustrate what we generally understand as a random arrangement of objects. Each digit appears approximately equally often, and this pattern continues as we look further: the pairing of the digits also reflects this tendency, meaning that any digit (like 5) is just as likely to be followed by 6 or 7 as by any of the others. In fact, if we were to take the entire row of calculated figures, remove the first five that we all recognize, and examine the rest, no one would have any reason to think these digits didn’t come from rolling a die with ten equal sides.
§ 14. If it be asked why this is so, a rather puzzling question is raised. Wherever physical causation is involved we are generally understood to have satisfied the demand implied in this question if we assign antecedents which will be followed regularly by the event before us; but in geometry and arithmetic there is no opening for antecedents. What we then commonly look for is a demonstration, i.e. the resolution of the observed fact into axioms if possible, or at any rate into admitted truths of wider generality. I do not know that a demonstration can be given as to the existence of this characteristic of statistical randomness in such successions of digits as those under consideration. But the following remarks may serve to shift the onus of unlikelihood by suggesting that the preponderance of analogy is rather in favour of the existence.
§ 14. If someone asks why this is the case, it's quite a puzzling question. In situations involving physical causation, we usually meet the expectation behind this question by identifying causes that consistently lead to the event we're examining. However, in geometry and arithmetic, there's no room for causes. What we typically seek is a demonstration, meaning the breakdown of the observed fact into axioms if possible, or at least into accepted truths that are broader in scope. I'm not sure a demonstration can be provided for the existence of this characteristic of statistical randomness in the sequences of digits we're discussing. However, the following comments may help ease the burden of unlikelihood by suggesting that the balance of analogy leans more toward the existence.
Take the well-known constant π for consideration. This stands for a quantity which presents itself in a vast number of arithmetical and geometrical relations; let us take for examination the best known of these, by regarding it as standing for the ratio of the circumference to the diameter of a circle. So regarded, it is nothing more than a simple case of the measurement of a magnitude by an arbitrarily selected unit. Conceive then that we had before us a rod or line and that we wished to measure it with absolute accuracy. We must suppose—if we are to have a suitable analogue to the determination of π to several hundred figures,—that by the application of continued higher magnifying power we can detect ever finer subdivisions in the graduation. We lay our rod against the scale and find it, say, fall between 31 and 32 inches; we then look at the next division of the scale, viz. that into tenths of an inch. Can we see the slightest reason why the number of these tenths should be other than independent of the number of 113 whole inches? The “piece over” which we are measuring may in fact be regarded as an entirely new piece, which had fallen into our hands after that of 31 inches had been measured and done with; and similarly with every successive piece over, as we proceed to the ever finer and finer divisions.
Consider the well-known constant π. This represents a value that appears in many mathematical and geometrical relationships; let’s focus on the most famous one, which is the ratio of the circumference to the diameter of a circle. When looked at this way, it's simply a case of measuring a quantity using a unit we've chosen. Now imagine we have a rod or a line and we want to measure it with complete accuracy. We need to assume—if we want a proper analogy for determining π to several hundred decimal places—that by using increasingly powerful magnification, we can identify finer and finer subdivisions on the scale. We place our rod against the scale and find it falls between 31 and 32 inches; then we check the next division, which represents tenths of an inch. Is there any reason to think the number of these tenths is related to the number of whole inches? The "extra piece" we are measuring can actually be thought of as a completely new piece that we have after measuring and finishing with the 31 inches; the same goes for each additional piece as we move to smaller and smaller divisions.
Similar remarks may be made about most other incommensurable quantities, such as irreducible roots. Conceive two straight lines at right angles, and that we lay off a certain number of inches along each of these from the point of intersection; say two and five inches, and join the extremities of these so as to form the diagonal of a right-angled triangle. If we proceed to measure this diagonal in terms of either of the other lines we are to all intents and purposes extracting a square root. We should expect, rather than otherwise, to find here, as in the case of π, that incommensurability and resultant randomness of order in the digits was the rule, and commensurability was the exception. Now and then, as when the two sides were three and four, we should find the diagonal commensurable with them; but these would be the occasional exceptions, or rather they would be the comparatively finite exceptions amidst the indefinitely numerous cases which furnished the rule.
Similar comments can be made about most other incompatible quantities, like irreducible roots. Imagine two straight lines meeting at right angles, and let's measure a certain number of inches along each from the intersection point; for example, two inches and five inches, and then connect the endpoints to form the diagonal of a right triangle. If we try to measure this diagonal in terms of either of the other lines, we are essentially extracting a square root. We would expect, much like with π, that incommensurability and the resulting randomness in the digits would be the norm, while commensurability would be the exception. Occasionally, like when the two sides are three and four, we would find the diagonal is compatible with them; but these instances would be the rare exceptions, or rather the relatively few exceptions among the countless cases that establish the norm.
§ 15. The best way perhaps of illustrating the truly random character of such a row of figures is by appealing to graphical aid. It is not easy here, any more than in ordinary statistics, to grasp the import of mere figures; whereas the arrangement of groups of points or lines is much more readily seized. The eye is very quick in detecting any symptoms of regularity in the arrangement, or any tendency to denser aggregation in one direction than in another. How then are we to dispose our figures so as to force them to display their true character? I should suggest that we set about drawing a line at random; and, since we cannot 114 trust our own unaided efforts to do this, that we rely upon the help of such a table of figures to do it for us, and then examine with what sort of efficiency they can perform the task. The problem of drawing straight lines at random, under various limitations of direction or intersection, is familiar enough, but I do not know that any one has suggested the drawing of a line whose shape as well as position shall be of a purely random character. For simplicity we suppose the line to be confined to a plane.
§ 15. One of the best ways to show the truly random nature of a series of numbers is by using a visual representation. Just like with regular statistics, it's not easy to understand what a series of numbers really means; however, when points or lines are arranged visually, it's much easier to grasp. Our eyes are quick to notice any signs of order in the arrangement, or any tendency for points to cluster more in one direction than another. So, how can we arrange our numbers to clearly reveal their true nature? I suggest that we start by drawing a line at random; and, since we can't completely rely on our own abilities to do this, we should use a table of figures to help us, then evaluate how effectively they accomplish the task. The challenge of drawing straight lines at random, under different constraints of direction or intersection, is well known. However, I'm not sure anyone has proposed drawing a line where both the shape and position are purely random. For simplicity, we'll assume the line is limited to a flat plane.
The definition of such a line does not seem to involve any particular difficulty. Phrased in accordance with the ordinary language we should describe it as the path (i.e. any path) traced out by a point which at every moment is as likely to move in any one direction as in any other. That we could not ourselves draw such a line, and that we could not get it traced by any physical agency, is certain. The mere inertia of any moving body will always give it a tendency, however slight, to go on in a straight line at each moment, instead of being instantly responsive to instantaneously varying dictates as to its direction of motion. Nor can we conceive or picture such a line in its ultimate or ideal condition. But it is easy to give a graphical approximation to it, and it is easy also to show how this approximation may be carried on as far as we please towards the ideal in question.
The definition of such a line doesn’t seem to have any real difficulty. In simple terms, we can describe it as the path (i.e. any path) traced out by a point that is equally likely to move in any direction at any moment. It's clear that we couldn’t draw such a line ourselves, nor could we have it traced by any physical method. The basic inertia of any moving object will always make it slightly more likely to continue in a straight line, rather than immediately responding to changing directions at every moment. We also can’t really imagine or visualize such a line in its ultimate or ideal state. However, it’s easy to create a graphical approximation of it, and it’s also straightforward to show how this approximation can be refined as much as we want towards the ideal condition.
We may proceed as follows. Take a sheet of the ordinary ruled paper prepared for the graphical exposition of curves. Select as our starting point the intersection of two of these lines, and consider the eight ‘points of the compass’ indicated by these lines and the bisections of the contained right angles.[5] For suggesting the random selection amongst 115 these directions let them be numbered from 0 to 7, and let us say that a line measured due ‘north’ shall be designated by the figure 0, ‘north-east’ by 1, and so on. The selection amongst these numbers, and therefore directions, at every corner, might be handed over to a die with eight faces; but for the purpose of the illustration in view we select the digits 0 to 7 as they present themselves in the calculated value of π. The sort of path along which we should travel by a series of such steps thus taken at random may be readily conceived; it is given at the end of this chapter.
We can proceed as follows. Take a sheet of standard ruled paper that's designed for graphing curves. Start at the intersection of two of these lines and look at the eight ‘points of the compass’ marked by these lines and the bisected right angles. [5] To randomly choose among these directions, we’ll number them from 0 to 7. Let's say that a line pointing due ‘north’ is labeled 0, ‘north-east’ is 1, and so on. The choice among these numbers, and thus the directions, at each corner could be determined by a die with eight sides; however, for this illustration, we’ll use the digits 0 to 7 as they appear in the calculated value of π. You can easily imagine the path we would take by following a series of these randomly chosen steps; it's detailed at the end of this chapter.
For the purpose with which this illustration was proposed, viz. the graphical display of the succession of digits in any one of the incommensurable constants of arithmetic or geometry, the above may suffice. After actually testing some of them in this way they seem to me, so far as the eye, or the theoretical principles to be presently mentioned, are any guide, to answer quite fairly to the description of randomness.
For the purpose of this illustration, which is to graphically show the sequence of digits in any of the incommensurable constants of arithmetic or geometry, the above may be sufficient. After testing some of them this way, they appear to me, based on visual observation and theoretical principles that will be discussed shortly, to reasonably fit the description of randomness.
§ 16. As we are on the subject, however, it seems worth going farther by enquiring how near we could get to the ideal of randomness of direction. To carry this out completely two improvements must be made. For one thing, instead of confining ourselves to eight directions we must admit an infinite number. This would offer no great difficulty; for instead of employing a small number of digits we should merely have to use some kind of circular teetotum which would rest indifferently in any direction. But in the next place instead of short finite steps we must suppose them indefinitely short. It is here that the actual unattainability makes itself felt. We are familiar enough with the device, employed by Newton, of passing from the discontinuous polygon to the continuous curve. But we can resort to this 116 device because the ideal, viz. the curve, is as easily drawn (and, I should say, as easily conceived or pictured) as any of the steps which lead us towards it. But in the case before us it is otherwise. The line in question will remain discontinuous, or rather angular, to the last: for its angles do not tend even to lose their sharpness, though the fragments which compose them increase in number and diminish in magnitude without any limit. And such an ideal is not conceivable as an ideal. It is as if we had a rough body under the microscope, and found that as we subjected it to higher and higher powers there was no tendency for the angles to round themselves off. Our ‘random line’ must remain as ‘spiky’ as ever, though the size of its spikes of course diminishes without any limit.
§ 16. Since we’re on the topic, it seems worth exploring how close we can get to the ideal of truly random direction. To fully achieve this, we need to make two improvements. First, instead of limiting ourselves to eight directions, we should allow for an infinite number of directions. This wouldn’t be too difficult; we would just need to use some kind of circular spinner that can point in any direction. Secondly, instead of taking short, finite steps, we should imagine them being infinitely small. This is where the actual impossibility becomes apparent. We’re familiar enough with Newton’s technique of moving from a jagged polygon to a smooth curve. We can use this method because the ideal—the curve—is as easy to draw (and, I would say, as easy to think about or visualize) as any of the steps leading toward it. But in our case, it’s different. The line we’re discussing will remain jagged or angular right to the end because its angles don’t even begin to blur, even though the segments that create them increase in number and decrease in size without limit. Such an ideal isn’t imaginable as an ideal. It’s like looking at a rough surface under a microscope and noticing that as we zoom in more and more, the angles don’t get smoother. Our ‘random line’ will stay just as ‘spiky’ as ever, even though the size of its spikes keeps getting smaller without end.
The case therefore seems to be this. It is easy, in words, to indicate the conception by speaking of a line which at every instant is as likely to take one direction as another. It is easy moreover to draw such a line with any degree of minuteness which we choose to demand. But it is not possible to conceive or picture the line in its ultimate form.[6] There is in fact no ‘limit’ here, intelligible to the understanding or picturable by the imagination (corresponding to the asymptote of a curve, or the continuous curve to the incessantly developing polygon), towards which we find ourselves continually approaching, and which therefore we are apt to conceive ourselves as ultimately attaining. The usual assumption therefore which underlies the Newtonian infinitesimal geometry and the Differential Calculus, ceases to apply here.
The situation seems to be this. It's easy to talk about this idea by mentioning a line that can go in any direction at any moment. It's also simple to draw such a line with any level of detail we want. But it's impossible to really imagine or visualize the line in its final form.[6] In reality, there’s no "limit" here that we can understand or visualize (like the asymptote of a curve, or the continuous curve of the ever-evolving polygon) that we are always getting closer to, leading us to think we might eventually reach it. Therefore, the common assumption underlying Newtonian infinitesimal geometry and Differential Calculus doesn't apply here.
§ 17. If we like to consider such a line in one of its approximate stages, as above indicated, it seems to me that 117 some of the usual theorems of Probability, where large numbers are concerned, may safely be applied. If it be asked, for instance, whether such a line will ultimately tend to stray indefinitely far from its starting point, Bernoulli's ‘Law of Large Numbers’ may be appealed to, in virtue of which we should say that it was excessively unlikely that its divergence should be relatively great. Recur to our graphical illustration, and consider first the resultant deviation of the point (after a great many steps) right or left of the vertical line through the starting point. Of the eight admissible motions at each stage two will not affect this relative position, whilst the other six are equally likely to move us a step to the right or to the left. Our resultant ‘drift’ therefore to the right or left will be analogous to the resultant difference between the number of heads and tails after a great many tosses of a penny. Now the well-known outcome of such a number of tosses is that ultimately the proportional approximation to the à priori probability, i.e. to equality of heads and tails, is more and more nearly carried out, but that the absolute deflection is more and more widely displayed.
§ 17. If we consider such a line in one of its approximate stages, as mentioned above, it seems to me that some of the usual theorems of Probability, when large numbers are involved, can be reliably applied. If we ask, for example, whether such a line will eventually wander indefinitely far from its starting point, we can refer to Bernoulli's ‘Law of Large Numbers,’ which suggests that it’s highly unlikely that its divergence would be significant. Let's look back at our graphical illustration and first think about the overall deviation of the point (after many steps) to the right or left of the vertical line through the starting point. Out of the eight possible moves at each stage, two will not change this relative position, while the other six are equally likely to move us a step right or left. So, our overall ‘drift’ to the right or left will be similar to the difference between the number of heads and tails after many coin tosses. The well-known result of tossing a coin multiple times is that over time, the proportional approximation to the à priori probability, meaning equal numbers of heads and tails, becomes increasingly accurate, but the absolute difference becomes more and more pronounced.
Applying this to the case in point, and remembering that the results apply equally to the horizontal and vertical directions, we should say that after any very great number of such ‘steps’ as those contemplated, the ratio of our distance from the starting point to the whole distance travelled will pretty certainly be small, whereas the actual distance from it would be large. We should also say that the longer we continued to produce such a line the more pronounced would these tendencies become. So far as concerns this test, and that afforded by the general appearance of the lines drawn,—this last, as above remarked, being tolerably trustworthy,—I feel no doubt as to the generally ‘random’ 118 character of the rows of figures displayed by the incommensurable or irrational ratios in question.
Applying this to the situation at hand, and keeping in mind that the results apply equally to both horizontal and vertical directions, we can say that after a very large number of such ‘steps’ as those considered, the ratio of our distance from the starting point to the total distance traveled will likely be small, while the actual distance from it would be significant. We should also note that the longer we continue to create such a line, the more pronounced these tendencies will be. In terms of this test, and the one indicated by the overall appearance of the lines drawn—this last one, as mentioned earlier, being fairly reliable—I have no doubt about the generally ‘random’ nature of the arrays of figures shown by the incommensurable or irrational ratios in question. 118
As it may interest the reader to see an actual specimen of such a path I append one representing the arrangement of the eight digits from 0 to 7 in the value of π. The data are taken from Mr Shanks' astonishing performance in the calculation of this constant to 707 places of figures (Proc. of R. S., XXI. p. 319). Of these, after omitting 8 and 9, there remain 568; the diagram represents the course traced out by following the direction of these as the clue to our path. Many of the steps have of course been taken in opposite directions twice or oftener. The result seems to me to furnish a very fair graphical indication of randomness. I have compared it with corresponding paths furnished by rows of figures taken from logarithmic tables, and in other ways, and find the results to be much the same.
As it might interest the reader to see a real example of such a path, I’ve included one that represents the arrangement of the eight digits from 0 to 7 in the value of π. The data is taken from Mr. Shanks' impressive work in calculating this constant to 707 decimal places (Proc. of R. S., XXI. p. 319). Of these, after excluding 8 and 9, 568 remain; the diagram shows the path created by following the sequence of these digits as a guide. Many of the steps have, of course, been taken in opposite directions multiple times. The outcome appears to provide a fair graphical representation of randomness. I've compared it with similar paths derived from rows of figures from logarithmic tables and found the results to be quite similar.

1 According to Prof. Skeat (Etymological Dictionary) the earliest known meaning is that of furious action, as in a charge of cavalry. The etymology, he considers, is connected with the Teutonic word rand (brim), and implies the furious and irregular action of a river full to the brim.
1 According to Prof. Skeat (Etymological Dictionary), the earliest known meaning is that of furious action, like a cavalry charge. He believes the etymology is linked to the Teutonic word rand (brim), suggesting the wild and erratic movement of a river that’s overflowing.
2 See the problem paper of Jan. 18, 1854, in the Cambridge Mathematical Tripos.
2 Check out the problem paper from January 18, 1854, in the Cambridge Mathematical Tripos.
3 As, according to Mr H. Godfray, the majority of the candidates did assume when the problem was once proposed in an examination. See the Educational Times (Reprint, Vol. VII. p. 99.)
3 As Mr. H. Godfray mentioned, most of the candidates did assume when the problem was presented in an exam. See the Educational Times (Reprint, Vol. VII.] p. 99.)
5 It would of course be more complete to take ten alternatives of direction, and thus to omit none of the digits; but this is much more troublesome in practice than to confine ourselves to eight.
5 It would definitely be more thorough to consider ten direction options and not leave out any of the numbers; however, this is much more of a hassle in practice than just sticking to eight.
CHAPTER 6.[*]
THE SUBJECTIVE SIDE OF PROBABILITY. MEASUREMENT OF BELIEF.
* Originally written in somewhat of a spirit of protest against what seemed to me the prevalent disposition to follow De Morgan in taking too subjective a view of the science. In reading it through now I cannot find any single sentence to which I could take distinct objection, though I must admit that if I were writing it entirely afresh I should endeavour to express myself with less emphasis, and I have made alterations in that direction. The reader who wishes to see a view not substantially very different from mine, but expressed with a somewhat opposite emphasis, can refer to Mr F. Y. Edgeworth's article on “The Philosophy of Chance” (Mind, Vol. IX.)
I’m ready to assist. Please provide the text you want me to modernize. Initially written out of a sense of protest against what I felt was the common tendency to follow De Morgan in adopting too subjective a perspective on the science. As I read it now, I can’t find any specific sentence to which I would strongly object, though I must admit that if I were writing it from scratch, I would try to express myself with less emphasis, and I have made some adjustments in that regard. The reader who wants to see a viewpoint that isn’t fundamentally different from mine, but presented with a somewhat contrasting emphasis, can check out Mr. F. Y. Edgeworth's article on “The Philosophy of Chance” (Mind, Vol. IX.)
§ 1. Having now obtained a clear conception of a certain kind of series, the next enquiry is, What is to be done with this series? How is it to be employed as a means of making inferences? The general step that we are now about to take might be described as one from the objective to the subjective, from the things themselves to the state of our minds in contemplating them.
§ 1. Now that we have a clear understanding of a specific type of series, the next question is, what should we do with this series? How can we use it to draw conclusions? The overall process we are about to undertake can be described as a shift from the objective to the subjective, moving from the things themselves to how we perceive them in our minds.
The reader should observe that a substitution has, in a great number of cases, already been made as a first stage towards bringing the things into a shape fit for calculation. This substitution, as described in former chapters, is, in a measure, a process of idealization. The series we actually meet with are apt to show a changeable type, and the individuals of them will sometimes transgress their licensed irregularity. Hence they have to be pruned a little into shape, as 120 natural objects almost always have before they are capable of being accurately reasoned about. The form in which the series emerges is that of a series with a fixed type. This imaginary or ideal series is the basis of our calculation.
The reader should note that a replacement has, in many cases, already been made as a first step toward organizing things in a way that is suitable for calculations. This replacement, as explained in earlier chapters, is, to some extent, a process of idealization. The series we actually encounter tend to have a variable type, and the individuals within them sometimes exceed their allowed irregularity. Therefore, they need to be slightly adjusted to fit a shape, as natural objects almost always do before they can be accurately analyzed. The form in which the series appears is that of a series with a fixed type. This imaginary or ideal series serves as the foundation for our calculations.
§ 2. It must not be supposed that this is at all at variance with the assertion previously made, that Probability is a science of inference about real things; it is only by a substitution of the above kind that we are enabled to reason about the things. In nature nearly all phenomena present themselves in a form which departs from that rigorously accurate one which scientific purposes mostly demand, so we have to introduce an imaginary series, which shall be free from any such defects. The only condition to be fulfilled is, that the substitution is to be as little arbitrary, that is, to vary from the truth as slightly, as possible. This kind of substitution generally passes without notice when natural objects of any kind are made subjects of exact science. I direct distinct attention to it here simply from the apprehension that want of familiarity with the subject-matter might lead some readers to suppose that it involves, in this case, an exceptional deflection from accuracy in the formal process of inference.
§ 2. It shouldn't be assumed that this contradicts the earlier statement that Probability is a science of reasoning about real things; it's only through this type of substitution that we can reason about them. In nature, almost all phenomena appear in a way that deviates from the precise form typically required for scientific purposes, so we need to introduce a theoretical series that is free from these flaws. The only requirement is that the substitution should be as non-arbitrary as possible, meaning it should differ from the truth as little as possible. This kind of substitution usually goes unnoticed when natural objects are studied in exact science. I’m drawing attention to it here simply because I worry that some readers, unfamiliar with the subject, may think it represents an unusual departure from accuracy in the formal reasoning process.
It may be remarked also that the adoption of this imaginary series offers no countenance whatever to the doctrine criticised in the last chapter, in accordance with which it was supposed that our series possessed a fixed unchangeable type which was merely the “development of the probabilities” of things, to use Laplace's expression. It differs from anything contemplated on that hypothesis by the fact that it is to be recognized as a necessary substitution of our own for the actual series, and to be kept in as close conformity with facts as possible. It is a mere fiction or artifice necessarily resorted to for the purpose of calculation, and for this purpose only.
It should be noted that adopting this imaginary series does not support the theory criticized in the last chapter, which suggested that our series had a fixed, unchangeable type that was simply the “development of the probabilities” of things, as Laplace put it. It differs from anything considered under that assumption because it serves as a necessary replacement for the actual series, and it should align as closely with facts as possible. It’s just a fiction or tool used strictly for calculation, and for that purpose only.
This caution is the more necessary, because in the example 121 that I shall select, and which belongs to the most favourite class of examples in this subject, the substitution becomes accidentally unnecessary. The things, as has been repeatedly pointed out, may sometimes need no trimming, because in the form in which they actually present themselves they are almost idealized. In most cases a good deal of alteration is necessary to bring the series into shape, but in some—prominently in the case of games of chance—we find the alterations, for all practical purposes, needless.
This warning is even more important because, in the example I’ll choose, which is one of the most popular examples on this topic, the substitution turns out to be unexpectedly unnecessary. As has been noted several times, sometimes items may not need any changes at all, because in their current form they are almost perfect. In most cases, a lot of adjustments are needed to get the series in order, but in some—especially in the case of games of chance—we find that the changes, for all practical purposes, are unnecessary.
§ 3. We start then, from such a series as this, upon the enquiry, What kind of inference can be made about it? It may assist the logical reader to inform him that our first step will be analogous to one class of what are commonly known as immediate inferences,—inferences, that is, of the type,—‘All men are mortal, therefore any particular man or men are mortal.’ This case, simple and obvious as it is in Logic, requires very careful consideration in Probability.
§ 3. We begin with a series like this, asking what kind of conclusion can be drawn from it. It might help the logical reader to know that our first step will resemble one type of what are typically called immediate inferences—specifically, inferences like, “All men are mortal; therefore, any particular man or men are mortal.” This case, though it seems simple and obvious in logic, demands careful thought in probability.
It is obvious that we must be prepared to form an opinion upon the propriety of taking the step involved in making such an inference. Hitherto we have had as little to do as possible with the irregular individuals; we have regarded them simply as fragments of a regular series. But we cannot long continue to neglect all consideration of them. Even if these events in the gross be tolerably certain, it is not only in the gross that we have to deal with them; they constantly come before us a few at a time, or even as individuals, and we have to form some opinion about them in this state. An insurance office, for instance, deals with numbers large enough to obviate most of the uncertainty, but each of their transactions has another party interested in it—What has the man who insures to say to their proceedings? for to him this question becomes an individual one. And even the office itself receives its cases singly, and would 122 therefore like to have as clear views as possible about these single cases. Now, the remarks made in the preceding chapters about the subjects which Probability discusses might seem to preclude all enquiries of this kind, for was not ignorance of the individual presupposed to such an extent that even (as will be seen hereafter) causation might be denied, within considerable limits, without affecting our conclusions? The answer to this enquiry will require us to turn now to the consideration of a totally distinct side of the question, and one which has not yet come before us. Our best introduction to it will be by the discussion of a special example.
It's clear that we need to be ready to form an opinion on whether it's appropriate to make such an inference. Until now, we've had as little interaction as possible with the irregular individuals; we've seen them merely as pieces of a regular series. However, we can't keep ignoring them for much longer. Even if the overall events are fairly certain, we don't only deal with them in broad terms; they frequently appear before us one at a time or even individually, and we must form some opinion about them in that context. For example, an insurance company works with large numbers that reduce most of the uncertainty, but each of their transactions involves another party—what does the person who insures have to say about their actions? For that person, this question becomes an individual matter. Moreover, the company itself receives cases individually and would therefore like to have as clear views as possible about these single cases. Now, the comments made in the previous chapters about the topics discussed by Probability might seem to exclude any inquiries of this sort, since the presumption of ignorance regarding the individual was so strong that, as will be shown later, causation could be denied within certain limits without impacting our conclusions. Answering this question will lead us to examine a completely different aspect of the issue that we haven't covered yet. A good way to introduce it will be by discussing a specific example.
§ 4. Let a penny be tossed up a very great many times; we may then be supposed to know for certain this fact (amongst many others) that in the long run head and tail will occur about equally often. But suppose we consider only a moderate number of throws, or fewer still, and so continue limiting the number until we come down to three or two, or even one? We have, as the extreme cases, certainty or something undistinguishably near it, and utter uncertainty. Have we not, between these extremes, all gradations of belief? There is a large body of writers, including some of the most eminent authorities upon this subject, who state or imply that we are distinctly conscious of such a variation of the amount of our belief, and that this state of our minds can be measured and determined with almost the same accuracy as the external events to which they refer. The principal mathematical supporter of this view is De Morgan, who has insisted strongly upon it in all his works on the subject. The clearest exposition of his opinions will be found in his Formal Logic, in which work he has made the view which we are now discussing the basis of his system. He holds that we have a certain amount of belief of every proposition which may be set before us, an amount 123 which in its nature admits of determination, though we may practically find it difficult in any particular case to determine it. He considers, in fact, that Probability is a sort of sister science to Formal Logic,[1] speaking of it in the following words: “I cannot understand why the study of the effect, which partial belief of the premises produces with respect to the conclusion, should be separated from that of the consequences of supposing the former to be absolutely true.”[2] In other words, there is a science—Formal Logic—which investigates the rules according to which one proposition can be necessarily inferred from another; in close correspondence with this there is a science which investigates the rules according to which the amount of our belief of one proposition varies with the amount of our belief of other propositions with which it is connected.
§ 4. If we toss a penny a lot of times, we can definitely say that, over time, heads and tails will show up about the same number of times. But what if we only think about a few tosses, or even just one? We have, at one extreme, absolute certainty and, at the other, complete uncertainty. Between these two extremes, don't we have various levels of belief? Many writers, including some of the most respected experts on this topic, argue that we are clearly aware of this variation in our belief levels, and that this mental state can be measured and assessed almost as accurately as the actual events it relates to. The main mathematical advocate for this idea is De Morgan, who has strongly argued for it in all his works. The clearest explanation of his views can be found in his Formal Logic, in which he makes the concept we’re discussing the foundation of his system. He believes that we have a certain degree of belief in every statement presented to us—an amount that can, in theory, be quantified, even though it might be tough to determine in specific situations. He suggests that Probability is a kind of sister science to Formal Logic, stating: “I cannot understand why the effect that partial belief in the premises has on the conclusion should be studied separately from the consequences of assuming the premises are absolutely true.” In other words, there’s a science—Formal Logic—that looks at the rules that allow one statement to be necessarily derived from another, and alongside this, there’s a science that examines how our belief in one statement shifts depending on our belief in other related statements.
The same view is also supported by another high authority, the late Prof. Donkin, who says (Phil. Mag. May, 1851), “It will, I suppose, be generally admitted, and has often been more or less explicitly stated, that the subject-matter of calculation in the mathematical theory of Probabilities is quantity of belief.”
The same view is also backed by another respected figure, the late Prof. Donkin, who states (Phil. Mag. May, 1851), “I think it will be generally accepted, and has frequently been stated more or less explicitly, that the focus of calculation in the mathematical theory of Probabilities is the quantity of belief.”
§ 5. Before proceeding to criticise this opinion, one remark may be made upon it which has been too frequently overlooked. It should be borne in mind that, even were this view of the subject not actually incorrect, it might be objected to as insufficient for the purpose of a definition, on the ground that variation of belief is not confined to Probability. It is a property with which that science is concerned, no doubt, but it is a property which meets us in other directions as 124 well. In every case in which we extend our inferences by Induction or Analogy, or depend upon the witness of others, or trust to our own memory of the past, or come to a conclusion through conflicting arguments, or even make a long and complicated deduction by mathematics or logic, we have a result of which we can scarcely feel as certain as of the premises from which it was obtained. In all these cases then we are conscious of varying quantities of belief, but are the laws according to which the belief is produced and varied the same? If they cannot be reduced to one harmonious scheme, if in fact they can at best be brought to nothing but a number of different schemes, each with its own body of laws and rules, then it is vain to endeavour to force them into one science.
§ 5. Before criticizing this opinion, it's important to note something that is often overlooked. Even if this viewpoint isn't actually wrong, it could be criticized as insufficient for a definition because variations in belief aren't limited to Probability. That's a concern of that science, for sure, but it also emerges in other areas as well. 124 Whenever we extend our conclusions through Induction or Analogy, rely on others' testimonies, trust our own memories of the past, arrive at conclusions from conflicting arguments, or even make complex deductions using mathematics or logic, the results we reach are often less certain than the premises they originated from. In all these instances, we're aware of differing levels of belief, but are the rules that create and change that belief the same? If they can't be consolidated into one coherent framework and can only be organized into multiple different frameworks, each with its own set of laws and rules, then it's pointless to try to fit them all into one unified science.
This opinion is strengthened by observing that most of the writers who adopt the definition in question do practically dismiss from consideration most of the above-mentioned examples of diminution of belief, and confine their attention to classes of events which have the property discussed in Chap I., viz. ‘ignorance of the few, knowledge of the many.’ It is quite true that considerable violence has to be done to some of these examples, by introducing exceedingly arbitrary suppositions into them, before they can be forced to assume a suitable form. But still there is little doubt that, if we carefully examine the language employed, we shall find that in almost every case assumptions are made which virtually imply that our knowledge of the individual is derived from propositions given in the typical form described in Chap I. This will be more fully proved when we come to consider some common misapplications of the science.
This perspective is reinforced by noticing that most of the authors who adopt the definition in question tend to ignore most of the previously mentioned examples of reduced belief and focus instead on types of events that exhibit the property discussed in Chap I., namely ‘ignorance of the few, knowledge of the many.’ It's true that a lot of manipulation is required for some of these examples by introducing highly arbitrary assumptions into them to make them fit a suitable form. However, it's pretty clear that if we closely analyze the language used, we'll find that in nearly every case, assumptions are made that essentially indicate our knowledge of the individual comes from statements presented in the typical form described in Chap I. This will be further demonstrated when we examine some common misapplications of the science.
§ 6. Even then, if the above-mentioned view of the subject were correct, it would yet, I consider, be insufficient for the purpose of a definition; but it is at least very doubtful whether it is correct. Before we could properly assign to 125 the belief side of the question the prominence given to it by De Morgan and others, certainly before the science could be defined from that side, it would be necessary, it appears, to establish the two following positions, against both of which strong objections can be brought.
§ 6. Even then, if the perspective mentioned above is accurate, I still think it would be inadequate for a proper definition; however, it’s quite uncertain if it is accurate. Before we could properly attribute the emphasis on the belief aspect of the issue as De Morgan and others have done, and certainly before the science could be defined from that angle, it seems we would need to establish the following two points, which face significant objections.
(1) That our belief of every proposition is a thing which we can, strictly speaking, be said to measure; that there must be a certain amount of it in every case, which we can realize somehow in consciousness and refer to some standard so as to pronounce upon its value.
(1) Our belief in every statement is something that we can, technically speaking, measure. There has to be a certain degree of it in every instance, which we can somehow become aware of in our mind and compare to a standard in order to determine its value.
(2) That the value thus apprehended is the correct one according to the theory, viz. that it is the exact fraction of full conviction that it should be. This statement will perhaps seem somewhat obscure at first; it will be explained presently.
(2) That the value understood here is the right one according to the theory, namely, that it's the exact fraction of complete conviction that it should be. This statement might seem a bit unclear at first; it will be explained shortly.
§ 7. (I.) Now, in the first place, as regards the difficulty of obtaining any measure of the amount of our belief. One source of this difficulty is too obvious to have escaped notice; this is the disturbing influence produced on the quantity of belief by any strong emotion or passion. A deep interest in the matter at stake, whether it excite hope or fear, plays great havoc with the belief-meter, so that we must assume the mind to be quite unimpassioned in weighing the evidence. This is noticed and acknowledged by Laplace and others; but these writers seem to me to assume it to be the only source of error, and also to be of comparative unimportance. Even if it were the only source of error I cannot see that it would be unimportant. We experience hope or fear in so very many instances, that to omit such influences from consideration would be almost equivalent to saying that whilst we profess to consider the whole quantity of our belief we will in reality consider only a portion of it. Very strong 126 feelings are, of course, exceptional, but we should nevertheless find that the emotional element, in some form or other, makes itself felt on almost every occasion. It is very seldom that we cannot speak of our surprise or expectation in reference to any particular event. Both of these expressions, but especially the former, seem to point to something more than mere belief. It is true that the word ‘expectation’ is generally defined in treatises on Probability as equivalent to belief; but it seems doubtful whether any one who attends to the popular use of the terms would admit that they were exactly synonymous. Be this however as it may, the emotional element is present upon almost every occasion, and its disturbing influence therefore is constantly at work.
§ 7. (I.) First of all, there’s the challenge of measuring how much we actually believe. A major source of this challenge is pretty obvious: strong emotions or passions significantly affect our level of belief. A deep interest in what's at stake, whether it brings hope or fear, really messes with our ability to gauge belief accurately, so we need to assume that our minds are neutral when assessing the evidence. This has been noted and acknowledged by Laplace and others; however, these writers seem to think this is the only source of error and that it’s relatively unimportant. Even if it were the only source of error, I don’t see how it could be considered unimportant. We often experience hope or fear, so ignoring these influences would be nearly equivalent to claiming that while we intend to consider the full extent of our belief, we’re actually only considering part of it. Very strong feelings are indeed exceptions, but we still find that emotions, in one form or another, are present almost every time. It’s rare that we can’t express our surprise or expectations regarding a specific event. Both of these terms, but especially the former, suggest something beyond mere belief. While the term ‘expectation’ is often defined in Probability literature as being equivalent to belief, it seems debatable whether anyone who pays attention to how these terms are used in everyday conversation would consider them exactly the same. Regardless, the emotional aspect is present almost every time, and its disruptive effect is always at play.
§ 8. Another cause, which co-operates with the former, is to be found in the extreme complexity and variety of the evidence on which our belief of any proposition depends. Hence it results that our actual belief at any given moment is one of the most fugitive and variable things possible, so that we can scarcely ever get sufficiently clear hold of it to measure it. This is not confined to the times when our minds are in a turmoil of excitement through hope or fear. In our calmest moments we shall find it no easy thing to give a precise answer to the question, How firmly do I hold this or that belief? There may be one or two prominent arguments in its favour, and one or two corresponding objections against it, but this is far from comprising all the causes by which our state of belief is produced. Because such reasons as these are all that can be practically introduced into oral or written controversies, we must not conclude that it is by these only that our conviction is influenced. On the contrary, our conviction generally rests upon a sort of chaotic basis composed of an infinite number of inferences and analogies of every description, and these moreover distorted 127 by our state of feeling at the time, dimmed by the degree of our recollection of them afterwards, and probably received from time to time with varying force according to the way in which they happen to combine in our consciousness at the moment. To borrow a striking illustration from Abraham Tucker, the substructure of our convictions is not so much to be compared to the solid foundations of an ordinary building, as to the piles of the houses of Rotterdam which rest somehow in a deep bed of soft mud. They bear their weight securely enough, but it would not be easy to point out accurately the dependence of the different parts upon one another. Directly we begin to think of the amount of our belief, we have to think of the arguments by which it is produced—in fact, these arguments will intrude themselves without our choice. As each in turn flashes through the mind, it modifies the strength of our conviction; we are like a person listening to the confused hubbub of a crowd, where there is always something arbitrary in the particular sound we choose to listen to. There may be reasons enough to suffice abundantly for our ultimate choice, but on examination we shall find that they are by no means apprehended with the same force at different times. The belief produced by some strong argument may be very decisive at the moment, but it will often begin to diminish when the argument is not actually before the mind. It is like being dazzled by a strong light; the impression still remains, but begins almost immediately to fade away. I think that this is the case, however we try to limit the sources of our conviction.
§ 8. Another reason that works alongside the previous one is the extreme complexity and variety of the evidence on which our belief in any proposition relies. As a result, our actual belief at any given moment is one of the most fleeting and variable things possible, making it hard for us to grasp it clearly enough to measure it. This struggle isn't limited to times when our minds are chaotic with excitement from hope or fear. Even in our calmest moments, it can be challenging to give a precise answer to the question, How strongly do I believe in this or that? There might be a couple of strong arguments supporting it and a few corresponding objections against it, but that's far from covering all the factors influencing our state of belief. Because these reasons are typically what we can practically present in discussions or writings, we shouldn't assume that they are the only things affecting our conviction. On the contrary, our belief generally rests on a sort of chaotic foundation made up of countless inferences and analogies of all kinds, which are also distorted by how we feel at the moment, obscured by how well we remember them later, and likely received with different intensity depending on how they happen to come together in our consciousness at that time. To use a striking example from Abraham Tucker, the foundation of our beliefs is less like the solid base of a regular building and more like the piles of houses in Rotterdam that somehow stand in a deep layer of soft mud. They support their weight securely enough, but it’s not easy to accurately point out how the different parts relate to one another. As soon as we start to think about the strength of our belief, we have to consider the arguments that produce it—in fact, these arguments will impose themselves without our choice. As each one flashes through our mind, it alters the strength of our conviction; we are like someone listening to the chaotic noise of a crowd, where there’s always something random about which sound we choose to focus on. There may be enough reasons to fully support our eventual decision, but upon closer inspection, we’ll find that we don’t perceive them with the same intensity at different times. A belief formed by a strong argument may feel very solid in the moment, but it often starts to fade when the argument isn't actively on our mind. It’s like being blinded by a bright light; the impression lingers, but begins to fade almost immediately. I believe this holds true, no matter how much we try to narrow down the sources of our conviction.
§ 9. (II.) But supposing that it were possible to strike a sort of average of this fluctuating state, should we find this average to be of the amount assigned by theory? In other words, is our natural belief in the happening of two different events in direct proportion to the frequency with which those 128 events happen in the long run? There is a lottery with 100 tickets and ten prizes; is a man's belief that he will get a prize fairly represented by one-tenth of certainty? The mere reference to a lottery should be sufficient to disprove this. Lotteries have flourished at all times, and have never failed to be abundantly supported, in spite of the most perfect conviction, on the part of many, if not of most, of those who put into them, that in the long run all will lose. Deductions should undoubtedly be made for those who act from superstitious motives, from belief in omens, dreams, and so on. But apart from these, and supposing any one to come fortified by all that mathematics can do for him, it is difficult to believe that his natural impressions about single events would be always what they should be according to theory. Are there many who can honestly declare that they would have no desire to buy a single ticket? They would probably say to themselves that the sum they paid away was nothing worth mentioning to lose, and that there was a chance of gaining a great deal; in other words, they are not apportioning their belief in the way that theory assigns.
§ 9. (II.) But if we could take an average of this changing situation, would we find that average matches the amount suggested by theory? In other words, does our natural belief in the occurrence of two different events directly reflect how often those events happen over time? Consider a lottery with 100 tickets and ten prizes; does a person really think they have a one-in-ten chance of winning? Just mentioning a lottery should be enough to disprove this idea. Lotteries have always thrived and have consistently been popular, even though many, if not most, participants are fully convinced that, in the long run, they will all lose. Certainly, we should take into account those who play for superstitious reasons, believing in omens, dreams, and so on. However, aside from these cases, even someone who thinks they have all the mathematical facts would struggle to trust their natural feelings about individual events to align perfectly with theory. Are there many who can honestly say they wouldn’t want to buy a single ticket? They probably tell themselves that the amount they spent isn’t worth worrying about losing and that there’s a chance to win a lot; in other words, they aren’t matching their belief with the way theory suggests.
What bears out this view is, that the same persons who would act in this way in single instances would often not think of doing so in any but single instances. In other words, the natural tendency here is to attribute too great an amount of belief where it is or should be small; i.e. to depreciate the risk in proportion to the contingent advantage. They would very likely, when argued with, attach disparaging epithets to this state of feeling, by calling it an unaccountable fascination, or something of that kind, but of its existence there can be little doubt. We are speaking now of what is the natural tendency of our minds, not of that into which they may at length be disciplined by education and thought. If, however, educated persons have succeeded 129 for the most part in controlling this tendency in games of chance, the spirit of reckless speculation has scarcely yet been banished from commerce. On examination, this tendency will be found so prevalent in all ages, ranks, and dispositions, that it would be inadmissible to neglect it in order to bring our supposed instincts more closely into accordance with the commonly received theories of Probability.
What supports this view is that the same people who would behave this way in specific cases often wouldn’t think of doing so in any cases other than those. In other words, the natural inclination here is to assign too much belief where it is or should be minimal; that is, to downplay the risk relative to the potential benefit. They would likely, when challenged, use negative terms to describe this feeling, calling it an unexplainable obsession or something similar, but there’s little doubt about its existence. We are talking now about the natural inclination of our minds, not about what they may eventually be trained to do through education and thought. However, even if educated individuals have generally managed to control this tendency in games of chance, the spirit of reckless speculation has hardly been eliminated from commerce. Upon closer inspection, this tendency is so widespread across all ages, social classes, and personality types that it would be inappropriate to ignore it in order to align our assumed instincts more closely with the traditional theories of Probability.
§ 10. There is another aspect of this question which has been often overlooked, but which seems to deserve some attention. Granted that we have an instinct of credence, why should it be assumed that this must be just of that intensity which subsequent experience will justify? Our instincts are implanted in us for good purposes, and are intended to act immediately and unconsciously. They are, however, subject to control, and have to be brought into accordance with what we believe to be true and right. In other departments of psychology we do not assume that every spontaneous prompting of nature is to be left just as we find it, or even that on the average, omitting individual variations, it is set at that pitch that will be found in the end to be the best when we come to think about it and assign its rules. Take, for example, the case of resentment. Here we have an instinctive tendency, and one that on the whole is good in its results. But moralists are agreed that almost all our efforts at self-control are to be directed towards subduing it and keeping it in its right direction. It is assumed to be given as a sort of rough protection, and to be set, if one might so express oneself, at too high a pitch to be deliberately and consciously acted on in society. May not something of this kind be the case also with our belief? I only make a passing reference to this point here, as on the theory of Probability adopted in this work it does not appear to be at all material to the science. But it seems 130 a strong argument against the expediency of commencing the study of the science from the subjective side, or even of assigning any great degree of prominence to this side.
§ 10. There’s another angle to this question that often gets overlooked, but it seems worth noting. Assuming we have an instinct for belief, why should we think it has to be just the right intensity that later experience will confirm? Our instincts are built into us for good reasons and are meant to function immediately and without thought. However, they can be controlled and need to align with what we believe is true and right. In other areas of psychology, we don’t assume that every natural impulse should be left as it is, or even that, on average—ignoring individual differences—it will naturally be at its optimal level once we think it through and define its guidelines. Take resentment, for example. We have an instinctive tendency here, and on the whole, it tends to yield good results. Yet, moralists agree that most of our self-control efforts should focus on managing it and steering it in the right direction. It's considered a kind of rough protective instinct, set—if I may put it that way—at a level that’s too intense to be acted on consciously in social situations. Could something similar be true regarding our beliefs? I bring this up briefly here, as it doesn’t seem very relevant to the science based on the theory of Probability used in this work. But it does provide a strong argument against starting the study of this science from the subjective perspective or giving it too much emphasis. 130
That men do not believe in exact accordance with this theory must have struck almost every one, but this has probably been considered as mere exception and irregularity; the assumption being made that on the average, and in far the majority of cases, they do so believe. As stated above, it is very doubtful whether the tendency which has just been discussed is not so widely prevalent that it might with far more propriety be called the rule than the exception. And it may be better that this should be so: many good results may follow from that cheerful disposition which induces a man sometimes to go on trying after some great good, the chance of which he overvalues. He will keep on through trouble and disappointment, without serious harm perhaps, when the cool and calculating bystander sees plainly that his ‘measure of belief’ is much higher than it should be. So, too, the tendency also so common, of underrating the chance of a great evil may also work for good. By many men death might be looked upon as an almost infinite evil, at least they would so regard it themselves; suppose they kept this contingency constantly before them at its right value, how would it be possible to get through the practical work of life? Men would be stopping indoors because if they went out they might be murdered or bitten by a mad dog. To say this is not to advocate a return to our instincts; indeed when we have once reached the critical and conscious state, it is hardly possible to do so; but it should be noticed that the advantage gained by correcting them is at best but a balanced one.[3] What is most to our present purpose, it 131 suggests the inexpediency of attempting to found an exact theory on what may afterwards prove to be a mere instinct, unauthorized in its full extent by experience.
That people do not believe in strict accordance with this theory must be obvious to almost everyone, but it's likely been dismissed as just an exception or quirk; the assumption being that, on average and in most cases, they do believe that way. As mentioned earlier, it's highly questionable whether the tendency we've just discussed is so widespread that it should be considered the norm rather than the exception. Perhaps it’s better this way: many positive outcomes can arise from that optimistic attitude that motivates a person to keep striving for some significant good, often overestimating their chances. They will persevere through hardship and disappointment, possibly without much harm, while the rational observer can clearly see that their ‘measure of belief’ is much higher than it realistically should be. Similarly, the common tendency to underestimate the likelihood of a major misfortune can also have its benefits. For many, death might be seen as an almost unimaginable evil, at least that's how they would perceive it; if they were to keep this possibility constantly in mind at its actual weight, how could they manage the practical tasks of life? People would stay inside because venturing out could lead to being attacked or bitten by a rabid dog. Saying this isn't meant to suggest we should revert to our instincts; once we've reached a conscious and critical state, it's nearly impossible to do so. However, it's important to recognize that the benefits gained from correcting these instincts are, at best, a trade-off. What matters most for our current discussion is that it highlights the impracticality of trying to base a precise theory on what may later turn out to be merely an instinct, not fully supported by experience. [3] 131
§ 11. It may be replied, that though people, as a matter of fact, do not apportion belief in this exact way, yet they ought to do so. The purport of this remark will be examined presently; it need only be said here that it grants all that is now contended for. For it admits that the degree of our belief is capable of modification, and may need it. But in accordance with what is the belief to be modified? obviously in accordance with experience; it cannot be trusted to by itself, but the fraction at which it is to be rated must be determined by the comparative frequency of the events to which it refers. Experience then furnishing the standard, it is surely most reasonable to start from this experience, and to found the theory of our processes upon it.
§ 11. One might argue that while people don’t actually distribute their beliefs this way, they *should* do so. This point will be discussed shortly; for now, it’s important to note that it agrees with the current argument. It acknowledges that the level of our belief can change and may need to. But on what basis should this belief be adjusted? Clearly, it should be adjusted based on experience; it can’t be trusted on its own. The level at which it should be evaluated must be determined by how often the events it relates to occur. Since experience provides the standard, it makes the most sense to begin with this experience and to base our understanding of our processes on it.
If we do not do this, it should be observed that we are detaching Probability altogether from the study of things external to us, and making it nothing else in effect than a portion of Psychology. If we refuse to be controlled by experience, but confine our attention to the laws according to which belief is naturally or instinctively compounded and distributed in our minds, we have no right then to appeal to experience afterwards even for illustrations, unless under the 132 express understanding that we do not guarantee its accuracy. Our belief in some single events, for example, might be correct, and yet that in a compound of several (if derived merely from our instinctive laws of belief) very possibly might not be correct, but might lead us into practical mistakes if we determined to act upon it. Even if the two were in accordance, this accordance would have to be proved, which would lead us round, by what I cannot but think a circuitous process, to the point which has been already chosen for commencing with.
If we don't do this, we should notice that we're completely separating Probability from studying things outside of us, turning it into just a part of Psychology. If we choose not to be guided by experience and focus only on the natural or instinctive ways our beliefs are formed and spread in our minds, we can't then rely on experience later for examples unless we clearly state that we don't guarantee its accuracy. For instance, our belief in some individual events might be right, but when it comes to a combination of several events (if it's only based on our instinctive beliefs), it could easily be wrong, leading us to make practical mistakes if we decide to act on it. Even if both were aligned, we'd need to prove that alignment, which I believe would take us through a long and indirect process back to the starting point we initially chose.
§ 12. De Morgan seems to imply that the doctrine criticised above finds a justification from the analogy of Formal Logic. If the laws of necessary inference can be studied apart from all reference to external facts (except by way of illustration), why not those of probable inference? There does not, however, seem to be much force in any such analogy. Formal Logic, at any rate under its modern or Kantian mode of treatment, is based upon the assumption that there are laws of thought as distinguished from laws of things, and that these laws of thought can be ascertained and studied without taking into account their reference to any particular object. Now so long as we are confined to necessary or irreversible laws, as is of course the case in ordinary Formal Logic, this assumption leads to no special difficulties. We mean by this, that no conflict arises between these subjective and objective necessities. The two exist in perfect harmony side by side, the one being the accurate counterpart of the other. So precise is the correspondence between them, that few persons would notice, until study of metaphysics had called their attention to such points, that there were these two sides to the question. They would make their appeal to either with equal confidence, saying indifferently, ‘the thing must be so,’ or, ‘we cannot conceive its being 133 otherwise.’ In fact it is only since the time of Kant that this mental analysis has been to any extent appreciated and accepted. And even now the dominant experience school of philosophy would not admit that there are here two really distinct sides to the phenomenon; they maintain either that the subjective necessity is nothing more than the consequence by inveterate association of the objective uniformity, or else that this so-called necessity (say in the Law of Contradiction) is after all merely verbal, merely a different way of saying the same thing over again in other words. Whatever the explanation adopted, the general result is that fallacies, as real acts of thought, are impossible within the domain of pure logic; error within that province is only possibly by a momentary lapse of attention, that is of consciousness.
§ 12. De Morgan seems to suggest that the doctrine discussed above is justified by the analogy of Formal Logic. If we can study the laws of necessary inference without referencing external facts (except as examples), why not do the same for probable inference? However, this analogy doesn't seem very solid. Formal Logic, especially in its modern or Kantian approach, assumes there are laws of thought that are different from laws of reality, and that these laws of thought can be identified and examined without considering their connection to any specific object. As long as we stick to necessary or irreversible laws, which is typically the case in Formal Logic, this assumption doesn't create significant problems. By this, we mean there's no conflict between subjective and objective necessities. They coexist in perfect harmony, with one accurately reflecting the other. The correspondence is so precise that few people would notice, until they've studied metaphysics, that there are these two aspects to the issue. They appeal to either with equal certainty, saying simply, "the thing must be so," or "we can't imagine it being otherwise." In fact, this mental analysis has only been somewhat recognized and accepted since Kant's time. Even today, the main school of philosophical thought, experience, wouldn't acknowledge that there are truly two distinct sides to this phenomenon; they argue that subjective necessity is just a result of fixed associations with objective uniformity, or that this so-called necessity (like in the Law of Contradiction) is merely verbal, just a different way of restating the same idea. Regardless of the explanation chosen, the overall conclusion is that fallacies, as genuine acts of thought, cannot occur in pure logic; errors in that realm happen only through a momentary lapse in attention, or consciousness.
§ 13. But though this perfect harmony between subjective and objective uniformities or laws may exist within the domain of pure logic, it is far from existing within that of probability. The moment we make the quantity of our belief an integral part of the subject to be studied, any such invariable correspondence ceases to exist. In the former case, we could not consciously think erroneously even though we might try to do so; in the latter, we not only can believe erroneously but constantly do so. Far from the quantity of our belief being so exactly adjusted in conformity with the facts to which it refers that we cannot even in imagination go astray, we find that it frequently exists in excess or defect of that which subsequent judgment will approve. Our instincts of credence are unquestionably in frequent hostility with experience; and what do we do then? We simply modify the instincts into accordance with the things. We are constantly performing this practice, and no cultivated mind would find it possible to do anything else. No man would think of divorcing his belief from the things on which 134 it was exercised, or would suppose that the former had anything else to do than to follow the lead of the latter. Hence it results that that separation of the subjective necessity from the objective, and that determination to treat the former as a science apart by itself, for which a plausible defence could be made in the case of pure logic, is entirely inadmissible in the case of probability. However we might contrive to ‘think’ aright without appeal to facts, we cannot believe aright without incessantly checking our proceedings by such appeals. Whatever then may be the claims of Formal Logic to rank as a separate science, it does not appear that it can furnish any support to the theory of Probability at present under examination.
§ 13. Although perfect harmony between subjective and objective principles or laws may exist in pure logic, it doesn't hold true in the realm of probability. Once we make the amount of our belief a key part of what we're studying, that consistent relationship falls apart. In the first case, we couldn't possibly think incorrectly, even if we aimed to; in the second, we not only can believe wrongly but often do. Instead of our belief being perfectly aligned with the facts it pertains to—so much so that we can't even misimagine things—it often exists in excess or deficiency compared to what later judgment would validate. Our instincts of belief frequently clash with experience; so what do we do? We simply adjust our instincts to match reality. We consistently do this, and no educated person would think of acting otherwise. No one would consider separating their belief from the realities it applies to or assume that belief has anything to do other than follow reality’s guidance. As a result, the division of subjective necessity from the objective, and the idea of treating the former as a separate science—which might have a reasonable defense in pure logic—is completely unacceptable in the context of probability. While we might find a way to ‘think’ correctly without referencing facts, we cannot believe correctly without continuously verifying our actions through those references. Thus, regardless of Formal Logic's claim to be a separate science, it doesn't seem to support the theory of Probability currently being discussed.
§ 14. The point in question is sometimes urged as follows. Suppose a man with two, and only two, alternatives before him, one of which he knows must involve success and the other failure. He knows nothing more about them than this, and he is forced to act. Would he not regard them with absolutely similar and equal feelings of confidence, without the necessity of referring them to any real or imaginary series? If so, is not this equivalent to saying that his belief of either, since one of them must come to pass, is equal to that of the other, and therefore that his belief of each is one-half of full confidence? Similarly if there are more than two alternatives: let it be supposed that there are any number of them, amongst which no distinctions whatever can be discerned except in such particulars as we know for certain will not affect the result; should we not feel equally confident in respect of each of them? and so here again should we riot have a fractional estimate of our absolute amount of belief? It is thus attempted to lay the basis of a pure science of Probability, determining the distribution and combination of our belief 135 hypothetically; viz. if the contingencies are exactly alike, then our belief is so apportioned, the question whether the contingencies are equal being of course decided as the objective data of Logic or Mathematics are decided.
§ 14. The issue at hand is sometimes presented like this. Imagine a man faced with two options, and only two, one of which he knows will definitely succeed and the other will definitely fail. He has no other information about them, and he must make a choice. Wouldn’t he view them with the same level of confidence, without needing to compare them to any real or imagined series? If that’s the case, doesn’t this mean that his belief in either option is equal, since one must happen, and therefore his belief in each is half of full confidence? Similarly, if there are more than two options: let’s say there are several, among which there are no noticeable differences except for details that we know won’t affect the outcome; shouldn’t we feel equally confident about each of them? And wouldn’t we then arrive at a fractional assessment of our total belief? This is how the foundation of a pure science of Probability is attempted, determining how our beliefs are distributed and combined hypothetically; namely, if the situations are identical, then our belief is divided accordingly, with the question of whether the situations are the same being settled as we resolve objective data in Logic or Mathematics. 135
To discuss this question fully would require a statement at some length of the reasons in favour of the objective or material view of Logic, as opposed to the Formal or Conceptualist. I shall have to speak on this subject in another chapter, and will not therefore enter upon it here. But one conclusive objection which is applicable more peculiarly to Probability may be offered at once. To pursue the line of enquiry just indicated, is, as already remarked, to desert the strictly logical ground, and to take up that appropriate to psychology; the proper question, in all these cases, being not what do men believe, but what ought they to believe? Admitting, as was done above, that in the case of Formal Logic these two enquiries, or rather those corresponding to them, practically run into one, owing to the fact that men cannot consciously ‘think’ wrongly; it cannot be too strongly insisted on that in Probability the two are perfectly separable and distinct. It is of no use saying what men do or will believe, we want to know what they will be right in believing; and this can never be settled without an appeal to the phenomena themselves.
To fully discuss this question would take a detailed explanation of the reasons for the objective or material view of Logic, compared to the Formal or Conceptualist view. I have to cover this topic in another chapter, so I won’t go into it here. However, there is one clear objection that specifically applies to Probability that I can mention right away. Following the line of inquiry just noted means stepping away from strictly logical principles and moving into the realm of psychology; the key question in all these cases is not what do people believe, but what should they believe? As mentioned earlier, in the case of Formal Logic, these two inquiries—or rather, those corresponding to them—practically blend into one, since people cannot consciously ‘think’ incorrectly. However, it cannot be emphasized enough that in Probability, the two are completely separate and distinct. It’s pointless to say what people do or will believe; we need to know what they should rightly believe, and this can never be determined without looking at the actual phenomena themselves.
§ 15. But apart from the above considerations, this way of putting the case does not seem to me at all conclusive. Take the following example. A man[4] finds himself on the 136 sands of the Wash or Morecambe Bay, in a dense mist, when the spring-tide is coming in; and knows therefore that to be once caught by the tide would be fatal. He hears a church-bell at a distance, but has no means of knowing whether it is on the same side of the water with himself or on the opposite side. He cannot tell therefore whether by following its sound he will be led out into the mid-stream and be lost, or led back to dry land and safety. Here there can be no repetition of the event, and the cases are indistinguishably alike, to him, in the only circumstances which can affect the issue: is not then his prospect of death, it will be said, necessarily equal to one-half? A proper analysis of his state of mind would be a psychological rather than a logical enquiry, and in any case, as above remarked, the decision of this question does not touch our logical position. But according to the best introspection I can give I should say that what really passes through the mind in such a case is something of this kind: In most doubtful positions and circumstances we are accustomed to decide our conduct by a consideration of the relative advantages and disadvantages of each side, that is by the observed or inferred frequency with which one or the other alternative has succeeded. In proportion as these become more nearly balanced, we are more frequently mistaken in the individual cases; that is, it becomes more and more nearly what would be called ‘a mere toss up’ whether we are right or wrong. The case in question seems merely the limiting case, in which it has 137 been contrived that there shall be no appreciable difference between the alternatives, by which to decide in favour of one or other, and we accordingly feel no confidence in the particular result. Having to decide, however, we decide according to the precedent of similar cases which have occurred before. To stand still and wait for better information is certain death, and we therefore appeal to and employ the only rule we know of; or rather we feel, or endeavour to feel, as we have felt before when acting in the presence of alternatives as nearly balanced as possible. But I can neither perceive in my own case, nor feel convinced in that of others, that this appeal, in a case which cannot be repeated,[5] to a rule acted on and justified in cases which can be and are repeated, at all forces us to admit that our state of mind is the same in each case.
§ 15. But aside from the points already mentioned, this way of framing the situation doesn’t seem completely convincing to me. Consider this example. A man finds himself on the sands of the Wash or Morecambe Bay, shrouded in dense mist, as the spring tide is coming in, and he realizes that getting caught by the tide would be deadly. He hears a church bell ringing in the distance but has no way of knowing whether it's on his side of the water or the other side. He can't tell if following the sound will lead him out into the middle of the current and cause him to drown, or guide him back to safety on dry land. In this scenario, there's no opportunity for the situation to repeat itself, and to him, the circumstances that could affect the outcome are indistinguishably the same: doesn’t that mean his chance of dying is, as some might argue, necessarily fifty-fifty? A proper analysis of his mindset would be more of a psychological inquiry than a logical one, and as I mentioned earlier, settling this question doesn’t really affect our logical stance. However, based on my best self-reflection, I’d say that what really goes through the mind in such a situation is something like this: In most uncertain situations, we tend to base our decisions on weighing the relative pros and cons of each option, which is determined by the observed or inferred frequency with which each alternative has succeeded in the past. As these become more evenly matched, we are often more likely to be wrong in specific instances; it becomes increasingly a matter of chance whether we are right or wrong. The current case seems like a limiting scenario, where it’s designed so that there’s no significant difference between the options to help choose one over the other, leading us to feel unsure about the specific outcome. However, when forced to decide, we rely on the precedents of similar situations we’ve encountered before. To stand still and wait for more information would guarantee death, so we turn to and use the only rule we know; or rather, we try to feel, or do feel, as we have when faced with equally balanced alternatives before. But I can’t see in my own experience, nor am I convinced in that of others, that this reliance on a rule that applies to situations that can be repeated truly means that our state of mind is the same in every case.
§ 16. This example serves to bring out very clearly a point which has been already mentioned, and which will have to be insisted upon again, viz. that all which Probability discusses is the statistical frequency of events, or, if we prefer so to put it, the quantity of belief with which any one of these events should be individually regarded, but leaves all the subsequent conduct dependent upon that frequency, or that belief, to the choice of the agents. Suppose there are two travellers in the predicament in question: shall they keep together, or separate in opposite directions? In either case alike the chance of safety to each is the same, viz. one-half, but clearly their circumstances must decide which course it is preferable to adopt. If they are husband and wife, they will probably prefer to remain together; if they are sole depositaries of an important state secret, they may decide to part. In other words, we have to select here between the two alternatives of the certainty of a single loss, and the even chance 138 of a double loss; alternatives which the common mathematical statement of their chances has a decided tendency to make us regard as indistinguishable from one another. But clearly the decision must be grounded on the desires, feelings, and conscience of the agents. Probability cannot say a word upon this question. As I have pointed out elsewhere, there has been much confusion on this matter in applications of the science to betting, and in the discussion of the Petersburg problem.
§ 16. This example clearly highlights a point that has already been mentioned and will need to be emphasized again: everything Probability talks about is the statistical frequency of events, or, if we express it differently, the level of belief with which any of these events should be considered. However, it leaves all subsequent actions based on that frequency or belief up to the choice of the individuals involved. Imagine two travelers in this situation: should they stay together or go their separate ways? In either case, the chance of safety for each is the same—one-half—but clearly, their circumstances must determine which option is better to choose. If they are husband and wife, they will likely want to stay together; if they are the only ones holding an important state secret, they might decide to split up. In other words, they have to choose between the certainty of one loss and the equal chance of two losses; options that a straightforward mathematical explanation of their chances may lead us to view as essentially the same. But the decision must be based on the desires, feelings, and conscience of the individuals involved. Probability has nothing to say on this issue. As I've pointed out elsewhere, there has been a lot of confusion about this in the application of the science to betting and in discussions about the Petersburg problem.
We have thus examined the doctrine in question with a minuteness which may seem tedious, but in consequence of the eminence of its supporters it would have been presumptuous to have rejected it without the strongest grounds. The objections which have been urged might be summarised as follows:—the amount of our belief of any given proposition, supposing it to be in its nature capable of accurate determination (which does not seem to be the case), depends upon a great variety of causes, of which statistical frequency—the subject of Probability—is but one. That even if we confine our attention to this one cause, the natural amount of our belief is not necessarily what theory would assign, but has to be checked by appeal to experience. The subjective side of Probability therefore, though very interesting and well deserving of examination, seems a mere appendage of the objective, and affords in itself no safe ground for a science of inference.
We have closely examined the doctrine in question, which might seem tedious, but given the prominence of its supporters, it would have been arrogant to reject it without solid reasons. The objections that have been raised can be summarized as follows: the level of our belief in any specific proposition, assuming it can be accurately determined (which doesn’t appear to be the case), depends on a variety of factors, with statistical frequency—the subject of Probability—being just one. Even if we focus solely on this factor, the natural level of our belief isn’t necessarily what theory predicts but must be verified through experience. Therefore, the subjective aspect of Probability, while very interesting and worthy of analysis, seems to be just an extension of the objective, and doesn’t provide a reliable foundation for a science of inference.
§ 17. The conception then of the science of Probability as a science of the laws of belief seems to break down at every point. We must not however rest content with such merely negative criticism. The degree of belief we entertain of a proposition may be hard to get at accurately, and when obtained may be often wrong, and may need therefore to be checked by an appeal to the objects of belief. Still in 139 popular estimation we do seem to be able with more or less accuracy to form a graduated scale of intensity of belief. What we have to examine now is whether this be possible, and, if so, what is the explanation of the fact?
§ 17. The idea of Probability as a science of the laws of belief seems to fall apart at every turn. However, we shouldn't be satisfied with just negative feedback. The level of belief we have in a statement can be difficult to determine accurately, and when we do find it, it can often be incorrect and might need to be validated by looking at the actual objects of belief. Yet, in common understanding, it seems we can somewhat accurately create a scale that measures the strength of our beliefs. Now, what we need to investigate is whether this is actually achievable, and if it is, what explains this phenomenon?
That it is generally believed that we can form such a scale scarcely admits of doubt. There is a whole vocabulary of common expressions such as, ‘I feel almost sure,’ ‘I do not feel quite certain,’ ‘I am less confident of this than of that,’ and so on. When we make use of any one of these phrases we seldom doubt that we have a distinct meaning to convey by means of it. Nor do we feel much at a loss, under any given circumstances, as to which of these expressions we should employ in preference to the others. If we were asked to arrange in order, according to the intensity of the belief with which we respectively hold them, things broadly marked off from one another, we could do it from our consciousness of belief alone, without a fresh appeal to the evidence upon which the belief depended. Passing over the looser propositions which are used in common conversation, let us take but one simple example from amongst those which furnish numerical data. Do I not feel more certain that some one will die this week in the whole town, than in the particular street in which I live? and if the town is known to contain a population one hundred times greater than that in the street, would not almost any one be prepared to assert on reflection that he felt a hundred times more sure of the first proposition than of the second? Or to take a non-numerical example, are we not often able to say unhesitatingly which of two propositions we believe the most, and to some rough degree how much more we believe one than the other, at a time when all the evidence upon which each rests has faded from the mind, so that each has to be judged, as we may say, solely on its own merits?
It’s widely accepted that we can create such a scale without a doubt. We have a whole set of common phrases like, “I feel almost sure,” “I’m not quite certain,” and “I’m less confident about this than that.” When we use any of these phrases, we usually have a clear meaning to express. We also don’t feel confused about which of these expressions to use in a given situation. If we were asked to rank things based on how strongly we believe in them, we could do it based solely on our awareness of our beliefs, without needing to refer back to the evidence that supports those beliefs. Setting aside the more casual phrases used in everyday conversation, let’s take one simple example that involves numbers. Don’t I feel more certain that someone will die this week in the entire town than in the specific street where I live? And if the town has a population one hundred times greater than that of the street, wouldn’t almost anyone agree, upon reflection, that they feel a hundred times more sure about the first statement than the second? Or, in a non-numerical example, aren’t we often able to confidently say which of two statements we believe more, and roughly how much more we believe one than the other, even when all the evidence for each has faded from memory, so that we judge them solely on their own merits?
Here then a problem proposes itself. If popular opinion, as illustrated in common language, be correct,—and very considerable weight must of course be attributed to it,—there does exist something which we call partial belief in reference to any proposition of the numerical kind described above. Now what we want to do is to find some test or justification of this belief, to obtain in fact some intelligible answer to the question, Is it correct? We shall find incidentally that the answer to this question will throw a good deal of light upon another question nearly as important and far more intricate, viz. What is the meaning of this partial belief?
Here, then, a problem arises. If public opinion, as shown in everyday language, is correct—and it certainly deserves considerable attention—there is indeed something we refer to as partial belief regarding any proposition of the numerical kind mentioned above. What we want to do is find some test or justification for this belief to provide a clear answer to the question, Is it accurate? We will also discover that answering this question will shed significant light on another question that is almost as important and much more complex, namely, What does this partial belief mean?
§ 18. We shall find it advisable to commence by ascertaining how such enquiries as the above would be answered in the case of ordinary full belief. Such a step would not offer the slightest difficulty. Suppose, to take a simple example, that we have obtained the following proposition,—whether by induction, or by the rules of ordinary deductive logic, does not matter for our present purpose,—that a certain mixture of oxygen and hydrogen is explosive. Here we have an inference, and consequent belief of a proposition. Now suppose there were any enquiry as to whether our belief were correct, what should we do? The simplest way of settling the matter would be to find out by a distinct appeal to experience whether the proposition was true. Since we are reasoning about things, the justification of the belief, that is, the test of its correctness, would be most readily found in the truth of the proposition. If by any process of inference I have come to believe that a certain mixture will explode, I consider my belief to be justified, that is to be correct, if under proper circumstances the explosion always does occur; if it does not occur the belief was wrong.
§ 18. It makes sense to start by figuring out how we would answer such questions in the case of complete belief. This shouldn't be too hard. For a simple example, let’s say we’ve determined that a specific mixture of oxygen and hydrogen is explosive, whether through induction or standard deductive logic isn't important for now. Here, we have an inference and a resulting belief in a proposition. Now, if there's any question about whether our belief is correct, what should we do? The easiest way to resolve this would be to directly check through experience if the proposition is true. Since we're reasoning about things, the justification for our belief—essentially the test of its correctness—would be best found in the truth of the proposition. If I've inferred that a certain mixture will explode, I consider my belief justified, or correct, if the explosion consistently happens under the right conditions; if it doesn’t occur, then the belief was incorrect.
Such an answer, no doubt, goes but a little way, or rather no way at all, towards explaining what is the nature of belief 141 in itself; but it is sufficient for our present purpose, which is merely that of determining what is meant by the correctness of our belief, and by the test of its correctness. In all inferences about things, in which the amount of our belief is not taken into account, such an explanation as the above is quite sufficient; it would be the ordinary one in any question of science. It is moreover perfectly intelligible, whether the conclusion is particular or universal. Whether we believe that ‘some men die’, or that ‘all men die’, our belief may with equal ease be tested by the appropriate train of experience.
Such an answer, of course, only goes a little way, or rather not at all, in explaining what belief really is; however, it’s enough for our current purpose, which is simply to clarify what we mean by the accuracy of our belief and the way to test that accuracy. In all reasoning about things, where the amount of our belief isn't considered, an explanation like the one above is quite adequate; it would be the standard approach in any scientific inquiry. It's also completely understandable, regardless of whether the conclusion is specific or general. Whether we believe that ‘some men die’ or that ‘all men die,’ our belief can be equally tested through the relevant experiences.
§ 19. But when we attempt to apply the same test to partial belief, we shall find ourselves reduced to an awkward perplexity. A difficulty now emerges which has been singularly overlooked by those who have treated of the subject. As a simple example will serve our purpose, we will take the case of a penny. I am about to toss one up, and I therefore half believe, to adopt the current language, that it will give head. Now it seems to be overlooked that if we appeal to the event, as we did in the case last examined, our belief must inevitably be wrong, and therefore the test above mentioned will fail. For the thing must either happen or not happen: i.e. in this case the penny must either give head, or not give it; there is no third alternative. But whichever way it occurs, our half-belief, so far as such a state of mind admits of interpretation, must be wrong. If head does come, I am wrong in not having expected it enough; for I only half believed in its occurrence. If it does not happen, I am equally wrong in having expected it too much; for I half believed in its occurrence, when in fact it did not occur at all.
§ 19. But when we try to apply the same test to partial belief, we run into a frustrating confusion. A problem arises that has been surprisingly overlooked by those who have discussed this topic. To illustrate our point, let's consider the example of a penny. I’m about to toss it, and so I somewhat believe, using the common phrasing, that it will land on heads. Now it seems to be missed that if we look at the outcome, as we did in the previous case, our belief must inevitably be incorrect, and therefore the aforementioned test will fail. The result must either happen or not happen: that is, in this case, the penny must either show heads or not; there’s no third option. But no matter how it lands, our half-belief, as far as such a mindset can be interpreted, must be mistaken. If heads comes up, I’m wrong for not expecting it enough, because I only half believed it would happen. If it doesn’t happen, I’m equally wrong for expecting it too much, since I half believed it would occur when, in fact, it didn’t happen at all.
The same difficulty will occur in every case in which we attempt to justify our state of partial belief in a single contingent event. Let us take another example, slightly differing from the last. A man is to receive £1 if a die gives six, 142 to pay 1s. if it gives any other number. It will generally be admitted that he ought to give 2s. 6d. for the chance, and that if he does so he will be paying a fair sum. This example only differs from the last in the fact that instead of simple belief in a proposition, we have taken what mathematicians call ‘the value of the expectation’. In other words, we have brought into a greater prominence, not merely the belief, but the conduct which is founded upon the belief. But precisely the same difficulty recurs here. For appealing to the event,—the single event, that is,—we see that one or other party must lose his money without compensation. In what sense then can such an expectation be said to be a fair one?
The same issue arises in every situation where we try to justify our partial belief in a single possible event. Let's look at another example, which is slightly different from the last one. A man will receive £1 if a die shows six, but he has to pay 1s. if it shows any other number. It's generally accepted that he should pay 2s. 6d. for the chance, and if he does, he will be making a fair payment. This example is different from the previous one only in that instead of just believing in a proposition, we're considering what mathematicians call ‘the value of the expectation’. In other words, we are highlighting not just the belief, but also the actions that are based on that belief. However, the same problem arises here. When we focus on the event—the single event—we notice that one of the parties must lose money without any compensation. So in what way can that expectation be considered fair?
§ 20. A possible answer to this, and so far as appears the only possible answer, will be, that what we really mean by saying that we half believe in the occurrence of head is to express our conviction that head will certainly happen on the average every other time. And similarly, in the second example, by calling the sum a fair one it is meant that in the long run neither party will gain or lose. As we shall recur presently to the point raised in this form of answer, the only notice that need be taken of it at this point is to call attention to the fact that it entirely abandons the whole question in dispute, for it admits that this partial belief does not in any strict sense apply to the individual event, since it clearly cannot be justified there. At such a result indeed we cannot be surprised; at least we cannot on the theory adopted throughout this Essay. For bearing in mind that the employment of Probability postulates ignorance of the single event, it is not easy to see how we are to justify any other opinion or statement about the single event than a confession of such ignorance.
§ 20. A possible answer to this, and apparently the only feasible one, is that when we say we sort of believe in the occurrence of heads, we mean that we’re convinced heads will definitely show up, on average, every other time. Likewise, in the second example, when we describe the sum as fair, we mean that over time neither party will gain or lose. Since we’ll come back to this issue later, it’s important to note right now that this perspective completely sidesteps the main argument, as it acknowledges that this partial belief doesn’t strictly apply to an individual event, since it clearly can't be justified there. We shouldn't be surprised by this outcome; at least not under the theory we've been using throughout this essay. Given that the use of Probability assumes a lack of knowledge about the individual event, it’s hard to see how we can validate any other opinion or statement regarding that single event besides admitting our ignorance.
§ 21. So far then we do not seem to have made the slightest approximation to a solution of the particular 143 question now under examination. The more closely we have analysed special examples, the more unmistakeably are we brought to the conclusion that in the individual instance no justification of anything like quantitative belief is to be found; at least none is to be found in the same sense in which we expect it in ordinary scientific conclusions, whether Inductive or Deductive. And yet we have to face and account for the fact that common impressions, as attested by a whole vocabulary of common phrases, are in favour of the existence of this quantitative belief. How are we to account for this? If we appeal to an example again, and analyse it somewhat more closely, we may yet find our way to some satisfactory explanation.
§ 21. As of now, it doesn't look like we've made any progress toward solving the specific question we're looking at. The more we analyze individual cases, the clearer it becomes that there’s no real justification for any kind of quantitative belief in these instances; at least not in the same way we expect it in standard scientific conclusions, whether they’re Inductive or Deductive. Still, we have to confront and explain the fact that common perceptions, backed by a whole range of everyday phrases, suggest that this quantitative belief does exist. How can we explain this? If we take another example and analyze it a bit more closely, we might still be able to find a satisfactory explanation.
In our previous analysis (§ 18) we found it sufficient to stop at an early stage, and to give as the justification of our belief the fact of the proposition being true. Stopping however at that stage, we have found this explanation fail altogether to give a justification of partial belief; fail, that is, when applied to the individual instance. The two states of belief and disbelief correspond admirably to the two results of the event happening and not happening respectively, and unless for psychological purposes we saw no reason to analyse further; but to partial belief there is nothing corresponding in the result, for the event cannot partially happen in such cases as we are concerned with. Suppose then we advance a step further in the analysis, and ask again what is meant by the proposition being true? This introduces us, of course, to a very long and intricate path; but in the short distance along it which we shall advance, we shall not, it is to be hoped, find any very serious difficulty. As before, we will illustrate the analysis by first applying it to the case of ordinary full belief.
In our previous analysis (§ 18), we found it sufficient to stop at an early point, justifying our belief by the fact that the proposition is true. However, stopping there, we realized this explanation fails to justify partial belief—specifically when applied to individual instances. The two states of belief and disbelief align perfectly with the outcomes of the event occurring or not occurring, and unless we have psychological reasons, we saw no need to analyze further. However, for partial belief, there’s no corresponding outcome, since the event cannot partially occur in the situations we’re discussing. So, let’s take a step further in our analysis and ask what is meant by the proposition being true. This will lead us down a long and complex path, but in the brief distance we will cover, we hopefully won’t encounter any major challenges. As before, we will illustrate the analysis by first applying it to the case of ordinary full belief.
§ 22. Whatever opinion then may be held about the essential nature of belief, it will probably be admitted that a 144 readiness to act upon the proposition believed is an inseparable accompaniment of that state of mind. There can be no alteration in our belief (at any rate in the case of sane persons) without a possible alteration in our conduct, nor anything in our conduct which is not connected with something in our belief. We will first take an example in connection with the penny, in which there is full belief; we will analyse it a step further than we did before, and then attempt to apply the same analysis to an example of a similar kind, but one in which the belief is partial instead of full.
§ 22. No matter what one thinks about the core nature of belief, it's likely that we can agree that a willingness to act on what we believe is a key part of that mindset. There can't be any change in our beliefs (at least for sane people) without a potential change in our actions, nor can there be any action that isn't related to some belief we hold. First, let's look at an example involving the penny, where belief is complete; we'll analyze it a bit deeper than we did before and then try to use the same analysis on a similar example where the belief is only partial.
Suppose that I am about to throw a penny up, and contemplate the prospect of its falling upon one of its sides and not upon its edge. We feel perfectly confident that it will do so. Now whatever else may be implied in our belief, we certainly mean this; that we are ready to stake our conduct upon its falling thus. All our betting, and everything else that we do, is carried on upon this supposition. Any risk whatever that might ensue upon its falling otherwise will be incurred without fear. This, it must be observed, is equally the case whether we are speaking of a single throw or of a long succession of throws.
Suppose I'm about to toss a penny into the air and I'm thinking about the chance of it landing on one of its sides instead of on its edge. We're completely sure that it will land on a side. Now, whatever else this belief includes, it definitely means that we are willing to base our actions on it landing this way. Everything we bet on and everything else we do is based on this assumption. Any risk that might come from it landing differently is taken on without worry. It’s important to note that this applies whether we're talking about a single toss or a series of tosses.
But now let us take the case of a penny falling, not upon one side or the other, but upon a given side, head. To a certain extent this example resembles the last. We are perfectly ready to stake our conduct upon what comes to pass in the long run. When we are considering the result of a large number of throws, we are ready to act upon the supposition that head comes every other time. If e.g. we are betting upon it, we shall not object to paying £1 every time that head comes, on condition of receiving £1 every time that head does not come. This is nothing else than the translation, as we may call it, into practice, of our belief that head and tail occur equally often.
But now let’s look at the case of a penny falling, not on one side or the other, but specifically on one side, head. In some ways, this example is similar to the previous one. We are fully prepared to base our actions on what happens over time. When we think about the results of many flips, we are willing to assume that head comes up every other time. For instance, if we are betting on it, we wouldn’t mind paying £1 every time head comes up, as long as we receive £1 whenever head doesn’t come up. This is simply a practical application of our belief that heads and tails occur equally often.
Now it will be obvious, on a moment's consideration, that our conduct is capable of being slightly varied: of being varied, that is, in form, whilst it remains identical in respect of its results. It is clear that to pay £1 every time we lose, and to get £1 every time we gain, comes to precisely the same thing, in the case under consideration, as to pay ten shillings every time without exception, and to receive £1 every time that head occurs. It is so, because heads occur, on the average, every other time. In the long run the two results coincide; but there is a marked difference between the two cases, considered individually. The difference is two-fold. In the first place we depart from the notion of a payment every other time, and come to that of one made every time. In the second place, what we pay every time is half of what we get in the cases in which we do get anything. The difference may seem slight; but mark the effect when our conduct is translated back again into the subjective condition upon which it depends, viz. into our belief. It is in consequence of such a translation, as it appears to me, that the notion has been acquired that we have an accurately determinable amount of belief as to every such proposition. To have losses and gains of equal amount, and to incur them equally often, was the experience connected with our belief that the two events, head and tail, would occur equally often. This was quite intelligible, for it referred to the long run. To find that this could be commuted for a payment made every time without exception, a payment, observe, of half the amount of what we occasionally receive, has very naturally been interpreted to mean that there must be a state of half-belief which refers to each individual throw.
Now, it’s clear that we can slightly change our behavior: we can change the way we do things while still getting the same results. Paying £1 every time we lose and receiving £1 every time we win essentially results in the same outcome as paying ten shillings every single time without fail and receiving £1 every time heads comes up. This is because heads comes up, on average, every other time. In the long run, both methods yield the same results; however, there’s a noticeable difference when we look at each case individually. There are two key differences. First, we shift from the idea of making a payment every other time to making a payment every time. Second, the payment we make each time is half of what we receive when we actually do get something. The difference may seem minor; but pay attention to the effect when our actions are translated back into the underlying belief they depend on. It seems to me that this translation leads to the idea that we have a precisely measurable amount of belief regarding each proposition. Experiencing losses and gains of equal amounts, happening equally often, is related to our belief that the two outcomes, heads and tails, would also occur equally often. This was understandable, as it referred to the long term. Realizing that this could be replaced by a payment made every time without exception—a payment, remember, that is half the amount of what we occasionally receive—has understandably been interpreted to mean that there must be a state of half-belief connected to each individual toss.
§ 23. One such example, of course, does not go far towards establishing a theory. But the reader will bear in mind that almost all our conduct tends towards the same 146 result; that it is not in betting only, but in every course of action in which we have to count the events, that such a numerical apportionment of our conduct is possible. Hence, by the ordinary principles of association, it would appear exceedingly likely that, not exactly a numerical condition of mind, but rather numerical associations, become inseparably connected with each particular event which we know to occur in a certain proportion of times. Once in six times a die gives ace; a knowledge of this fact, taken in combination with all the practical results to which it leads, produces, one cannot doubt, an inseparable notion of one-sixth connected with each single throw. But it surely cannot be called belief to the amount of one-sixth; at least it admits neither of justification nor explanation in these single cases, to which alone the fractional belief, if such existed, ought to apply.
§ 23. One example, of course, doesn’t really prove a theory. But keep in mind that almost all our actions lead to the same outcome; it’s not just in betting, but in every situation where we have to consider the events, that we can assign a numerical aspect to our behavior. Therefore, based on common principles of association, it seems very likely that numerical associations, rather than a specific numerical mindset, become tightly linked with each event that we know occurs in a certain frequency. Once in six times, a die shows an ace; knowing this fact, combined with all the practical implications it brings, creates an undeniable connection of one-sixth with each single roll. However, this certainly cannot be considered a belief of one-sixth; at least, it doesn’t allow for justification or explanation in these individual cases, where the fractional belief, if it did exist, should only apply.
It is in consequence, I apprehend, of such association that we act in such an unhesitating manner in reference to any single contingent event, even when we have no expectation of its being repeated. A die is going to be thrown up once, and once only. I bet 5 to 1 against ace, not, as is commonly asserted, because I feel one-sixth part of certainty in the occurrence of ace; but because I know that such conduct would be justified in the long run of such cases, and I apply to the solitary individual the same rule that I should apply to it if I knew it were one of a long series. This accounts for my conduct being the same in the two cases; by association, moreover, we probably experience very similar feelings in regard to them both.
It’s because of this association that we act so confidently about any single random event, even when we don’t expect it to happen again. A die is going to be rolled just once. I bet 5 to 1 against rolling an ace, not because I genuinely believe there’s a one-sixth chance of getting an ace, but because I know that this approach would be supported in the long run in similar situations. I treat this individual roll the same way I would treat it if I thought it were part of a long series of rolls. This explains why my behavior is consistent in both cases; in addition, through association, we likely feel very similarly about both situations.
§ 24. And here, on the view of the subject adopted in this Essay, we might stop. We are bound to explain the ‘measure of our belief’ in the occurrence of a single event when we judge solely from the statistical frequency with which such events occur, for such a series of events was our 147 starting-point; but we are not bound to inquire whether in every case in which persons have, or claim to have, a certain measure of belief there must be such a series to which to refer it, and by which to justify it. Those who start from the subjective side, and regard Probability as the science of quantitative belief, are obliged to do this, but we are free from the obligation.
§ 24. And at this point, based on the perspective we've taken in this essay, we could wrap things up. We need to explain the 'level of our belief' in the occurrence of a single event when we judge solely based on how often such events happen, as that series of events was our starting point. However, we don't need to investigate whether every time someone has, or claims to have, a certain level of belief there must be a corresponding series to reference and validate it. Those who approach it from the subjective angle and see Probability as the science of quantifying belief have to do this, but we are not obligated to.
Still the question is one which is so naturally raised in connection with this subject, that it cannot be altogether passed by. I think that to a considerable extent such a justification as that mentioned above will be found applicable in other cases. The fact is that we are very seldom called upon to decide and act upon a single contingency which cannot be viewed as being one of a series. Experience introduces us, it must be remembered, not merely to a succession of events neatly arranged in a single series (as we have hitherto assumed them to be for the purpose of illustration), but to an infinite number belonging to a vast variety of different series. A man is obliged to be acting, and therefore exercising his belief about one thing or another, almost the whole of every day of his life. Any one person will have to decide in his time about a multitude of events, each one of which may never recur again within his own experience. But by the very fact of there being a multitude, though they are all of different kinds, we shall still find that order is maintained, and so a course of conduct can be justified. In a plantation of trees we should find that there is order of a certain kind if we measure them in any one direction, the trees being on an average about the same distance from each other. But a somewhat similar order would be found if we were to examine them in any other direction whatsoever. So in nature generally; there is regularity in a succession of events of the same kind. But there may also be regularity 148 if we form a series by taking successively a number out of totally distinct kinds.
Still, the question is one that naturally comes up in connection with this topic, so it can't be overlooked. I believe that to a large extent, the justification mentioned above will apply to other cases as well. The truth is that we're rarely faced with a single event that can't be seen as part of a larger series. It's important to remember that our experiences don't just involve a neat sequence of events (as we've previously illustrated), but rather an endless number belonging to a wide variety of different series. A person is almost always making decisions and relying on their beliefs about one thing or another throughout their day. Each individual will have to make choices about many different events, each of which may never happen again in their lifetime. Yet, despite the multitude of different types of events, we can find that order is maintained, which justifies a course of action. In a grove of trees, we would see a certain order if we measure them in one direction, as the trees tend to be roughly the same distance apart. A similar order would appear if we looked at them from any other direction. Similarly, in nature, there is a pattern in a sequence of events of the same kind. However, there can also be patterns if we create a series by taking a number of totally distinct types one after another. 148
It is in this circumstance that we find an extension of the practical justification of the measure of our belief. A man, say, buys a life annuity, insures his life on a railway journey, puts into a lottery, and so on. Now we may make a series out of these acts of his, though each is in itself a single event which he may never intend to repeat. His conduct, and therefore his belief, measured by the result in each individual instance, will not be justified, but the reverse, as shown in § 19. Could he indeed repeat each kind of action often enough it would be justified; but from this, by the conditions of life, he is debarred. Now it is perfectly conceivable that in the new series, formed by his successive acts of different kinds, there should be no regularity. As a matter of fact, however, it is found that there is regularity. In this way the equalization of his gains and losses, for which he cannot hope in annuities, insurances, and lotteries taken separately, may yet be secured to him out of these events taken collectively. If in each case he values his chance at its right proportion (and acts accordingly) he will in the course of his life neither gain nor lose. And in the same way if, whenever he has the alternative of different courses of conduct, he acts in accordance with the estimate of his belief described above, i.e. chooses the event whose chance is the best, he will in the end gain more in this way than by any other course. By the existence, therefore, of these cross-series, as we may term them, there is an immense addition to the number of actions which may be fairly considered to belong to those courses of conduct which offer many successive opportunities of equalizing gains and losses. All these cases then may be regarded as admitting of justification in the way now under discussion.
It is in this situation that we see an extension of the practical justification of our belief. For example, a person buys a life annuity, insures their life on a train journey, enters a lottery, and so on. We can categorize these actions, even though each one is a unique event that they might not intend to repeat. Their behavior, and therefore their belief, assessed by the outcomes of each individual instance, will not be justified, but the opposite, as shown in § 19. If they could repeat each type of action often enough, it would be justified; however, life’s circumstances prevent them from doing so. It is entirely possible that in the new series formed by their successive actions of different kinds, there would be no regularity. In reality, though, we find that there is regularity. This way, the balancing of their gains and losses, which they cannot expect from annuities, insurance, and lotteries taken individually, may still be achieved from these events considered collectively. If in every case they value their chance accurately (and act accordingly), they will neither gain nor lose during their life. Similarly, if whenever they have the option of different courses of action, they choose based on the assessment of their belief described above, meaning they select the option with the best odds, they will ultimately gain more this way than by any other method. Thus, the existence of these cross-series, as we can call them, significantly increases the number of actions that can justifiably be seen as part of those courses of conduct that provide many successive opportunities to balance gains and losses. All these cases can then be regarded as justifiable in the manner now being discussed.
§ 25. In the above remarks it will be observed that we 149 have been giving what is to be regarded as a justification of his belief from the point of view of the individual agent himself. If we suppose the existence of an enlarged fellow-feeling, the applicability of such a justification becomes still more extensive. We can assign a very intelligible sense to the assertion that it is 999 to 1 that I shall not get a prize in a lottery, even if this be stated in the form that my belief in my so doing is represented by the fraction 1/1000th of certainty. Properly it means that in a very large number of throws I should gain once in 1000 times. If we include other contingencies of the same kind, as described in the last section, each individual may be supposed to reach to something like this experience within the limits of his own life. He could not do it in this particular line of conduct alone, but he could do it in this line combined with others. Now introduce the possibility of each man feeling that the gain of others offers some analogy to his own gains, which we may conceive his doing except in the case of the gains of those against whom he is directly competing, and the above justification becomes still more extensively applicable.
§ 25. In the earlier points, you’ll notice we’ve been discussing a way to justify his belief from the perspective of the individual agent. If we assume a broader sense of empathy, this justification becomes even more relevant. We can clearly understand the statement that the odds are 999 to 1 against me winning a lottery, even if it's expressed as my belief being represented by a fraction of 1/1000th of certainty. Essentially, it means that over a large number of attempts, I would expect to win once every 1000 tries. If we take into account other similar situations, as explained in the last section, each person can be expected to have experiences like this within their lifetime. They might not experience it solely in this specific behavior but could do so through a mix of different ones. Now, if we add the idea that each person feels that the success of others reflects their own potential gains — which we can reasonably assume except in the case of those they are directly competing against — the justification mentioned above becomes even more widely applicable.
The following would be a fair illustration to test this view. I know that I must die on some day of the week, and there are but seven days. My belief, therefore, that I shall die on a Sunday is one-seventh. Here the contingent event is clearly one that does not admit of repetition; and yet would not the belief of every man have the value assigned it by the formula? It would appear that the same principle will be found to be at work here as in the former examples. It is quite true that I have only the opportunity of dying once myself, but I am a member of a class in which deaths occur with frequency, and I form my opinion upon evidence drawn from that class. If, for example, I had insured my life for £1000, I should feel a certain propriety in demanding 150 £7000 in case the office declared that it would only pay in the event of my dying on a Sunday. I, indeed, for my own private part, might not find the arrangement an equitable one; but mankind at large, in case they acted on such a principle, might fairly commute their aggregate gains in such a way, whilst to the Insurance Office it would not make any difference at all.
The following would be a fair example to test this idea. I know that I will die on some day of the week, and there are only seven days. Therefore, my belief that I will die on a Sunday is one out of seven. Here, the event is clearly one that cannot happen more than once; yet wouldn't everyone's belief hold the value given by the formula? It seems the same principle is at work here as in the previous examples. It's true that I only have the chance to die once, but I belong to a group where deaths happen frequently, and I base my opinion on evidence from that group. For instance, if I had insured my life for £1000, I would feel justified in asking for £7000 if the insurance company stated it would only pay out if I died on a Sunday. I might not personally find that arrangement fair; however, if society at large operated under that principle, they could reasonably adjust their total gains accordingly, while it wouldn't make any difference to the Insurance Office at all.
§ 26. The results of the last few sections might be summarised as follows:—the different amounts of belief which we entertain upon different events, and which are recognized by various phrases in common use, have undoubtedly some meaning. But the greater part of their meaning, and certainly their only justification, are to be sought in the series of corresponding events to which they belong; in regard to which it may be shown that far more events are capable of being referred to a series than might be supposed at first sight. The test and justification of belief are to be found in conduct; in this test applied to the series as a whole, there is nothing peculiar, it differs in no way from the similar test when we are acting on our belief about any single event. But so applied, from the nature of the case it is applied successively to each of the individuals of the series; here our conduct generally admits of being separately considered in reference to each particular event; and this has been understood to denote a certain amount of belief which should be a fraction of certainty. Probably on the principles of association, a peculiar condition of mind is produced in reference to each single event. And these associations are not unnaturally retained even when we contemplate any one of these single events isolated from any series to which it belongs. When it is found alone we treat it, and feel towards it, as we do when it is in company with the rest of the series.
§ 26. The results from the last few sections can be summarized as follows: the varying levels of belief we hold about different events, which are described by various commonly used phrases, undoubtedly have some significance. However, most of their meaning, and certainly their only justification, can be found in the series of related events they are part of; regarding this, it can be shown that many more events can be linked to a series than one might initially think. The test and validation of belief are found in our actions; this test, when applied to the series as a whole, is nothing out of the ordinary—it doesn't differ from the similar test we use when acting based on our belief about a single event. But, in this application, it is naturally applied one by one to each individual in the series; here, our conduct can usually be separately considered in relation to each specific event, and this has been understood to represent a certain level of belief that should be a fraction of certainty. Likely due to the principles of association, a unique state of mind is created concerning each individual event. These associations are often retained even when we consider any one of these events apart from the series it belongs to. When it stands alone, we treat it, and feel towards it, just like we do when it is with the rest of the series.
§ 27. We may now see, more clearly than we could 151 before, why it is that we are free from any necessity of assuming the existence of causation, in the sense of necessary invariable sequence, in the case of the events which compose our series. Against such a view it might very plausibly be urged, that we constantly talk of the probability of a single event; but how can this be done, it may reasonably be said, if we once admit the possibility of that event occurring fortuitously? Take an instance from human life; the average duration of the lives of a batch of men aged thirty will be about thirty-four years. We say therefore to any individual of them, Your expectation of life is thirty-four years. But how can this be said if we admit that the train of events composing his life is liable to be destitute of all regular sequence of cause and effect? To this it may be replied that the denial of causation enables us to say neither more nor less than its assertion, in reference to the length of the individual life, for of this we are ignorant in each case alike. By assigning, as above, an expectation in reference to the individual, we mean nothing more than to make a statement about the average of his class. Whether there be causation or not in these individual cases does not affect our knowledge of the average, for this by supposition rests on independent experience. The legitimate inferences are the same on either hypothesis, and of equal value. The only difference is that on the hypothesis of non-causation we have forced upon our attention the impropriety of talking of the ‘proper’ expectation of the individual, owing to the fact that all knowledge of its amount is formally impossible; on the other hypothesis the impropriety is overlooked from the fact of such knowledge being only practically unattainable. As a matter of fact the amount of our knowledge is the same in each case; it is a knowledge of the average, and of that only.[6]
§ 27. We can now see more clearly than before why we don’t need to assume causation, in the sense of a necessary and unchanging sequence, for the events in our series. Some might argue against this perspective by stating that we often talk about the probability of a single event. However, it’s reasonable to ask how we can do this if we accept that the event could occur randomly. For example, the average lifespan of a group of thirty-year-old men is about thirty-four years. Therefore, we might say to any individual in that group, “Your life expectancy is thirty-four years.” But how can we say this if we allow for the possibility that the series of events in his life might not follow any regular cause-and-effect pattern? The response to this is that denying causation allows us to make a statement about the length of an individual’s life that is no more or less than the assertion that causation provides, since we are ignorant of the specifics in each case. By stating an expectation for the individual, we are simply making a claim about the average of his group. Whether causation exists in these individual situations doesn’t change our understanding of the average, which, by nature, is based on independent experience. The valid conclusions are the same under either assumption and hold equal value. The only difference is that with the assumption of non-causation, we are reminded of the inappropriateness of discussing the “correct” expectation for the individual, since we can never formally know its value; while under the other assumption, this inappropriateness is overlooked simply because such knowledge is only practically unreachable. In reality, our amount of knowledge is the same in both cases; it is simply knowledge of the average, and nothing more. [6]
§ 28. We may conclude, then, that the limits within which we are thus able to justify the amount of our belief are far more extensive than might appear at first sight. Whether every case in which persons feel an amount of belief short of perfect confidence could be forced into the province of Probability is a wider question. Even, however, if the belief could be supposed capable of justification on its principles, its rules could never in such cases be made use of. Suppose, for example, that a father were in doubt whether to give a certain medicine to his sick child. On the one hand, the doctor declared that the child would die unless the medicine were given; on the other, through a mistake, the father cannot feel quite sure that the medicine he has is the right one. It is conceivable that some mathematicians, in their conviction that everything has its definite numerical probability, would declare that the man's belief had some ‘value’ (if they could only find out what it is), say nine-tenths; by which they would mean that in nine cases out of ten in which he entertained a belief of that particular value he proved to be right. So with his belief and doubt on the other side of the question. Putting the two together, there is but one course which, as a prudent man and a good father, he can possibly follow. It may be so, but when (as here) the identification of an event in a series depends on purely subjective conditions, as in this case upon the degree of vividness of his conviction, of which no one else can judge, no test is possible, and therefore no proof can be found.
§ 28. We can conclude that the limits within which we can justify our beliefs are much broader than they might seem at first glance. Whether every situation where people have a belief that falls short of complete confidence can be categorized under the concept of Probability is a more complex question. Even if we could argue that such beliefs could be justified based on their principles, their rules wouldn't be applicable in these cases. For instance, imagine a father uncertain about giving a specific medicine to his sick child. On one hand, the doctor has said the child will die without the medicine; on the other hand, due to a mix-up, the father isn’t completely sure if the medicine he has is the correct one. Some mathematicians, who believe that everything has a specific numerical probability, might assert that the father's belief has some ‘value’ (if they could only determine what it is), let’s say nine-tenths; meaning he is right nine out of ten times when he has that level of belief. The same goes for his doubts regarding the other side of the issue. Weighing both sides, there’s only one responsible action he can take as a cautious person and a caring father. It might be the case, but when (as here) identifying an outcome in a series relies on purely personal factors, like the intensity of his belief—something no one else can assess—there's no possible test, and thus no proof can be established.
§ 29. So much then for the attempts, so frequently made, to found the science on a subjective basis; they can lead, as it has here been endeavoured to show, to no satisfactory result. Still our belief is so inseparably connected with our action, that something of a defence can be made for the attempts described above; but when it is attempted, as is 153 often the case, to import other sentiments besides pure belief, and to find a justification for them also in the results of our science, the confusion becomes far worse. The following extract from Archbishop Thomson's Laws of Thought (§ 122, Ed. II.) will show what kind of applications of the science are contemplated here: “In applying the doctrine of chances to that subject in connexion with which it was invented—games of chance,—the principles of what has been happily termed ‘moral arithmetic’ must not be forgotten. Not only would it be difficult for a gamester to find an antagonist on terms, as to fortune and needs, precisely equal, but also it is impossible that with such an equality the advantage of a considerable gain should balance the harm of a serious loss. ‘If two men,’ says Buffon, ‘were to determine to play for their whole property, what would be the effect of this agreement? The one would only double his fortune, and the other reduce his to naught. What proportion is there between the loss and the gain? The same that there is between all and nothing. The gain of the one is but a moderate sum,—the loss of the other is numerically infinite, and morally so great that the labour of his whole life may not perhaps suffice to restore his property.’ ”
§ 29. So much for the attempts, often made, to base the science on a subjective foundation; as we've tried to show, these efforts lead to no satisfactory outcome. Still, our beliefs are so tightly linked to our actions that there’s some justification for the attempts mentioned above. However, when people try to bring in other feelings beyond pure belief and seek to justify them using the results of our science, things get even more confusing. The following excerpt from Archbishop Thomson's Laws of Thought (§ 122, Ed. II.) will illustrate the kinds of applications of science being considered here: “When applying the doctrine of chances to the subject for which it was created—games of chance—we must not forget the principles of what has been aptly called ‘moral arithmetic.’ Not only would it be challenging for a gambler to find an opponent with equal fortune and needs, but it is also impossible for the advantage of a significant win to offset the damage of a serious loss. ‘If two men,’ says Buffon, ‘were to decide to play for their entire fortune, what would happen? One would merely double his wealth, while the other would lose everything. What is the ratio between loss and gain? It’s the same as all and nothing. The gain for one is just a modest amount—the loss for the other is numerically infinite, and morally so great that the effort of his entire life might not be enough to restore his fortune.’”
As moral advice this is all very true and good. But if it be regarded as a contribution to the science of the subject it is quite inappropriate, and seems calculated to cause confusion. The doctrine of chances pronounces upon certain kinds of events in respect of number and magnitude; it has absolutely nothing to do with any particular person's feelings about these relations. We might as well append a corollary to the rules of arithmetic, to point out that although it is very true that twice two are four it does not follow that four, horses will give twice as much pleasure to the owner as two will. If two men play on equal terms their chances are 154 equal; in other words, if they were often to play in this manner each would lose as frequently as he would gain. That is all that Probability can say; what under the circumstances may be the determination and opinions of the men in question, it is for them and them alone to decide. There are many persons who cannot bear mediocrity of any kind, and to whom the prospect of doubling their fortune would outweigh a greater chance of losing it altogether. They alone are the judges.
As moral advice, this is all very true and valid. But if we look at it as a contribution to the science of the subject, it’s completely inappropriate and seems designed to create confusion. The law of probability deals with certain types of events in terms of number and size; it has nothing to do with anyone's personal feelings about those relationships. We might as well add a note to the rules of arithmetic stating that while it's true that twice two equals four, that doesn't mean four horses will bring twice as much joy to the owner as two will. If two players compete on equal terms, their chances are equal; in other words, if they played this way many times, each would lose as often as they would win. That's all Probability can tell us; what the men involved may decide and feel in that situation is up to them alone. There are many people who can’t tolerate any kind of mediocrity, and for whom the chance to double their wealth would outweigh a higher risk of losing it all. They alone are the judges.
If we will introduce such a balance of pleasure and pain the individual must make the calculation for himself. The supposition is that total ruin is very painful, partial loss painful in a less proportion than that assigned by the ratio of the losses themselves; the inference is therefore drawn that on the average more pain is caused by occasional great losses than by frequent small ones, though the money value of the losses in the long run may be the same in each case. But if we suppose a country where the desire of spending largely is very strong, and where owing to abundant production loss is easily replaced, the calculation might incline the other way. Under such circumstances it is quite possible that more happiness might result from playing for high than for low stakes. The fact is that all emotional considerations of this kind are irrelevant; they are, at most, mere applications of the theory, and such as each individual is alone competent to make for himself. Some more remarks will be made upon this subject in the chapter upon Insurance and Gambling.
If we are going to establish a balance between pleasure and pain, each person has to figure it out for themselves. The assumption is that total loss is very painful, while partial loss is less painful than the actual losses suggest. Therefore, it can be inferred that, on average, more pain comes from occasional big losses than from frequent small ones, even if the total money lost is the same in the end. But if we imagine a country where people have a strong desire to spend a lot and where losses can easily be compensated due to high production, the calculation might go the other way. In that case, it’s possible that people might find more happiness in gambling with high stakes than with low ones. The truth is that all emotional factors like this are not really significant; they're mostly just applications of the theory, and only each individual can truly assess their own situation. More comments on this will be included in the chapter on Insurance and Gambling.
§ 30. It is by the introduction of such considerations as these that the Petersburg Problem has been so perplexed. Having already given some description of this problem we will refer to it very briefly here. It presents us with a sequence of sets of throws for each of which sets I am to 155 receive something, say a shilling, as the minimum receipt. My receipts increase in proportion to the rarity of each particular kind of set, and each kind is observed or inferred to grow more rare in a certain definite but unlimited order. By the wording of the problem, properly interpreted, I am supposed never to stop. Clearly therefore, however large a fee I pay for each of these sets, I shall be sure to make it up in time. The mathematical expression of this is, that I ought always to pay an infinite sum. To this the objection is opposed, that no sensible man would think of advancing even a large finite sum, say £50. Certainly he would not; but why? Because neither he nor those who are to pay him would be likely to live long enough for him to obtain throws good enough to remunerate him for one-tenth of his outlay; to say nothing of his trouble and loss of time. We must not suppose that the problem, as stated in the ideal form, will coincide with the practical form in which it presents itself in life. A carpenter might as well object to Euclid's second postulate, because his plane came to a stop in six feet on the plank on which he was at work. Many persons have failed to perceive this, and have assumed that, besides enabling us to draw numerical inferences about the members of a series, the theory ought also to be called upon to justify all the opinions which average respectable men might be inclined to form about them, as well as the conduct they might choose to pursue in consequence. It is obvious that to enter upon such considerations as these is to diverge from our proper ground. We are concerned, in these cases, with the actions of men only, as given in statistics; with the emotions they experience in the performance of these actions we have no direct concern whatever. The error is the same as if any one were to confound, in political economy, value in use with value in exchange, and object to measuring the 156 value of a loaf by its cost of production, because bread is worth more to a man when he is hungry than it is just after his dinner.
§ 30. It's the introduction of ideas like these that has complicated the Petersburg Problem. After having described this problem briefly, let’s summarize it here. It provides a sequence of sets of throws, for which I will receive something, say a shilling, as the minimum amount. My earnings increase based on how rare each specific set is, and each kind tends to become rarer in a certain definite but unlimited way. According to the problem, properly interpreted, I am never supposed to stop. Clearly, no matter how much I pay for each of these sets, I will eventually make it back. The mathematical way to express this is that I should always be willing to pay an infinite amount. The counterargument is that no reasonable person would consider paying even a large sum, like £50. Of course, they wouldn’t; but why? Because neither they nor those who would pay them are likely to live long enough to see throws good enough to cover one-tenth of their expense, not to mention their trouble and lost time. We shouldn’t assume that the problem, as defined in its ideal form, matches the practical way it appears in real life. A carpenter might as well refuse Euclid's second postulate because his plane stops at six feet on the plank he’s working on. Many people have failed to see this and have assumed that, besides allowing us to draw numerical conclusions about the members of a series, the theory should also justify all the opinions that average reasonable people might form about them and the actions they might choose to take as a result. It’s clear that discussing such matters is straying from our main focus. In these cases, we are only concerned with people's actions as shown in statistics; we have no direct interest in the emotions they experience while performing those actions. The mistake is similar to someone confusing, in political economy, value in use with value in exchange and objecting to measuring the value of a loaf based on its production cost because bread is worth more to someone when they are hungry than right after their meal.
§ 31. One class of emotions indeed ought to be excepted, which, from the apparent uniformity and consistency with which they show themselves in different persons and at different times, do really present some better claim to consideration. In connection with a science of inference they can never indeed be regarded as more than an accident of what is essential to the subject, but compared with other emotions they seem to be inseparable accidents.
§ 31. There's one category of emotions that should be considered separately, as they consistently appear in various people and at different times, giving them a stronger case for attention. When it comes to the science of inference, they can only be seen as incidental to the core of the subject, but when compared to other emotions, they seem to be inseparable elements.
The reader will remember that attention was drawn in the earlier part of this chapter to the compound nature of the state of mind which we term belief. It is partly intellectual, partly also emotional; it professes to rest upon experience, but in reality the experience acts through the distorting media of hopes and fears and other disturbing agencies. So long as we confine our attention to the state of mind of the person who believes, it appears to me that these two parts of belief are quite inseparable. Indeed, to speak of them as two parts may convey a wrong impression; for though they spring from different sources, they so entirely merge in one result as to produce what might be called an indistinguishable compound. Every kind of inference, whether in probability or not, is liable to be disturbed in this way. A timid man may honestly believe that he will be wounded in a coming battle, when others, with the same experience but calmer judgments, see that the chance is too small to deserve consideration. But such a man's belief, if we look only to that, will not differ in its nature from sound belief. His conduct also in consequence of his belief will by itself afford no ground of discrimination; he will make his will as sincerely as a man who is unmistakeably on 157 his death-bed. The only resource is to check and correct his belief by appealing to past and current experience.[7] This was advanced as an objection to the theory on which probability is regarded as concerned primarily with laws of belief. But on the view taken in this Essay in which we are supposed to be concerned with laws of inference about things, error and difficulty from this source vanish. Let us bear clearly in mind that we are concerned with inferences about things, and whatever there may be in belief which does not depend on experience will disappear from notice.
The reader will remember that earlier in this chapter, we talked about the complex nature of what we call belief. It involves both intellectual and emotional aspects; it claims to be based on experience, but in reality, that experience is filtered through the distorting lenses of hopes, fears, and other disruptive factors. As long as we focus on the state of mind of the person who believes, I think these two aspects of belief are pretty inseparable. In fact, describing them as two separate parts might give the wrong impression; even though they come from different sources, they completely merge into a single outcome, creating what could be called an indistinguishable mix. Any kind of inference, whether probable or not, can be affected in this way. A fearful person might genuinely believe they'll be injured in an upcoming battle, while others with the same experience but calmer minds see that the odds are too low to worry about. However, if we only focus on his belief, it won't be qualitatively different from a solid belief. His actions due to his belief would also not provide a basis for differentiation; he would make his will as sincerely as someone who is clearly on their deathbed. The only solution is to check and adjust his belief by referring to past and current experiences.[7] This was presented as a challenge to the theory that probability mainly deals with laws of belief. But in the perspective taken in this Essay, where we focus on the laws of inference about things, mistakes and difficulties from this source disappear. Let's keep in mind that we are focused on inferences about things, and anything in belief that doesn't rely on experience will fade from consideration.
§ 32. These emotions then can claim no notice as an integral portion of any science of inference, and should in strictness be rigidly excluded from it. But if any of them are uniform and regular in their production and magnitude, they may be fairly admitted as accidental and extraneous accompaniments. This is really the case to some extent with our surprise. This emotion does show a considerable degree of uniformity. The rarer any event is the more am I, in common with most other men, surprised at it when it does happen. This surprise may range through all degrees, from the most languid form of interest up to the condition which we term ‘being startled’. And since the surprise seems to be pretty much the same, under similar circumstances, at different times, and in the case of different persons, it is free from that extreme irregularity which is found in most of the other mental conditions which accompany the contemplation 158 of unexpected events. Hence our surprise, though, as stated above, having no proper claim to admission into the science of Probability, is such a constant and regular accompaniment of that which Probability is concerned with, that notice must often be taken of it. References will occasionally be found to this aspect of the question in the following chapters.
§ 32. These emotions shouldn't be regarded as an essential part of any science of inference, and should strictly be kept out of it. However, if any of them are consistent and predictable in their occurrence and intensity, they can be reasonably considered as accidental and external factors. This is somewhat true for our surprise. This emotion tends to show a significant level of consistency. The rarer an event is, the more surprised I, like most people, am when it happens. This surprise can vary widely, from a mild interest to what we call ‘being startled’. And since the surprise appears to be quite similar under similar conditions, at different times, and among different individuals, it lacks the extreme unpredictability seen in most other mental states that accompany the contemplation of unexpected events. Therefore, our surprise, although, as mentioned earlier, it doesn't really fit into the science of Probability, is such a frequent and consistent reaction to what Probability deals with that it often deserves attention. References to this aspect of the issue will occasionally appear in the following chapters.
It may be remarked in passing, for the sake of further illustration of the subject, that this emotional accompaniment of surprise, to which we are thus able to assign something like a fractional value, differs in two important respects from the commonly accepted fraction of belief. In the first place, it has what may be termed an independent existence; it is intelligible by itself. The belief, as we endeavoured to show, needs explanation and finds it in our consequent conduct. Not so with the emotion; this stands upon its own footing, and may be examined in and by itself. Hence, in the second place, it is as applicable, and as capable of any kind of justification, in relation to the single event, as to a series of events. In this respect, as will be remembered, it offers a complete contrast to our state of belief about any one contingent event. May not these considerations help to account for the general acceptance of the doctrine, that we have a certain definite and measurable amount of belief about these events? I cannot help thinking that what is so obviously true of the emotional portion of the belief, has been unconsciously transferred to the other or intellectual portion of the compound condition, to which it is not applicable, and where it cannot find a justification.
It can be noted that this emotional response of surprise, which we can assign a sort of fractional value to, differs in two significant ways from the usual idea of belief. First, it has what we might call an independent existence; it makes sense on its own. The belief, as we tried to demonstrate, needs to be explained and finds its justification in our subsequent actions. The emotion, on the other hand, stands on its own and can be analyzed independently. Therefore, secondly, it applies to and can be justified for a single event just as much as for a series of events. In this way, it completely contrasts with our belief regarding any single possible event. Could these points help explain the common belief that we hold a specific and measurable degree of belief about these events? I can't help but think that what is clearly true for the emotional aspect of belief has been unconsciously projected onto the intellectual aspect of the mixed condition, where it doesn’t apply and cannot be justified.
§ 33. A further illustration may now be given of the subjective view of Probability at present under discussion.
§ 33. We can now provide another example of the subjective view of Probability that we're currently discussing.
An appeal to common language is always of service, as 159 the employment of any distinct word is generally a proof that mankind have observed some distinct properties in the things, which have caused them to be singled out and have that name appropriated to them. There is such a class of words assigned by popular usage to the kind of events of which Probability takes account. If we examine them we shall find, I think, that they direct us unmistakeably to the two-fold aspect of the question,—the objective and the subjective, the quality in the events and the state of our minds in considering them,—that have occupied our attention during the former chapters.
An appeal to everyday language is always helpful, as using any specific word typically indicates that people have noticed certain distinct characteristics in things, which have led to them being identified and given that name. There is a category of words used by the public that relates to the kinds of events that Probability considers. If we look at these words, I believe we'll find that they clearly point us to the two-fold aspect of the question—the objective and the subjective, the quality in the events and our mindset when thinking about them—that we've been focused on in the previous chapters.
The word ‘extraordinary’, for instance, seems to point to the observed fact, that events are arranged in a sort of ordo or rank. No one of them might be so exactly placed that we could have inferred its position, but when we take a great many into account together, running our eye, as it were, along the line, we begin to see that they really do for the most part stand in order. Those which stand away from the line have this divergence observed, and are called extraordinary, the rest ordinary, or in the line. So too ‘irregular’ and ‘abnormal’ are doubtless used from the appearance of things, when examined in large numbers, being that of an arrangement by rule or measure. This only holds when there are a good many; we could not speak of the single events being so arranged. Again the word ‘law’, in its philosophical sense, has now become quite popularised. How the term became introduced is not certain, but there can be little doubt that it was somewhat in this way:—The effect of a law, in its usual application to human conduct, is to produce regularity where it did not previously exist; when then a regularity began to be perceived in nature, the same word was used, whether the cause was supposed to be the same or not. In each case there 160 was the same generality of agreement, subject to occasional deflection.[8]
The word 'extraordinary', for example, highlights the fact that events are organized in a sort of order or ranking. None of them might be positioned so precisely that we could infer their place, but when we look at a lot of them together, we start to notice that they mostly do fall into a pattern. Those that stray from this pattern are noted for their divergence and are called extraordinary, while the others are seen as ordinary, or in line. Similarly, 'irregular' and 'abnormal' are likely used based on the appearance of things when viewed in larger quantities, suggesting an arrangement by rule or measure. This only applies when there are many; we couldn't talk about single events being arranged this way. Moreover, the word 'law', in its philosophical sense, has become quite popular. How this term was introduced is unclear, but it likely happened like this: The effect of a law, in its usual context regarding human behavior, is to create regularity where it didn't exist before; so when regularity began to be observed in nature, the same word was applied, regardless of whether the cause was believed to be the same. In both cases, there was a general agreement, subject to occasional deviations. 160
On the other hand, observe the words ‘wonderful’, ‘unexpected’, ‘incredible’. Their connotation describes states of mind simply; they are of course not confined to Probability, in the sense of statistical frequency, but imply simply that the events they denote are such as from some cause we did not expect would happen, and at which therefore, when they do happen, we are surprised.
On the other hand, take a look at the words 'wonderful', 'unexpected', 'incredible'. Their meaning captures states of mind straightforwardly; they aren’t limited to Probability, in terms of statistical frequency, but simply suggest that the events they refer to are ones we didn’t expect to occur for some reason, and so when they do happen, we’re surprised.
Now when we bear in mind that these two classes of words are in their origin perfectly distinct;—the one denoting simply events of a certain character; the other, though also denoting events, connoting simply states of mind;—and yet that they are universally applied to the same events, so as to be used as perfectly synonymous, we have in this a striking illustration of the two sides under which Probability may be viewed, and of the universal recognition of a close connection between them. The words are popularly used as synonymous, and we must not press their meaning too far; but if it were to be observed, as I am rather inclined to think it could, that the application of the words which denote mental states is wider than that of the others, we should have an illustration of what has been already observed, viz. that the province of Probability is not so extensive as that over which variation of belief might be observed. Probability only considers the case in which this variation is brought about in a certain definite statistical way.
Now, when we consider that these two types of words come from completely different origins—one simply describing events of a particular kind, and the other, while also describing events, specifically indicating states of mind—we see that they are used interchangeably in relation to the same events. This serves as a clear example of the different aspects under which Probability can be understood and the universal acknowledgment of their close relationship. People commonly treat these words as synonyms, and we shouldn’t analyze their meanings too rigidly. However, if it were to be noted, as I believe it could, that the usage of words referring to mental states is broader than that of the others, it would illustrate what has already been stated: that the scope of Probability is not as wide as the area where variations in belief can be observed. Probability only considers situations where this variation is achieved in a specific statistical manner.
§ 34. It will be found in the end both interesting and important to have devoted some attention to this subjective 161 side of the question. In the first place, as a mere speculative inquiry the quantity of our belief of any proposition deserves notice. To study it at all deeply would be to trespass into the province of Psychology, but it is so intimately connected with our own subject that we cannot avoid all reference to it. We therefore discuss the laws under which our expectation and surprise at isolated events increases or diminishes, so as to account for these states of mind in any individual instance, and, if necessary, to correct them when they vary from their proper amount.
§ 34. In the end, it’s both interesting and important to pay some attention to this subjective aspect of the issue. First of all, the level of our belief in any statement deserves attention as a speculative inquiry. Delving too deeply into it would lead us into the field of Psychology, but it’s so closely related to our topic that we can’t entirely avoid mentioning it. Therefore, we examine the principles that affect our expectation and surprise regarding isolated events, so we can understand these mental states in specific cases and, if needed, adjust them when they deviate from what’s appropriate.
But there is another more important reason than this. It is quite true that when the subjects of our discussion in any particular instance lie entirely within the province of Probability, they may be treated without any reference to our belief. We may or we may not employ this side of the question according to our pleasure. If, for example, I am asked whether it is more likely that A. B. will die this year, than that it will rain to-morrow, I may calculate the chance (which really is at bottom the same thing as my belief) of each, find them respectively, one-sixth and one-seventh, say, and therefore decide that my ‘expectation’ of the former is the greater, viz. that this is the more likely event. In this case the process is precisely the same whether we suppose our belief to be introduced or not; our mental state is, in fact, quite immaterial to the question. But, in other cases, it may be different. Suppose that we are comparing two things, of which one is wholly alien to Probability, in the sense that it is hopeless to attempt to assign any degree of numerical frequency to it, the only ground they have in common may be the amount of belief to which they are respectively entitled. We cannot compare the frequency of their occurrence, for one may occur too seldom to judge by, perhaps it may be unique. It has been already 162 said, that our belief of many events rests upon a very complicated and extensive basis. My belief may be the product of many conflicting arguments, and many analogies more or less remote; these proofs themselves may have mostly faded from my mind, but they will leave their effect behind them in a weak or strong conviction. At the time, therefore, I may still be able to say, with some degree of accuracy, though a very slight degree, what amount of belief I entertain upon the subject. Now we cannot compare things that are heterogeneous: if, therefore, we are to decide between this and an event determined naturally and properly by Probability, it is impossible to appeal to chances or frequency of occurrence. The measure of belief is the only common ground, and we must therefore compare this quantity in each case. The test afforded will be an exceedingly rough one, for the reasons mentioned above, but it will be better than none; in some cases it will be found to furnish all we want.
But there's another, more important reason for this. It's true that when the topics we discuss in a specific instance fall entirely within the realm of Probability, we can handle them without considering our beliefs. We can choose whether or not to factor in this perspective. For example, if I’m asked whether it's more likely that A. B. will die this year than it will rain tomorrow, I can calculate the odds (which essentially reflect my belief) of each event, finding them to be one-sixth and one-seventh, say, and decide that my 'expectation' of the first is greater— meaning that it’s the more likely event. In this case, the process remains the same whether we include our belief or not; our mental state is actually irrelevant to the question. However, in other cases, it could be different. Suppose we’re comparing two things where one is completely outside of Probability, meaning it's futile to try to assign any numerical frequency to it; the only common factor may be the level of belief we have in each. We can't compare their frequency of occurrence, since one might happen so rarely that there’s not enough data to base a judgment on, or it may even be unique. It has been noted that our belief in many events is based on a very complex and extensive foundation. My belief could result from various conflicting arguments and many more or less distant analogies; although these pieces of evidence may have mostly faded from my mind, they leave a lingering effect in the form of a weak or strong conviction. So, at that moment, I might still be able to estimate, with some degree of accuracy—though very slight— the level of belief I hold on the matter. Now, we can't compare things that are fundamentally different: therefore, if we need to decide between this and an event that is naturally and properly determined by Probability, it’s impossible to refer to chances or frequency of occurrence. The measure of belief is the only common ground, and we must compare this quantity in each case. The assessment we get will be very rough for the reasons mentioned above, but it’s better than having no assessment at all; in some situations, it will provide everything we need.
Suppose, for example, that one letter in a million is lost in the Post Office, and that in any given instance I wish to know which is more likely, that a letter has been so lost, or that my servant has stolen it? If the latter alternative could, like the former, be stated in a numerical form, the comparison would be simple. But it cannot be reduced to this form, at least not consciously and directly. Still, if we could feel that our belief in the man's dishonesty was greater than one-millionth, we should then have homogeneous things before us, and therefore comparison would be possible.
Suppose, for example, that one letter in a million gets lost in the Post Office, and in any given situation, I want to know which is more likely: that a letter has been lost or that my servant has stolen it? If the second option could be expressed in a numerical way, like the first, the comparison would be straightforward. But it can't be put into that form, at least not directly and consciously. Still, if we felt that our belief in the man's dishonesty was greater than one in a million, we would then have comparable things in front of us, making it possible to compare them.
§ 35. We are now in a position to give a tolerably accurate definition of a phrase which we have frequently been obliged to employ, or incidentally to suggest, and of which the reader may have looked for a definition already, viz. the probability of an event, or what is equivalent to this, the chance of any given event happening. I consider that these 163 terms presuppose a series; within the indefinitely numerous class which composes this series a smaller class is distinguished by the presence or absence of some attribute or attributes, as was fully illustrated and explained in a previous chapter. These larger and smaller classes respectively are commonly spoken of as instances of the ‘event,’ and of ‘its happening in a given particular way.’ Adopting this phraseology, which with proper explanations is suitable enough, we may define the probability or chance (the terms are here regarded as synonymous) of the event happening in that particular way as the numerical fraction which represents the proportion between the two different classes in the long run. Thus, for example, let the probability be that of a given infant living to be eighty years of age. The larger series will comprise all infants, the smaller all who live to eighty. Let the proportion of the former to the latter be 9 to 1; in other words, suppose that one infant in ten lives to eighty. Then the chance or probability that any given infant will live to eighty is the numerical fraction 1/10. This assumes that the series are of indefinite extent, and of the kind which we have described as possessing a fixed type. If this be not the case, but the series be supposed terminable, or regularly or irregularly fluctuating, as might be the case, for instance, in a society where owing to sanitary or other causes the average longevity was steadily undergoing a change, then in so far as this is the case the series ceases to be a subject of science. What we have to do under these circumstances, is to substitute a series of the right kind for the inappropriate one presented by nature, choosing it, of course, with as little deflection as possible from the observed facts. This is nothing more than has to be done, and invariably is done, whenever natural objects are made subjects of strict science.
§ 35. We can now provide a fairly accurate definition of a phrase we've frequently had to use or hint at, and which the reader may have already been looking for a definition of, namely—the probability of an event, or equivalently, the chance of a specific event occurring. I believe these terms imply a series; within the endlessly large class that makes up this series, a smaller class is defined by the presence or absence of certain attributes, as explained in detail in a previous chapter. These larger and smaller classes are typically referred to as examples of the ‘event’ and of ‘it happening in a specific way.’ Using this terminology, which is appropriate with the right explanations, we can define the probability or chance (with these terms considered synonymous here) of the event occurring in that specific way as the numerical fraction that represents the ratio between the two different classes over time. For example, let's consider the probability of a particular infant living to be eighty years old. The larger series would include all infants, while the smaller series would include all those who live to eighty. If the ratio of the former to the latter is 9 to 1, in other words, if one infant in ten lives to eighty, then the chance or probability that any given infant will reach eighty is the numerical fraction 1/10. This assumes the series is indefinitely large and has a fixed type. If this is not the case, and the series is thought to be finite or fluctuating, as might happen in a society where, due to health or other factors, the average lifespan is consistently changing, then in such situations, the series ceases to be a subject of scientific inquiry. What we need to do in these circumstances is to replace the inappropriate series presented by nature with one of the right kind, selecting it with as little deviation as possible from the observed facts. This is no different from what needs to be done, and is always done, whenever natural objects become the focus of rigorous scientific study.
§ 36. A word or two of explanation may be added about the expression employed above, ‘the proportion in the long run.’ The run must be supposed to be very long indeed, in fact never to stop. As we keep on taking more terms of the series we shall find the proportion still fluctuating a little, but its fluctuations will grow less. The proportion, in fact, will gradually approach towards some fixed numerical value, what mathematicians term its limit. This fractional value is the one spoken of above. In the cases in which deductive reasoning is possible, this fraction may be obtained without direct appeal to statistics, from reasoning about the conditions under which the events occur, as was explained in the fourth chapter.
§ 36. A few words of explanation can be added about the phrase used earlier, 'the proportion in the long run.' The "run" is assumed to be very long, essentially never-ending. As we keep taking more terms from the series, we'll notice the proportion still fluctuating a bit, but those fluctuations will decrease. The proportion will gradually get closer to a fixed numerical value, which mathematicians call its limit. This fractional value is what was mentioned earlier. In situations where deductive reasoning is applicable, this fraction can be found without directly relying on statistics, by reasoning about the conditions under which the events happen, as explained in the fourth chapter.
Here becomes apparent the full importance of the distinction so frequently insisted on, between the actual irregular series before us and the substituted one of calculation, and the meaning of the assertion (Ch. I. § 13), that it was in the case of the latter only that strict scientific inferences could be made. For how can we have a ‘limit’ in the case of those series which ultimately exhibit irregular fluctuations? When we say, for instance, that it is an even chance that a given person recovers from the cholera, the meaning of this assertion is that in the long run one half of the persons attacked by that disease do recover. But if we examined a sufficiently extensive range of statistics, we might find that the manners and customs of society had produced such a change in the type of the disease or its treatment, that we were no nearer approaching towards a fixed limit than we were at first. The conception of an ultimate limit in the ratio between the numbers of the two classes in the series necessarily involves an absolute fixity of the type. When therefore nature does not present us with this absolute fixity, as she seldom or never does except in games of chance (and 165 not demonstrably there), our only resource is to introduce such a series, in other words, as has so often been said, to substitute a series of the right kind.
Here, the full importance of the distinction that is often emphasized becomes clear: the difference between the actual irregular series we see and the calculated one that is used as a substitute, and the significance of the statement (Ch. I. § 13) that strict scientific inferences can only be drawn from the latter. How can we determine a ‘limit’ for series that ultimately display irregular fluctuations? For example, when we say there's a fifty-fifty chance that someone will recover from cholera, what we mean is that, in the long run, half of the people who get that disease will recover. However, if we looked at a broad enough range of statistics, we might discover that societal habits and practices had changed the nature of the disease or its treatment so much that we still aren't getting closer to a fixed limit. The idea of an ultimate limit in the ratio of the two classes in the series requires a complete consistency in type. So, when nature doesn’t provide us with this kind of absolute consistency, which is rare except in games of chance (and even then, not necessarily), our only option is to create such a series; in other words, as has often been said, to substitute it with a series of the right kind.
§ 37. The above, which may be considered tolerably complete as a definition, might equally well have been given in the last chapter. It has been deferred however to the present place, in order to connect with it at once a proposition involving the conceptions introduced in this chapter; viz. the state of our own minds, in reference to the amount of belief we entertain in contemplating any one of the events whose probability has just been described. Reasons were given against the opinion that our belief admitted of any exact apportionment like the numerical one just mentioned. Still, it was shown that a reasonable explanation could be given of such an expression as, ‘my belief is 1/10th of certainty’, though it was an explanation which pointed unmistakeably to a series of events, and ceased to be intelligible, or at any rate justifiable, when it was not viewed in such a relation to a series. In so far, then, as this explanation is adopted, we may say that our belief is in proportion to the above fraction. This referred to the purely intellectual part of belief which cannot be conceived to be separable, even in thought, from the things upon which it is exercised. With this intellectual part there are commonly associated various emotions. These we can to a certain extent separate, and, when separated, can measure with that degree of accuracy which is possible in the case of other emotions. They are moreover intelligible in reference to the individual events. They will be found to increase and diminish in accordance, to some extent, with the fraction which represents the scarcity of the event. The emotion of surprise does so with some degree of accuracy.
§ 37. The definition above is fairly complete and could have been mentioned in the last chapter. However, it has been put off to this point to connect it with a proposition related to the ideas introduced in this chapter; specifically, the state of our minds regarding the level of belief we have when considering any of the events whose probabilities have just been discussed. Arguments were presented against the idea that our belief could be exactly quantified like the numerical example given earlier. Still, it was demonstrated that a reasonable explanation could be provided for a statement like, ‘my belief is 1/10th of certainty,’ although this explanation clearly points to a series of events and loses its meaning, or at least its justification, when not related to that series. Therefore, if we accept this explanation, we can say that our belief is in proportion to the fraction mentioned. This refers to the purely intellectual aspect of belief, which cannot be thought of as separate from the things it applies to. This intellectual part is often linked with various emotions. We can somewhat separate these emotions and, when they are separated, measure them with a level of accuracy similar to that of other emotions. They can also be understood in relation to the individual events. It will be found that they increase and decrease in relation to the fraction representing the rarity of the event. The emotion of surprise does this with a certain degree of precision.
The above investigation describes, though in a very brief 166 form, the amount of truth which appears to me to be contained in the assertion frequently made, that the fraction expressive of the probability represents also the fractional part of full certainty to which our belief of the individual event amounts. Any further analysis of the matter would seem to belong to Psychology rather than to Probability.
The investigation above outlines, although briefly, the amount of truth that I believe is in the claim often made that the fraction representing probability also indicates the fractional part of complete certainty that our belief in the individual event reaches. Any deeper analysis of this topic seems more relevant to Psychology than to Probability.
1 In the ordinary signification of this term. As De Morgan uses it he makes Formal Logic include Probability, as one of its branches, as indicated in his title “Formal Logic, or the Calculus of Inference, necessary and probable.”
1 In the usual meaning of this term. As De Morgan uses it, he makes Formal Logic include Probability as one of its branches, as shown in his title "Formal Logic, or the Calculus of Inference, necessary and probable."
2 Formal Logic. Preface, page v.
__A_TAG_PLACEHOLDER_0__ Formal Logic. Preface, page v.
3 An illustration of the points here insisted on has recently [1876] been given in a quarter where few would have expected it; I allude, as many readers will readily infer, to J. S. Mill's exceedingly interesting Essays on Theism. It is not within our province here to criticise any of their conclusions, but they have expressed in a very significant way the conviction entertained by him that beliefs which are not justified by evidence, and possibly may not be capable of justification (those for instance of immortality and the existence of the Deity), may nevertheless not only continue to exist in cultivated minds, but may also be profitably encouraged there, at any rate in the shape of hopes, for certain supposed advantages attendant on their retention, irrespective even of their truth.
3 An illustration of the points discussed here was recently provided in a place where few would have expected it; I’m referring, as many readers will likely guess, to J. S. Mill's very engaging Essays on Theism. It’s not our role here to critique any of their conclusions, but they have clearly articulated Mill's belief that ideas not backed by evidence, and possibly unprovable ones (like those about immortality and the existence of God), may still exist in educated minds and can even be beneficially nurtured there, at least as hopes, for certain assumed benefits they bring, regardless of their truth.
4 It is necessary to take an example in which the man is forced to act, or we should not be able to shew that he has any belief on the subject at all. He may declare that he neither knows nor cares anything about the matter, and that therefore there is nothing of the nature of belief to be extracted out of his mental condition. He very likely would take this ground if we asked him, as De Morgan does, with a slightly different reference (Formal Logic, p. 183), whether he considers that there are volcanoes on the unseen side of the moon larger than those on the side turned towards us; or, with Jevons (Principles of Science, Ed. II. p. 212) whether he considers that a Platythliptic Coefficient is positive. These do not therefore seem good instances to illustrate the position that we always entertain a certain degree of belief on every question which can be stated, and that utter inability to give a reason in favour of either alternative corresponds to half belief.
4 We need an example where a person is compelled to take action, or else we won't be able to show that he has any belief about the issue at all. He might insist that he neither knows nor cares about the topic, and that there's nothing resembling belief in his mental state. He would probably respond this way if we asked him, as De Morgan does with a slightly different context (Formal Logic, p. 183), whether he thinks there are volcanoes on the far side of the moon that are larger than those we can see; or, like Jevons (Principles of Science, Ed. II. p. 212), whether he believes a Platythliptic Coefficient is positive. Therefore, these examples don’t effectively illustrate the idea that we always hold some degree of belief on any question that can be posed, and that a complete inability to provide a reason in support of either option reflects a partial belief.
5 Except indeed on the principles indicated further on in §§ 24, 25.
5 Except for the principles mentioned later in §§ 24, 25.
6 For a fuller discussion of this, see the Chapter on Causation.
6 For a more detailed discussion on this, check out the Chapter on Causation.
7 The best example I can recall of the distinction between judging from the subjective and the objective side, in such cases as these, occurred once in a railway train. I met a timid old lady who was in much fear of accidents. I endeavoured to soothe her on the usual statistical ground of the extreme rarity of such events. She listened patiently, and then replied, “Yes, Sir, that is all very well; but I don't see how the real danger will be a bit the less because I don't believe in it.”
7 The best example I can think of that shows the difference between subjective and objective judgment in situations like this happened on a train. I met a nervous old lady who was really afraid of accidents. I tried to calm her down by pointing out the statistics on how rare such events are. She listened patiently and then said, “Yes, Sir, that’s all great; but I don’t understand how the real danger becomes any less just because I don’t believe in it.”
8 This would still hold of empirical laws which may be capable of being broken: we now have very much shifted the word, to denote an ultimate law which it is supposed cannot be broken.
8 This still applies to empirical laws that can be violated: we've now changed the term to refer to an ultimate law that is believed to be unbreakable.
CHAPTER 7.
THE RULES OF INFERENCE IN PROBABILITY.
§ 1. In the previous chapter, an investigation was made into what may be called, from the analogy of Logic, Immediate Inferences. Given that nine men out of ten, of any assigned age, live to forty, what could be inferred about the prospect of life of any particular man? It was shown that, although this step was very far from being so simple as it is frequently supposed to be, and as the corresponding step really is in Logic, there was nevertheless an intelligible sense in which we might speak of the amount of our belief in any one of these ‘proportional propositions,’ as they may succinctly be termed, and justify that amount. We must now proceed to the consideration of inferences more properly so called, I mean inferences of the kind analogous to those which form the staple of ordinary logical treatises. In other words, having ascertained in what manner particular propositions could be inferred from the general propositions which included them, we must now examine in what cases one general proposition can be inferred from another. By a general proposition here is meant, of course, a general proposition of the statistical kind contemplated in Probability. The rules of such inference being very few and simple, their consideration will not detain us long. From the data now in our possession we are 168 able to deduce the rules of probability given in ordinary treatises upon the science. It would be more correct to say that we are able to deduce some of these rules, for, as will appear on examination, they are of two very different kinds, resting on entirely distinct grounds. They might be divided into those which are formal, and those which are more or less experimental. This may be otherwise expressed by saying that, from the kind of series described in the first chapters, some rules will follow necessarily by the mere application of arithmetic; whilst others either depend upon peculiar hypotheses, or demand for their establishment continually renewed appeals to experience, and extension by the aid of the various resources of Induction. We shall confine our attention at present principally to the former class; the latter can only be fully understood when we have considered the connection of our science with Induction.
§ 1. In the previous chapter, we looked into what we can call, by analogy to Logic, Immediate Inferences. If nine out of ten men of any given age live to be forty, what can we infer about the life expectancy of a specific man? It was demonstrated that this step is much more complex than it is often assumed to be, and unlike the corresponding step in Logic, we can still discuss our level of belief in any of these 'proportional propositions,' as we can briefly call them, and justify that belief. Now we need to move on to more traditional inferences, specifically those that are similar to what you would find in standard logical texts. In other words, after determining how particular propositions can be derived from the general propositions that encompass them, we will now look into the cases where one general proposition can be inferred from another. By a general proposition, I mean a general proposition of the statistical type discussed in Probability. The rules for these inferences are quite few and straightforward, so we won't linger on them for long. With the information we currently have, we can derive the probability rules typically found in standard texts on the subject. It would be more accurate to say that we can derive some of these rules, because, as we will discover upon closer examination, they fall into two very different categories based on entirely different principles. They can be divided into those that are formal and those that are more or less experimental. Put differently, some rules will emerge necessarily from the types of series described in the earlier chapters simply through arithmetic, while others rely on specific hypotheses or require ongoing experimentation and extension through various resources of Induction. For now, we'll focus mainly on the former group; the latter can only be fully understood once we explore the relationship between our science and Induction.
§ 2. The fundamental rules of Probability strictly so called, that is the formal rules, may be divided into two classes,—those obtained by addition or subtraction on the one hand, corresponding to what are generally termed the connection of exclusive or incompatible events;[1] and those obtained by multiplication or division, on the other hand, corresponding to what are commonly termed dependent events. We will examine these in order.
§ 2. The basic rules of Probability, in the strict sense, which are the formal rules, can be split into two categories: those derived from addition or subtraction, which relate to what are usually called exclusive or incompatible events;[1] and those derived from multiplication or division, which relate to what are typically referred to as dependent events. We will look at these in turn.
(1) We can make inferences by simple addition. If, for instance, there are two distinct properties observable in various members of the series, which properties do not occur in the same individual; it is plain that in any batch the number that are of one kind or the other will be equal to the sum of those of the two kinds separately. Thus 36.4 infants 169 in 100 live to over sixty, 35.4 in 100 die before they are ten;[2] take a large number, say 10,000, then there will be about 3640 who live to over sixty, and about 3540 who do not reach ten; hence the total number who do not die within the assigned limits will be about 2820 altogether. Of course if these proportions were accurately assigned, the resultant sum would be equally accurate: but, as the reader knows, in Probability this proportion is merely the limit towards which the numbers tend in the long run, not the precise result assigned in any particular case. Hence we can only venture to say that this is the limit towards which we tend as the numbers become greater and greater.
(1) We can draw conclusions through simple addition. For example, if there are two distinct characteristics observed in different members of the series, and these characteristics don’t appear in the same individual, it’s clear that in any sample, the number of individuals with one characteristic or the other will equal the total number of those with both characteristics separately. So, for every 100 infants, 36.4 live to be over sixty, while 35.4 die before turning ten;[2] if we take a large sample, like 10,000, then about 3,640 will live to over sixty, and around 3,540 will not reach ten. This means the total number who don’t die within the specified limits will be about 2,820 altogether. Of course, if these proportions were perfectly accurate, the resulting total would also be precise: however, as the reader knows, in Probability, this proportion is just the limit that the numbers approach over time, not the exact outcome in any specific scenario. Therefore, we can only say that this is the limit we approach as the sample size gets larger.
This rule, in its general algebraic form, would be expressed in the language of Probability as follows:—If the chances of two exclusive or incompatible events be respectively 1/m and 1/n the chance of one or other of them happening will be 1/m + 1/n or m + n/mn. Similarly if there were more than two events of the kind in question. On the principles adopted in this work, the rule, when thus algebraically expressed, means precisely the same thing as when it is expressed in the statistical form. It was shown at the conclusion of the last chapter that to say, for example, that the chance of a given event happening in a certain way is 1/6, is only another way of saying that in the long run it does tend to happen in that way once in six times.
This rule, in its general algebraic form, would be expressed in the language of probability as follows: If the chances of two exclusive or incompatible events are respectively 1/m and 1/n, the chance of either one of them happening will be 1/m + 1/n or m + n/mn. Similarly, this applies if there are more than two events of the same kind. Based on the principles established in this work, the rule, when expressed algebraically, means exactly the same thing as when it is stated in statistical terms. It was demonstrated at the end of the last chapter that saying, for example, the chance of a specific event happening in a certain way is 1/6, is just another way of saying that, in the long run, it tends to occur in that way once every six times.
It is plain that a sort of corollary to this rule might be 170 obtained, in precisely the same way, by subtraction instead of addition. Stated generally it would be as follows:—If the chance of one or other of two incompatible events be 1/m and the chance of one alone be 1/n, the chance of the remaining one will be 1/m − 1/n or n − m/nm.
It's clear that a kind of corollary to this rule could be obtained, in exactly the same way, by subtraction instead of addition. Stated more generally, it would be as follows: If the probability of one or the other of two incompatible events is 1/m and the probability of one alone is 1/n, the probability of the remaining one will be 1/m − 1/n or n − m/nm.
For example, if the chance of any one dying in a year is 1/10, and his chance of dying of some particular disease is 1/100, his chance of dying of any other disease is 9/100.
For example, if the chance of someone dying in a year is 1/10, and their chance of dying from a specific disease is 1/100, then their chance of dying from any other disease is 9/100.
The reader will remark here that there are two apparently different modes of stating this rule, according as we speak of ‘one or other of two or more events happening,’ or of ‘the same event happening in one or other of two or more ways.’ But no confusion need arise on this ground; either way of speaking is legitimate, the difference being merely verbal, and depending (as was shown in the first chapter, § 8) upon whether the distinctions between the ‘ways’ are or are not too deep and numerous to entitle the event to be conventionally regarded as the same.
The reader will notice that there are two seemingly different ways to express this rule, depending on whether we talk about 'one or another of two or more events happening' or 'the same event happening in one or another of two or more ways.' However, there’s no need for confusion here; both ways of speaking are valid, and the difference is only in wording. It relies (as was shown in the first chapter, § 8) on whether the distinctions between the 'ways' are too significant and numerous to consider the event as conventionally the same.
We may also here point out the justification for the common doctrine that certainty is represented by unity, just as any given degree of probability is represented by its appropriate fraction. If the statement that an event happens once in m times, is equivalently expressed by saying that its chance is 1/m, it follows that to say that it happens m times in m times, or every time without exception, is equivalent to saying that its chance is m/m or 1. Now an event that happens every time is of course one of whose occurrence we are 171 certain; hence the fraction which represents the ‘chance’ of an event which is certain becomes unity.
We can also point out why the common belief that certainty equals unity works, just like the degree of probability is shown as its corresponding fraction. If we say that an event occurs once in m times, it can also be expressed as having a chance of 1/m. This means that if it happens m times in m trials, or every time without fail, it’s the same as saying its chance is m/m or 1. An event that occurs every time is something we are 171 certain about; therefore, the fraction representing the ‘chance’ of an event that is certain becomes one.
It will be equally obvious that given that the chance that an event will happen is 1/m, the chance that it will not happen is 1 − 1/m or m − 1/m.
It will be just as clear that if the probability of an event occurring is 1/m, then the probability of it not occurring is 1 − 1/m or m − 1/m.
§ 3. (2) We can also make inferences by multiplication or division. Suppose that two events instead of being incompatible, are connected together in the sense that one is contingent upon the occurrence of the other. Let us be told that a given proportion of the members of the series possess a certain property, and a given proportion again of these possess another property, then the proportion of the whole which possess both properties will be found by multiplying together the two fractions which represent the above two proportions. Of the inhabitants of London, twenty-five in a thousand, say, will die in the course of the year; we suppose it to be known also that one death in five is due to fever; we should then infer that one in 200 of the inhabitants will die of fever in the course of the year. It would of course be equally simple, by division, to make a sort of converse inference. Given the total mortality per cent. of the population from fever, and the proportion of fever cases to the aggregate of other cases of mortality, we might have inferred, by dividing one fraction by the other, what was the total mortality per cent. from all causes.
§ 3. (2) We can also draw conclusions through multiplication or division. Imagine that two events are not incompatible but are connected, meaning one depends on the occurrence of the other. If we know that a certain percentage of a group has a specific characteristic, and a certain percentage of that group has another characteristic, we can find the overall percentage that has both characteristics by multiplying the two fractions that represent those percentages. For example, in London, let's say twenty-five out of a thousand people will die within the year; we also assume it's known that one in five deaths is caused by fever. We would then conclude that one in 200 of the population will die from fever over the course of a year. Similarly, we could simply use division to make a sort of opposite inference. Given the overall mortality percentage of the population due to fever and the proportion of fever cases compared to all other causes of death, we could infer the total mortality percentage from all causes by dividing one fraction by the other.
The rule as given above is variously expressed in the language of Probability. Perhaps the simplest and best statement is that it gives us the rule of dependent events. That is; if the chance of one event is 1/m, and the chance that if it happens another will also happen 1/n, then the chance 172 of the latter is 1/mn. In this case it is assumed that the latter is so entirely dependent upon the former that though it does not always happen with it, it certainly will not happen without it; the necessity of this assumption however may be obviated by saying that what we are speaking of in the latter case is the joint event, viz. both together if they are simultaneous events, or the latter in consequence of the former, if they are successive.
The rule mentioned above is expressed in different ways in the language of Probability. One of the simplest and clearest explanations is that it provides the rule for dependent events. Specifically, if the probability of one event is 1/m, and the probability that another event will also occur if the first happens is 1/n, then the probability of the second event is 1/min. In this situation, it is assumed that the second event is completely dependent on the first; even though it doesn't always occur with it, it definitely won't happen without it. However, we can avoid needing this assumption by stating that what we are referring to in the second case is the joint event, meaning both events together if they occur at the same time, or the second event as a result of the first, if they happen one after the other.
§ 4. The above inferences are necessary, in the sense in which arithmetical inferences are necessary, and they do not demand for their establishment any arbitrary hypothesis. We assume in them no more than is warranted, and in fact necessitated by the data actually given to us, and make our inferences from these data by the help of arithmetic. In the simple examples given above nothing is required beyond arithmetic in its most familiar form, but it need hardly be added that in practice examples may often present themselves which will require much profounder methods than these. It may task all the resources of that higher and more abstract arithmetic known as algebra to extract a solution. But as the necessity of appeal to such methods as these does not touch the principles of this part of the subject we need not enter upon them here.
§ 4. The conclusions we draw are necessary in the same way that arithmetical conclusions are, and they don’t require any arbitrary assumptions to establish them. We rely only on what is justified and, in fact, required by the information we have, making our conclusions based on that information with the help of arithmetic. In the simple examples mentioned earlier, all that’s needed is arithmetic in its most basic form, but it's worth noting that in real-life situations, we may often encounter problems that call for much deeper methods than these. It may require all the skills of higher and more abstract arithmetic known as algebra to find a solution. However, since the need to use such methods does not affect the principles of this section of the topic, we won’t discuss them here.
§ 5. The formula next to be discussed stands upon a somewhat different footing from the above in respect of its cogency and freedom from appeal to experience, or to hypothesis. In the two former instances we considered cases in which the data were supposed to be given under the conditions that the properties which distinguished the different kinds of events whose frequency was discussed, were respectively known to be disconnected and known to be connected. Let us now suppose that no such conditions are given to us. 173 One man in ten, say, has black hair, and one in twelve is short-sighted; what conclusions could we then draw as to the chance of any given man having one only of these two attributes, or neither, or both? It is clearly possible that the properties in question might be inconsistent with one another, so as never to be found combined in the same person; or all the short-sighted might have black hair; or the properties might be allotted[3] in almost any other proportion whatever. If we are perfectly ignorant upon these points, it would seem that no inferences whatever could be drawn about the required chances.
§ 5. The formula we’re about to discuss takes a different angle from the previous ones in terms of its effectiveness and its lack of reliance on experience or assumptions. In the two earlier instances, we looked at situations where the data was assumed to be provided under conditions where the characteristics distinguishing different types of events were known to be separate or connected. Now, let’s consider a scenario where we don’t have any of those conditions. 173 For example, one in ten people has black hair, and one in twelve is short-sighted; what conclusions can we draw about the likelihood of a specific person having just one of these traits, neither, or both? It’s entirely possible that these traits could be incompatible, meaning they’d never appear in the same individual; or all the short-sighted individuals could have black hair; or the traits might be distributed in almost any other proportion. If we have no information about these aspects, it seems that we can't draw any conclusions about the probabilities we’re interested in.
Inferences however are drawn, and practically, in most cases, quite justly drawn. An escape from the apparent indeterminateness of the problem, as above described, is found by assuming that, not merely will one-tenth of the whole number of men have black hair (for this was given as one of the data), but also that one-tenth alike of those who are and who are not short-sighted have black hair. Let us take a batch of 1200, as a sample of the whole. Now, from the data which were originally given to us, it will easily be seen that in every such batch there will be on the average 120 who have black hair, and therefore 1080 who have not. And here in strict right we ought to stop, at least until we have appealed again to experience; but we do not stop here. From data which we assume, we go on to infer that of the 120, 10 (i.e. one-twelfth of 120) will be short-sighted, and 110 (the remainder) will not. Similarly we infer that of the 1080, 174 90 are short-sighted, and 990 are not. On the whole, then, the 1200 are thus divided:—black-haired short-sighted, 10; short-sighted without black hair, 90; black-haired men who are not short-sighted, 110; men who are neither short-sighted nor have black hair, 990.
Inferences, however, are drawn, and in most cases, they are justly made. We can escape the unclear nature of the problem described above by assuming that not only will one-tenth of the total number of men have black hair (as this was one of the given data points), but also that one-tenth of both short-sighted and non-short-sighted individuals have black hair. Let's take a sample of 1200 as a representation of the whole. Based on the original data given to us, we can see that there will be, on average, 120 individuals with black hair and therefore 1080 without it. At this point, we should ideally pause and refer back to experience; however, we continue. From our assumptions, we can infer that out of the 120, 10 (which is one-twelfth of 120) will be short-sighted, and 110 (the remaining) will not be. Likewise, we infer that of the 1080, 90 will be short-sighted, and 990 will not. Altogether, the 1200 are divided as follows: black-haired and short-sighted, 10; short-sighted without black hair, 90; black-haired men who are not short-sighted, 110; men who are neither short-sighted nor have black hair, 990.
This rule, expressed in its most general form, in the language of Probability, would be as follows:—If the chances of a thing being p and q are respectively 1/m and 1/n, then the chance of its being both p and q is 1/mn, p and not q is n − 1/mn, q and not p is m − 1/mn, not p and not q is (m − 1)(n − 1)/mn, where p and q are independent. The sum of these chances is obviously unity; as it ought to be, since one or other of the four alternatives must necessarily exist.
This rule, in its simplest terms and using the language of Probability, can be stated like this: If the chances of something being p and q are 1/m and 1/n, then the chance of it being both p and q is 1/min, the chance of p and not q is n − 1/mn, the chance of q and not p is m − 1/mn, and the chance of not p and not q is (m − 1)(n − 1)/mn, where p and q are independent. The total of these chances is obviously one; as it should be, since one or the other of the four options must exist.
§ 6. I have purposely emphasized the distinction between the inference in this case, and that in the two preceding, to an extent which to many readers may seem unwarranted. But it appears to me that where a science makes use, as Probability does, of two such very distinct sources of conviction as the necessary rules of arithmetic and the merely more or less cogent ones of Induction, it is hardly possible to lay too much stress upon the distinction. Few will be prepared to deny that very arbitrary assumptions have been made by many writers on the subject, and none will deny that in the case of what are called ‘inverse probabilities’ assumptions are sometimes made which are at least decidedly open to question. The best course therefore is to make a pause and stringent enquiry at the point at which the possibility of such error and doubtfulness first exhibits itself. These remarks apply to some of the best writers on the subject; in the case of inferior writers, or those who appeal to 175 Probability without having properly mastered its principles, we may go further. It would really not be asserting too much to say that they seem to think themselves justified in assuming that where we know nothing about the distribution of the properties alluded to we must assume them to be distributed as above described, and therefore apportion our belief in the same ratio. This is called ‘assuming the events to be independent,’ the supposition being made that the rule will certainly follow from this independence, and that we have a right, if we know nothing to the contrary, to assume that the events are independent.
§ 6. I have deliberately highlighted the difference between the reasoning in this case and that in the two previous ones, to an extent that may seem excessive to some readers. However, I believe that when a field of study, like Probability, relies on two very different sources of belief— the necessary rules of arithmetic and the more variable rules of Induction—it's important to emphasize this distinction. Few people would argue against the idea that many authors have made rather arbitrary assumptions on this topic, and no one can deny that in cases of what are called ‘inverse probabilities,’ some assumptions are made that are definitely questionable. Therefore, it’s wise to pause and carefully examine the point where the potential for such errors and uncertainties first arises. These comments apply to some of the leading authors in the field; however, if we consider lesser writers, or those who reference Probability without truly understanding its principles, we can be even more critical. It wouldn't be an overstatement to say that they seem to believe that when we know nothing about how the properties are distributed, we should assume they are distributed as described above, and thus set our beliefs accordingly. This is known as ‘assuming the events to be independent,’ based on the idea that this independence will lead to the rule being applied, and that we are justified in assuming the events are independent if we have no contrary information.
The validity of this last claim has already been discussed in the first chapter; it is only another of the attempts to construct à priori the series which experience will present to us, and one for which no such strong defence can be made as for the equality of heads and tails in the throws of a penny. But the meaning to be assigned to the ‘independence’ of the events in question demands a moment's consideration.
The validity of this last claim has already been discussed in the first chapter; it's just another attempt to construct à priori the series that experience will show us, and it doesn’t have as strong a defense as the equality of heads and tails in coin tosses. However, we need to take a moment to consider what we mean by the ‘independence’ of the events in question.
The circumstances of the problem are these. There are two different qualities, by the presence and absence respectively of each of which, amongst the individuals of a series, two distinct pairs of classes of these individuals are produced. For the establishment of the rule under discussion it was found that one supposition was both necessary and sufficient, namely, that the division into classes caused by each of the above distinctions should subdivide each of the classes created by the other distinction in the same ratio in which it subdivides the whole. If the independence be granted and so defined as to mean this, the rule of course will stand, but, without especial attention being drawn to the point, it does not seem that the word would naturally be so understood.
The situation with the problem is as follows. There are two different qualities, and by their presence and absence among individuals in a series, two distinct pairs of classes of these individuals are formed. To establish the rule we're discussing, it was found that one assumption was both necessary and sufficient: the division into classes caused by each of the distinctions should further divide each class created by the other distinction in the same way it divides the whole. If we accept this independence and define it in this way, then the rule will hold. However, unless attention is specifically drawn to it, it doesn’t seem that the term would be naturally understood this way.
§ 7. The above, then, being the fundamental rules of 176 inference in probability, the question at once arises, What is their relation to the great body of formulæ which are made use of in treatises upon the science, and in practical applications of it? The reply would be that these formulæ, in so far as they properly belong to the science, are nothing else in reality than applications of the above fundamental rules. Such applications may assume any degree of complexity, for owing to the difficulty of particular examples, in the form in which they actually present themselves, recourse must sometimes be made to the profoundest theorems of mathematics. Still we ought not to regard these theorems as being anything else than convenient and necessary abbreviations of arithmetical processes, which in practice have become too cumbersome to be otherwise performed.
§ 7. So, these are the basic rules of 176 inference in probability. This leads us to the question: How do they relate to the wide range of formulas used in textbooks on the subject and in real-world applications? The answer is that these formulas, as far as they properly belong to the science, are essentially just applications of the fundamental rules mentioned above. These applications can get quite complex because, due to the challenges posed by specific examples in their actual forms, we sometimes need to rely on advanced mathematical theorems. Nonetheless, we shouldn't see these theorems as anything more than helpful and necessary shortcuts to arithmetic processes that have become too complicated to handle otherwise.
This explanation will account for some of the rules as they are ordinarily given, but by no means for all of them. It will account for those which are demonstrable by the certain laws of arithmetic, but not for those which in reality rest only upon inductive generalizations. And it can hardly be doubted that many rules of the latter description have become associated with those of the former, so that in popular estimation they have been blended into one system, of which all the separate rules are supposed to possess a similar origin and equal certainty. Hints have already been frequently given of this tendency, but the subject is one of such extreme importance that a separate chapter (that on Induction) must be devoted to its consideration.
This explanation will cover some of the rules as they are typically stated, but definitely not all of them. It will address those that can be proven by the established laws of arithmetic, but not those that actually rely only on inductive generalizations. It's hard to deny that many of the latter type of rules have become mixed up with the former, so that in popular belief, they have merged into a single system where all the individual rules are thought to have a similar origin and equal certainty. Hints have already been often provided about this tendency, but the topic is so important that a separate chapter (the one on Induction) will be dedicated to it.
§ 8. In establishing the validity of the above rules, we have taken as the basis of our investigations, in accordance with the general scheme of this work, the statistical frequency of the events referred to; but it was also shown that each formula, when established, might with equal propriety be expressed in the more familiar form of a fraction representing 177 the ‘chance’ of the occurrence of the particular event. The question may therefore now be raised, Can those writers who (as described in the last chapter) take as the primary subject of the science not the degree of statistical frequency, but the quantity of belief, with equal consistency make this the basis of their rules, and so also regard the fraction expressive of the chance as a merely synonymous expression? De Morgan maintains that whereas in ordinary logic we suppose the premises to be absolutely true, the province of Probability is to study ‘the effect which partial belief of the premises produces with respect to the conclusion.’ It would appear therefore as if in strictness we ought on this view to be able to determine this consequent diminution at first hand, from introspection of the mind, that is of the conceptions and beliefs which it entertains; instead of making any recourse to statistics to tell us how much we ought to believe the conclusion.
§ 8. To validate the above rules, we've based our research on the statistical frequency of the events discussed, following the overall framework of this work. However, it's also been demonstrated that each formula can be equally represented in a more familiar fraction that indicates the 'chance' of a specific event occurring. So, we can now consider the question: Can those writers who, as explained in the last chapter, focus on belief quantity rather than statistical frequency consistently use this as the foundation for their rules and view the fraction that expresses chance as just another way to say the same thing? De Morgan argues that while ordinary logic assumes the premises are absolutely true, the field of Probability examines ‘the effect that partial belief in the premises has on the conclusion.’ Thus, it seems that strictly speaking, we should be able to assess this resultant decrease directly, through introspection of the mind—meaning the ideas and beliefs it holds—rather than relying on statistics to inform us how much we should believe in the conclusion.
Any readers who have concurred with me in the general results of the last chapter, will naturally agree in the conclusion that nothing deserving the name of logical science can be extracted from any results of appeal to our consciousness as to the quantity of belief we entertain of this or that proposition. Suppose, for example, that one person in 100 dies on the sea passage out to India, and that one in 9 dies during a 5 years residence there. It would commonly be said that the chance that any one, who is now going out, has of living to start homewards 5 years hence, is 88/100; for his chance of getting there is 99/100; and of his surviving, if he gets there, 8/9; hence the result or dependent event is got by multiplying these fractions together, which gives 88/100. Here the real basis of the reasoning is statistical, and the processes or results are merely translated afterwards into fractions. But can we say the same when we look at the belief side of 178 the question? I quite admit the psychological fact that we have degrees of belief, more or less corresponding to the frequency of the events to which they refer. In the above example, for instance, we should undoubtedly admit on enquiry that our belief in the man's return was affected by each of the risks in question, so that we had less expectation of it than if he were subject to either risk separately; that is, we should in some way compound the risks. But what I cannot recognise is that we should be able to perform the process with any approach to accuracy without appeal to the statistics, or that, even supposing we could do so, we should have any guarantee of the correctness of the result without similar appeal. It appears to me in fact that but little meaning, and certainly no security, can be attained by so regarding the process of inference. The probabilities expressed as degrees of belief, just as those which are expressed as fractions, must, when we are put upon our justification, first be translated into their corresponding facts of statistical frequency of occurrence of the events, and then the inferences must be drawn and justified there. This part of the operation, as we have already shown, is mostly carried on by the ordinary rules of arithmetic. When we have obtained our conclusion we may, if we please, translate it back again into the subjective form, just as we can and do for convenience into the fractional, but I do not see how the process of inference can be conceived as taking place in that form, and still less how any proof of it can thus be given. If therefore the process of inference be so expressed it must be regarded as a symbolical process, symbolical of such an inference about things as has been described above, and it therefore seems to me more advisable to state and expound it in this latter form.
Any readers who agree with me on the overall results of the last chapter will naturally also agree with the conclusion that nothing worthy of being called logical science can be drawn from our conscious beliefs about how much we believe this or that statement. For example, if one person in 100 dies on the sea voyage to India, and one in 9 dies during a 5-year stay there, it’s commonly stated that the chance of someone currently going out living to return home 5 years later is 88/100; because their chance of getting there is 99/100; and the chance of surviving, once there, is 8/9; therefore, the result or dependent event is calculated by multiplying these fractions together, resulting in 88/100. Here, the actual basis of the reasoning is statistical, and the processes or results are simply converted into fractions later. But can we say the same when we consider the belief aspect of the question? I fully acknowledge the psychological fact that we have varying degrees of belief that somewhat align with the frequency of the events they refer to. In the above example, we would certainly admit upon inquiry that our belief in the man's return was influenced by each of the risks involved, leading us to have lower expectations of his return than if he faced either risk individually; in other words, we would somehow combine the risks. However, what I can’t accept is that we could accurately perform this process without referring to the statistics, or that even if we could, we would be guaranteed the correctness of the result without a similar reference. It seems to me that there is little meaning, and definitely no security, in interpreting the inference process this way. The probabilities expressed as degrees of belief, like those expressed as fractions, must, when we need to justify them, first be translated into the corresponding statistical facts about how frequently the events occur, after which the inferences must be made and justified accordingly. This part of the operation, as we’ve already shown, primarily relies on standard arithmetic rules. Once we have reached our conclusion, we can, if we wish, convert it back to the subjective form, just as we do for convenience with the fractional form, but I don’t see how the inference process can be thought to happen in that form, much less how any proof of it could be provided that way. Therefore, if the inference process is expressed this way, it must be understood as a symbolic process, a symbol of such an inference about things as has been described above, and so it seems more advisable to present and explain it in this latter format.
On Inverse Probability and the Rules required for it.
§ 9. It has been already stated that the only fundamental rules of inference in Probability are the two described in §§ 2, 3, but there are of course abundance of derivative rules, the nature and use of which are best obtained from the study of any manual upon the subject. One class of these derivative rules, however, is sufficiently distinct in respect of the questions to which it may give rise, to deserve special examination. It involves the distinction commonly recognised as that between Direct and Inverse Probability. It is thus introduced by De Morgan:—
§ 9. It has already been mentioned that the only basic rules of inference in Probability are the two outlined in §§ 2, 3, but there are certainly plenty of derived rules, which are best understood by studying any manual on the topic. However, one category of these derived rules is distinct enough in terms of the questions it raises to warrant special attention. It relates to the commonly recognized distinction between Direct and Inverse Probability. De Morgan introduces it as follows:—
“In the preceding chapter we have calculated the chances of an event, knowing the circumstances under which it is to happen or fail. We are now to place ourselves in an inverted position: we know the event, and ask what is the probability which results from the event in favour of any set of circumstances under which the same might have happened.”[4] The distinction might therefore be summarily described as that between finding an effect when we are given the causes, and finding a cause when we are given effects.
“In the previous chapter, we calculated the likelihood of an event based on the conditions that could lead to its occurrence or failure. Now, we will switch perspectives: we know the event has happened and will ask what probability arises from that event in support of any conditions under which it could have happened.”[4] The distinction can be briefly described as finding an effect when we know the causes, versus finding a cause when we know the effects.
On the principles of the science involved in the definition which was discussed and adopted in the earlier chapters of this work, the reader will easily infer that no such distinction as this can be regarded as fundamental. One common feature was traced in all the objects which were to be referred to Probability, and from this feature the possible rules of 180 inference can be immediately derived. All other distinctions are merely those of arrangement or management.
Based on the principles of the science discussed and accepted in the earlier chapters of this work, the reader can easily conclude that no distinction like this can be seen as fundamental. One common characteristic was identified in all the objects related to Probability, and from this characteristic, the possible rules of inference can be directly derived. All other distinctions are simply those of organization or handling. 180
But although the distinction is not by any means fundamental, it is nevertheless true that the practical treatment of such problems as those principally occurring in Inverse Probability, does correspond to a very serious source of ambiguity and perplexity. The arbitrary assumptions which appear in Direct Probability are not by any means serious, but those which invade us in a large proportion of the problems offered by Inverse Probability are both serious and inevitable.
But even though the distinction isn't fundamentally important, it's still true that the practical handling of issues mainly seen in Inverse Probability leads to a significant source of confusion and uncertainty. The arbitrary assumptions found in Direct Probability aren't crucial, but the ones that crop up in many of the problems presented by Inverse Probability are both serious and unavoidable.
§ 10. This will be best seen by the examination of special examples; as any, however simple, will serve our purpose, let us take the two following:—
§ 10. This will be best understood by looking at specific examples; since any example, no matter how simple, will work for us, let’s consider the two following:—
(1) A ball is drawn from a bag containing nine black balls and one white: what is the chance of its being the white ball?
(1) A ball is taken from a bag that has nine black balls and one white ball: what are the odds of it being the white ball?
(2) A ball is drawn from a bag containing ten balls, and is found to be white; what is the chance of there having been but that one white ball in the bag?
(2) A ball is taken from a bag that has ten balls, and it turns out to be white; what are the odds that there was only that one white ball in the bag?
The class of which the first example is a simple instance has been already abundantly discussed. The interpretation of it is as follows: If balls be continually drawn and replaced, the proportion of white ones to the whole number drawn will tend towards the fraction 1/10. The contemplated action is a single one, but we view it as one of the above series; at least our opinion is formed upon that assumption. We conclude that we are going to take one of a series of events which may appear individually fortuitous, but in which, in the long run, those of a given kind are one-tenth of the whole; this kind (white) is then singled out by anticipation. By stating that its chance is 1/10, we merely mean to assert this physical fact, together with such other mental 181 facts, emotions, inferences, &c., as may be properly associated with it.
The class, of which the first example is a straightforward case, has already been discussed at length. Here's the interpretation: If balls are continuously drawn and put back, the ratio of white ones to the total number drawn will approach the fraction 1/10. The action we're considering is a single one, but we look at it as part of the aforementioned series; at least, that's our assumption. We conclude that we're about to take one from a series of events that may seem random individually, but in the long run, those of a specific kind make up one-tenth of the whole; this type (white) is then highlighted by expectation. By stating that its probability is 1/10, we're simply affirming this physical fact, along with any other mental 181 facts, feelings, inferences, and so on, that might be appropriately related to it.
§ 11. Have we to interpret the second example in a different way? Here also we have a single instance, but the nature of the question would seem to decide that the only series to which it can properly be referred is the following:—Balls are continually drawn from different bags each containing ten, and are always found to be white; what is ultimately the proportion of cases in which they will be found to have been taken from bags with only one white ball in them? Now it may be readily shown[5] that time has nothing to do with the question; omitting therefore the consideration of this element, we have for the two series from which our opinions in these two examples respectively are to be formed:—(1) balls of different colours presented to us in a given ultimate ratio; (2) bags with different contents similarly presented. From these data respectively we have to assign their due weight to our anticipations of (1) a white ball; (2) a bag containing but one white ball. So stated the problems would appear to be formally identical.
§ 11. Do we need to interpret the second example differently? Here we also have a single instance, but the nature of the question suggests that the only series it can be properly referred to is as follows:—Balls are continually drawn from different bags, each containing ten, and are always found to be white; what is ultimately the proportion of cases in which they were drawn from bags with only one white ball in them? It can be easily shown [5] that time is irrelevant to the question; therefore, omitting this element, we have two series from which our opinions in these two examples are to be formed:—(1) balls of different colors presented to us in a given ultimate ratio; (2) bags with different contents similarly presented. From these data, we need to assign the appropriate weight to our expectations of (1) a white ball; (2) a bag containing only one white ball. Stated this way, the problems appear to be formally identical.
When, however, we begin the practical work of solving them we perceive a most important distinction. In the first example there is not much that is arbitrary; balls would under such circumstance really come out more or less accurately in the proportion expected. Moreover, in case it should be objected that it is difficult to prove that they will do so, it does not seem an unfair demand to say that the balls are to be ‘well-mixed’ or ‘fairly distributed,’ or to introduce any of the other conditions by which, under the semblance of judging à priori, we take care to secure our prospect of a 182 series of the desired kind. But we cannot say the same in the case of the second example.
When we start actually solving them, we notice a crucial difference. In the first example, there's not much that is random; the balls would actually end up coming out in the expected proportions. Additionally, if someone argues that proving this is challenging, it doesn't seem unreasonable to require that the balls are 'well-mixed' or 'evenly distributed,' or to include any other conditions that, under the guise of judging à priori, help us ensure a series of the desired kind. However, we can't say the same for the second example.
§ 12. The line of proof by which it is generally attempted to solve the second example is of this kind;—It is shown that there being one white ball for certain in the bag, the only possible antecedents are of ten kinds, viz. bags, each of which contains ten balls, but in which the white balls range respectively from one to ten in number. This of course imposes limits upon the kind of terms to be found in our series. But we want more than such limitations, we must know the proportions in which these terms are ultimately found to arrange themselves in the series. Now this requires an experience about bags which may not, and indeed in a large proportion of similar cases, cannot, be given to us. If therefore we are to solve the question at all we must make an assumption; let us make the following;—that each of the bags described above occurs equally often,—and see what follows. The bags being drawn from equally often, it does not follow that they will each yield equal numbers of white balls. On the contrary they will, as in the last example, yield them in direct proportion to the number of such balls which they contain. The bag with one white and nine black will yield a white ball once in ten times; that with two white, twice; and so on. The result of this, it will be easily seen, is that in 100 drawings there will be obtained on the average 55 white balls and 45 black. Now with those drawings that do not yield white balls we have, by the question, nothing to do, for that question postulated the drawing of a white ball as an accomplished fact. The series we want is therefore composed of those which do yield white. Now what is the additional attribute which is found in some members, and in some members only, of this series, and which we mentally anticipate? Clearly it is the attribute of 183 having been drawn from a bag which only contained one of these white balls. Of these there is, out of the 55 drawings, but one. Accordingly the required chance is 1/55. That is to say, the white ball will have been drawn from the bag containing only that one white, once in 55 times.
§ 12. The proof technique typically used to solve the second example goes like this: It shows that since there is definitely one white ball in the bag, there are ten possible scenarios—specifically, bags that each contain ten balls, where the number of white balls varies from one to ten. This sets limits on the types of outcomes we might find in our series. However, we need more than just these limitations; we need to understand the proportions in which these outcomes ultimately arrange themselves in the series. This requires experience with bags that we may not have, and in many similar cases, we can't obtain that information. Therefore, to tackle the question, we have to make an assumption; let's assume that each of the bags mentioned above occurs equally often, and see what that leads to. With the bags being drawn from equally often, it doesn’t mean that they will all produce equal amounts of white balls. On the contrary, as in the previous example, they will yield them in direct proportion to how many white balls they contain. The bag with one white and nine black will produce a white ball once every ten times; the bag with two white will produce a white ball twice, and so forth. The outcome will show that over 100 drawings, there will be, on average, 55 white balls and 45 black ones. Now, concerning those drawings that don’t yield any white balls, we won’t consider them because the question assumes the drawing of a white ball has already happened. Thus, the series we are interested in consists only of the successful draws that produced white balls. Now, what is the extra characteristic found in some members of this series—and only in some? Clearly, it’s the characteristic of having been drawn from a bag that contained only that one white ball. Out of the 55 draws, there is only one such instance. Therefore, the chance we are looking for is 1/55. In other words, the white ball will have come from the bag containing only that one white ball once in 55 times.
§ 13. Now, with the exception of the passage in italics, the process here is precisely the same as in the other example; it is somewhat longer only because we are not able to appeal immediately to experience, but are forced to try to deduce what the result will be, though the validity of this deduction itself rests, of course, ultimately upon experience. But the above passage is a very important one. It is scarcely necessary to point out how arbitrary it is.
§ 13. Now, except for the part in italics, the process here is exactly the same as in the other example; it's just a bit longer because we can't refer to experience right away and have to figure out what the outcome will be. However, the validity of this reasoning ultimately relies on experience. But the passage above is very important. It's hardly necessary to emphasize how arbitrary it is.
For is the supposition, that the different specified kinds of bags are equally likely, the most reasonable supposition under the circumstances in question? One man may think it is, another may take a contrary view. In fact in an excellent manual[6] upon the subject a totally different supposition is made, at any rate in one example; it is taken for granted in that instance, not that every possible number of black and white balls respectively is equally likely, but that every possible way of getting each number is equally likely, whence it follows that bags with an intermediate number of black and white balls are far more likely than those with an extreme number of either. On this supposition five black and five white being obtainable in 252 ways against the ten ways of obtaining one white and nine black, it follows that the chance that we have drawn from a bag of the latter description is much less than on the hypothesis first made. The chance, in fact, becomes now 1/512 instead of 1/55. In the one case each distinct result is considered 184 equally likely, in the other every distinct way of getting each result.
For is the assumption that the different types of bags are equally likely the most reasonable assumption given the circumstances? One person might believe it is, while another might disagree. In fact, in a great manual [6] on the topic, a completely different assumption is made, at least in one example; it is assumed in that case that not every possible number of black and white balls is equally likely, but that every possible way of achieving each number is equally likely. This leads to the conclusion that bags with a moderate number of black and white balls are much more likely than those with an extreme number of either. Based on this assumption, five black and five white balls can be drawn in 252 ways compared to just ten ways of drawing one white and nine black. Therefore, the chance of selecting from a bag of the latter type is significantly lower than the first assumption suggested. The probability now becomes 1/512 instead of 1/55. In one scenario, each distinct outcome is viewed as equally likely, while in the other, each distinct method of achieving each outcome is considered. 184
§ 14. Uncertainties of this kind are peculiarly likely to arise in these inverse probabilities, because when we are merely given an effect and told to look out for the chance of some assigned cause, we are often given no clue as to the relative prevalence of these causes, but are left to determine them on general principles. Give us either their actual prevalence in statistics, or the conditions by which such prevalence is brought about, and we know what to do; but without the help of such data we are reduced to guessing. In the above example, if we had been told how the bag had been originally filled, that is by what process, or under what circumstances, we should have known what to do. If it had been filled at random from a box containing equal numbers of black and white balls, the supposition in Mr Whitworth's example is the most reasonable; but in the absence of any such information as this we are entirely in the dark, and the supposition made in § 12 is neither more nor less trustworthy and reasonable than many others, though it doubtless possesses the merit of superior simplicity.[7] If the reader will recur to Ch. V. §§ 4, 5, he will find this particular difficulty fully explained. Everybody practically admits that a certain characteristic arrangement or distribution has to be introduced at some prior stage; and that, as soon as this stage has been selected, there are no further theoretic difficulties to be encountered. But when we come to decide, in examples of the class in question, at what stage it is most reasonable 185 to make our postulate, we are often left without any very definite or rational guidance.
§ 14. Uncertainties like these are especially likely to come up in these reverse probabilities because when we’re only given an effect and asked to consider the chance of a specific cause, we often receive no information about how common these causes are. We have to figure them out based on general principles. If we had either their actual prevalence in statistics or the conditions that create such prevalence, we would know what to do; but without that data, we’re just guessing. In the above example, if we had been informed about how the bag was originally filled—by what process or under what circumstances—we would have known what to do. If it had been filled randomly from a box with equal numbers of black and white balls, the assumption in Mr. Whitworth's example would be the most reasonable. But without any such information, we're completely in the dark, and the assumption made in § 12 is just as trustworthy and reasonable as many others, though it certainly has the advantage of being simpler. [7] If the reader goes back to Ch. V. §§ 4, 5, they will find this particular issue explained in detail. Practically everyone acknowledges that a certain typical arrangement or distribution must be introduced at some earlier stage; and once this stage is chosen, there are no additional theoretical issues to deal with. However, when it comes to deciding, in examples like this, at what stage it makes the most sense to make our assumption, we often find ourselves without clear or rational guidance.
§ 15. When, however, we take what may be called, by comparison with the above purely artificial examples, instances presented by nature, much of this uncertainty will disappear, and then all real distinction between direct and inverse probability will often vanish. In such cases the causes are mostly determined by tolerably definite rules, instead of being a mere cloud-land of capricious guesses. We may either find their relative frequency of occurrence by reference to tables, or may be able to infer it by examination of the circumstances under which they are brought about. Almost any simple example would then serve to illustrate the fact that under such circumstances the distinction between direct and inverse probability disappears altogether, or merely resolves itself into one of time, which, as will be more fully shown in a future chapter, is entirely foreign to our subject.
§ 15. However, when we look at examples from nature, as opposed to the purely artificial ones mentioned earlier, much of this uncertainty fades away, and the distinction between direct and inverse probability often disappears. In these cases, the causes are usually determined by fairly clear rules, rather than being just random guesses. We can either find their relative frequency using tables or infer it by examining the circumstances that lead to them. Almost any simple example would illustrate that, in such situations, the distinction between direct and inverse probability completely vanishes or simply becomes one of time, which, as will be explained in more detail in a future chapter, is not really relevant to our topic.
It is not of course intended to imply that difficulties similar to those mentioned above do not occasionally invade us here also. As already mentioned, they are, if not inherent in the subject, at any rate almost unavoidable in comparison with the simpler and more direct procedure of determining what is likely to follow from assigned conditions. What is meant is that so long as we confine ourselves within the comparatively regular and uniform field of natural sequences and co-existences, statistics of causes may be just as readily available as those of effects. There will not be much more that is arbitrary in the one than in the other. But of course this security is lost when, as will be almost immediately noticed, what may be called metaphysical rather than natural causes are introduced into the enquiry.
It doesn't mean to suggest that challenges similar to those mentioned above don't occasionally arise here too. As stated earlier, they are, if not inherent to the subject, at least almost unavoidable when compared to the simpler and more straightforward process of figuring out what is likely to happen based on given conditions. What is meant is that as long as we stick to the relatively regular and uniform field of natural sequences and co-existences, statistics of causes can be just as easily accessible as those of effects. There won't be much more that is arbitrary in one than in the other. However, this sense of security is lost once, as will soon be noticed, metaphysical rather than natural causes are introduced into the inquiry.
For instance, it is known that in London about 20 people 186 die per thousand each year. Suppose it also known that of every 100 deaths there are about 4 attributable to bronchitis. The odds therefore against any unknown person dying of bronchitis in a given year are 1249 to 1. Exactly the same statistics are available to solve the inverse problem:—A man is dead, what is the chance that he died of bronchitis? Here, since the man's death is taken for granted, we do not require to know the general average mortality. All that we want is the proportional mortality from the disease in question as given above. If Probability dealt only with inferences founded in this way upon actual statistics, and these tolerably extensive, it is scarcely likely that any distinction such as this between direct and inverse problems would ever have been drawn.
For example, it's known that in London, around 20 people 186 die per thousand each year. It's also known that out of every 100 deaths, about 4 are due to bronchitis. So, the odds against any random person dying of bronchitis in a given year are 1249 to 1. The same statistics can be used to answer the reverse question: A man has died, what are the chances that he died of bronchitis? Here, since the man’s death is assumed, we don’t need to know the overall average mortality. All we require is the proportional mortality from the disease as mentioned above. If probability only dealt with inferences based on actual statistics, and those were reasonably extensive, it’s unlikely that any distinction between direct and inverse problems would have ever been made.
§ 16. Considered therefore as a contribution to the theory of the subject, the distinction between Direct and Inverse Probability must be abandoned. When the appropriate statistics are at hand the two classes of problems become identical in method of treatment, and when they are not we have no more right to extract a solution in one case than in the other. The discussion however may serve to direct renewed attention to another and far more important distinction. It will remind us that there is one class of examples to which the calculus of Probability is rightfully applied, because statistical data are all we have to judge by; whereas there are other examples in regard to which, if we will insist upon making use of these rules, we may either be deliberately abandoning the opportunity of getting far more trustworthy information by other means, or we may be obtaining solutions about matters on which the human intellect has no right to any definite quantitative opinion.
§ 16. Therefore, when considering it as a contribution to the theory of the subject, we must let go of the distinction between Direct and Inverse Probability. When we have the right statistics available, the two types of problems become the same in how we approach them; and when we don’t have that data, we have no more justification for finding a solution in one case than in the other. However, this discussion may help refocus our attention on another, much more significant distinction. It serves as a reminder that there is a set of examples where the Probability calculus is correctly applied, because we only have statistical data to rely on. In contrast, there are other examples where, if we insist on using these rules, we might be intentionally missing out on more reliable information through other methods, or we might be finding solutions about issues where the human mind has no right to form a definite quantitative opinion.
§ 17. The nearest approach to any practical justification of such judgments that I remember to have seen is afforded 187 by cases of which the following example is a specimen:— “Of 10 cases treated by Lister's method, 7 did well and 3 suffered from blood-poisoning: of 14 treated with ordinary dressings, 9 did well and 5 had blood-poisoning; what are the odds that the success of Lister's method was due to chance?”.[8] Or, to put it into other words, a short experience has shown an actual superiority in one method over the other: what are the chances that an indefinitely long experience, under similar conditions, will confirm this superiority?
§ 17. The closest thing I've seen to a practical justification for such judgments is found in cases like the following example:— “Out of 10 cases treated using Lister's method, 7 had positive outcomes and 3 experienced blood poisoning; out of 14 treated with standard dressings, 9 did well and 5 had blood poisoning. What are the chances that Lister's method was successful purely by chance?” [8] In other words, a brief experience has shown that one method is better than the other: what are the odds that an indefinitely long experience, under the same conditions, will support this advantage?
The proposer treated this as a ‘bag and balls’ problem, analogous to the following: 10 balls from one bag gave 7 white and 3 black, 14 from another bag gave 9 white and 5 black: what is the chance that the actual ratio of white to black balls was greater in the former than in the latter?—this actual ratio being of course considered a true indication of what would be the ultimate proportions of white and black drawings. This seems to me to be the only reasonable way of treating the problem, if it is to be considered capable of numerical solution at all.
The proposer viewed this as a ‘bag and balls’ problem, similar to the following: taking 10 balls from one bag resulted in 7 white and 3 black, while 14 from another bag gave 9 white and 5 black. What’s the chance that the actual ratio of white to black balls was higher in the first bag than in the second?—this actual ratio is seen as a true indicator of what the ultimate proportions of white and black draws would be. This, in my opinion, is the only sensible way to approach the problem if it is to be considered solvable numerically at all.
Of course the inevitable assumption has to be made here about the equal prevalence of the different possible kinds of bag,—or, as the supporters of the justice of the calculation would put it, of the obligation to assume the equal à priori likelihood of each kind,—but I think that in this particular example the arbitrariness of the assumption is less than usual. This is because the problem discusses simply a balance between two extremely similar cases, and there is a certain set-off against each other of the objectionable assumptions 188 on each side. Had one set of experiments only been proposed, and had we been asked to evaluate the probability of continued repetition of them confirming their verdict, I should have felt all the scruples I have already mentioned. But here we have got two sets of experiments carried on under almost exactly similar circumstances, and there is therefore less arbitrariness in assuming that their unknown conditions are tolerably equally prevalent.
Of course, we have to make the inevitable assumption here about the equal chances of the different types of bag—or, as the supporters of the fairness of the calculation would say, about the obligation to assume the equal à priori likelihood of each type—but I think that in this particular example, the arbitrariness of the assumption is less than usual. This is because the problem simply discusses a balance between two very similar cases, and there’s a certain offset against each other of the questionable assumptions on either side. If only one set of experiments had been proposed, and we were asked to evaluate the probability of them continuing to confirm their conclusion, I would have felt all the concerns I've already mentioned. But here we have two sets of experiments conducted under almost exactly similar circumstances, so there’s less arbitrariness in assuming that their unknown conditions are fairly equally prevalent. 188
§ 18. Examples of the description commonly introduced seem objectionable enough, but if we wish to realize to its full extent the vagueness of some of the problems submitted to this Inverse Probability, we have not far to seek. In natural as in artificial examples, where statistics are unattainable the enquiry becomes utterly hopeless, and all attempts at laying down rules for calculation must be abandoned. Take, for instance, the question which has given rise to some discussion,[9] whether such and such groups of stars are or are not to be regarded as the results of an accidental distribution; or the still wider and vaguer question, whether such and such things, or say the world itself, have been produced by chance?
§ 18. Examples of the descriptions usually presented seem quite problematic, but if we want to fully grasp the ambiguity of some of the issues addressed by this Inverse Probability, it's not hard to find. In both natural and artificial examples, where statistics are missing, the inquiry becomes completely futile, and any attempts to establish calculation rules must be discarded. Take, for example, the question that has sparked some debate, whether certain groups of stars should be considered the result of random distribution; or the even broader and more ambiguous question of whether certain things, or even the world itself, came about by chance?
In cases of this kind the insuperable difficulty is in determining what sense exactly is to be attached to the words ‘accidental’ and ‘random’ which enter into the discussion. Some account was given, in the fourth chapter, of their scientific and conventional meaning in Probability. There seem to be the same objections to generalizing them out of such relation, as there is in metaphysics to talking of the Infinite or the Absolute. Infinite magnitude, or infinite 189 power, one can to some extent comprehend, or at least one may understand what is being talked about, but ‘the infinite’ seems to me a term devoid of meaning. So of anything supposed to have been produced at random: tell us the nature of the agency, the limits of its randomness and so on, and we can venture upon the problem, but without such data we know not what to do. The further consideration of such a problem might, I think, without arrogance be relegated to the Chapter on Fallacies. Accordingly any further remarks which I have to make upon the subject will be found there, and at the conclusion of the chapter on Causation and Design.
In situations like this, the main challenge is figuring out what exactly we mean by the terms 'accidental' and 'random' that come up in the discussion. In the fourth chapter, we discussed their scientific and conventional meanings in Probability. There appear to be similar issues with trying to generalize these terms outside that context, just as there are in metaphysics when talking about the Infinite or the Absolute. While we can somewhat grasp infinite magnitude or infinite power, or at least have an idea of what is being talked about, 'the infinite' seems like a term without meaning to me. The same goes for anything thought to be produced at random: if you tell us the nature of the process, the limits of its randomness, and so on, then we can tackle the problem, but without that information, we don’t know what to do. I believe that further exploration of such a problem could reasonably be assigned to the Chapter on Fallacies. Therefore, any additional comments I have on the topic will be found there, as well as at the end of the chapter on Causation and Design.
1 It might be more accurate to speak of ‘incompatible hypotheses with respect to any individual case’, or ‘mutually exclusive classes of events’.
1 It might be more accurate to talk about ‘incompatible hypotheses regarding any individual case’ or ‘mutually exclusive categories of events’.
2 The examples, of this kind, referring to human mortality are taken from the Carlisle tables. These differ considerably, as is well known, from other tables, but we have the high authority of De Morgan for regarding them as the best representative of the average mortality of the English middle classes at the present day.
2 The examples referring to human mortality are taken from the Carlisle tables. These differ quite a bit from other tables, but we have the strong endorsement of De Morgan for considering them as the best representation of the average mortality of the English middle classes today.
3 I say, almost any proportion, because, as may easily be seen, arithmetic imposes certain restrictions upon the assumptions that can be made. We could not, for instance, suppose that all the black-haired men are short-sighted, for in any given batch of men the former are more numerous. But the range of these restrictions is limited, and their existence is not of importance in the above discussion.
3 I say, almost any proportion, because, as you can easily see, math imposes certain limits on the assumptions we can make. For example, we couldn’t assume that all black-haired men are short-sighted, since in any group of men, the black-haired ones are usually more numerous. However, the range of these limits is narrow, and their existence isn't important to the discussion above.
4 Essay on Probabilities, p. 53. I have been reminded that in his article on Probability in the Encyclopædia Metropolitana he has stated that such rules involve no new principle.
4 Essay on Probabilities, p. 53. I’ve been reminded that in his article on Probability in the Encyclopædia Metropolitana, he stated that these rules don’t involve any new principles.
5 This point will be fully discussed in a future chapter, after the general stand-point of an objective system of logic has been explained and illustrated.
5 This point will be discussed in detail in a later chapter, once the overall perspective of an objective system of logic has been explained and illustrated.
6 Whitworth's Choice and Chance, Ed. II., p. 123. See also Boole's Laws of Thought, p. 370.
6 Whitworth's Choice and Chance, 2nd ed., p. 123. See also Boole's Laws of Thought, p. 370.
7 Opinions differ about the defence of such suppositions, as they do about the nature of them. Some writers, admitting the above assumption to be doubtful, call it the most impartial hypothesis. Others regard it as a sort of mean hypothesis.
7 People have different opinions about defending these assumptions, just as they do about what those assumptions actually are. Some writers, who consider the assumption to be questionable, refer to it as the most unbiased hypothesis. Others see it as a kind of average hypothesis.
8 Educational Times; Reprint, Vol. xxxvii. p. 40. The question was proposed by Dr. Macalister and gave rise to considerable controversy. As usual with problems of this inverse kind hardly any two of the writers were in agreement as to the assumptions to be made, or therefore as to the numerical estimate of the odds.
8 Educational Times; Reprint, Vol. xxxvii. p. 40. The question was raised by Dr. Macalister and sparked a lot of debate. As is often the case with issues like this, barely any of the contributors agreed on the assumptions to be made, and consequently, they also disagreed on the numerical estimates of the odds.
9 See Todhunter's History, pp. 333, 4.
__A_TAG_PLACEHOLDER_0__ See Todhunter's History, pp. 333, 4.
There is an interesting discussion upon this question by the late J. D. Forbes in a paper in the Philosophical Magazine for Dec. 1850. It was replied to in a subsequent number by Prof. Donkin.
There’s an interesting discussion on this question by the late J. D. Forbes in a paper in the Philosophical Magazine from December 1850. Prof. Donkin replied in a later issue.
CHAPTER 8.
THE RULE OF SUCCESSION.[*]
* A word of apology may be offered here for the introduction of a new name. The only other alternative would have been to entitle the rule one of Induction. But such a title I cannot admit, for reasons which will be almost immediately explained.
Please provide the text you would like me to modernize. I want to apologize for bringing up a new name here. The only other option would have been to call the rule Induction. But I can't accept that title, and I'll explain why shortly.
§ 1. In the last chapter we discussed at some length the nature of the kinds of inference in Probability which correspond to those termed, in Logic, immediate and mediate inferences. We ascertained what was the meaning of saying, for example, that the chance of any given man A. B. dying in a year is 1/3, when concluded from the general proposition that one man out of three in his circumstances dies. We also discussed the nature and evidence of rules of a more completely inferential character. But to stop at this point would be to take a very imperfect view of the subject. If Probability is a science of real inference about things, it must surely lead up to something more than such merely formal conclusions; we must be able, if not by means of it, at any rate by some means, to step beyond the limits of what has been actually observed, and to draw conclusions about what is as yet unobserved. This leads at once to the question, What is the connection of Probability with Induction? This is a question into which it will be necessary to enter now with some minuteness.
§ 1. In the last chapter, we talked at length about the types of inference in Probability that correspond to what are called, in Logic, immediate and mediate inferences. We clarified what it means, for instance, to say that the chance of a specific person A.B. dying within a year is 1/3, based on the general statement that one in three men in similar circumstances dies. We also explored the nature and evidence of rules that are more wholly inferential. However, stopping here would provide an incomplete understanding of the topic. If Probability is truly a science of real inference about phenomena, it must lead us to more than just formal conclusions; we should be able, if not through it, then through some means, to go beyond what has been observed and make predictions about what has not yet been seen. This immediately raises the question, What is the relationship between Probability and Induction? This is a topic we need to delve into now with some detail.
That there is a close connection between Probability and Induction, must have been observed by almost every one 191 who has treated of either subject; I have not however seen any account of this connection that seemed to me to be satisfactory. An explicit description of it should rather be sought in treatises upon the narrower subject, Probability; but it is precisely here that the most confusion is to be found. The province of Probability being somewhat narrow, incursions have been constantly made from it into the adjacent territory of Induction. In this way, amongst the arithmetical rules discussed in the last chapter, others have been frequently introduced which ought not in strictness to be classed with them, as they rest on an entirely different basis.
There is a clear link between Probability and Induction that almost everyone who has studied either topic has noticed. However, I haven't come across any explanation of this connection that I found satisfactory. A detailed description should ideally be found in works focused specifically on Probability, but that's exactly where the most confusion arises. Since the field of Probability is somewhat limited, there have been ongoing overlaps into the related area of Induction. As a result, among the mathematical rules discussed in the last chapter, others have often been included that shouldn't strictly be categorized with them, as they are based on a completely different foundation.
§ 2. The origin of such confusion is easy of explanation; it arises, doubtless, from the habit of laying undue stress upon the subjective side of Probability, upon that which treats of the quantity of our belief upon different subjects and the variations of which that quantity is susceptible. It has been already urged that this variation of belief is at most but a constant accompaniment of what is really essential to Probability, and is moreover common to other subjects as well. By defining the science therefore from this side these other subjects would claim admittance into it; some of these, as Induction, have been accepted, but others have been somewhat arbitrarily rejected. Our belief in a wider proposition gained by Induction is, prior to verification, not so strong as that of the narrower generalization from which it is inferred. This being observed, a so-called rule of probability has been given by which it is supposed that this diminution of assent could in many instances be calculated.
§ 2. The cause of this confusion is easy to explain; it likely comes from focusing too much on the subjective aspect of Probability, which deals with how much we believe in different subjects and the way that belief can change. It has already been pointed out that this change in belief is mostly just a constant feature of what is actually important to Probability, and it also appears in other areas. By defining the science from this perspective, these other areas would want to be included; some of them, like Induction, have been accepted, but others have been rejected somewhat arbitrarily. Our belief in a broader proposition gained through Induction is, before it gets verified, not as strong as that of the narrower generalization it comes from. Noticing this, a so-called rule of probability has been proposed to supposedly calculate this decrease in agreement in many cases.
But time also works changes in our conviction; our belief in the happening of almost every event, if we recur to it long afterwards, when the evidence has faded from the mind, is 192 less strong than it was at the time. Why are not rules of oblivion inserted in treatises upon Probability? If a man is told how firmly he ought to expect the tide to rise again, because it has already risen ten times, might he not also ask for a rule which should tell him how firm should be his belief of an event which rests upon a ten years' recollection?[1] The infractions of a rule of this latter kind could scarcely be more numerous and extensive, as we shall see presently, than those of the former confessedly are. The fact is that the agencies, by which the strength of our conviction is modified, are so indefinitely numerous that they cannot all be assembled into one science; for purposes of definition therefore the quantity of belief had better be omitted from consideration, or at any rate regarded as a mere appendage, and the science, defined from the other or statistical side of the subject, in which, as has been shown, a tolerably clear boundary-line can be traced.
But time also changes our beliefs; our confidence in almost every event, when we think back on it long after, when the evidence has faded from memory, is 192 weaker than it was at the moment. Why aren't rules of forgetfulness included in discussions about Probability? If someone is advised how confidently he should expect the tide to rise again because it has come in ten times before, might he also want a guideline that tells him how certain he should feel about an event based on a ten-year-old memory?[1] The violations of a rule like this could hardly be more numerous and extensive, as we will see shortly, than those of the former. The truth is that the factors that influence the strength of our beliefs are so countless that they can't all be grouped into one science; for the sake of definition, it's better for the level of belief to be left out of consideration or at least seen as a simple addition, with the science defined from the other, more statistical aspect of the subject, where, as we’ve demonstrated, a fairly clear boundary can be established.
§ 3. Induction, however, from its importance does merit a separate discussion; a single example will show its bearing upon this part of our subject. We are considering the prospect of a given man, A. B. living another year, and we find that nine out of ten men of his age do survive. In forming an opinion about his surviving, however, we shall find that there are in reality two very distinct causes which aid in determining the strength of our conviction; distinct, but in practice so intimately connected that we are very apt to overlook one, and attribute the effect entirely to the other.
§ 3. Induction is important enough to deserve its own discussion; one example will highlight its relevance to our topic. We are looking at the likelihood of a specific man, A.B., living for another year, and we observe that nine out of ten men his age do survive. However, when we form an opinion about his chances of surviving, we discover that there are actually two very different factors that influence our level of certainty; these factors are distinct, but in practice, they are so closely linked that we often overlook one and attribute the outcome solely to the other.
(I.) There is that which strictly belongs to Probability; 193 that which (as was explained in Chap VI.) measures our belief of the individual case as deduced from the general proposition. Granted that nine men out of ten of the kind to which A. B. belongs do live another year, it obviously does not follow at all necessarily that he will. We describe this state of things by saying, that our belief of his surviving is diminished from certainty in the ratio of 10 to 9, or, in other words, is measured by the fraction 9/10.
(I.) There is a part that strictly relates to Probability; 193 which (as explained in Chapter VI.) measures our belief in the individual case based on the general statement. If we know that nine out of ten people like A. B. live for another year, it doesn’t mean that he will necessarily. We describe this situation by saying that our belief in his survival decreases from certainty in the ratio of 10 to 9, or in other words, is represented by the fraction 9/10.
(II.) But are we certain that nine men out of ten like him will live another year? we know that they have so survived in time past, but will they continue to do so? Since A. B. is still alive it is plain that this proposition is to a certain extent assumed, or rather obtained by Induction. We cannot however be as certain of the inductive inference as we are of the data from which it was inferred. Here, therefore, is a second cause which tends to diminish our belief; in practice these two causes always accompany each other, but in thought they can be separated.
(II.) But can we be sure that nine out of ten men like him will live another year? We know they have survived in the past, but will they keep doing so? Since A. B. is still alive, it’s clear that this assumption is somewhat based on past experiences, or rather, it’s reached through induction. However, we can’t be as confident in the inductive reasoning as we are in the original data it came from. So, there’s a second reason that makes us less sure; in practice, these two reasons always go hand in hand, but in our thinking, we can separate them.
The two distinct causes described above are very liable to be confused together, and the class of cases from which examples are necessarily for the most part drawn increases this liability. The step from the statement ‘all men have died in a certain proportion’ to the inference ‘they will continue to die in that proportion’ is so slight a step that it is unnoticed, and the diminution of conviction that should accompany it is unsuspected. In what are called à priori examples the step is still slighter. We feel so certain about the permanence of the laws of mechanics, that few people would think of regarding it as an inference when they believe that a die will in the long run turn up all its faces equally often, because other dice have done so in time past.
The two distinct causes mentioned above are very likely to get mixed up, and the types of cases from which examples are mostly drawn make this confusion even more likely. The leap from the statement ‘all men have died in a certain proportion’ to the conclusion ‘they will continue to die in that proportion’ is such a small leap that it often goes unnoticed, and the reduced certainty that should come with it is overlooked. In what are called à priori examples, the leap is even smaller. We feel so confident about the consistency of the laws of mechanics that few people would consider it an inference when they believe that a die will, over time, show all its faces equally often, simply because other dice have done so in the past.
§ 4. It has been already pointed out (in Chapter VI.) 194 that, so far as concerns that definition of Probability which regards it as the science which discusses the degree and modifications of our belief, the question at issue seems to be simply this:—Are the causes alluded to above in (II.) capable of being reduced to one simple coherent scheme, so that any universal rules for the modification of assent can be obtained from them? If they are, strong grounds will have been shown for classing them with (I.), in other words, for considering them as rules of probability. Even then they would be rules practically of a very different kind, contingent instead of necessary (if one may use these terms without committing oneself to any philosophical system), but this objection might perhaps be overruled by the greater simplicity secured by classing them together. This view is, with various modifications, generally adopted by writers on Probability, or at least, as I understand the matter, implied by their methods of definition and treatment. Or, on the other hand, must these causes be regarded as a vast system, one might almost say a chaos, of perfectly distinct agencies; which may indeed be classified and arranged to some extent, but from which we can never hope to obtain any rules of perfect generality which shall not be subject to constant exception? If so, but one course is left; to exclude them all alike from Probability. In other words, we must assume the general proposition, viz. that which has been described throughout as our starting-point, to be given to us; it may be obtained by any of the numerous rules furnished by Induction, or it may be inferred deductively, or given by our own observation; its value may be diminished by its depending upon the testimony of witnesses, or its being recalled by our own memory. Its real value may be influenced by these causes or any combinations of them; but all these are preliminary questions with which we have nothing directly 195 to do. We assume our statistical proposition to be true, neglecting the diminution of its value by the process of attainment; we take it up first at this point and then apply our rules to it. We receive it in fact, if one may use the expression, ready-made, and ask no questions about the process or completeness of its manufacture.
§ 4. It has already been pointed out (in Chapter VI.) 194 that regarding the definition of Probability as the science that discusses the degree and changes of our belief, the main question seems to be this:—Are the causes mentioned above in (II.) able to be simplified into one clear framework, so that we can derive universal rules for modifying agreement from them? If they can, it would provide a solid basis for categorizing them with (I.), meaning considering them as rules of probability. Even then, they would be rules that are practically quite different—conditional rather than absolute (if we can use these terms without adhering to any particular philosophical view)—but this objection might be outweighed by the simplicity gained by grouping them together. This perspective is generally accepted by probability writers, or at least, as I understand it, is implied in how they define and approach the subject. Or, alternatively, must we see these causes as a vast system, almost a chaos, of completely separate factors; which can indeed be classified and organized to some extent, but from which we can never expect to derive any universally applicable rules that won’t have constant exceptions? If that’s the case, then the only option left is to exclude them all from Probability. In other words, we must accept the overall notion, namely, that which has been described throughout as our starting point, to be accepted as given; it might come from any of the many rules provided by Induction, or it might be inferred deductively, or derived from our own observations; its value could be affected by relying on witness testimony, or being recalled from our memory. Its actual value might be influenced by these factors or any combinations of them; but all these are preliminary issues that we have no direct connection with. We assume our statistical proposition to be true, ignoring the reduction of its value through the process of acquisition; we begin with it at this point and then apply our rules to it. Essentially, we accept it, if one can put it this way, ready-made, and don’t question the process or completeness of its creation.
§ 5. It is not to be supposed, of course, that any writers have seriously attempted to reduce to one system of calculation all the causes mentioned above, and to embrace in one formula the diminution of certainty to which the inclusion of them subjects us. But on the other hand, they have been unwilling to restrain themselves from all appeal to them. From an early period in the study of the science attempts have been made to proceed, by the Calculus of Probability, from the observed cases to adjacent and similar cases. In practice, as has been already said, it is not possible to avoid some extension of this kind. But it should be observed, that in these instances the divergence from the strict ground of experience is not in reality recognized, at least not as a part of our logical procedure. We have, it is true, wandered somewhat beyond it, and so obtained a wider proposition than our data strictly necessitated, and therefore one of less certainty. Still we assume the conclusion given by induction to be equally certain with the data, or rather omit all notice of the divergence from consideration. It is assumed that the unexamined instances will resemble the examined, an assumption for which abundant warrant may exist; the theory of the calculation rests upon the supposition that there will be no difference between them, and the practical error is insignificant simply because this difference is small.
§ 5. It's not to be believed that any writers have seriously tried to simplify all the factors mentioned above into a single calculation system or to condense the uncertainty involved in incorporating them into one formula. However, they have been unwilling to completely refrain from referencing them. From an early stage in studying this science, people have attempted to use the Calculus of Probability to make inferences from observed cases to similar ones. In practice, as previously mentioned, it's impossible to avoid some level of extension like this. However, it's important to note that, in these cases, the departure from strict empirical grounds is not actually acknowledged, at least not as part of our logical process. It's true that we've strayed somewhat from it, resulting in a broader conclusion than our data strictly requires, and therefore one that carries less certainty. Nevertheless, we treat the conclusion reached through induction as equally certain as the data, or we simply overlook the divergence. We assume that unexamined instances will resemble those we have examined, an assumption that may have ample basis; the theory of calculation is based on the idea that there will be no difference between them, and the practical error is considered minor simply because this difference is small.
§ 6. But the rule we are now about to discuss, and which may be called the Rule of Succession, is of a very different kind. It not only recognizes the fact that we are 196 leaving the ground of past experience, but takes the consequences of this divergence as the express subject of its calculation. It professes to give a general rule for the measure of expectation that we should have of the reappearance of a phenomenon that has been already observed any number of times. This rule is generally stated somewhat as follows: “To find the chance of the recurrence of an event already observed, divide the number of times the event has been observed, increased by one, by the same number increased by two.”
§ 6. The rule we’re about to discuss, known as the Rule of Succession, is quite different. It acknowledges that we are stepping away from past experience and considers the implications of this shift as the main focus of its calculations. It aims to provide a general guideline for how we should expect a phenomenon that has been observed multiple times to reappear. This rule is typically summarized like this: “To determine the likelihood of an event that has already been observed occurring again, take the number of times the event has been observed, add one, and divide that by the same number plus two.”
§ 7. It will be instructive to point out the origin of this rule; if only to remind the reader of the necessity of keeping mathematical formulæ to their proper province, and to show what astonishing conclusions are apt to be accepted on the supposed warrant of mathematics. Revert then to the example of Inverse Probability on p. 182. We saw that under certain assumptions, it would follow that when a single white ball had been drawn from a bag known to contain 10 balls which were white or black, the chance could be determined that there was only one white ball in it. Having done this we readily calculate ‘directly’ the chance that this white ball will be drawn next time. Similarly we can reckon the chances of there being two, three, &c. up to ten white balls in it, and determine on each of these suppositions the chance of a white ball being drawn next time. Adding these together we have the answer to the question:—a white ball has been drawn once from a bag known to contain ten balls, white or black; what is the chance of a second time drawing a white ball?
§ 7. It's important to highlight where this rule comes from; it's a good reminder of the need to keep mathematical formulas within their proper limits and to illustrate how surprising conclusions can be accepted based solely on math. Let's go back to the example of Inverse Probability on p. 182. We saw that under certain assumptions, it follows that when a single white ball is drawn from a bag known to have 10 balls that are either white or black, we can determine the chance that there is only one white ball in the bag. Once we do this, we can easily calculate ‘directly’ the chance that this white ball will be drawn again. Likewise, we can figure out the chances of there being two, three, and so on, up to ten white balls in the bag and determine the chance of drawing a white ball next time for each of these scenarios. By adding these probabilities together, we can answer the question: a white ball has been drawn once from a bag containing ten balls that are either white or black; what is the chance of drawing a white ball again?
So far only arithmetic is required. For the next step we need higher mathematics, and by its aid we solve this problem:—A white ball has been drawn m times from a 197 bag which contains any number, we know not what, of balls each of which is white or black, find the chance of the next drawing also yielding a white ball. The answer is
So far, only basic math is needed. For the next step, we need advanced math, and with its help, we tackle this problem: A white ball has been drawn m times from a 197 bag that contains an unknown number of balls, each of which is either white or black. Find the probability that the next draw will also be a white ball. The answer is
Thus far mathematics. Then comes in the physical assumption that the universe may be likened to such a bag as the above, in the sense that the above rule may be applied to solve this question:—an event has been observed to happen m times in a certain way, find the chance that it will happen in that way next time. Laplace, for instance, has pointed out that at the date of the writing of his Essai Philosophique, the odds in favour of the sun's rising again (on the old assumption as to the age of the world) were 1,826,214 to 1. De Morgan says that a man who standing on the bank of a river has seen ten ships pass by with flags should judge it to be 11 to 1 that the next ship will also carry a flag.
So far, we've covered mathematics. Next comes the idea that the universe might be compared to a bag, in the sense that we can use the same principle to answer this question: an event has been observed to happen m times in a specific way; what's the chance it will happen that way again? For example, Laplace noted that when he wrote his Essai Philosophique, the odds of the sun rising again (based on the old belief about the earth's age) were 1,826,214 to 1. De Morgan mentions that a person standing by a river who has seen ten ships pass by with flags should think there's an 11 to 1 chance that the next ship will also have a flag.
§ 8. It is hard to take such a rule as this seriously, for there does not seem to be even that moderate confirmation of it which we shall find to hold good in the case of the application of abstract formulæ to the estimation of the evidence of witnesses. If however its validity is to be discussed there appear to be two very distinct lines of enquiry along which we may be led.
§ 8. It's difficult to treat a rule like this with seriousness because there doesn’t seem to be even the minimal support for it that we find when applying abstract formulas to evaluate witness testimony. However, if we are going to discuss its validity, there seem to be two clear directions for investigation we can take.
(1) In the first place we may take it for what it professes to be, and for what it is commonly understood to be, viz. a rule which assigns the measure of expectation we ought to entertain of the recurrence of the event under the circumstances in question. Of course, on the view adopted in this work, we insist on enquiring whether it is really true that on the average events do thus repeat their performance in accordance with this law. Thus tested, no 198 one surely would attempt to defend such a formula. So far from past occurrence being a ground for belief in future recurrence, there are (as will be more fully pointed out in the Chapter on Fallacies) plenty of cases in which the direct contrary holds good. Then again a rule of this kind is subject to the very serious perplexity to be explained in our next chapter, arising out of the necessary arbitrariness of such inverse reference. That is, when an event has happened but a few times, we have no certain guide; and when it has happened but once,[2] we have no guide whatever, as to the class of cases to which it is to be referred. In the example above, about the flags, why did we stop short at this notion simply, instead of specifying the size, shape, &c. of the flags?
(1) First of all, we can consider it for what it claims to be and what it is generally understood to be, namely, a rule that determines the level of expectation we should have for the recurrence of the event in question. From the perspective taken in this work, we emphasize the need to investigate whether it is really true that, on average, events do tend to repeat their occurrence according to this law. When examined this way, no one would logically defend such a formula. Far from believing that past events justify expecting future occurrences, there are (as will be elaborated in the Chapter on Fallacies) many instances where the opposite is true. Additionally, this kind of rule faces significant confusion that will be discussed in the next chapter, stemming from the unavoidable randomness of such backward referencing. Specifically, when an event has only occurred a few times, we have no reliable guide; and if it has happened just once, we have no guidance at all regarding the category of cases it should be classified under. In the earlier example about the flags, why did we stop at this idea alone instead of detailing the size, shape, etc.
De Morgan, it must be remembered, only accepts this rule in a qualified sense. He regards it as furnishing a minimum value for the amount of our expectation. He terms it “the rule of probability of a pure induction,” and says of it, “The probabilities shown by the above rules are merely minima which may be augmented by other sources of knowledge.” That is, he recognizes only those instances in which our belief in the Uniformity of Nature and in the existence of special laws of causation comes in 199 to supplement that which arises from the mere frequency of past occurrence. This however does not meet those cases in which past occurrence is a positive ground of disbelief in future recurrence.
De Morgan, it’s important to remember, only accepts this rule to a certain extent. He sees it as providing a minimum value for our expectations. He calls it “the rule of probability of a pure induction,” and he states, “The probabilities indicated by the above rules are merely minima which can be increased by other sources of knowledge.” This means he only acknowledges those situations where our belief in the Uniformity of Nature and the existence of specific laws of causation help to support what comes from the simple frequency of previous occurrences. However, this does not address instances where past occurrences actually lead to a strong doubt about future ones. 199
§ 9. (2) There is however another and very different view which might be taken of such a rule. It is one, an obscure recognition of which has very likely had much to do with the acceptance which the rule has received.
§ 9. (2) However, there is another, quite different perspective on this rule. This is a somewhat unclear acknowledgment that has probably contributed significantly to the acceptance of the rule.
What we might suppose ourselves to be thus expressing is,—not the measure of rational expectation which might be held by minds sufficiently advanced to be able to classify and to draw conscious inferences, but,—the law according to which the primitive elements of belief were started and developed. Of course such an interpretation as this would be equivalent to quitting the province of Logic altogether and crossing over into that of Psychology; but it would be a perfectly valid line of enquiry. We should be attempting nothing more than a development of the researches of Fechner and his followers in psychophysical measurement. Only then we ought, like them, not to start with any analogy of a ballot box and its contents, but to base our enquiry on careful determination of the actual mental phenomena experienced. We know how the law has been determined in accordance with which the intensity of the feeling of light varies with that of its objective source. We see how it is possible to measure the growth of memory according to the number of repetitions of a sentence or a succession of mere syllables. In this latter case, for instance, we just try experiments, and determine how much better a man can remember any utterances after eight hearings than after seven.[3]
What we might think we’re expressing is—not the level of rational expectation that more advanced minds might have to classify and draw conscious conclusions, but—the principle that explains how the basic elements of belief were initiated and developed. Of course, this kind of interpretation would mean leaving the realm of Logic and entering that of Psychology; however, it would still be a completely valid line of inquiry. We would be doing nothing more than expanding on the research of Fechner and his followers in psychophysical measurement. Just like them, we shouldn’t use the analogy of a ballot box and its contents to start, but instead focus on a careful examination of the actual mental experiences involved. We understand how the principle has been established where the intensity of the sensation of light varies with that of its actual source. We see how we can measure the enhancement of memory based on the number of times a sentence or a series of syllables is repeated. In this latter case, for instance, we conduct experiments to find out how much better someone can remember certain phrases after hearing them eight times compared to seven. [3]
Now this case furnishes a very close parallel to our supposed attempt to measure the increase of intensity of belief after repeated recurrence. That is, if it were possible to experiment in this order of mental phenomena, we ought simply to repeat a phenomenon a certain number of times and then ascertain by actual introspection or by some simple test, how fast the belief was increasing. Thus viewed the problem seems to me a hopeless one. The difficulties are serious enough, when we are trying to measure our simple sensations, of laying aside the effects of past training, and of attempting, as it were, to leave the mind open and passive to mere reception of stimuli. But if we were to attempt in this way to measure our belief these difficulties would become quite insuperable. We can no more divest ourselves of past training here than we can of intelligence or thought. I do not see how any one could possibly avoid classing the observed recurrences with others which he had experienced, and of being thus guided by special analogies and inductions instead of trusting solely to De Morgan's ‘pure induction’. The same considerations tend to rebut another form of defence for the rule in question. It is urged, for instance, that we may at least resort to it in those cases in which we are in entire ignorance as to the number and nature of the antecedents. This is a position to which I can hardly conceive it possible that we should ever be reduced. However remote or exceptional may be the phenomenon selected we may yet bring it into relation with some accepted generalizations and thus draw our conclusions from these rather than from purely à priori considerations.
Now this situation presents a very close parallel to our supposed attempt to measure the increase of belief intensity after repeated occurrences. That is, if it were possible to experiment with this type of mental phenomenon, we should simply repeat an event a certain number of times and then determine through actual introspection or some straightforward test how quickly the belief was growing. Viewed this way, the problem seems impossible to me. The challenges are already significant when we try to measure our basic sensations, as we must set aside the effects of past experiences and strive to keep our minds open and passive to merely receiving stimuli. But if we tried to measure our beliefs this way, those challenges would become truly insurmountable. We can’t strip ourselves of our past experiences any more than we can of intelligence or thought. I don’t see how anyone could avoid comparing the observed recurrences to others they’ve experienced and being influenced by specific analogies and inferences instead of relying solely on De Morgan's ‘pure induction’. The same arguments also counter another defense for the rule in question. It is suggested, for example, that we might at least use it in cases where we have no idea about the number and nature of the antecedents. This is a scenario I can hardly imagine we would ever face. No matter how rare or exceptional the chosen phenomenon may be, we can still relate it to accepted generalizations and draw our conclusions from those rather than from purely à priori considerations.
§ 10. Since then past acquisitions cannot be laid aside or allowed for, the only remaining resource would be to experiment upon the infant mind. One would not like 201 to pronounce that any line of enquiry is impossible; but the difficulties would certainly be enormous. And interesting as the facts would be, supposing that we had succeeded in securing them, they would not be of the slightest importance in Logic. However the question were settled:—whether, for instance, we proved that the sentiment or emotion of belief grew up slowly and gradually from a sort of zero point under the impress of repetition of experience; or whether we proved that a single occurrence produced complete belief in the repetition of the event, so that experience gradually untaught us and weakened our convictions;—in no case would the mature mind gain any aid as to what it ought to believe.
§ 10. Since past experiences can’t be ignored or set aside, the only option left would be to experiment with the developing mind. One wouldn't want to claim that any line of inquiry is impossible; however, the challenges would definitely be significant. And while the findings would be intriguing if we were able to capture them, they wouldn't hold any real significance in Logic. Regardless of how the question is resolved:—whether we show that the feeling or emotion of belief develops slowly and gradually from a sort of starting point under the influence of repeated experiences; or whether we demonstrate that a single occurrence leads to complete belief in the repetition of the event, causing experience to gradually weaken our convictions;—in either case, the mature mind wouldn’t receive any guidance on what it should believe.
I cannot but think that some such view as this must occasionally underlie the acceptance which this rule has received. For instance, Laplace, though unhesitatingly adopting it as a real, that is, objective rule of inference, has gone into so much physiological and psychological matter towards the end of his discussion (Essai philosophique) as to suggest that what he had in view was the natural history of belief rather than its subsequent justification.
I can't help but think that some version of this idea must sometimes underlie the acceptance of this rule. For example, Laplace, while confidently treating it as a real, or objective, rule of inference, delves into so much physiological and psychological content towards the end of his discussion (Essai philosophique) that it suggests he was more focused on the natural history of belief than on its later justification.
Again, the curious doctrine adopted by Jevons, that the principles of Induction rest entirely upon the theory of Probability,—a very different doctrine from that which is conveyed by saying that all knowledge of facts is probable only, i.e. not necessary,—seems unintelligible except on some such interpretation. We shall have more to say on this subject in our next chapter. It will be enough here to remark that in our present reflective and rational stage we find that every inference in Probability involves some appeal to, or support from, Induction, but that it is impossible to base either upon the other. However far back we try to push our way, and however disposed we might be 202 to account for our ultimate beliefs by Association, it seems to me that so long as we consider ourselves to be dealing with rules of inference we must still distinguish between Induction and Probability.
Once again, the intriguing idea put forth by Jevons, that the principles of Induction are entirely based on the theory of Probability—a concept quite different from stating that all knowledge of facts is probable only, meaning not necessary—seems puzzling unless we interpret it in some way. We will discuss this topic more in our next chapter. For now, it’s important to note that in our current reflective and rational phase, we find that every inference in Probability relies on some form of support from Induction, yet it's impossible to establish one as the foundation for the other. No matter how far back we try to trace our reasoning or how inclined we are to explain our fundamental beliefs through Association, I believe that as long as we view ourselves as engaging with rules of inference, we must still draw a distinction between Induction and Probability.
1 John Craig, in his often named work, Theologiæ Christianæ Principia Mathematica (Lond. 1699) attempted something in this direction when he proposed to solve such problems as:—Quando evanescet probabilitas cujusvis Historiæ, cujus subjectum est transiens, vivâ tantum voce transmissæ, determinare.
1 John Craig, in his frequently cited work, The Christian Theology Mathematical Principles (London, 1699), attempted something in this direction when he proposed to solve problems such as:—When will the probability of any history, which is only transmitted through spoken word, vanish?
2 When m = 1 the fraction becomes 2/3; i.e. the odds are 2 to 1 in favour of recurrence. And there are writers who accept this result. For instance, Jevons (Principles of Science p. 258) says “Thus on the first occasion on which a person sees a shark, and notices that it is accompanied by a little pilot fish, the odds are 2 to 1 that the next shark will be so accompanied.” To say nothing of the fact that recognizing and naming the fish implies that they have often been seen before, how many of the observed characteristics of that single ‘event’ are to be considered essential? Must the pilot precede; and at the same distance? Must we consider the latitude, the ocean, the season, the species of shark, as matter also of repetition on the next occasion? and so on. I cannot see how the Inductive problem can be even intelligibly stated, for quantitative purposes, on the first occurrence of any event.
2 When m = 1 the fraction becomes 2/3; that is, the odds are 2 to 1 in favor of it happening again. Some writers accept this outcome. For example, Jevons (Principles of Science p. 258) states, “So on the first time someone sees a shark and observes that it’s accompanied by a small pilot fish, the odds are 2 to 1 that the next shark will be accompanied in the same way.” Not to mention that recognizing and naming the fish suggests they have been seen before, how many of the observed traits of that single 'event' should be considered essential? Does the pilot need to come first, and at the same distance? Should we take into account the latitude, the ocean, the season, the species of shark, as factors for repetition during the next occurrence? And so forth. I can't see how the inductive problem can even be clearly stated for quantitative purposes when it comes to the first occurrence of any event.
CHAPTER 9.
INDUCTION AND ITS CONNECTION WITH PROBABILITY.
§ 1. We were occupied, during the last chapter, with the examination of a rule, the object of which was to enable us to make inferences about instances as yet unexamined. It was professedly, therefore, a rule of an inductive character. But, in the form in which it is commonly expressed, it was found to fail utterly. It is reasonable therefore to enquire at this point whether Probability is entirely a formal or deductive science, or whether, on the other hand, we are able, by means of it, to make valid inferences about instances as yet unexamined. This question has been already in part answered by implication in the course of the last two chapters. It is proposed in the present chapter to devote a fuller investigation to this subject, and to describe, as minutely as limits will allow, the nature of the connection between Probability and Induction. We shall find it advisable for clearness of conception to commence our enquiry at a somewhat early stage. We will travel over the ground, however, as rapidly as possible, until we approach the boundary of what can properly be termed Probability.
§ 1. We spent the last chapter exploring a rule aimed at helping us draw conclusions about situations we haven't yet examined. So, it was clearly meant to be an inductive rule. However, in its usual expression, it was completely ineffective. At this point, it's reasonable to ask whether Probability is solely a formal or deductive science, or if we can use it to make legitimate inferences about unexamined instances. This question has already been touched upon implicitly in the previous two chapters. In this chapter, we plan to investigate this topic more thoroughly and explain, as much as the limits allow, the relationship between Probability and Induction. For clarity, we'll start our inquiry at a somewhat early stage. However, we will move quickly through the basics until we reach the point that can accurately be called Probability.
§ 2. Let us then conceive some one setting to work to investigate nature, under its broadest aspect, with the view of systematizing the facts of experience that are known, and thence (in case he should find that this is possible) discovering 204 others which are at present unknown. He observes a multitude of phenomena, physical and mental, contemporary and successive. He enquires what connections are there between them? what rules can be found, so that some of these things being observed I can infer others from them? We suppose him, let it be observed, deliberately resolving to investigate the things themselves, and not to be turned aside by any prior enquiry as to there being laws under which the mind is compelled to judge of the things. This may arise either from a disbelief in the existence of any independent and necessary mental laws, and a consequent conviction that the mind is perfectly competent to observe and believe anything that experience offers, and should believe nothing else, or simply from a preference for investigations of the latter kind. In other words, we suppose him to reject Formal Logic, and to apply himself to a study of objective existences.
§ 2. Let’s imagine someone starting to investigate nature in its broadest sense, aiming to organize the known facts of experience and, if possible, discover those that are currently unknown. They observe many phenomena, both physical and mental, that occur at the same time or sequentially. They ask what connections exist between them? What rules can be found so that by observing some things, we can infer others? We assume this person is intentionally choosing to study the things themselves and is not distracted by any previous inquiries into whether there are laws that compel the mind to judge these things. This may stem from a disbelief in any independent and necessary mental laws, leading to a belief that the mind can fully observe and accept anything that experience presents without believing anything else, or simply from a preference for this type of investigation. In other words, we assume they reject Formal Logic and focus on studying objective realities.
It must not for a moment be supposed that we are here doing more than conceiving a fictitious case for the purpose of more vividly setting before the reader the nature of the inductive process, the assumptions it has to make, and the character of the materials to which it is applied. It is not psychologically possible that any one should come to the study of nature with all his mental faculties in full perfection, but void of all materials of knowledge, and free from any bias as to the uniformities which might be found to prevail around him. In practice, of course, the form and the matter—the laws of belief or association, and the objects to which they are applied—act and react upon one another, and neither can exist in any but a low degree without presupposing the existence of the other. But the supposition is perfectly legitimate for the purpose of calling attention to the requirements of such a system of Logic, and is indeed nothing more 205 than what has to be done at almost every step in psychological enquiry.[1]
It shouldn't be assumed for a moment that we are doing anything more than imagining a fictional scenario to help the reader better understand the nature of the inductive process, the assumptions it has to make, and the types of materials it uses. It's not psychologically possible for anyone to approach the study of nature with all their mental abilities fully intact, yet lacking any knowledge and free from any biases regarding the patterns they might observe around them. In practice, the form and the content—the laws of belief or association and the objects to which they apply—interact with each other, and neither can exist to any significant degree without assuming the presence of the other. However, this assumption is perfectly valid for highlighting the needs of such a system of Logic and is essentially what must be done at nearly every stage of psychological inquiry. 205
§ 3. His task at first might be conceived to be a slow and tedious one. It would consist of a gradual accumulation of individual instances, as marked out from one another by various points of distinction, and connected with one another by points of resemblance. These would have to be respectively distinguished and associated in the mind, and the consequent results would then be summed up in general propositions, from which inferences could afterwards be drawn. These inferences could, of course, contain no new facts, they would only be repetitions of what he or others had previously observed. All that we should have so far done would have been to make our classifications of things and then to appeal to them again. We should therefore be keeping well within the province of ordinary logic, the processes of which (whatever their ultimate explanation) may of course always be expressed, in accordance with Aristotle's Dictum, as ways of determining whether or not we can show that one given class is included wholly or partly within another, or excluded from it, as the case may be.
§ 3. At first, his job might seem slow and boring. It would involve gradually gathering individual examples, distinguished by various characteristics, while also linking them through similarities. These examples would need to be identified and associated in the mind, and the results would then be summarized into general statements from which conclusions could be drawn later. These conclusions wouldn’t introduce any new information; they would simply repeat what he or others had already observed. So far, we would have just organized our classifications of things and then referred back to them. Therefore, we would be staying well within the realm of basic logic, whose processes (regardless of their ultimate explanation) could always be described, following Aristotle's principle, as methods of determining whether one specific class is entirely or partially included in another, or excluded from it, depending on the situation.
§ 4. But a very short course of observation would suggest the possibility of a wide extension of his information. Experience itself would soon detect that events were connected together in a regular way; he would ascertain that there are ‘laws of nature.’ Coming with no à priori necessity of believing in them, he would soon find that as a matter 206 of fact they do exist, though he could not feel any certainty as to the extent of their prevalence. The discovery of this arrangement in nature would at once alter the plan of his proceedings, and set the tone to the whole range of his methods of investigation. His main work now would be to find out by what means he could best discover these laws of nature.
§ 4. But a brief period of observation would suggest that his knowledge could be greatly expanded. Experience would quickly reveal that events are linked in a consistent manner; he would come to understand that there are ‘laws of nature.’ Without any à priori need to believe in them, he would soon realize that they actually exist, even though he might not be entirely sure how widespread they are. Recognizing this order in nature would immediately change his approach and influence all of his investigative methods. His primary focus would now be to determine the best ways to uncover these laws of nature.
An illustration may assist. Suppose I were engaged in breaking up a vast piece of rock, say slate, into small pieces. I should begin by wearily working through it inch by inch. But I should soon find the process completely changed owing to the existence of cleavage. By this arrangement of things a very few blows would do the work—not, as I might possibly have at first supposed, to the extent of a few inches—but right through the whole mass. In other words, by the process itself of cutting, as shown in experience, and by nothing else, a constitution would be detected in the things that would make that process vastly more easy and extensive. Such a discovery would of course change our tactics. Our principal object would thenceforth be to ascertain the extent and direction of this cleavage.
An illustration might help clarify. Imagine I'm working to break apart a huge piece of rock, like slate, into smaller pieces. I would start by laboriously chipping away at it inch by inch. However, I would quickly notice that the process changes drastically because of the cleavage. With this natural feature, just a few strikes would accomplish the task—not just a few inches, but all the way through the entire block. In other words, through the act of cutting, as we know from experience, we would find a structure in the rock that makes this process much simpler and more efficient. This discovery would certainly change our approach. From then on, our main goal would be to determine the extent and direction of this cleavage.
Something resembling this is found in Induction. The discovery of laws of nature enables the mind to dart with its inferences from a few facts completely through a whole class of objects, and thus to acquire results the successive individual attainment of which would have involved long and wearisome investigation, and would indeed in multitudes of instances have been out of the question. We have no demonstrative proof that this state of things is universal; but having found it prevail extensively, we go on with the resolution at least to try for it everywhere else, and we are not disappointed. From propositions obtained in this way, or rather from the original facts on which these propositions rest, we can make new inferences, not indeed with absolute 207 certainty, but with a degree of conviction that is of the utmost practical use. We have gained the great step of being able to make trustworthy generalizations. We conclude, for instance, not merely that John and Henry die, but that all men die.
Something like this can be found in Induction. The discovery of natural laws allows the mind to quickly infer from a few facts about an entire class of objects, helping us achieve results that would have otherwise required long and tedious investigation, and in many cases would have been impossible. We don't have conclusive proof that this situation is universal; however, after seeing it happen extensively, we proceed with the intent to look for it everywhere else, and we are not let down. From the conclusions drawn this way, or rather from the original facts these conclusions are based on, we can make new inferences, not with complete certainty, but with a level of confidence that is extremely useful. We have made significant progress in being able to form reliable generalizations. We conclude, for example, not just that John and Henry die, but that all men die. 207
§ 5. The above brief investigation contains, it is hoped, a tolerably correct outline of the nature of the Inductive inference, as it presents itself in Material or Scientific Logic. It involves the distinction drawn by Mill, and with which the reader of his System of Logic will be familiar, between an inference drawn according to a formula and one drawn from a formula. We do in reality make our inference from the data afforded by experience directly to the conclusion; it is a mere arrangement of convenience to do so by passing through the generalization. But it is one of such extreme convenience, and one so necessarily forced upon us when we are appealing to our own past experience or to that of others for the grounds of our conclusion, that practically we find it the best plan to divide the process of inference into two parts. The first part is concerned with establishing the generalization; the second (which contains the rules of ordinary logic) determines what conclusions can be drawn from this generalization.
§ 5. The brief investigation above aims to provide a reasonably accurate outline of what Inductive inference is, as it appears in Material or Scientific Logic. It highlights the distinction made by Mill, which readers of his System of Logic will recognize, between an inference made according to a formula and one made from a formula. In reality, we make our inference from the data provided by experience directly to the conclusion; using generalization is merely a convenient arrangement. However, this convenience is so significant and necessary, especially when we rely on our past experiences or those of others to support our conclusions, that it makes practical sense to split the inference process into two parts. The first part focuses on establishing the generalization, while the second part (which includes the rules of ordinary logic) determines what conclusions can be drawn from that generalization.
§ 6. We may now see our way to ascertaining the province of Probability and its relation to kindred sciences. Inductive Logic gives rules for discovering such generalizations as those spoken of above, and for testing their correctness. If they are expressed in universal propositions it is the part of ordinary logic to determine what inferences can be made from and by them; if, on the other hand, they are expressed in proportional propositions, that is, propositions of the kind described in our first chapter, they are handed over to Probability. We find, for example, that three infants 208 out of ten die in their first four years. It belongs to Induction to say whether we are justified in generalizing our observation into the assertion, All infants die in that proportion. When such a proposition is obtained, whatever may be the value to be assigned to it, we recognize in it a series of a familiar kind, and it is at once claimed by Probability.
§ 6. We can now clarify the scope of Probability and how it relates to similar fields. Inductive Logic provides guidelines for finding generalizations like the ones mentioned above and for verifying their accuracy. If these are stated as universal propositions, it falls to traditional logic to figure out what conclusions can be drawn from them. However, if they are presented as proportional propositions, like the ones we described in the first chapter, they are assigned to Probability. For instance, we see that three out of ten infants die within their first four years. It is the role of Induction to determine if we can reasonably generalize this observation into the statement, All infants die in that proportion. Once such a statement is made, regardless of its assigned significance, we recognize it as a familiar series, and it is immediately classified under Probability.
In this latter case the division into two parts, the inductive and the ratiocinative, seems decidedly more than one of convenience; it is indeed imperatively necessary for clearness of thought and cogency of treatment. It is true that in almost every example that can be selected we shall find both of the above elements existing together and combining to determine the degree of our conviction, but when we come to examine them closely it appears to me that the grounds of their cogency, the kind of conviction they produce, and consequently the rules which they give rise to, are so entirely distinct that they cannot possibly be harmonized into a single consistent system.
In this case, splitting it into two parts, the inductive and the deductive, seems not just a matter of convenience; it's actually necessary for clarity of thought and effective analysis. It’s true that in almost every example we can choose, we’ll find both elements working together to shape our level of belief. However, when we look at them closely, it seems to me that the reasons behind their effectiveness, the type of belief they create, and the rules they lead to are so fundamentally different that they can’t really be integrated into a single, consistent system.
The opinion therefore according to which certain Inductive formulæ are regarded as composing a portion of Probability, and which finds utterance in the Rule of Succession criticised in our last chapter, cannot, I think, be maintained. It would be more correct to say, as stated above, that Induction is quite distinct from Probability, yet co-operates in almost all its inferences. By Induction we determine, for example, whether, and how far, we can safely generalize the proposition that four men in ten live to be fifty-six; supposing such a proposition to be safely generalized, we hand it over to Probability to say what sort of inferences can be deduced from it.
The opinion that certain inductive formulas make up a part of probability, which was discussed in the Rule of Succession criticized in our last chapter, cannot be upheld. It’s more accurate to say, as mentioned earlier, that induction is completely separate from probability, yet it works together in almost all its conclusions. For instance, through induction, we determine if and how much we can confidently generalize the claim that four out of ten men live to be fifty-six; assuming that claim can be safely generalized, we then pass it to probability to see what kinds of inferences can be drawn from it.
§ 7. So much then for the opinion which tends to regard pure Induction as a subdivision of Probability. By the majority of philosophical and logical writers a widely different 209 view has of course been entertained. They are mostly disposed to distinguish these sciences very sharply from, not to say to contrast them with, one another; the one being accepted as philosophical or logical, and the other rejected as mathematical. This may without offence be termed the popular prejudice against Probability.
§ 7. That covers the idea of viewing pure Induction as a part of Probability. Most philosophical and logical writers have a completely different perspective. They generally prefer to clearly separate these sciences, if not outright contrast them; one is seen as philosophical or logical, while the other is dismissed as mathematical. This is often referred to as the common bias against Probability.
A somewhat different view, however, must be noticed here, which, by a sort of reaction against the latter, seems even to go beyond the former; and which occasionally finds expression in the statement that all inductive reasoning of every kind is merely a matter of Probability. Two examples of this may be given.
A slightly different perspective, however, needs to be acknowledged here, which, in a sort of reaction against the previous view, appears to surpass it; and this perspective is sometimes expressed in the claim that all types of inductive reasoning are simply a matter of Probability. Two examples of this can be provided.
Beginning with the older authority, there is an often quoted saying by Butler at the commencement of his Analogy, that ‘probability is the very guide of life’; a saying which seems frequently to be understood to signify that the rules or principles of Probability are thus all-prevalent when we are drawing conclusions in practical life. Judging by the drift of the context, indeed, this seems a fair interpretation of his meaning, in so far of course as there could be said to be any such thing as a science of Probability in those days. Prof. Jevons, in his Principles of Science (p. 197), has expressed a somewhat similar view, of course in a way more consistent with the principles of modern science, physical and mathematical. He says, “I am convinced that it is impossible to expound the methods of induction in a sound manner, without resting them on the theory of Probability. Perfect knowledge alone can give certainty, and in nature perfect knowledge would be infinite knowledge, which is clearly beyond our capacities. We have, therefore, to content ourselves with partial knowledge,—knowledge mingled with ignorance, producing doubt.”[2]
Starting with the earlier authority, there’s a well-known quote by Butler at the beginning of his Analogy, stating that 'probability is the very guide of life'; this is often interpreted to mean that the rules or principles of probability are universally applicable when we draw conclusions in everyday life. Judging by the context, this seems to accurately reflect his intent, especially since there was hardly an established science of probability in those times. Professor Jevons, in his Principles of Science (p. 197), expresses a similar viewpoint, though in a way that aligns more closely with modern scientific principles, both physical and mathematical. He states, “I am convinced that it is impossible to explain the methods of induction properly without basing them on the theory of probability. Only perfect knowledge can provide certainty, and achieving perfect knowledge in nature would mean having infinite knowledge, which is clearly beyond our capabilities. Therefore, we must settle for partial knowledge—knowledge intermixed with ignorance, leading to doubt.”[2]
§ 8. There are two senses in which this disposition to merge the two sciences into one may be understood. Using the word Probability in its vague popular signification, nothing more may be intended than to call attention to the fact, that in every case alike our conclusions are nothing more than ‘probable,’ that is, that they are not, and cannot be, absolutely certain. This must be fully admitted, for of course no one acquainted with the complexity of physical and other evidence would seriously maintain that absolute ideal certainty can be attained in any branch of applied logic. Hypothetical certainty, in abstract science, may be possible, but not absolute certainty in the domain of the concrete. This has been already noticed in a former chapter, where, however, it was pointed out that whatever justification may exist, on the subjective view of logic, for regarding this common prevalence of absence of certainty as warranting us in fusing the sciences into one, no such justification is admitted when we take the objective view.
§ 8. There are two ways to understand the tendency to combine the two sciences into one. When using the term Probability in its broad, everyday sense, it simply means to highlight that our conclusions are always just ‘probable,’ which means they aren’t and can’t be absolutely certain. This is something that must be fully acknowledged, as anyone familiar with the complexity of physical and other forms of evidence would not seriously argue that absolute certainty can be achieved in any area of applied logic. Hypothetical certainty might be possible in abstract science, but absolute certainty is not achievable in the realm of concrete phenomena. This point was mentioned in an earlier chapter, where it was also noted that, although there might be some justification, from a subjective viewpoint of logic, for considering this common lack of certainty as a reason to merge the sciences, no such justification holds when adopting an objective perspective.
§ 9. What may be meant, however, is that the grounds of this absence of certainty are always of the same general character. This argument, if admitted, would have real force, and must therefore be briefly noticed. We have seen abundantly that when we say of a conclusion within the strict province of Probability, that it is not certain, all that we mean is that in some proportion of cases only will such conclusion be right, in the other cases it will be wrong. Now when we say, in reference to any inductive conclusion, that we feel uncertain about its absolute cogency, are we conscious of the same interpretation? It seems to me that we are not. It is indeed quite possible that on ultimate analysis it might be proved that experience of failure in the past employment of our methods of investigation was the main cause of our present want of perfect confidence in 211 them. But this, as we have repeatedly insisted, does not belong to the province of logical, but to that of Psychological enquiry. It is surely not the case that we are, as a rule, consciously guided by such occasional or repeated instances of past failure. In so far as they are at all influential, they seem to do their work by infusing a vague want of confidence which cannot be referred to any statistical grounds for its justification, at least not in a quantitative way. Part of our want of confidence is derived sympathetically from those who have investigated the matter more nearly at first hand. Here again, analysis might detect that a given proportion of past failures lay at the root of the distrust, but it does not show at the surface. Moreover, one reason why we cannot feel perfectly certain about our inductions is, that the memory has to be appealed to for some of our data; and will any one assert that the only reason why we do not place absolute reliance on our memory of events long past is that we have been deceived in that way before?
§ 9. What might be suggested here is that the reason for this lack of certainty is always of a similar nature. If we accept this argument, it holds significant weight and deserves a brief mention. We have repeatedly observed that when we state a conclusion within the strict realm of Probability is uncertain, we only mean that in some cases, it will be correct, while in others it will be incorrect. Now, when we refer to any inductive conclusion and express uncertainty about its absolute validity, do we interpret it the same way? It seems to me we do not. It could indeed be possible that upon deeper analysis, we might find that previous failures in our investigative methods largely contribute to our current lack of complete confidence in them. However, as we have often emphasized, this falls under Psychological rather than logical inquiry. It is certainly not the case that we are typically consciously influenced by such occasional or repeated instances of past failures. To the extent that they do have an impact, it seems to create a vague sense of doubt that cannot be justified by any statistical rationale, at least not quantitatively. Part of our uncertainty comes empathetically from those who have investigated the topic more closely. Again, while analysis might reveal that a certain number of past failures contribute to this distrust, it doesn't clearly manifest. Moreover, one reason we can't feel fully confident in our inductions is that we have to rely on memory for some of our data; and can anyone claim that the sole reason we don't fully trust our memory of events long ago is merely because we've been misled in that way previously?
In any other sense, therefore, than as a needful protest against attaching too great demonstrative force to the conclusions of Inductive Logic, it seems decidedly misleading to speak of its reasonings as resting upon Probability.
In any other way, then, except as a necessary protest against placing too much emphasis on the conclusions of Inductive Logic, it seems definitely misleading to say that its reasoning relies on Probability.
§ 10. We may now see clearly the reasons for the limits within which causation[3] is necessarily required, but beyond which it is not needed. To be able to generalize a formula so as to extend it from the observed to the unobserved, it is clearly essential that there should be a certain permanence in the order of nature; this permanence is one form of what is implied in the term causation. If the 212 circumstances under which men live and die remaining the same, we did not feel warranted in inferring that four men out of ten would continue to live to fifty, because in the case of those whom we had observed this proportion had hitherto done so, it is clear that we should be admitting that the same antecedents need not be followed by the same consequents. This uniformity being what the Law of Causation asserts, the truth of the law is clearly necessary to enable us to obtain our generalizations: in other words, it is necessary for the Inductive part of the process. But it seems to be equally clear that causation is not necessary for that part of the process which belongs to Probability. Provided only that the truth of our generalizations is secured to us, in the way just mentioned, what does it matter to us whether or not the individual members are subject to causation? For it is not in reality about these individuals that we make inferences. As this last point has been already fully treated in Chapter VI., any further allusion to it need not be made here.
§ 10. We can now clearly see the reasons for the limits within which causation[3] is needed and beyond which it isn't. To generalize a formula from what we've observed to what we haven’t, it’s essential that there is some permanence in the order of nature; this permanence is one aspect of what causation means. If the conditions under which people live and die remain unchanged, we wouldn't feel justified in concluding that four out of ten men would continue to live to fifty simply because that has been the case for those we've observed. It would imply that the same causes don't have to lead to the same effects. This uniformity is what the Law of Causation claims, and the validity of this law is necessary for us to make our generalizations; in other words, it is vital for the inductive part of the process. However, it's also clear that causation isn't necessary for the probabilistic part of the process. As long as we can ensure the accuracy of our generalizations in the way previously mentioned, it doesn't matter to us whether individual cases are subject to causation. Because in reality, our inferences aren't about these individual cases. Since this point has already been thoroughly discussed in Chapter VI, there's no need to elaborate further here.
§ 11. The above description, or rather indication, of the process of obtaining these generalizations must suffice for the present. Let us now turn and consider the means by which we are practically to make use of them when they are obtained. The point which we had reached in the course of the investigations entered into in the sixth and seventh chapters was this:—Given a series of a certain kind, we could draw inferences about the members which composed it; inferences, that is, of a peculiar kind, the value and meaning of which were fully discussed in their proper place.
§ 11. The description or rather overview of how to obtain these generalizations will have to do for now. Let’s shift our focus to how we can practically apply them once we have them. The point we reached in the investigations covered in the sixth and seventh chapters was this: given a certain type of series, we could make inferences about the components that make it up; these inferences are of a specific kind, the significance and implications of which were fully discussed in their respective sections.
We must now shift our point of view a little; instead of starting, as in the former chapters, with a determinate series supposed to be given to us, let us assume that the individual only is given, and that the work is imposed upon us of finding out the appropriate series. How are we to set about the 213 task? In the former case our data were of this kind:—Eight out of ten men, aged fifty, will live eleven years more, and we ascertained in what sense, and with what certainty, we could infer that, say, John Smith, aged fifty, would live to sixty-one.
We need to change our perspective a bit; instead of starting, like we did in the previous chapters, with a specific series that we think is already given, let’s assume that only the individual is known, and it’s our job to figure out the right series. How should we approach this task? In the previous case, our data looked like this: Eight out of ten men who are fifty years old will live for another eleven years, and we determined how and with what certainty we could predict that, for example, John Smith, who is fifty, would live to be sixty-one.
§ 12. Let us then suppose, instead, that John Smith presents himself, how should we in this case set about obtaining a series for him? In other words, how should we collect the appropriate statistics? It should be borne in mind that when we are attempting to make real inferences about things as yet unknown, it is in this form that the problem will practically present itself.
§ 12. Now, let’s imagine that John Smith shows up. How would we go about gathering data for him? In other words, how do we collect the right statistics? It's important to remember that when we're trying to draw real conclusions about things we don't yet know, this is how the problem will actually appear.
At first sight the answer to this question may seem to be obtained by a very simple process, viz. by counting how many men of the age of John Smith, respectively do and do not live for eleven years. In reality however the process is far from being so simple as it appears. For it must be remembered that each individual thing has not one distinct and appropriate class or group, to which, and to which alone, it properly belongs. We may indeed be practically in the habit of considering it under such a single aspect, and it may therefore seem to us more familiar when it occupies a place in one series rather than in another; but such a practice is merely customary on our part, not obligatory. It is obvious that every individual thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things. By belonging to any one class it of course becomes at the same time a member of all the higher classes, the genera, of which that class was a species. But, moreover, by virtue of each accidental attribute which it possesses, it becomes a member of a class intersecting, so to say, some of the other classes. John Smith 214 is a consumptive man say, and a native of a northern climate. Being a man he is of course included in the class of vertebrates, also in that of animals, as well as in any higher such classes that there may be. The property of being consumptive refers him to another class, narrower than any of the above; whilst that of being born in a northern climate refers him to a new and distinct class, not conterminous with any of the rest, for there are things born in the north which are not men.
At first glance, the answer to this question might seem simple: just count how many men of John Smith's age live for eleven years and how many do not. However, the reality is that the process is far more complicated than it seems. It's important to remember that each individual doesn't belong to just one specific class or group. We might be accustomed to thinking of it in a single way, making it seem more familiar when it fits into one category over another, but that's just a habit, not a rule. It's clear that every individual thing or event has countless observable properties or attributes, allowing it to fit into numerous different classes. By belonging to one class, it also becomes part of all the higher classes to which that class belongs. Additionally, each unique attribute it has can connect it to a class that overlaps with other classes. For example, John Smith is a man who suffers from consumption and is from a northern climate. As a man, he falls into the class of vertebrates and animals, along with any higher classes beyond those. The attribute of being consumptive places him in a more specific class, while being born in a northern climate links him to a distinct class that doesn’t overlap with the others since there are non-human entities born in the north. 214
§ 13. When therefore John Smith presents himself to our notice without, so to say, any particular label attached to him informing us under which of his various aspects he is to be viewed, the process of thus referring him to a class becomes to a great extent arbitrary. If he had been indicated to us by a general name, that, of course, would have been some clue; for the name having a determinate connotation would specify at any rate a fixed group of attributes within which our selection was to be confined. But names and attributes being connected together, we are here supposed to be just as much in ignorance what name he is to be called by, as what group out of all his innumerable attributes is to be taken account of; for to tell us one of these things would be precisely the same in effect as to tell us the other. In saying that it is thus arbitrary under which class he is placed, we mean, of course, that there are no logical grounds of decision; the selection must be determined by some extraneous considerations. Mere inspection of the individual would simply show us that he could equally be referred to an indefinite number of classes, but would in itself give no inducement to prefer, for our special purpose, one of these classes to another.
§ 13. So when John Smith comes to our attention without any specific label identifying which aspect of him we should focus on, deciding how to classify him becomes pretty much random. If he had been given a general name, that would have been a clue, as a name that carries a specific meaning would narrow down a set of traits we should consider. But since names and traits are connected, we’re just as clueless about what name he goes by as we are about which of his many traits we should focus on; telling us one is just like telling us the other. When we say it’s arbitrary which category he falls into, we mean there aren’t any logical reasons to decide; the choice has to be based on some outside factors. Just looking at the individual would show that he could belong to countless categories, but it wouldn’t give us a reason to choose one over another for our specific purpose.
This variety of classes to which the individual may be referred owing to his possession of a multiplicity of attributes, 215 has an important bearing on the process of inference which was indicated in the earlier sections of this chapter, and which we must now examine in more special reference to our particular subject.
This range of categories that a person may belong to due to having multiple attributes, 215 is significant for the reasoning process mentioned in the earlier sections of this chapter, and which we now need to look at more closely in relation to our specific topic.
§ 14. It will serve to bring out more clearly the nature of some of those peculiarities of the step which we are now about to take in the case of Probability, if we first examine the form which the corresponding step assumes in the case of ordinary Logic. Suppose then that we wished to ascertain whether a certain John Smith, a man of thirty, who is amongst other things a resident in India, and distinctly affected with cancer, will continue to survive there for twenty years longer. The terms in which the man is thus introduced to us refer him to different classes in the way already indicated. Corresponding to these classes there will be a number of propositions which have been obtained by previous observations and inductions, and which we may therefore assume to be available and ready at hand when we want to make use of them. Let us conceive them to be such as these following:—Some men live to fifty; some Indian residents live to fifty; no man suffering thus from cancer lives for five years. From the first and second of these premises nothing whatever can be inferred, for they are both[4] particular propositions, and therefore lead to no conclusion in this case. The third answers our enquiry decisively.
§ 14. To clarify the nature of some specific aspects of the step we’re about to take regarding Probability, let's first look at how this step appears in ordinary Logic. Imagine we want to determine whether a certain John Smith, a thirty-year-old man living in India who is notably affected by cancer, will survive for another twenty years there. The way this man is introduced refers him to different categories, as previously mentioned. For these categories, there are several propositions derived from earlier observations and inductions that we can assume are available for our use. Let's consider these propositions: Some men live to fifty; some residents of India live to fifty; no man suffering from cancer lives for five years. From the first two propositions, we can’t draw any conclusions since they are both specific claims and don't lead us to a conclusion in this case. The third proposition answers our question definitively.
To the logical reader it will hardly be necessary to point out that the process here under consideration is that of finding middle terms which shall serve to connect the subject and predicate of our conclusion. This subject and predicate in the case in question, are the individual before 216 us and his death within the stated period. Regarded by themselves there is nothing in common between them, and therefore no link by which they may be connected or disconnected with each other. The various classes above referred to are a set of such middle terms, and the propositions belonging to them are a corresponding set of major premises. By the help of any one of them we are enabled, under suitable circumstances, to connect together the subject and predicate of the conclusion, that is, to infer whether the man will or will not live twenty years.
To a logical reader, it’s pretty obvious that the process we’re discussing is about finding middle terms that connect the subject and predicate of our conclusion. In this case, the subject and predicate are the individual in front of us and his death within the given timeframe. When looked at separately, there’s nothing in common between them, so there’s no way to connect or disconnect them. The various classes mentioned earlier serve as a set of these middle terms, and the propositions related to them are a relevant set of major premises. With any one of those, we can, under the right conditions, link the subject and predicate of the conclusion, meaning we can figure out if the man will live for twenty more years or not.
§ 15. Now in the performance of such a logical process there are two considerations to which the reader's attention must for a moment be directed. They are simple enough in this case, but will need careful explanation in the corresponding case in Probability. In the first place, it is clear that whenever we can make any inference at all, we can do so with absolute certainty. Logic, within its own domain, knows nothing of hesitation or doubt. If the middle term is appropriate it serves to connect the extremes in such a way as to preclude all uncertainty about the conclusion; if it is not, there is so far an end of the matter: no conclusion can be drawn, and we are therefore left where we were. Assuming our premises to be correct, we either know our conclusion for certain, or we know nothing whatever about it. In the second place, it should be noticed that none of the possible alternatives in the shape of such major premises as those given above can ever contradict any of the others, or be at all inconsistent with them. Regarded as isolated propositions, there is of course nothing to secure such harmony; they have very different predicates, and may seem quite out of each other's reach for either support or opposition. But by means of the other premise they are in each case brought into relation with one another, and the general 217 interests of truth and consistency prevent them therefore from contradicting one another. As isolated propositions it might have been the case that all men live to fifty, and that no Indian residents do so, but having recognised that some men are residents in India, we see at once that these premises are inconsistent, and therefore that one or other of them must be rejected. In all applied logic this necessity of avoiding self-contradiction is so obvious and imperious that no one would think it necessary to lay down the formal postulate that all such possible major premises are to be mutually consistent. To suppose that this postulate is not complied with, would be in effect to make two or more contradictory assumptions about matters of fact.
§ 15. Now, in carrying out this logical process, there are two things that the reader needs to consider for a moment. They are straightforward here but will require careful explanation in the context of Probability. First, it is obvious that whenever we can draw any inference at all, we do so with complete certainty. Logic, in its realm, has no room for hesitation or doubt. If the middle term is appropriate, it links the extremes in a way that eliminates all uncertainty about the conclusion; if it isn’t, that’s the end of the discussion: no conclusion can be drawn, and we remain in the same position. Assuming our premises are correct, we either know our conclusion for sure or we know nothing at all about it. Secondly, it’s important to note that none of the possible alternatives in the form of major premises like the ones given above can ever contradict or be inconsistent with one another. Viewed as isolated statements, there’s nothing to ensure such harmony; they have very different predicates and may seem far removed from each other in terms of support or opposition. However, through the other premise, they are each connected, and the overarching interests of truth and consistency prevent them from contradicting each other. As isolated statements, it might be possible to say that all men live to fifty, and that no Indian residents do so, but once we acknowledge that some men are residents in India, we quickly see that these premises are inconsistent, meaning one must be rejected. In all applied logic, this need to avoid self-contradiction is so clear and compelling that no one would think it necessary to establish the formal principle that all such possible major premises should be mutually consistent. To assume this principle is not observed would effectively mean making two or more contradictory claims about facts.
§ 16. But now observe the difference when we attempt to take the corresponding step in Probability. For ordinary propositions, universal or particular, substitute statistical propositions of what we have been in the habit of calling the ‘proportional’ kind. In other words, instead of asking whether the man will live for twenty years, let us ask whether he will live for one year? We shall be unable to find any universal propositions which will cover the case, but we may without difficulty obtain an abundance of appropriate proportional ones. They will be of the following description:—Of men aged 30, 98 in 100 live another year; of residents in India a smaller proportion survive, let us for example say 90 in 100; of men suffering from cancer a smaller proportion still, let us say 20 in 100.
§ 16. But now notice the difference when we try to take the same step in Probability. Instead of ordinary propositions, whether universal or specific, let's use statistical propositions that we typically refer to as the ‘proportional’ kind. In other words, instead of asking whether a man will live for twenty years, let's ask whether he will live for one year. We won't be able to find any universal propositions that apply, but we can easily gather a lot of relevant proportional ones. They will look something like this: Of men aged 30, 98 out of 100 live another year; among residents in India, a smaller proportion survives—let's say 90 out of 100; and among men with cancer, an even smaller proportion survives, say 20 out of 100.
Now in both of the respects to which attention has just been drawn, propositions of this kind offer a marked contrast with those last considered. In the first place, they do not, like ordinary propositions, either assert unequivocally yes or no, or else refuse to open their lips; but they give instead a sort of qualified or hesitating answer concerning 218 the individuals included in them. This is of course nothing more than the familiar characteristic of what may be called ‘probability propositions.’ But it leads up to, and indeed renders possible, the second and more important point; viz. that these various answers, though they cannot directly and formally contradict each other (this their nature as proportional propositions, will not as a rule permit), may yet, in a way which will now have to be pointed out, be found to be more or less in conflict with each other.
Now, regarding both aspects we've just talked about, propositions of this kind stand in stark contrast to the ones we just examined. Firstly, they don't straightforwardly say yes or no like regular propositions, nor do they simply stay silent; instead, they provide a sort of qualified or uncertain response about the individuals involved. This is really just a common feature of what we can call ‘probability propositions.’ However, this leads to the second and more significant point; namely, that these different answers, although they can't directly and formally contradict each other (due to their nature as proportional propositions), may still, as we will now explain, be found to be somewhat conflicting with one another.
Hence it follows that in the attempt to draw a conclusion from premises of the kind in question, we may be placed in a position of some perplexity; but it is a perplexity which may present itself in two forms, a mild and an aggravated form. We will notice them in turn.
Hence it follows that when we try to draw a conclusion from these kinds of premises, we might find ourselves a bit confused; but this confusion can show up in two ways, one mild and one more intense. Let’s look at them one at a time.
§ 17. The mild form occurs when the different classes to which the individual case may be appropriately referred are successively included one within another; for here our sets of statistics, though leading to different results, will not often be found to be very seriously at variance with one another. All that comes of it is that as we ascend in the scale by appealing to higher and higher genera, the statistics grow continually less appropriate to the particular case in point, and such information therefore as they afford becomes gradually less explicit and accurate.
§ 17. The mild form happens when the different categories that apply to a specific case are nested within each other; in this situation, our sets of statistics, even though they lead to different outcomes, usually aren't found to be significantly at odds with one another. The result is that as we move up the hierarchy by referring to broader categories, the statistics become less relevant to the specific case, and the information they provide becomes gradually less clear and precise.
The question that we originally wanted to determine, be it remembered, is whether John Smith will die within one year. But all knowledge of this fact being unattainable, owing to the absence of suitable inductions, we felt justified (with the explanation, and under the restrictions mentioned in Chap VI.), in substituting, as the only available equivalent for such individual knowledge, the answer to the following statistical enquiry, What proportion of men in his circumstances die?
The question we initially wanted to answer is whether John Smith will die within a year. However, since we can't get that information due to the lack of suitable evidence, we believed it was reasonable (with the explanation and under the limitations noted in Chap VI.) to replace that individual knowledge with the answer to this statistical inquiry: What percentage of men in his situation die?
§ 18. But then at once there begins to arise some doubt and ambiguity as to what exactly is to be understood by his circumstances. We may know very well what these circumstances are in themselves, and yet be in perplexity as to how many of them we ought to take into account when endeavouring to estimate his fate. We might conceivably, for a beginning, choose to confine our attention to those properties only which he has in common with all animals. If so, and statistics on the subject were attainable, they would presumably be of some such character as this, Ninety-nine animals out of a hundred die within a year. Unusual as such a reference would be, we should, logically speaking, be doing nothing more than taking a wider class than the one we were accustomed to. Similarly we might, if we pleased, take our stand at the class of vertebrates, or at that of mammalia, if zoologists were able to give us the requisite information. Of course we reject these wide classes and prefer a narrower one. If asked why we reject them, the natural answer is that they are so general, and resemble the particular case before us in so few points, that we should be exceedingly likely to go astray in trusting to them. Though accuracy cannot be insured, we may at least avoid any needless exaggeration of the relative number and magnitude of our errors.
§ 18. But then doubts and uncertainties start to emerge about what exactly we should understand regarding his circumstances. We may know quite well what these circumstances are on their own, but still feel confused about how many of them we should consider when trying to estimate his fate. For instance, we could choose to focus only on the characteristics he shares with all animals. If we did that, and if statistics on the subject were available, they might look something like this: Ninety-nine animals out of a hundred die within a year. Although this reference might seem unusual, logically, we would simply be considering a broader category than usual. Likewise, we could choose to refer to the class of vertebrates or mammals, assuming zoologists could provide us with the necessary information. However, we tend to dismiss these broad categories and opt for a more specific one. If asked why we reject them, the natural response is that they are too general and share very few similarities with the specific case we’re dealing with, making it quite likely that we would go wrong by relying on them. Although we can’t guarantee accuracy, we can at least avoid unnecessarily exaggerating the relative number and significance of our mistakes.
§ 19. The above answer is quite valid; but whilst cautioning us against appealing to too wide a class, it seems to suggest that we cannot go wrong in the opposite direction, that is in taking too narrow a class. And yet we do avoid any such extremes. John Smith is not only an Englishman; he may also be a native of such a part of England, be living in such a Presidency, and so on. An indefinite number of such additional characteristics might be brought out into notice, many of which at any rate have some bearing upon 220 the question of vitality. Why do we reject any consideration of these narrower classes? We do reject them, but it is for what may be termed a practical rather than a theoretical reason. As was explained in the first chapters, it is essential that our series should contain a considerable number of terms if they are to be of any service to us. Now many of the attributes of any individual are so rare that to take them into account would be at variance with the fundamental assumption of our science, viz. that we are properly concerned only with the averages of large numbers. The more special and minute our statistics the better, provided only that we can get enough of them, and so make up the requisite large number of instances. This is, however, impossible in many cases. We are therefore obliged to neglect one attribute after another, and so to enlarge the contents of our class; at the avowed risk of somewhat increased variety and unsuitability in the members of it, for at each step of this kind we diverge more and more from the sort of instances that we really want. We continue to do so, until we no longer gain more in quantity than we lose in quality. We finally take our stand at the point where we first obtain statistics drawn from a sufficiently large range of observation to secure the requisite degree of stability and uniformity.
§ 19. The answer above is definitely valid; however, while warning us against appealing to overly broad categories, it seems to imply that we can't go wrong by going too narrow. Yet we avoid such extremes. John Smith isn’t just an Englishman; he might also be from a specific part of England, living in a particular area, and so on. An unlimited number of additional traits could be highlighted, many of which indeed relate to the issue of vitality. Why do we ignore these narrower categories? We do dismiss them, but for what could be called practical rather than theoretical reasons. As explained in the earlier chapters, it’s crucial that our series includes a significant number of terms if they’re going to be useful. A lot of an individual’s characteristics are so uncommon that considering them would contradict the fundamental premise of our science, which is that we’re really only focused on the averages of large groups. The more detailed and specific our statistics, the better, as long as we have enough of them to create the needed large number of cases. However, this often isn’t feasible. Consequently, we must overlook one trait after another and broaden our category, knowing there’s a risk of increasing variety and less relevance in its members, as each step we take leads us further away from the types of examples we actually want. We keep doing this until the gain in quantity no longer outweighs the loss in quality. We ultimately settle at the point where we first gather statistics from a sufficiently broad observation range to achieve the necessary stability and consistency.
§ 20. In such an example as the one just mentioned, where one of the successive classes—man—is a well-defined natural kind or species, there is such a complete break in each direction at this point, that every one is prompted to take his stand here. On the one hand, no enquirer would ever think of introducing any reference to the higher classes with fewer attributes, such as animal or organized being: and on the other hand, the inferior classes, created by our taking notice of his employment or place of residence, &c., do not as a rule differ sufficiently in their characteristics 221 from the class man to make it worth our while to attend to them.
§ 20. In the example just mentioned, where one of the sequential categories—man—is a clearly defined natural kind or species, there's such a notable distinction in both directions at this point that everyone feels compelled to take a position here. On one side, no researcher would even consider referencing the broader categories with fewer traits, like animal or organized being; and on the other side, the lower categories, which are identified by factors like job or place of residence, generally don't differ enough in their traits from the class man to make it worthwhile for us to focus on them. 221
Now and then indeed these characteristics do rise into importance, and whenever this is the case we concentrate our attention upon the class to which they correspond, that is, the class which is marked off by their presence. Thus, for instance, the quality of consumptiveness separates any one off so widely from the majority of his fellow-men in all questions pertaining to mortality, that statistics about the lives of consumptive men differ materially from those which refer to men in general. And we see the result; if a consumptive man can effect an insurance at all, he must do it for a much higher premium, calculated upon his special circumstances. In other words, the attribute is sufficiently important to mark off a fresh class or series. So with insurance against accident. It is not indeed attempted to make a special rate of insurance for the members of each separate trade, but the differences of risk to which they are liable oblige us to take such facts to some degree into account. Hence, trades are roughly divided into two or three classes, such as the ordinary, the hazardous, and the extra-hazardous, each having to pay its own rate of premium.
Now and then, these characteristics indeed become significant, and when that happens, we focus our attention on the class they represent, which is defined by their presence. For example, the trait of being consumptive sets someone apart significantly from most of their peers regarding mortality, so the statistics on the lives of consumptive individuals differ greatly from those of the general population. As a result, if a consumptive person can even get insurance, it will be at a much higher premium based on their specific situation. In other words, this trait is important enough to create a new class or category. The same goes for accident insurance. While we don't typically create a specific insurance rate for members of each individual trade, the varying risks they face require us to consider such factors to some extent. Therefore, trades are broadly categorized into two or three classes, like ordinary, hazardous, and extra-hazardous, each responsible for paying its respective premium rate.
§ 21. Where one or other of the classes thus corresponds to natural kinds, or involves distinctions of co-ordinate importance with those of natural kinds, the process is not difficult; there is almost always some one of these classes which is so universally recognised to be the appropriate one, that most persons are quite unaware of there being any necessity for a process of selection. Except in the cases where a man has a sickly constitution, or follows a dangerous employment, we seldom have occasion to collect statistics for him from any class but that of men in general of his age in the country.
§ 21. When one of these categories aligns with natural types, or involves equally important distinctions like those of natural types, the process isn't hard; there's almost always a category that is so widely recognized as the right one that most people don’t even realize there’s a need for selection. Except in cases where someone has a fragile health or is in a risky job, we rarely need to gather statistics for him from any group other than that of men in general of his age in the country.
When, however, these successive classes are not ready marked out for us by nature, and thence arranged in easily distinguishable groups, the process is more obviously arbitrary. Suppose we were considering the chance of a man's house being burnt down, with what collection of attributes should we rest content in this instance? Should we include all kinds of buildings, or only dwelling-houses, or confine ourselves to those where there is much wood, or those which have stoves? All these attributes, and a multitude of others may be present, and, if so, they are all circumstances which help to modify our judgment. We must be guided here by the statistics which we happen to be able to obtain in sufficient numbers. Here again, rough distinctions of this kind are practically drawn in Insurance Offices, by dividing risks into ordinary, hazardous, and extra-hazardous. We examine our case, refer it to one or other of these classes, and then form our judgment upon its prospects by the statistics appropriate to its class.
When these different categories aren't clearly defined by nature and easily grouped, the process seems more random. Let's say we’re looking at the likelihood of someone's house burning down; what set of characteristics should we consider in this case? Should we look at all types of buildings, just residential homes, only those made mostly of wood, or those that have stoves? All these features, along with many others, could be involved. If they are, they all influence our judgment. We need to rely on the statistics we can gather in enough quantity. Similarly, Insurance Companies make rough distinctions by classifying risks as ordinary, hazardous, or extra-hazardous. We evaluate our situation, categorize it into one of these classes, and then base our judgment about its risks on the statistics relevant to that category.
§ 22. So much for what may be called the mild form in which the ambiguity occurs; but there is an aggravated form in which it may show itself, and which at first sight seems to place us in far greater perplexity.
§ 22. That covers what can be considered the mild form of ambiguity; however, there is a more intense form that can emerge, and at first glance, it appears to put us in much greater confusion.
Suppose that the different classes mentioned above are not included successively one within the other. We may then be quite at a loss which of the statistical tables to employ. Let us assume, for example, that nine out of ten Englishmen are injured by residence in Madeira, but that nine out of ten consumptive persons are benefited by such a residence. These statistics, though fanciful, are conceivable and perfectly compatible. John Smith is a consumptive Englishman; are we to recommend a visit to Madeira in his case or not? In other words, what inferences are we to draw about the probability of his death? Both of the statistical 223 tables apply to his case, but they would lead us to directly contradictory conclusions. This does not mean, of course, contradictory precisely in the logical sense of that word, for one of these propositions does not assert that an event must happen and the other deny that it must; but contradictory in the sense that one would cause us in some considerable degree to believe what the other would cause us in some considerable degree to disbelieve. This refers, of course, to the individual events; the statistics are by supposition in no degree contradictory. Without further data, therefore, we can come to no decision.
Suppose the different classes mentioned above are not nested within each other. We might then be unsure which statistical tables to use. Let’s say, for instance, that nine out of ten Englishmen are harmed by living in Madeira, but at the same time, nine out of ten people with tuberculosis benefit from living there. These statistics, while hypothetical, are possible and perfectly compatible. John Smith is an Englishman with tuberculosis; should we recommend he visit Madeira or not? In other words, what conclusions can we draw about the likelihood of his death? Both statistical tables apply to his situation, but they would lead us to completely contradictory conclusions. This doesn’t mean they’re contradictory in a strict logical sense, since one proposition doesn’t claim that an event must occur while the other denies it; rather, they are contradictory in that one would lead us to believe something that the other would lead us to disbelieve. This refers specifically to individual cases; the statistics themselves aren't inherently contradictory. Therefore, without more information, we can't make a decision. 223
§ 23. Practically, of course, if we were forced to a decision with only these data before us, we should make our choice by the consideration that the state of a man's lungs has probably more to do with his health than the place of his birth has; that is, we should conclude that the duration of life of consumptive Englishmen corresponds much more closely with that of consumptive persons in general than with that of their healthy countrymen. But this is, of course, to import empirical considerations into the question. The data, as they are given to us, and if we confine ourselves to them, leave us in absolute uncertainty upon the point. It may be that the consumptive Englishmen almost all die when transported into the other climate; it may be that they almost all recover. If they die, this is in obvious accordance with the first set of statistics; it will be found in accordance with the second set through the fact of the foreign consumptives profiting by the change of climate in more than what might be termed their due proportion. A similar explanation will apply to the other alternative, viz. to the supposition that the consumptive Englishmen mostly recover. The problem is, therefore, left absolutely indeterminate, for we cannot here appeal to any general rule 224 so simple and so obviously applicable as that which, in a former case, recommended us always to prefer the more special statistics, when sufficiently extensive, to those which are wider and more general. We have no means here of knowing whether one set is more special than the other.
§ 23. In practical terms, if we had to make a decision based solely on this information, we would likely choose based on the idea that a person’s lung condition probably affects their health more than where they were born. In other words, we would think that how long consumptive Englishmen live is much more similar to the life expectancy of consumptive people in general than to that of their healthy peers. However, this brings in empirical factors into the discussion. The data, as presented, leaves us completely uncertain on this point. It could be that nearly all consumptive Englishmen die when moved to another climate; or it might be the case that they mostly recover. If they die, it aligns with the first set of statistics; conversely, if they recover, that would be in line with the second set because foreign consumptives might benefit from the change in climate more than their fair share. A similar reasoning applies to the other scenario, which is the assumption that consumptive Englishmen mostly recover. Therefore, the issue remains completely unresolved because we cannot refer to any straightforward general rule that clearly applies here, unlike in a previous case where we recommended always preferring more specific statistics, when sufficiently extensive, over broader, more general ones. We have no way of knowing if one set is more specific than the other. 224
And in this no difficulty can be found, so long as we confine ourselves to a just view of the subject. Let me again recall to the reader's mind what our present position is; we have substituted for knowledge of the individual (finding that unattainable) a knowledge of what occurs in the average of similar cases. This step had to be taken the moment the problem was handed over to Probability. But the conception of similarity in the cases introduces us to a perplexity; we manage indeed to evade it in many instances, but here it is inevitably forced upon our notice. There are here two aspects of this similarity, and they introduce us to two distinct averages. Two assertions are made as to what happens in the long run, and both of these assertions, by supposition, are verified. Of their truth there need be no doubt, for both were supposed to be obtained from experience.
And in this, there's no difficulty as long as we maintain a fair perspective on the topic. Let me remind the reader of our current situation; we’ve replaced the understanding of the individual (since that’s impossible to achieve) with knowledge of what happens in the average of similar cases. This change was necessary the moment we handed the problem over to Probability. However, the idea of similarity between cases brings us to a confusion; we can dodge it in many situations, but here it’s unavoidable. There are two sides to this similarity, leading us to two different averages. Two claims are made about what occurs over time, and both of these claims, hypothetically, are proven. There can be no doubt about their truth, as both were assumed to be derived from experience.
§ 24. It may perhaps be supposed that such an example as this is a reductio ad absurdum of the principle upon which Life and other Insurances are founded. But a moment's consideration will show that this is quite a mistake, and that the principle of insurance is just as applicable to examples of this kind as to any other. An office need find no difficulty in the case supposed. They might (for a reason to be mentioned presently, they probably would not) insure the individual without inconsistency at a rate determined by either average. They might say to him, “You are an Englishman. Out of the multitude of English who come to us nine in ten die if they go to Madeira. We will insure 225 you at a rate assigned by these statistics, knowing that in the long run all will come right so far as we are concerned. You are also consumptive, it is true, and we do not know what proportion of the English are consumptive, nor what proportion of English consumptives die in Madeira. But this does not really matter for our purpose. The formula, nine in ten die, is in reality calculated by taking into account these unknown proportions; for, though we do not know them in themselves, statistics tell us all that we care to know about their results. In other words, whatever unknown elements may exist, must, in regard to all the effects which they can produce, have been already taken into account, so that our ignorance about them cannot in the least degree invalidate such conclusions as we are able to draw. And this is sufficient for our purpose.” But precisely the same language might be held to him if he presented himself as a consumptive man; that is to say, the office could safely carry on its proceedings upon either alternative.
§ 24. It might be thought that an example like this is a reductio ad absurdum of the principle on which Life and other Insurances are based. However, a moment’s thought will reveal that this is a misunderstanding, and that the insurance principle applies to these examples just as well as to any others. An office wouldn’t have any trouble with the scenario described. They might (though they probably wouldn’t for reasons I’ll explain shortly) insure the person at a rate based on average statistics without inconsistency. They might say to him, “You are an Englishman. Out of the many English who come to us, nine out of ten die if they go to Madeira. We will insure you at a rate based on these statistics, knowing that in the long run, it will all even out for us. You also have a lung condition, and while we don’t know the percentage of English people with lung conditions, nor how many of them die in Madeira, that doesn’t really matter for us. The statistic, nine out of ten die, actually takes these unknown percentages into account; because, even though we don’t know them specifically, statistics give us all the information we need about the outcomes. In other words, whatever unknown factors might be out there have already been considered regarding all the effects they can have, so our lack of knowledge about them doesn’t invalidate the conclusions we can draw. And that is enough for our purpose.” But the same reasoning could be applied if he presented himself as someone with a lung condition; the office could confidently proceed with either option.
This would, of course, be a very imperfect state for the matter to be left in. The only rational plan would be to isolate the case of consumptive Englishmen, so as to make a separate calculation for their circumstances. This calculation would then at once supersede all other tables so far as they were concerned; for though, in the end, it could not arrogate to itself any superiority over the others, it would in the mean time be marked by fewer and slighter aberrations from the truth.
This would definitely be a pretty flawed situation to leave things in. The only sensible approach would be to focus on the case of Englishmen with tuberculosis, so we could make a separate assessment based on their specific situation. This assessment would then replace all other data concerning them; because while, in the end, it couldn't claim to be better than the others, for now, it would have fewer and less significant errors compared to the truth.
§ 25. The real reason why the Insurance office could not long work on the above terms is of a very different kind from that which some readers might contemplate, and belongs to a class of considerations which have been much neglected in the attempts to construct sciences of the different branches of human conduct. It is nothing else than 226 that annoying contingency to which prophets since the time of Jonah have been subject, of uttering suicidal prophecies; of publishing conclusions which are perfectly certain when every condition and cause but one have been taken into account, that one being the effect of the prophecy itself upon those to whom it refers.
§ 25. The real reason the Insurance office couldn’t continue operating under those terms for long is quite different from what some readers might assume, and it relates to a set of considerations that have been largely overlooked in efforts to create sciences around various aspects of human behavior. It comes down to that frustrating situation that prophets have faced since Jonah’s time—issuing suicidal prophecies; publishing conclusions that seem completely certain when all conditions and causes except one are considered, and that one is the impact of the prophecy itself on those it concerns.
In our example above, the office (in so far as the particular cases in Madeira are concerned) would get on very well until the consumptive Englishmen in question found out what much better terms they could make by announcing themselves as consumptives, and paying the premium appropriate to that class, instead of announcing themselves as Englishmen. But if they did this they would of course be disturbing the statistics. The tables were based upon the assumption that a certain fixed proportion (it does not matter what proportion) of the English lives would continue to be consumptive lives, which, under the supposed circumstances, would probably soon cease to be true. When it is said that nine Englishmen out of ten die in Madeira, it is meant that of those who come to the office, as the phrase is, at random, or in their fair proportions, nine-tenths die. The consumptives are supposed to go there just like red-haired men, or poets, or any other special class. Or they might go in any proportions greater or less than those of other classes, so long as they adhered to the same proportion throughout. The tables are then calculated on the continuance of this state of things; the practical contradiction is in supposing such a state of things to continue after the people had once had a look at the tables. If we merely make the assumption that the publication of these tables made no such alteration in the conduct of those to whom it referred, no hitch of this kind need occur.
In our example above, the office (regarding the specific cases in Madeira) would function quite well until the terminally ill Englishmen realized they could negotiate much better terms by identifying themselves as consumptives and paying the appropriate premium for that classification, rather than identifying themselves simply as Englishmen. However, if they did that, they would be disrupting the statistics. The tables were based on the assumption that a consistent percentage (it doesn’t matter what percentage) of the English lives would remain consumptive lives, which, under the given conditions, would likely soon no longer be valid. When it is stated that nine out of ten Englishmen die in Madeira, it means that among those who approach the office, so to speak, randomly or in their expected proportions, nine-tenths die. Consumptives are thought to go there just like red-haired individuals, poets, or any other specific group. Or they might go in any percentage more or less than those of other groups, as long as they maintained the same ratio throughout. The tables are then calculated based on the continuation of this situation; the main contradiction lies in assuming such a situation could persist after people had seen the tables. If we simply assume that the publication of these tables did not change the behavior of those involved, then this kind of issue wouldn’t arise.
§ 26. The assumptions here made, as has been said, are 227 not in any way contradictory, but they need some explanation. It will readily be seen that, taken together, they are inconsistent with the supposition that each of these classes is homogeneous, that is, that the statistical proportions which hold of the whole of either of them will also hold of any portion of them which we may take. There are certain individuals (viz. the consumptive Englishmen) who belong to each class, and of course the two different sets of statistics cannot both be true of them taken by themselves. They might coincide in their characteristics with either class, but not with both; probably in most practical cases they will coincide with neither, but be of a somewhat intermediate character. Now when it is said of any such heterogeneous body that, say, nine-tenths die, what is meant (or rather implied) is that the class might be broken up into smaller subdivisions of a more homogeneous character, in some of which, of course, more than nine-tenths die, whilst in others less, the differences depending upon their character, constitution, profession, &c.; the number of such divisions and the amount of their divergence from one another being perhaps very considerable.
§ 26. The assumptions we've made, as mentioned, aren't contradictory, but they do require some clarification. It's clear that, when combined, they contradict the idea that each of these groups is uniform, meaning that the statistical proportions that apply to the entire group will also apply to any portion we might select. There are specific individuals (like the consumptive Englishmen) who belong to each group, and naturally, the two different sets of statistics can't both be accurate for them alone. They might share characteristics with either group, but not both; in most practical instances, they likely align with neither and fall into a somewhat mixed category. So when we say that, for a mixed group, for example, nine-tenths die, what is implied is that the group could be divided into smaller, more uniform sections, some of which have a death rate higher than nine-tenths, while others have lower rates, with these differences being based on their traits, makeup, profession, etc.; the number of these divisions and how much they differ from one another could be quite significant.
Now when we speak of either class as a whole and say that nine-tenths die, the most natural and soundest meaning is that that would be the proportion if all without exception went abroad, or (what comes to the same thing) if each of these various subdivisions was represented in fair proportion to its numbers. Or it might only be meant that they go in some other proportion, depending upon their tastes, pursuits, and so on. But whatever meaning be adopted one condition is necessary, viz. that the proportion of each class that went at the time the statistics were drawn up must be adhered to throughout. When the class is homogeneous this is not needed, but when it is heterogeneous the 228 statistics would be interfered with unless this condition were secured.
Now, when we talk about either class as a whole and say that nine out of ten die, the most straightforward and sensible interpretation is that this would be the ratio if everyone without exception went out, or (which amounts to the same thing) if each of these different groups was represented fairly according to their numbers. It could also mean that they go out in some other ratio, depending on their interests, activities, and so on. But whatever interpretation we choose, one condition is essential, namely that the proportion of each class going out at the time the statistics were gathered must stay the same throughout. When the class is uniform, this isn’t necessary, but when it’s diverse, the statistics would be skewed unless this condition is met. 228
We are here supposed to have two sets of statistics, one for the English and one for the consumptives, so that the consumptive English are in a sense counted twice over. If their mortality is of an intermediate amount, therefore, they serve to keep down the mortality of one class and to keep up that of the other. If the statistics are supposed to be exhaustive, by referring to the whole of each class, it follows that actually the same individuals must be counted each time; but if representatives only of each class are taken, the same individuals need not be inserted in each set of tables.
We are supposed to have two sets of statistics, one for the English and one for those with consumption, meaning that English people with consumption are basically counted twice. If their mortality rate is moderate, they lower the mortality rate for one group while raising it for the other. If the statistics are meant to be complete, considering everyone in each group, then the same individuals have to be counted each time; but if only a sample from each group is used, the same individuals don’t have to be included in both sets of tables.
§ 27. When therefore they come to insure (our remarks are still confined to our supposed Madeira case), we have some English consumptives counted as English, and paying the high rate; and others counted as consumptives and paying the low rate. Logically indeed we may suppose them all entered in each class, and paying therefore each rate. What we have said above is that any individual may be conceived to present himself for either of these classes. Conceive that some one else pays his premium for him, so that it is a matter of indifference to him personally at which rate he insures, and there is nothing to prevent some of the class (or for that matter all) going to one class, and others (or all again) going to the other class.
§ 27. So when they come to get insurance (we're still talking about our hypothetical Madeira case), we have some English people with tuberculosis classified as English and paying the higher rate, while others are classified as consumptives and paying the lower rate. Logically, we can imagine that each person could be placed in either category, paying either rate. What we've mentioned above is that any individual could choose to apply for either class. Imagine someone else paying his premium for him, making it irrelevant for him personally which rate he gets insured at, and there’s nothing stopping some people in the first class (or even all of them) from moving to the second class, and others (or all of them again) from going to the first class.
So long therefore as we make the logically possible though practically absurd supposition that some men will continue to pay a higher rate than they need, there is nothing to prevent the English consumptives (some or all) from insuring in each category and paying its appropriate premium. As soon as they gave any thought to the matter, of course they would, in the case supposed, all prefer to insure as consumptives. But their doing this would disturb each set 229 of statistics. The English mortality in Madeira would instantly become heavier, so far as the Insurance company was concerned, by the loss of all their best lives; whilst the consumptive statistics (unless all the English consumptives had already been taken for insurance) would be in the same way deteriorated.[5] A slight readjustment therefore of each scale of insurance would then be needed; this is the disturbance mentioned just above. It must be clearly understood, however, that it is not our original statistics which have proved to be inconsistent, but simply that there were practical obstacles to carrying out a system of insurance upon them.
As long as we entertain the logically possible yet practically ridiculous idea that some people will keep paying more than they need to, there's nothing stopping English individuals with tuberculosis (some or all) from getting insurance in each category and paying the corresponding premium. Once they thought about it, they would all choose to get insured as tuberculosis patients. However, this would disrupt all the statistics. The mortality rate of English individuals in Madeira would immediately increase for the insurance company due to the loss of their healthiest members; meanwhile, the statistics for tuberculosis patients (unless all English individuals with tuberculosis had already been insured) would similarly worsen.229 A slight adjustment of each insurance scale would then be necessary; this is the disruption mentioned earlier. It must be understood that it is not our original statistics that have proven inconsistent, but rather that there are practical challenges to implementing a system of insurance based on them.
§ 28. Examples subject to the difficulty now under consideration will doubtless seem perplexing to the student unacquainted with the subject. They are difficult to reconcile with any other view of the science than that insisted on throughout this Essay, viz. that we are only concerned with averages. It will perhaps be urged that there are two different values of the man's life in these cases, and that they cannot both be true. Why not? The ‘value’ of his life is simply the number of years to which men in his circumstances do, on the average, attain; we have the man set before us under two different circumstances; what wonder, therefore, that these should offer different averages? In such an objection it is forgotten that we have had to substitute for the unattainable result about the individual, the really attainable result about a set of men as much like him as possible. The difficulty and apparent contradiction only arise when people will try to find some justification for their belief in the individual case. What can we possibly conclude, 230 it may be asked, about this particular man John Smith's prospects when we are thus offered two different values for his life? Nothing whatever, it must be replied; nor could we in reality draw a conclusion, be it remembered, in the former case, when we were practically confined to one set of statistics. There also we had what we called the ‘value’ of his life, and since we only knew of one such value, we came to regard it as in some sense appropriate to him as an individual. Here, on the other hand, we have two values, belonging to different series, and as these values are really different it may be complained that they are discordant, but such a complaint can only be made when we do what we have no right to do, viz. assign a value to the individual which shall admit of individual justification.
§ 28. Examples related to the difficulty currently being discussed will likely seem confusing to students who aren’t familiar with the topic. They are hard to align with any other perspective in this field except for the one emphasized throughout this Essay, which is that we are only focused on averages. It might be argued that there are two different values of a person's life in these instances, and that both cannot be accurate. Why not? The 'value' of a person’s life is simply the number of years that people in similar situations typically live; we are looking at the person under two different scenarios, so it’s not surprising that these would yield different averages. This objection overlooks that we have replaced the unattainable result pertaining to the individual with the achievable result regarding a group of people as similar to him as possible. The confusion and apparent contradiction arise only when people seek to justify their beliefs in individual cases. What can we really conclude about this specific man John Smith’s future when we are presented with two different values for his life? The answer is nothing at all; we couldn’t actually draw a conclusion in the earlier case either, when we were limited to just one set of data. In that instance, we also had what we called the 'value' of his life, and since we only knew of one such value, we came to consider it somewhat related to him as an individual. In this scenario, however, we have two values from different sets, and while it might seem that these values are incompatible, this complaint is valid only when we improperly try to assign a value to the individual that can be justified on an individual basis.
§ 29. Is it then perfectly arbitrary what series or class of instances we select by which to judge? By no means; it has been stated repeatedly that in choosing a series, we must seek for one the members of which shall resemble our individual in as many of his attributes as possible, subject only to the restriction that it must be a sufficiently extensive series. What is meant is, that in the above case, where we have two series, we cannot fairly call them contradictory; the only valid charge is one of incompleteness or insufficiency for their purpose, a charge which applies in exactly the same sense, be it remembered, to all statistics which comprise genera unnecessarily wider than the species with which we are concerned. The only difference between the two different classes of cases is, that in the one instance we are on a path which we know will lead at the last, through many errors, towards the truth (in the sense in which truth can be attained here), and we took it for want of a better. In the other instance we have two such paths, perfectly different paths, either of which however will lead us towards the truth 231 as before. Contradiction can only seem to arise when it is attempted to justify each separate step on our paths, as well as their ultimate tendency.
§ 29. Is it completely random which series or group of examples we choose to judge? Not at all; it's been said many times that when selecting a series, we should look for one where the members share as many attributes as possible with our individual case, as long as the series is broad enough. In the situation mentioned, where we have two series, we can't truly call them contradictory; the only legitimate concern is one of being incomplete or inadequate for their purpose, a concern that similarly applies to all statistics that involve categories that are unnecessarily broader than the specific case we're dealing with. The only distinction between the two types of cases is that in one, we're following a path that we know will ultimately lead us, despite many mistakes, towards the truth (in the sense of truth that can be achieved here), and we chose it because we had no better option. In the other case, we have two completely different paths, either of which will still guide us towards the truth as before. Contradictions seem to arise only when we try to justify each individual step we take on our paths, as well as their overall direction. 231
Still it cannot be denied that these objections are a serious drawback to the completeness and validity of any anticipations which are merely founded upon statistical frequency, at any rate in an early stage of experience, when but few statistics have been collected. Such knowledge as Probability can give is not in any individual case of a high order, being subject to the characteristic infirmity of repeated error; but even when measured by its own standard it commences at a very low stage of proficiency. The errors are then relatively very numerous and large compared with what they may ultimately be reduced to.
Still, it can't be denied that these objections are a significant drawback to the completeness and validity of any predictions based solely on statistical frequency, especially at an early stage of experience, when only a few statistics have been gathered. The knowledge that Probability can provide isn't particularly high in quality for any individual case, as it suffers from the inherent flaw of repeated error; however, even by its own standards, it starts at a very low level of proficiency. The errors are therefore relatively numerous and substantial compared to what they could eventually be reduced to.
§ 30. Here as elsewhere there is a continuous process of specialization going on. The needs of a gradually widening experience are perpetually calling upon us to subdivide classes which are found to be too heterogeneous. Sometimes the only complaint that has to be made is that the class to which we are obliged to refer is found to be somewhat too broad to suit our purpose, and that it might be subdivided with convenience. This is the case, as has been shown above, when an Insurance office finds that its increasing business makes it possible and desirable to separate off the men who follow some particular trades from the rest of their fellow-countrymen. Similarly in every other department in which statistics are made use of. This increased demand for specificness leads, in fact, as naturally in this direction, as does the progress of civilization to the subdivision of trades in any town or country. So in reference to the other kind of perplexity mentioned above. Nothing is more common in those sciences or practical arts, in which deduction is but little available, and where in consequence our knowledge is 232 for the most part of the empirical kind, than to meet with suggestions which point more or less directly in contrary directions. Whenever some new substance is discovered or brought into more general use, those who have to deal with it must be familiar with such a state of things. The medical man who has to employ a new drug may often find himself confronted by the two distinct recommendations, that on the one hand it should be employed for certain diseases, and that on the other hand it should not be tried on certain constitutions. A man with such a constitution, but suffering from such a disease, presents himself; which recommendation is the doctor to follow? He feels at once obliged to set to work to collect narrower and more special statistics, in order to escape from such an ambiguity.
§ 30. Here, as in other areas, there’s an ongoing process of specialization. The needs of an expanding experience continually prompt us to break down classes that are too diverse. Sometimes the only issue is that the class we must reference is a bit too broad for our purpose, and it could be subdivided more conveniently. This is evident, as mentioned earlier, when an insurance office realizes that its growing business allows for separating individuals in specific trades from the rest of their peers. The same applies in every field where statistics are utilized. This increased demand for specificity naturally leads to the subdivision of trades in any town or country, similar to the other type of confusion mentioned earlier. It’s very common in those sciences or practical arts where deduction isn’t very useful, and where our knowledge mostly comes from empirical evidence, to encounter suggestions that point in conflicting directions. Whenever a new substance is discovered or becomes widely used, those working with it must understand this situation. A doctor who has to use a new drug may often find himself faced with two conflicting recommendations: on one hand, it should be used for certain diseases, and on the other hand, it shouldn’t be given to individuals with certain conditions. A patient with such a condition but suffering from that specific disease comes in; which advice should the doctor follow? He immediately feels compelled to gather more focused and specialized statistics to resolve such ambiguity.
§ 31. In this and a multitude of analogous cases afforded by the more practical arts it is not of course necessary that numerical data should be quoted and appealed to; it is sufficient that the judgment is more or less consciously determined by them. All that is necessary to make the examples appropriate is that we should admit that in their case statistical data are our ultimate appeal in the present state of knowledge. Of course if the empirical laws can be resolved into their component causes we may appeal to direct deduction, and in this case the employment of statistics, and consequently the use of the theory of Probability, may be superseded.
§ 31. In this and many similar situations in practical fields, it's not essential to quote specific numerical data; it's enough that our judgment is somewhat consciously influenced by it. To make the examples relevant, we just need to agree that, given our current understanding, statistical data is our final reference. If we can break down empirical laws into their basic causes, we can rely on direct reasoning, and in that case, the use of statistics and the theory of Probability might be unnecessary.
In this direction therefore, as time proceeds, the advance of statistical refinement by the incessant subdivision of classes to meet the developing wants of man is plain enough. But if we glance backwards to a more primitive stage, we shall soon see in what a very imperfect state the operation commences. At this early stage, however, Probability and Induction are so closely connected together as to be very apt to 233 be merged into one, or at any rate to have their functions confounded.
In this regard, as time goes on, it's clear that the improvement of statistics through the constant breaking down of categories to meet the evolving needs of humanity is evident. However, if we look back at a more primitive stage, we'll quickly see how imperfect the process begins. At this early stage, Probability and Induction are so closely linked that they often end up being combined or, at the very least, their roles are confused. 233
§ 32. Since the generalization of our statistics is found to belong to Induction, this process of generalization may be regarded as prior to, or at least independent of, Probability. We have, moreover, already discussed (in Chapter VI.) the step corresponding to what are termed immediate inferences, and (in Chapter VII.) that corresponding to syllogistic inferences. Our present position therefore is that in which we may consider ourselves in possession of any number of generalizations, but wish to employ them so as to make inferences about a given individual; just as in one department of common logic we are engaged in finding middle terms to establish the desired conclusion. In this latter case the process is found to be extremely simple, no accumulation of different middle terms being able to lead to any real ambiguity or contradiction. In Probability, however, the case is different. Here, if we attempt to draw inferences about the individual case before us, as often is attempted—in the Rule of Succession for example—we shall encounter the full force of this ambiguity and contradiction. Treat the question, however, fairly, and all difficulty disappears. Our inference really is not about the individuals as individuals, but about series or successions of them. We wished to know whether John Smith will die within the year; this, however, cannot be known. But John Smith, by the possession of many attributes, belongs to many different series. The multiplicity of middle terms, therefore, is what ought to be expected. We can know whether a succession of men, residents in India, consumptives, &c. die within a year. We may make our selection, therefore, amongst these, and in the long run the belief and consequent conduct of ourselves and other persons (as described in Chapter VI.) will become 234 capable of justification. With regard to choosing one of these series rather than another, we have two opposing principles of guidance. On the one hand, the more special the series the better; for, though not more right in the end, we shall thus be more nearly right all along. But, on the other hand, if we try to make the series too special, we shall generally meet the practical objection arising from insufficient statistics.
§ 32. Since generalizing our statistics is part of the process of Induction, we can see this process of generalization as occurring before or at least separately from Probability. Additionally, we've already talked about (in Chapter VI.) the step related to what are called immediate inferences, and (in Chapter VII.) that related to syllogistic inferences. Our current position is one where we have a number of generalizations but want to use them to make inferences about a specific individual; similar to how, in certain areas of common logic, we look for middle terms to reach the desired conclusion. In this latter case, the process is very straightforward, as no combination of different middle terms typically leads to any real confusion or contradiction. However, in Probability, the situation is different. Here, if we try to make inferences about the individual case at hand, as is often done—in the Rule of Succession for example—we'll face significant ambiguity and contradiction. If we address the question fairly, all difficulties disappear. Our inference is not actually about the individuals themselves, but about groups or sequences of them. We want to know if John Smith will die within the year; however, we can't know that for sure. But John Smith, through his various attributes, belongs to many different groups. Therefore, we should expect a multiplicity of middle terms. We can know whether a group of men, living in India, who have tuberculosis, etc. die within a year. We can make our selection among these groups, and over time, the beliefs and resulting actions of ourselves and others (as described in Chapter VI.) will become justifiable. When it comes to choosing one group over another, we have two conflicting guiding principles. On one hand, the more specific the group, the better; even though it might not end up being more accurate, we will be closer to being right throughout. On the other hand, if we make the group too specific, we will often face practical challenges due to insufficient statistics.
1 Some of my readers may be familiar with a very striking digression in Buffon's Natural History (Natural Hist. of Man, § VIII.), in which he supposes the first man in full possession of his faculties, but with all his experience to gain, and speculates on the gradual acquisition of his knowledge. Whatever may be thought of his particular conclusions the passage is very interesting and suggestive to any student of Psychology.
1 Some of my readers might know about a notable digression in Buffon's Natural History (Natural Hist. of Man, § VIII.), where he imagines the first man fully capable but with all his knowledge still to acquire, and he speculates on how he gradually learns. Regardless of what one might think about his specific conclusions, this passage is quite interesting and thought-provoking for anyone studying Psychology.
2 See also Dugald Stewart (Ed. by Hamilton; VII. pp. 115–119).
2 See also Dugald Stewart (Ed. by Hamilton; VII. pp. 115–119).
3 Required that is for purposes of logical inference within the limits of Probability; it is not intended to imply any doubts as to its actual universal prevalence, or its all-importance for scientific purposes. The subject is more fully discussed in a future chapter.
3 Required for logical reasoning within the bounds of Probability; this does not suggest any skepticism about its widespread reality or its critical significance for scientific purposes. The topic will be explored in more detail in a future chapter.
4 As particular propositions they are both of course identical in form. The fact that the ‘some’ in the former corresponds to a larger proportion than in the latter, is a distinction alien to pure Logic.
4 As specific statements, they are both clearly the same in form. The difference that the ‘some’ in the first one refers to a larger amount than in the second is a distinction unrelated to pure Logic.
5 The reason is obvious. The healthiest English lives in Madeira (viz. the consumptive ones) have now ceased to be reckoned as English; whereas the worst consumptive lives there (viz. the English) are now increased in relative numbers.
5 The reason is clear. The healthiest English people living in Madeira (specifically the ones with tuberculosis) have now stopped being counted as English; while the least healthy English people there (specifically the ones with tuberculosis) have now increased in number relative to others.
CHAPTER 10.
CHANCE AS OPPOSED TO CAUSATION AND DESIGN.
§ 1. The remarks in the previous chapter will have served to clear the way for an enquiry which probably excites more popular interest than any other within the range of our subject, viz. the determination whether such and such events are to be attributed to Chance on the one hand, or to Causation or Design on the other. As the principal difficulty seems to arise from the ambiguity with which the problem is generally conceived and stated, owing to the extreme generality of the conceptions involved, it becomes necessary to distinguish clearly between the several distinct issues which are apt to be involved.
§ 1. The comments made in the previous chapter have likely paved the way for an inquiry that probably sparks more public interest than any other topic related to our subject, specifically the question of whether certain events can be attributed to Chance on one side or to Causation or Design on the other. Since the main challenge seems to stem from the confusion surrounding how the problem is typically understood and expressed, due to the broadness of the concepts involved, it is important to clearly differentiate between the various distinct issues that may be at play.
I. There is, to begin with, a very old objection, founded on the assumption which our science is supposed to make of the existence of Chance. The objection against chance is of course many centuries older than the Theory of Probability; and as it seems a nearly obsolete objection at the present day we need not pause long for its consideration. If we spelt the word with a capital C, and maintained that it was representative of some distinct creative or administrative agency, we should presumably be guilty of some form of Manicheism. But the only rational meaning of the objection 236 would appear to be that the principles of the science compel us to assume that events (some events, only, that is) happen without causes, and are thereby removed from the customary control of the Deity. As repeatedly pointed out already this is altogether a mistake. The science of Probability makes no assumption whatever about the way in which events are brought about, whether by causation or without it. All that we undertake to do is to establish and explain a body of rules which are applicable to classes of cases in which we do not or cannot make inferences about the individuals. The objection therefore must be somewhat differently stated, and appears finally to reduce itself to this:—that the assumptions upon which the science of Probability rests, are not inconsistent with a disbelief in causation within certain limits; causation being of course understood simply in the sense of regular sequence. So stated the objection seems perfectly valid, or rather the facts on which it is based must be admitted; though what connection there would be between such lack of causation and absence of Divine superintendence I quite fail to see.
I. To start with, there’s a very old argument based on the assumption that our science presumes the existence of Chance. This objection to chance predates the Theory of Probability by many centuries, and since it seems nearly outdated today, we won’t dwell on it too long. If we capitalized the word and argued that it represented some unique creative or administrative force, we would likely be guilty of some form of Manicheism. However, the only reasonable interpretation of the objection is that the principles of the science require us to believe that some events occur without causes, and thus fall outside the usual control of the Deity. As has been pointed out before, this is entirely mistaken. The science of Probability makes no claims about how events occur, whether through causation or otherwise. Our goal is simply to establish and explain a set of rules that apply to categories of cases where we can’t or don’t make inferences about the individual instances. Therefore, the objection should be rephrased somewhat and essentially comes down to this: the assumptions on which the science of Probability is built do not contradict a disbelief in causation within certain limits, with causation understood merely as a regular sequence. When stated this way, the objection seems perfectly valid, or rather, the facts it rests upon must be acknowledged; still, I don’t see any clear connection between such a lack of causation and the absence of Divine oversight.
As this Theological objection died away the men of physical science, and those who sympathized with them, began to enforce the same protest; and similar cautions are still to be found from time to time in modern treatises. Hume, for instance, in his short essay on Probability, commences with the remark, “though there be no such thing as chance in the world, our ignorance of the real cause of any event has the same influence on the understanding, &c.” De Morgan indeed goes so far as to declare that “the foundations of the theory of Probability have ceased to exist in the mind that has formed the conception,” “that anything ever did happen or will happen without some particular reason why it should have been precisely what it was and 237 not anything else.”[1] Similar remarks might be quoted from Laplace and others.
As this theological objection faded away, scientists and their supporters began to express the same concerns, and similar warnings can still be found in modern writings from time to time. Hume, for example, in his brief essay on Probability, starts with the statement, “though there is no such thing as chance in the world, our ignorance of the real cause of any event has the same effect on understanding, &c.” De Morgan even goes so far as to claim that “the foundations of the theory of Probability have ceased to exist in the mind that has formed the conception,” “that anything ever did happen or will happen without some specific reason for why it was exactly what it was and 237 not something else.” [1] Similar statements could be cited from Laplace and others.
§ 2. In the particular form of the controversy above referred to, and which is mostly found in the region of the natural and physical sciences, the contention that chance and causation are irreconcileable occupies rather a defensive position; the main fact insisted on being that, whenever in these subjects we may happen to be ignorant of the details we have no warrant for assuming as a consequence that the details are uncaused. But this supposed irreconcileability is sometimes urged in a much more aggressive spirit in reference to social enquiries. Here the attempt is often made to prove causation in the details, from the known and admitted regularity in the averages. A considerable amount of controversy was excited some years ago upon this topic, in great part originated by the vigorous and outspoken support of the necessitarian side by Buckle in his History of Civilization.
§ 2. In the specific type of debate mentioned earlier, which mainly occurs in the natural and physical sciences, the argument that chance and causation cannot coexist tends to take a defensive stance. The key point emphasized is that whenever we lack knowledge about the details in these fields, we have no basis for concluding that the details are without cause. However, this supposed incompatibility is sometimes presented more aggressively in social research. In this context, there are frequent attempts to demonstrate causation in the specifics based on the known and accepted consistency in the averages. A significant amount of debate arose on this issue a few years ago, largely sparked by Buckle's strong and candid support for the necessitarian view in his History of Civilization.
It should be remarked that in these cases the attempt is sometimes made as it were to startle the reader into acquiescence by the singularity of the examples chosen. Instances are selected which, though they possess no greater logical value, are, if one may so express it, emotionally more effective. Every reader of Buckle's History, for instance, will remember the stress which he laid upon the observed fact, that the number of suicides in London remains about the same, year by year; and he may remember also the sort of panic with which the promulgation of this fact was accompanied in many quarters. So too the way in which Laplace notices that the number of undirected letters annually sent to the Post Office remains about the same, and the comments of Dugald Stewart upon this particular uniformity, seem to imply that they regarded 238 this instance as more remarkable than many analogous ones taken from other quarters.
It should be noted that in these situations, there's often an attempt to shock the reader into agreement by using unusual examples. Cases are picked that, while they don't have any more logical significance, are, in a way, more emotionally impactful. Every reader of Buckle's History, for example, will recall how he emphasized the fact that the number of suicides in London remains roughly the same each year; they might also remember the kind of panic that arose when this fact was brought up in various circles. Similarly, Laplace points out that the number of undelivered letters sent to the Post Office each year stays about the same, and Dugald Stewart's comments on this specific consistency suggest that they saw this case as more noteworthy than many similar ones from different contexts.
That there is a certain foundation of truth in the reasonings in support of which the above examples are advanced, cannot be denied, but their authors appear to me very much to overrate the sort of opposition that exists between the theory of Chances and the doctrine of Causation. As regards first that wider conception of order or regularity which we have termed uniformity, anything which might be called objective chance would certainly be at variance with this in one respect. In Probability ultimate regularity is always postulated; in tossing a die, if not merely the individual throws were uncertain in their results, but even the average also, owing to the nature of the die, or the number of the marks upon it, being arbitrarily interfered with, of course no kind of science would attempt to take any account of it.
There’s definitely some truth to the reasoning behind the examples mentioned above, but I think their authors greatly overestimate the opposition between the theory of Chance and the concept of Causation. First, regarding the broader idea of order or regularity we refer to as uniformity, anything that could be considered objective chance would clash with this in one way. In Probability, ultimate regularity is always assumed; for instance, when tossing a die, if not only the individual throws were unpredictable in their outcomes but also the average results—because of the die's nature or the number of markings on it being arbitrarily altered—no scientific approach would bother to account for it.
§ 3. So much must undoubtedly be granted; but must the same admission be made as regards the succession of the individual events? Can causation, in the sense of invariable succession (for we are here shifting on to this narrower ground), be denied, not indeed without suspicion of scientific heterodoxy, but at any rate without throwing uncertainty upon the foundations of Probability? De Morgan, as we have seen, strongly maintains that this cannot be so. I find myself unable to agree with him here, but this disagreement springs not so much from differences of detail, as from those of the point of view in which we regard the science. He always appears to incline to the opinion that the individual judgment in probability is to admit of justification; that when we say, for instance, that the odds in favour of some event are three to two, that we can explain and justify our statement without any necessary reference to a series or class of such events. It is not easy to see how this can be 239 done in any case, but the obstacles would doubtless be greater even than they are, if knowledge of the individual event were not merely unattained, but, owing to the absence of any causal connection, essentially unattainable. On the theory adopted in this work we simply postulate ignorance of the details, but it is not regarded as of any importance on what sort of grounds this ignorance is based. It may be that knowledge is out of the question from the nature of the case, the causative link, so to say, being missing. It may be that such links are known to exist, but that either we cannot ascertain them, or should find it troublesome to do so. It is the fact of this ignorance that makes us appeal to the theory of Probability, the grounds of it are of no importance.
§ 3. It must certainly be acknowledged; but does the same need to be accepted regarding the sequence of individual events? Can we deny causation, in the sense of a consistent sequence (since we are narrowing our focus here), without raising doubts about scientific integrity, or at least without creating uncertainty about the basics of Probability? De Morgan, as we've noted, strongly argues that this isn't the case. I can't fully agree with him here, but my disagreement doesn't stem from specific differences; rather, it arises from differing perspectives on the science. He seems to lean toward the view that individual judgments in probability can be justified; that when we state, for example, that the odds of a certain event are three to two, we can explain and justify this assertion without needing to reference a series or category of such events. It's not easy to see how this can be done in any situation, but the challenges would likely be even greater if knowledge of the individual event weren't just unattained but essentially unreachable due to the lack of any causal link. In the theory presented in this work, we simply assume ignorance of the specifics, but we don't consider it significant what type of basis this ignorance rests on. It may be that knowledge is impossible due to the nature of the situation, as the causal link is absent. It could also be the case that such links are acknowledged to exist, but we either can't identify them or would find it inconvenient to do so. It is this ignorance that leads us to rely on the theory of Probability; the reasons for it are insignificant.
§ 4. On the view here adopted we are concerned only with averages, or with the single event as deduced from an average and conceived to form one of a series. We start with the assumption, grounded on experience, that there is uniformity in this average, and, so long as this is secured to us, we can afford to be perfectly indifferent to the fate, as regards causation, of the individuals which compose the average. The question then assumes the following form:—Is this assumption, of average regularity in the aggregate, inconsistent with the admission of what may be termed causeless irregularity in the details? It does not seem to me that it would be at all easy to prove that this is so. As a matter of fact the two beliefs have constantly co-existed in the same minds. This may not count for much, but it suggests that if there be a contradiction between them it is by no means palpable and obvious. Millions, for instance, have believed in the general uniformity of the seasons taken one with another, who certainly did not believe in, and would very likely have been ready distinctly to deny, the existence 240 of necessary sequences in the various phenomena which compose what we call a season. So with cards and dice; almost every gambler must have recognized that judgment and foresight are of use in the long run, but writers on chance seem to think that gamblers need a good deal of reasoning to convince them that each separate throw is in its nature essentially predictable.
§ 4. In the view we've taken, we're only focused on averages, or on a single event that comes from an average and is seen as part of a series. We start with the assumption, based on experience, that there's consistency in this average, and as long as we have that, we can be completely indifferent to the individual cases that make up the average in terms of causation. The question then becomes: Is this assumption of average consistency in the overall data incompatible with what might be called random irregularity in the specifics? It doesn't seem easy to prove that it is. In fact, these two beliefs have often coexisted in the same minds. While this alone might not mean much, it suggests that if there's a contradiction, it's not at all clear or obvious. For example, millions of people have believed in the general consistency of the seasons overall, even if they didn't believe in or would likely have outright denied the existence of necessary connections in the various phenomena that make up what we call a season. The same goes for cards and dice; almost every gambler must recognize that judgment and foresight help in the long run, but writers on chance seem to think gamblers need a lot of convincing to understand that each individual throw is, by its nature, essentially predictable.
§ 5. In its application to moral and social subjects, what gives this controversy its main interest is its real or supposed bearing upon the vexed question of the freedom of the will; for in this region Causation, and Fatalism or Necessitarianism, are regarded as one and the same thing.
§ 5. In its discussion of moral and social issues, what makes this debate particularly engaging is its real or perceived connection to the ongoing debate about free will; in this area, causation and fatalism or necessitarianism are viewed as essentially the same thing.
Here, as in the last case, that wide and somewhat vague kind of regularity that we have called Uniformity, must be admitted as a notorious fact. Statistics have put it out of the power of any reasonably informed person to feel any hesitation upon this point. Some idea has already been gained, in the earlier chapters, of the nature and amount of the evidence which might be furnished of this fact, and any quantity more might be supplied from the works of professed writers upon the subject. If, therefore, Free-will be so interpreted as to imply such essential irregularity as defies prediction both in the average, and also in the single case, then the negation of free-will follows, not as a remote logical consequence, but as an obvious inference from indisputable facts of experience.
Here, just like in the previous case, that broad and somewhat unclear type of regularity that we've referred to as Uniformity must be recognized as a clear fact. Statistics have made it impossible for any reasonably informed person to doubt this. Some understanding has already been gained in the earlier chapters about the nature and extent of the evidence that can support this fact, and much more could be provided from the works of experts on the topic. If, therefore, Free will is interpreted in such a way that it implies a fundamental irregularity that cannot be predicted, both on average and in individual cases, then the rejection of free will follows not as a distant logical conclusion, but as an obvious result of undeniable facts from experience.
Few persons, however, would go so far as to interpret it in this sense. All that troubles them is the fear that somehow this general regularity may be found to carry with it causation, certainly in the sense of regular invariable sequence, and probably also with the further association of compulsion. Rejecting the latter association as utterly unphilosophical, I cannot even see that the former consequence 241 can be admitted as really proved, though it doubtless gains some confirmation from this source.
Few people, however, would go as far as to interpret it this way. What troubles them is the fear that this general regularity might imply causation, definitely in the sense of a consistent sequence, and probably also with the added idea of compulsion. While I reject the latter idea as completely unphilosophical, I don't believe the former outcome can be considered truly proven, although it certainly gets some support from this source. 241
§ 6. The nature of the argument against free-will, drawn from statistics, at least in the form in which it is very commonly expressed, seems to me exceedingly defective. The antecedents and consequents, in the case of our volitions, must clearly be supposed to be very nearly immediately in succession, if anything approaching to causation is to be established: whereas in statistical enquiries the data are often widely separate, if indeed they do not apply merely to single groups of actions or results. For instance, in the case of the misdirected letters, what it is attempted to prove is that each writer was so much the ‘victim of circumstances’ (to use a common but misleading expression) that he could not have done otherwise than he did under his circumstances. But really no accumulation of figures to prove that the number of such letters remains the same year by year, can have much bearing upon this doctrine, even though they were accompanied by corresponding figures which should connect the forgetfulness thus indicated with some other characteristics in the writers. So with the number of suicides. If 250 people do, or lately did, annually put an end to themselves in London, the fact, as it thus stands by itself, may be one of importance to the philanthropist and statesman, but it needs bringing into much closer relation with psychological elements if it is to convince us that the actions of men are always instances of inflexible order. In fact, instead of having secured our A and B here in closest intimacy of succession to one another,—to employ the symbolic notation commonly used in works on Inductive Logic to illustrate the causal connection,—we find them separated by a considerable interval; often indeed we merely have an A or a B by itself.
§ 6. The argument against free will based on statistics, at least in the way it's usually put forward, seems really flawed to me. The causes and effects of our decisions must be assumed to follow each other very closely if we want to show any kind of causation. However, in statistical studies, the data are often widely spaced out and may only relate to specific groups of actions or outcomes. For example, in the case of misdelivered letters, what is being argued is that each writer was so much a ‘victim of circumstances’ (using a common but misleading phrase) that they couldn’t have acted differently given their situation. But honestly, no amount of data showing that the number of these letters stays the same year after year can seriously support this idea, even if they were paired with figures that linked this forgetfulness to other traits in the writers. The same goes for suicides. If 250 people do, or recently did, take their own lives each year in London, that fact alone may be significant for philanthropists and politicians, but it needs to be connected much more closely with psychological factors to convince us that people’s actions are always examples of rigid order. In reality, instead of having our A and B closely following each other, which is how we typically illustrate causal relationships in Inductive Logic, we often find them separated by a significant gap; sometimes, we only have an A or a B on its own.
§ 7. Again, another deficiency in such reasoning seems to be the laying undue weight upon the mere regularity or persistency of the statistics. These may lead to very important results, but they are not exactly what is wanted for the purpose of proving anything against the freedom of the will; it is not indeed easy to see what connection this has with such facts as that the annual number of thefts or of suicides remains at pretty nearly the same figure. Statistical uniformity seems to me to establish nothing else, at least directly, in the case of human actions, than it does in that of physical characteristics. Take but one instance, that of the misdirected letters. We were already aware that the height, weight, chest measurement, and so on, of a large number of persons preserved a tolerably regular average amidst innumerable deflections, and we were prepared by analogy to anticipate the same regularity in their mental characteristics. All that we gain, by counting the numbers of letters which are posted without addresses, is a certain amount of direct evidence that this is the case. Just as observations of the former kind had already shown that statistics of the strength and stature of the human body grouped themselves about a mean, so do those of the latter that a similar state of things prevails in respect of the readiness and general trustworthiness of the memory. The evidence is not so direct and conclusive in the latter case, for the memory is not singled out and subjected to measurement by itself, but is taken in combination with innumerable other influencing circumstances. Still there can be little doubt that the statistics tell on the whole in this direction, and that by duly varying and extending them they may obtain considerable probative force.
§ 7. Once again, another flaw in this reasoning seems to be putting too much emphasis on the regularity or consistency of the statistics. While these can lead to very important conclusions, they aren’t necessarily what’s needed to prove anything against free will; it’s hard to see how the fact that the annual number of thefts or suicides stays around the same number is related. Statistical consistency seems to show nothing more, at least directly, in the case of human actions than it does in physical traits. Take, for example, the issue of misdirected letters. We already knew that the height, weight, chest measurements, and so on, of many individuals maintained a fairly regular average despite countless variations, and we expected the same consistency in their mental traits. All we gain from counting the number of letters posted without addresses is some direct evidence supporting this idea. Just as previous observations showed that statistics of the strength and height of the human body cluster around an average, so do those suggesting a similar pattern exists regarding the reliability and overall trustworthiness of memory. The evidence in the latter case isn’t as direct and conclusive because memory isn't measured in isolation; it’s considered along with countless other influencing factors. Still, there’s little doubt that the statistics generally suggest this direction, and with careful variation and expansion, they can achieve significant evidential power.
The fact is that Probability has nothing more to do with Natural Theology, either in its favour or against it, than the 243 general principles of Logic or Induction have. It is simply a body of rules for drawing inferences about classes of events which are distinguished by a certain quality. The believer in a Deity will, by the study of nature, be led to form an opinion about His works, and so to a certain extent about His attributes. But it is surely unreasonable to propose that he should abandon his belief because the sequence of events,—not, observe, their general tendency towards happiness or misery, good or evil,—is brought about in a way different from what he had expected; whether it be by displaying order where he had expected irregularity, or by involving the machinery of secondary causes where he had expected immediate agency.
The truth is that Probability is no more related to Natural Theology, either supporting or opposing it, than the general principles of Logic or Induction are. It’s just a set of rules for making inferences about groups of events that have a specific characteristic. A believer in a Deity will, through studying nature, develop an opinion about His creations and, to some degree, about His qualities. But it’s unreasonable to suggest that he should give up his belief just because the order of events—note that this isn’t about their overall trend toward happiness or suffering, good or bad—occurs differently than he expected; whether that’s revealing order when he anticipated chaos or involving the workings of secondary causes when he expected direct intervention.
§ 8. It is both amusing and instructive to consider what very different feelings might have been excited in our minds by this co-existence of, what may be called, ignorance of individuals and knowledge of aggregates, if they had presented themselves to our observation in a reverse order. Being utterly unable to make assured predictions about a single life, or the conduct of individuals, people are sometimes startled, and occasionally even dismayed, at the unexpected discovery that such predictions can be confidently made when we are speaking of large numbers. And so some are prompted to exclaim, This is denying Providence! it is utter Fatalism! But let us assume, for a moment, that our familiarity with the subject had been experienced, in the first instance, in reference to the aggregates instead of the individual lives. It is difficult, perhaps, to carry out such a supposition completely; though we may readily conceive something approaching to it in the case of an ignorant clerk in a Life Assurance Office, who had never thought of life, except as having such a ‘value’ at such an age, and who had hardly estimated it except in the form of averages. Might 244 we not suppose him, in some moment of reflectiveness, being astonished and dismayed at the sudden realization of the utter uncertainty in which the single life is involved? And might not his exclamation in turn be, Why this is denying Providence! It is utter chaos and chance! A belief in a Creator and Administrator of the world is not confined to any particular assumption about the nature of the immediate sequence of events, but those who have been accustomed hitherto to regard the events under one of the aspects above referred to, will often for a time feel at a loss how to connect them with the other.
§ 8. It’s both amusing and enlightening to think about how different our feelings might be about the coexistence of what we might call ignorance of individuals and knowledge of groups if they had appeared to us in the opposite order. When people can't reliably predict the outcomes of a single life or the behavior of individuals, they are sometimes shocked and even troubled when they discover that such predictions can be confidently made regarding large groups. As a result, some people might exclaim, "This denies Providence! It’s pure Fatalism!" But let’s imagine, for a moment, that our understanding of the subject started with groups rather than individual lives. It might be hard to fully grasp this idea, but we can picture something similar in the case of an uninformed clerk at a Life Assurance Office, who thinks solely about life in terms of a 'value' at a certain age, hardly considering it any way other than through averages. Could we not imagine him, in a moment of reflection, feeling shocked and troubled by the sudden realization of the complete uncertainty surrounding individual lives? And might his response then be, "This denies Providence! It’s total chaos and chance!" Belief in a Creator and Ruler of the world isn’t limited to any specific viewpoint about how events unfold. However, those who have previously viewed events through one of the lenses mentioned above may often feel confused about how to link them with the other perspective for a time.
§ 9. So far we have been touching on a very general question; viz. the relation of the fundamental postulates of Probability to the conception of Order or Uniformity in the world, physical or moral. The difficulties which thence arise are mainly theological, metaphysical or psychological. What we must now consider are problems of a more detailed or logical character. They are prominently these two; (1) the distinction between chance arrangement and causal arrangement in physical phenomena; and (2) the distinction between chance arrangement and designed arrangement where we are supposed to be contemplating rational agency as acting on one side at least.
§ 9. So far, we've been discussing a very broad question: the relationship between the fundamental principles of Probability and the idea of Order or Uniformity in the world, whether physical or moral. The challenges that come from this are mainly theological, metaphysical, or psychological. Now, we need to look at problems of a more specific or logical nature. These are primarily two issues: (1) the difference between random arrangement and causal arrangement in physical events; and (2) the difference between random arrangement and designed arrangement when we consider rational agency as influencing at least one side.
II. The first of these questions raises the antithesis between chance and causation, not as a general characteristic pervading all phenomena, but in reference to some specified occurrence:—Is this a case of chance or not? The most strenuous supporters of the universal prevalence of causation and order admit that the question is a relevant one, and they must therefore be supposed to have some rule for testing the answers to it.
II. The first of these questions highlights the contrast between chance and causation, not as a broad characteristic that applies to all phenomena, but in relation to a specific event:—Is this a case of chance or not? The strongest advocates for the universal existence of causation and order acknowledge that this question is a valid one, and they must therefore have some method for assessing the answers to it.
Suppose, for instance, a man is seized with a fit in a house where he has gone to dine, and dies there; and some 245 one remarks that that was the very house in which he was born. We begin to wonder if this was an odd coincidence and nothing more. But if our informant goes on to tell us that the house was an old family one, and was occupied by the brother of the deceased, we should feel at once that these facts put the matter in a rather different light. Or again, as Cournot suggests, if we hear that two brothers have been killed in battle on the same day, it makes a great difference in our estimation of the case whether they were killed fighting in the same engagement or whether one fell in the north of France and the other in the south. The latter we should at once class with mere coincidences, whereas the former might admit of explanation.
Suppose a man has a seizure while dining at a house and ends up dying there; someone then mentions that it was the same house where he was born. We might think it’s just a strange coincidence. But if the person adds that it was an old family home and was occupied by the deceased's brother, we would immediately see the situation differently. Or again, as Cournot points out, if we learn that two brothers were killed in battle on the same day, it matters greatly whether they died in the same fight or if one was in northern France and the other in the south. The latter would likely be seen as just a coincidence, while the former could have a deeper explanation.
§ 10. The problem, as thus conceived, seems to be one rather of Inductive Logic than of Probability, because there is not the slightest attempt to calculate chances. But it deserves some notice here. Of course no accurate thinker who was under the sway of modern physical notions would for a moment doubt that each of the two elements in question had its own ‘cause’ behind it, from which (assuming perfect knowledge) it might have been confidently inferred. No more would he doubt, I apprehend, that if we could take a sufficiently minute and comprehensive view, and penetrate sufficiently far back into the past, we should reach a stage at which (again assuming perfect knowledge) the co-existence of the two events could equally have been foreseen. The employment of the word casual therefore does not imply any rejection of a cause; but it does nevertheless correspond to a distinction of some practical importance. We call a coincidence casual, I apprehend, when we mean to imply that no knowledge of one of the two elements, which we can suppose to be practically attainable, would enable us to expect the other. We know of no generalization which covers them 246 both, except of course such as are taken for granted to be inoperative. In such an application it seems that the word ‘casual’ is not used in antithesis to ‘causal’ or to ‘designed’, but rather to that broader conception of order or regularity to which I should apply the term Uniformity. The casual coincidence is one which we cannot bring under any special generalization; certain, probable, or even plausible.
§ 10. The issue, as presented, appears to be more about Inductive Logic than Probability, since there’s no attempt to calculate odds at all. Still, it’s worth mentioning. Clearly, no serious thinker influenced by modern scientific ideas would doubt for a second that each of the two factors in question has its own ‘cause’ behind it, from which (if we had perfect knowledge) we could confidently deduce conclusions. Likewise, I believe he wouldn’t question that if we could take a sufficiently detailed and broad perspective, looking far back into the past, we would arrive at a point where (again, assuming perfect knowledge) the occurrence of both events could have been predicted. Therefore, the use of the word casual doesn’t imply a denial of a cause; however, it does reflect a distinction that holds practical significance. We refer to a coincidence as casual, I think, when we intend to suggest that no amount of knowledge about one of the two factors, which we consider practically attainable, would allow us to anticipate the other. We aren’t aware of any general rule that encompasses both, except those that are assumed to be ineffective. In this context, it seems that the term ‘casual’ isn’t used in opposition to ‘causal’ or ‘intentional,’ but instead to a broader concept of order or consistency, which I would term Uniformity. A casual coincidence is one that we can’t categorize under any specific generalization; whether certain, probable, or even plausible. 246
A slightly different way of expressing this distinction is to regard these ‘mere coincidences’ as being simply cases in point of independent events, in the sense in which independence was described in a former chapter. We saw that any two events, A and B, were so described when each happens with precisely the same relative statistical frequency whether the other happens or not. This state of things seems to hold good of the successions of heads and tails in tossing coins, as in that of male and female births in a town, or that of the digits in many mathematical tables. Thus we suppose that when men are picked up in the street and taken into a house to die, there will not be in the long run any preferential selection for or against the house in which they were born. And all that we necessarily mean to claim when we deny of such an occurrence, in any particular case, that it is a mere coincidence, is that that particular case must be taken out of the common list and transferred to one in which there is some such preferential selection.
A slightly different way to express this distinction is to see these ‘mere coincidences’ as examples of independent events, in the sense defined in a previous chapter. We observed that any two events, A and B, are described as independent when each occurs with exactly the same relative statistical frequency, regardless of whether the other occurs or not. This situation seems to apply to the outcomes of heads and tails when tossing coins, as well as to the ratio of male and female births in a town, or to the digits in many mathematical tables. Therefore, we assume that when people are picked up in the street and taken into a house to die, there will not be, in the long run, any preference for or against the house they were born in. What we mean when we specifically deny that such an occurrence is merely a coincidence is that this particular case should be removed from the general list and placed in one where there is some kind of preferential selection involved.
§ 11. III. The next problem is a somewhat more intricate one, and will therefore require rather careful subdivision. It involves the antithesis between Chance and Design. That is, we are not now (as in the preceding case) considering objects in their physical aspect alone, and taking account only of the relative frequency of their co-existence or sequence; but we are considering the agency by which they are produced, and we are enquiring whether that agency 247 trusted to what we call chance, or whether it employed what we call design.
§ 11. III. The next issue is a bit more complex, so we'll need to break it down carefully. It involves the contrast between Chance and Design. Here, we aren't just looking at objects in their physical form as before and only focusing on how often they occur together or in sequence; instead, we are examining the process by which they are created and questioning whether that process relies on what we refer to as chance, or if it uses what we call design. 247
The reader must clearly understand that we are not now discussing the mere question of fact whether a certain assigned arrangement is what we call a chance one. This, as was fully pointed out in the fourth chapter, can be settled by mere inspection, provided the materials are extensive enough. What we are now proposing to do is to carry on the enquiry from the point at which we then had to leave it off, by solving the question, Given a certain arrangement, is it more likely that this was produced by design, or by some of the methods commonly called chance methods? The distinction will be obvious if we revert to the succession of figures which constitute the ratio π. As I have said, this arrangement, regarded as a mere succession of digits, appears to fulfil perfectly the characteristics of a chance arrangement. If we were to omit the first four or five digits, which are familiar to most of us, we might safely defy any one to whom it was shown to say that it was not got at by simply drawing figures from a bag. He might look at it for his whole life without detecting that it was anything but the result of such a chance selection. And rightly so, because regarded as a mere arrangement it is a chance one: it fulfils all the requirements of such an arrangement.[2] The question 248 we are now proceeding to discuss is this: Given any such arrangement how are we to determine the process by which it was arrived at?
The reader needs to understand that we’re not just talking about whether a specific arrangement is what we call random. As explained in the fourth chapter, this can be figured out just by looking at it, as long as the materials are extensive enough. What we’re going to do now is continue the investigation from where we left off, by addressing the question: Given a specific arrangement, is it more likely that this was created intentionally, or by what we typically refer to as random methods? The difference will be clear if we look back at the sequence of numbers that make up the ratio π. As I mentioned, this arrangement, when seen as just a series of digits, seems to perfectly fit the characteristics of a random arrangement. If we were to leave out the first four or five digits, which most of us know, we could confidently challenge anyone shown this arrangement to say it wasn’t just made by pulling numbers out of a bag. They could stare at it their entire life without realizing it was anything other than a random selection. And they would be right, because seen as just an arrangement, it is random: it meets all the criteria for such an arrangement.[2] The question 248 we are now discussing is this: Given any such arrangement, how do we determine the process by which it was created?
We are supposed to have some event before us which might have been produced in either of two alternative ways, i.e. by chance or by some kind of deliberate design; and we are asked to determine the odds in favour of one or other of these alternatives. It is therefore a problem in Inverse Probability and is liable to all the difficulties to which problems of this class are apt to be exposed.
We are meant to have some event ahead of us that could have happened in one of two ways: either by chance or through some intentional design; and we need to figure out the likelihood of one option versus the other. This makes it a problem in Inverse Probability and comes with all the challenges that such problems often face.
§ 12. For the theoretic solution of such a question we require the two following data:—
§ 12. For the theoretical solution of this question, we need the following two pieces of information:—
(1) The relative frequency of the two classes of agencies, viz. that which is to act in a chance way and that which is to act designedly.
(1) The relative frequency of the two types of agencies, namely that which acts randomly and that which acts intentionally.
(2) The probability that each of these agencies, if it were the really operative one, would produce the event in question.
(2) The likelihood that each of these agencies, if it were the one actually in charge, would cause the event in question.
The latter of these data can generally be secured without any difficulty. The determination of the various contingencies on the chance hypothesis ought not, if the example were a suitable one, to offer any other than arithmetical difficulties. And as regards the design alternative, it is generally taken for granted that if this had been operative it would certainly have produced the result aimed at. For instance, if ten pence are found on a table, all with head uppermost, and it be asked whether chance or design had been at work here; we feel no difficulty up to a certain point. Had the pence been tossed we should have got ten heads only once in 1024 throws; but had they been placed designedly the result would have been achieved with certainty.
The latter of this data is usually easy to secure. Determining the various contingencies based on the chance hypothesis should, if the example is appropriate, only present arithmetical challenges. As for the design alternative, it’s generally assumed that if it were at play, it would definitely have produced the desired result. For example, if ten pence are found on a table, all heads up, and someone asks whether chance or design was involved, we encounter no difficulty up to a certain point. If the coins were tossed, we would only expect to get ten heads once in 1,024 throws; however, if they were intentionally placed, the result would be guaranteed.
But the other postulate, viz. that of the relative prevalence of these two classes of agencies, opens up a far more serious class of difficulties. Cases can be found no doubt, though they are not very frequent, in which this question can be answered approximately, and then there is no further trouble. For instance, if in a school class-list I were to see the four names Brown, Jones, Robinson, Smith, standing in this order, it might occur to me to enquire whether this arrangement were alphabetical or one of merit. In our enlarged sense of the terms this is equivalent to chance and design as the alternatives; for, since the initial letter of 250 a boy's name has no known connection with his attainments, the successive arrangement of these letters on any other than the alphabetical plan will display the random features, just as we found to be the case with the digits of an incommensurable magnitude. The odds are 23 to 1 against 4 names coming undesignedly in alphabetical order; they are equivalent to certainty in favour of their doing so if this order had been designed. As regards the relative frequency of the two kinds of orders in school examinations I do not know that statistics are at hand, though they could easily be procured if necessary, but it is pretty certain that the majority adopt the order of merit. Put for hypothesis the proportion as high as 9 to 1, and it would still be found more likely than not that in the case in question the order was really an alphabetical one.
But the other idea, namely, the relative prevalence of these two types of factors, brings up a much more serious set of issues. There are certainly some cases, although they aren't very common, where this question can be answered approximately, and then there's no further issue. For example, if I were to look at a school class list and saw the four names Brown, Jones, Robinson, Smith in that order, I might wonder whether this arrangement was alphabetical or based on merit. In our broader understanding of the terms, this is equivalent to chance and design as the two options; since the first letter of a boy's name has no known connection to his accomplishments, the order of these names, unless arranged alphabetically, would show random characteristics, just like we found with the digits of an irrational number. The odds are 23 to 1 against four names appearing in alphabetical order by accident; they are almost certain to be in that order if it was intended. As for the relative frequency of these two types of orders in school exams, I don't know if there are any available statistics, but they could be easily gathered if needed. However, it's pretty clear that most people use merit order. Assuming a ratio as high as 9 to 1, it would still be more likely than not that, in this situation, the order was actually alphabetical.
§ 13. But in the vast majority of cases we have no such statistics at hand, and then we find ourselves exposed to very serious ambiguities. These may be divided into two distinct classes, the nature of which will best be seen by the discussion of examples.
§ 13. But in most cases, we don’t have those statistics available, which leaves us facing some significant uncertainties. These can be categorized into two different types, which will be clearer through the discussion of examples.
In the first place we are especially liable to the drawback already described in a former chapter as rendering mere statistics so untrustworthy, which consists in the fact that the proportions are so apt to be disturbed almost from moment to moment by the possession of fresh hints or information. We saw for instance why it was that statistics of mortality were so very unserviceable in the midst of a disease or in the crisis of a battle. Suppose now that on coming into a room I see on the table ten coins lying face uppermost, and am asked what was the likelihood that the arrangement was brought about by design. Everything turns upon special knowledge of the circumstances of the case. Who had been in the room? Were they children, 251 or coin-collectors, or persons who might have been supposed to have indulged in tossing for sport or for gambling purposes? Were the coins new or old ones? a distinction of this kind would be very pertinent when we were considering the existence of any motive for arranging them the same way uppermost. And so on; we feel that our statistics are at the mercy of any momentary fragment of information.
First of all, we are particularly vulnerable to the issue mentioned in a previous chapter that makes plain statistics very unreliable. This issue arises from the fact that the proportions can easily change from moment to moment as new hints or information come to light. For example, we saw why mortality statistics are so unhelpful during a disease outbreak or in the heat of a battle. Let’s say that when I enter a room, I see ten coins lying heads up on the table and I'm asked what the chances are that this arrangement was intentional. Everything hinges on specific knowledge about the situation. Who had been in the room? Were they children, coin collectors, or perhaps people who might have been tossing them for fun or for gambling? Were the coins new or old? This kind of distinction would be very relevant when considering any motive for arranging them in that way. And so on; we realize that our statistics are vulnerable to any fleeting piece of information.
§ 14. But there is another consideration besides this. Not only should we be thus influenced by what may be called external circumstances of a general kind, such as the character and position of the agents, we should also be influenced by what we supposed to be the conventional[3] estimate with which this or that particular chance arrangement was then regarded. Thus from time to time as new games of cards become popular new combinations acquire significance; and therefore when the question of design takes the form of possible cheating a knowledge of the current estimate of such combinations becomes exceedingly important.
§ 14. But there’s another factor to consider. We shouldn’t just be influenced by general external circumstances, like the character and position of the agents, but also by what we believe to be the conventional estimate of the specific chance arrangements at that time. As new card games gain popularity, new combinations take on significance; therefore, when the issue of design involves the possibility of cheating, understanding the current perception of these combinations becomes extremely important.
§ 15. The full significance of these difficulties will best be apprehended by the discussion of a case which is not fictitious or invented for the purpose, but which has actually given rise to serious dispute. Some years ago Prof. Piazzi Smyth published a work[4] upon the great pyramid of Ghizeh, the general object of which was to show that that building contained, in its magnitude, proportions and contents, a number of almost imperishable natural standards of length, volume, &c. Amongst other things it was determined that 252 the value of π was accurately (the degree of accuracy is not, I think, assigned) indicated by the ratio of the sides to the height. The contention was that this result could not be accidental but must have been designed.
§ 15. The full importance of these challenges will be best understood through a real case that has led to serious debate. A few years ago, Prof. Piazzi Smyth published a work[4] about the Great Pyramid of Giza, aiming to demonstrate that this structure holds a number of almost permanent natural standards of length, volume, etc., through its size, proportions, and contents. Among other findings, it was concluded that the value of π was accurately (the specific degree of accuracy isn’t mentioned, I believe) represented by the ratio of the sides to the height. The argument was made that this outcome couldn't be a coincidence but must have been intentional.
As regards the estimation of the value of the chance hypothesis the calculation is not quite so clear as in the case of dice or cards. We cannot indeed suppose that, for a given length of base, any height can be equally possible. We must limit ourselves to a certain range here; for if too high the building would be insecure, and if too low it would be ridiculous. Again, we must decide to how close an approximation the measurements are made. If they are guaranteed to the hundredth of an inch the coincidence would be of a quite different order from one where the guarantee extended only to an inch. Suppose that this has been decided, and that we have ascertained that out of 10,000 possible heights for a pyramid of given base just that one has been selected which would most nearly yield the ratio of the radius to the circumference of a circle.
When it comes to estimating the value of the chance hypothesis, the calculations aren't as straightforward as they are with dice or cards. We can't really assume that, for any given base size, any height is equally possible. We need to limit ourselves to a certain range; if the height is too tall, the building would be unstable, and if it's too short, it would look odd. Additionally, we have to decide how precise our measurements are. If they're accurate down to the hundredth of an inch, the outcome would be very different compared to a scenario where the accuracy is only to the nearest inch. Let's say we've made those decisions and figured out that out of 10,000 potential heights for a pyramid with a specific base, we've chosen the one that most closely matches the ratio of the radius to the circumference of a circle.
The remaining consideration would be the relative frequency of the ‘design’ alternative,—what is called its à priori probability,—that is, the relative frequency with which such builders can be supposed to have aimed at that ratio; with the obvious implied assumption that if they did aim at it they would certainly secure it. Considering our extreme ignorance of the attainments of the builders it is obvious that no attempt at numerical appreciation is here possible. If indeed the ‘design’ was interpreted to mean conscious resolve to produce that ratio, instead of mere resolve to employ some method which happened to produce it, few persons would feel much hesitation. Not only do we feel tolerably certain that the builders did not know the value of π, except in the rude way in which all artificers 253 must know it; but we can see no rational motive, if they did know it, which should induce them to perpetuate it in their building. If, however, to adopt an ingenious suggestion,[5] we suppose that the builder may have proceeded in the following fashion, the matter assumes a different aspect. Suppose that having decided on the height of his pyramid he drew a circle with that as radius: that, laying down a cord along the line of this circle, he drew this cord out into a square, which square marked the base of the building. Hardly any simpler means could be devised in a comparatively rude age; and it is obvious that the circumference of the base, being equal to the length of the cord, would bear exactly the admitted ratio to the height. In other words, the exact attainment of a geometric value does not imply a knowledge of that ratio, but merely of some method which involves and displays it. A teredo can bore, as well as any of us, a hole which displays the geometric properties of a circle, but we do not credit it with corresponding knowledge.
The main thing to consider now is the frequency of the ‘design’ alternative—known as its à priori probability—which refers to how often builders likely aimed for that specific ratio. It's assumed that if they aimed for it, they would definitely achieve it. Given our lack of understanding about the builders' skills, it's clear that we can't put any numerical value on this. If 'design' means a conscious decision to create that ratio rather than just using a method that happened to produce it, most people wouldn't hesitate much. We are fairly confident that the builders didn't actually know the value of π, except in a rough way like all craftsmen do. Furthermore, there seems to be no rational reason for them to incorporate it into their construction if they had known it. However, if we take an interesting suggestion, [5], let's assume the builder acted as follows: after deciding on the height of his pyramid, he drew a circle with that height as the radius. Then, by laying a cord along the edge of this circle, he stretched it out into a square, marking the base of the building. This is one of the simplest methods that could have been used in a relatively primitive time; and it's clear that the circumference of the base, which equals the length of the cord, would maintain the accepted ratio to the height. In other words, achieving a precise geometric value doesn't mean having knowledge of that ratio but rather understanding some method that results in and reveals it. A teredo can bore a hole that demonstrates the geometric properties of a circle, but we wouldn't attribute that knowledge to it.
As before said, all numerical appreciation of the likelihood of the design alternative is out of the question. But, if the precision is equal to what Mr Smyth claimed, I suppose that most persons (with the above suggestion before them) will think it somewhat more likely that the coincidence was not a chance one.
As mentioned earlier, it's impossible to assess the probability of the design alternative numerically. But, if the accuracy is what Mr. Smyth stated, I think most people (considering the suggestion above) will find it a bit more likely that the coincidence wasn't just a coincidence.
§ 16. There still remains a serious, and highly interesting speculative consideration. In the above argument we took it for granted, in calculating the chance alternative, that only one of the 10,000 possible values was favourable; that is, we took it for granted that the ratio π was the only one whose claims, so to say, were before the court. But it is clear that if we had obtained just double this ratio the result would have been of similar significance, for it would have been simply the ratio of the circumference to the diameter. In fact, Mr Smyth's selected ratio,—the height to twice the breadth of the base as compared with the diameter to the circumference,—is obviously only one of a plurality of ratios. Again; if the measured results had shown that the ratio of the height to one side of the base was 1 : √2 (i.e. that of a side to a diagonal of a square) or 1 : √3 (i.e. that of a side to a diagonal of a cube) would not such results equally show evidence of design? Proceeding in this way, we might suggest one known mathematical ratio after another until most of the 10,000 supposed possible values had been taken into account. We might then argue thus: since almost every possible height of the pyramid would correspond to some mathematical ratio, a builder, ignorant of them all alike, would be not at all unlikely to stumble upon one or other of them: why then attribute design to him in one case rather than another?
§ 16. There remains a serious and fascinating speculative point to consider. In the argument above, we assumed that only one of the 10,000 possible values was favorable; specifically, we assumed that the ratio π was the only one whose claims, so to speak, were being evaluated. However, it’s clear that if we had found a ratio just double this, the result would still have significance, as it would simply represent the ratio of the circumference to the diameter. In fact, Mr. Smyth's chosen ratio—the height to twice the width of the base compared to the diameter to the circumference—is clearly just one of many possible ratios. Furthermore, if the measured results had shown that the ratio of the height to one side of the base was 1:√2 (i.e., that of a side to a diagonal of a square) or 1:√3 (i.e., that of a side to a diagonal of a cube), wouldn’t such results also indicate evidence of design? By continuing in this manner, we could propose one known mathematical ratio after another until we had considered most of the 10,000 supposedly possible values. We could then argue: since almost every potential height of the pyramid would correspond to some mathematical ratio, a builder, unaware of all of them, would likely stumble upon one or the other. So why attribute design to him in one instance rather than another?
§ 17. The answer to this objection has been already hinted at. Everything turns upon the conventional estimate of one result as compared with another. Revert, for simplicity to the coins. Ten heads is just as likely as alternate heads and tails, or five heads followed by five tails; or, in fact, as any one of the remaining 1023 possible cases. But universal convention has picked out a run of ten as being remarkable. Here, of course, the convention seems a very 255 natural and indeed inevitable one, but in other cases it is wholly arbitrary. For instance, in cards, “queen of spades and knave of diamonds” is exactly as uncommon as any other such pair: moreover, till bezique was introduced it offered presumably no superior interest over any other specified pair. But during the time when that game was very popular this combination was brought into the category of coincidences in which interest was felt; and, given dishonesty amongst the players, its chance of being designed stood at once on a much better footing.[6]
§ 17. The answer to this objection has already been hinted at. Everything depends on the conventional value placed on one outcome compared to another. To keep it simple, let's go back to the coins. Getting ten heads in a row is just as likely as getting alternating heads and tails, or five heads followed by five tails; in fact, it's just as likely as any of the other 1023 possible outcomes. However, the general consensus has singled out a run of ten as being remarkable. Here, this convention seems very reasonable and even inevitable, but in other cases, it's completely arbitrary. For example, in cards, the "queen of spades and knave of diamonds" is just as rare as any other specific pair: besides, until bezique was introduced, it likely didn't hold any more interest than any other designated pair. But when that game was very popular, this combination became part of the coincidences that piqued interest; and with dishonesty among players, the chance of it being planned suddenly seemed much more plausible.[6]
Returning then to the pyramid, we see that in balancing the claims of chance and design we must, in fairness to the latter, reckon to its account several other values as well as that of π, e.g. √2 and √3, and a few more such simple and familiar ratios, as well as some of their multiples. But though the number of such values which might be reckoned, on the ground that they are actually known to us, is infinite, yet the number that ought to be reckoned, on the ground that they could have been familiar to the builders of a pyramid, are very few. The order of probability for or against the existence of design will not therefore be seriously altered here by such considerations.[7]
Returning to the pyramid, we see that in balancing the claims of chance and design, we must fairly acknowledge several other values besides π, such as √2 and √3, along with a few more simple and familiar ratios and some of their multiples. While the number of values that could be considered, because they are known to us, is infinite, the number that should be considered, based on what might have been familiar to the builders of a pyramid, is very limited. Therefore, the odds for or against the existence of design won't be significantly impacted by these factors.[7]
§ 18. Up to this point it will be observed that what we have been balancing against each other are two forms of agency,—of human agency, that is,—one acting through chance, and the other by direct design. In this case we know where we are, for we can thoroughly understand agency of this kind. The problem is indeed but seldom numerically soluble, and in most cases not soluble at all, but it is at any rate capable of being clearly stated. We know the kind of answer to be expected and the reasons which would serve to determine it, if they were attainable.
§ 18. Up to this point, it's clear that we've been weighing two types of human action against each other—one driven by chance and the other by intentional design. In this situation, we have clarity because we can fully grasp this kind of action. The problem is rarely solvable with numbers, and in many cases, it can't be solved at all, but it can certainly be stated clearly. We understand the type of answer we should anticipate and the reasons that would guide it, if we could access them.
The next stage in the enquiry would be that of balancing ordinary human chance agency against,—I will not call it direct spiritualist agency, for that would be narrowing the hypothesis unnecessarily,—but against all other possible causes. Some of the investigations of the Society for Psychical Research will furnish an admirable illustration of what is intended by this statement. There is a full discussion of these applications in a recent essay by Mr F. Y. Edgeworth;[8] but as his account of the matter is connected with other calculations and diagrams I can only quote it in part. But I am in substantial agreement with him.
The next step in the investigation will be to weigh ordinary human chance and choices against—I'm not going to call it direct spiritualist influence, since that would limit the hypothesis unnecessarily—but against all other possible causes. Some of the studies by the Society for Psychical Research provide an excellent example of what I mean by this. There's a comprehensive discussion of these applications in a recent essay by Mr. F. Y. Edgeworth;[8] but since his explanation relates to other calculations and diagrams, I can only quote part of it. However, I largely agree with him.
“It is recorded that 1833 guesses were made by a ‘percipient’ as to the suit of cards which the ‘agent’ had fixed upon. The number of successful guesses was 510, considerably above 458, the number which, as being the quarter of 1833, would, on the supposition of pure chance, be more likely than any other number. Now, by the Law of Error, we are able approximately to determine the probability of 257 such an excess occurring by chance. It is equal to the extremity of the tail of a probability-curve such as [those we have already had occasion to examine]…. The proportion of this extremity of the tail to the whole body is 0.003 to 1. That fraction, then, is the probability of a chance shot striking that extremity of the tail; the probability that, if the guessing were governed by pure chance, a number of successful guesses equal or greater than 510 would occur”: odds, that is, of about 332 to 1 against such occurrence.
“It is recorded that 1,833 guesses were made by a ‘percipient’ regarding the suit of cards that the ‘agent’ had chosen. The number of successful guesses was 510, significantly higher than 458, which, for the quarter of 1833, would be the most likely outcome under the assumption of pure chance. Now, according to the Law of Error, we can roughly determine the probability of such an excess happening by chance. It is equal to the extreme end of a probability curve similar to the ones we have previously examined…. The ratio of this extreme end to the whole is 0.003 to 1. Therefore, that fraction represents the probability of a chance guess hitting that extreme end; the likelihood that, if guessing were purely random, there would be a number of successful guesses equal to or greater than 510”: odds of about 332 to 1 against such an occurrence.
§ 19. Mr Edgeworth holds, as strongly as I do, that for purposes of calculation, in any strict sense of the word, we ought to have some determination of the data on the non-chance side of the hypothesis. We ought to know its relative frequency of occurrence, and the relative frequency with which it attains its aims. I am also in agreement with him that “what that other cause may be,—whether some trick, or unconscious illusion, or thought-transference of the sort which is vindicated by the investigators—it is for common-sense and ordinary Logic to consider.”
§ 19. Mr. Edgeworth firmly believes, just as I do, that for the sake of accurate calculation, we need to have some clear understanding of the data on the non-chance side of the hypothesis. We should know how often it occurs and how often it achieves its goals. I also agree with him that "whatever that other cause might be—whether it's some trick, an unconscious illusion, or thought-transference as validated by the researchers—it's up to common sense and ordinary logic to figure that out."
I am in agreement therefore with those who think that though we cannot form a quantitative opinion we can in certain cases form a tolerably decisive one. Of course if we allow the last word to the supporters of the chance hypothesis we can never reach proof, for it will always be open to them to revise and re-fix the antecedent probability of the counter hypothesis. What we may fairly require is that those who deny the chance explanation should assign some sort of minimum value to the probability of occurrence on the other supposition, and we can then try to surmount this by increasing the rarity of the actually produced phenomenon on the chance hypothesis. If, for instance, they declare that in their estimation the odds against any other than the chance agency being at work are greater than 332 to 1, we must 258 try to secure a yet uncommoner occurrence than that in question. If the supporters of thought-transference have the courage of their convictions,—as they most assuredly have,—they would not shrink from accepting this test. I am inclined to think that even at present, on such evidence as that above, the probability that the results were got at by ordinary guessing is very small.
I agree with those who believe that while we can't provide a precise quantitative assessment, we can, in some cases, form a fairly clear opinion. Of course, if we let the advocates of the chance theory have the final say, we can never establish proof, because they can always adjust and redefine the initial probability of the opposing theory. What we should reasonably ask is that those who reject the chance explanation assign some minimum value to the likelihood of an occurrence based on the other theory, and then we can attempt to surpass this by demonstrating that the actual phenomenon is even rarer under the chance hypothesis. For example, if they assert that in their view the odds of anything other than random chance being responsible are greater than 332 to 1, we must strive to identify an even rarer event than the one in question. If the proponents of thought-transference truly believe in their position—as they undoubtedly do—they shouldn't hesitate to accept this test. I tend to think that even now, based on the evidence mentioned, the likelihood that the results were achieved through ordinary guessing is quite low.
§ 20. The problems discussed in the preceding sections are at least intelligible even if they are not always resolvable. But before finishing this chapter we must take notice of some speculations upon this part of the subject which do not seem to keep quite within the limits of what is intelligible. Take for instance the question discussed by Arbuthnott (in a paper in the Phil. Transactions, Vol. XXVII.) under the title “An Argument for Divine Providence, taken from the constant Regularity observed in the birth of both sexes.” Had his argument been of the ordinary teleological kind; that is, had he simply maintained that the existent ratio of approximate equality, with a six per cent. surplusage of males, was a beneficent one, there would have been nothing here to object against. But what he contemplated was just such a balance of alternate hypotheses between chance and design as we are here considering. His conclusion in his own words is, “it is art, not chance, that governs.”
§ 20. The issues mentioned in the previous sections are at least understandable, even if they aren't always solvable. However, before we wrap up this chapter, we need to address some theories related to this topic that don’t quite stay within the bounds of what makes sense. For example, consider the question raised by Arbuthnott (in a paper in the Phil. Transactions, Vol. XXVII.) under the title “An Argument for Divine Providence, taken from the constant Regularity observed in the birth of both sexes.” If his argument had been the usual teleological one—that is, if he had simply argued that the current ratio of nearly equal numbers with a six percent surplus of males is a positive thing—there would be nothing to criticize. But what he actually looked at was a balance of alternative hypotheses between chance and design, which is what we’re discussing here. His conclusion, in his own words, is, “it is art, not chance, that governs.”
It is difficult to render such an argument precise without rendering it simply ridiculous. Strictly understood it can surely bear only one of two interpretations. On the one hand we may be personifying Chance: regarding it as an agent which must be reckoned with as being quite capable of having produced man, or at any rate having arranged the proportion of the sexes. And then the decision must be drawn, as between this agent and the Creator, which of the two produced the existent arrangement. If so, and Chance 259 be defined as any agent which produces a chance or random arrangement, I am afraid there can be little doubt that it was this agent that was at work in the case in question. The arrangement of male and female births presents, so far as we can see, one of the most perfect examples of chance: there is ultimate uniformity emerging out of individual irregularity: all the ‘runs’ or successions of each alternative are duly represented: the fact of, say, five sons having been already born in a family does not seem to have any certain effect in diminishing the likelihood of the next being a son, and so on. Such a nearly perfect instance of ‘independent events’ is comparatively very rare in physical phenomena. It is all that we can claim from a chance arrangement.[9] The only other interpretation I can see is to suggest that there was but one agent who might, like any one of us, have either tossed up or designed, and we have to ascertain which course he probably adopted in the case in question. Here too, if we are to judge of his mode of action by the tests we should apply to any work of our own, it would certainly look very much as if he had adopted some scheme of tossing.
It's hard to make this argument precise without making it seem ridiculous. Strictly understood, it can only have one of two meanings. On one hand, we might be personifying Chance, seeing it as an agent that's capable of creating humans or at least arranging the ratio of the sexes. Then we have to decide between this agent and the Creator, determining which of the two is responsible for the existing arrangement. If that's the case, and Chance is defined as any agent that produces a random arrangement, there's no doubt that this agent was at work here. The pattern of male and female births shows, as far as we can tell, one of the best examples of chance: complete uniformity emerges from individual irregularities. All the sequences of each gender are adequately represented; for instance, the fact that, say, five sons have already been born in a family doesn't seem to have any real impact on the likelihood of the next one being a son, and so on. Such a nearly perfect case of ‘independent events’ is quite rare in physical phenomena. That's all we can claim from a chance arrangement.
§ 21. The simple fact is that any rational attempt to 260 decide between chance and design as agencies must be confined to the case of finite intelligences. One of the important determining elements here, as we have seen, is the state of knowledge of the agent, and the conventional estimate entertained about this or that particular arrangement; and these can be appreciated only when we are dealing with beings like ourselves.
§ 21. The reality is that any logical effort to decide between chance and design as influences has to be limited to the situation of finite intelligences. One key factor here, as we've seen, is the knowledge level of the agent, and the general opinion held about this or that specific arrangement; these can only be understood when we're talking about beings like us.
For instance, to return to that much debated question about the arrangement of the stars, there can hardly be any doubt that what Mitchell,—who started the discussion,—had in view was the decision between Chance and Design. He says (Trans. Roy. Soc. 1767) “The argument I intend to make use of… is of that kind which infers either design or some general law from a general analogy and from the greatness of the odds against things having been in the present situation if it was not owing to some such cause.” And he concludes that had the stars “been scattered by mere chance as it might happen” there would be “odds of near 500,000 to 1 that no six stars out of that number [1500], scattered at random in the whole heavens, would be within so small a distance from each other as the Pleiades are.” Under any such interpretation the controversy seems to me to be idle. I do not for a moment dispute that there is some force in the ordinary teleological argument which seeks to trace signs of goodness and wisdom in the general tendency of things. But what do we possibly understand about the nature of creation, or the designs of the Creator, which should enable us to decide about the likelihood of his putting the stars in one shape rather than in another, or which should allow any significance to “mere chance” as contrasted with his supposed all-pervading agency?
For example, to revisit that highly debated question about how the stars are arranged, it’s hard to deny that what Mitchell, who sparked the discussion, really meant was the choice between Chance and Design. He states (Trans. Roy. Soc. 1767) “The argument I plan to use… is the kind that infers either design or some general law from a broad analogy and from the extremely long odds against things being in their current position if it weren’t due to some cause.” He concludes that if the stars “had been scattered by mere chance as could happen,” there would be “odds of nearly 500,000 to 1 that no six stars out of that number [1500], randomly scattered across the whole sky, would be within such a small distance of each other as the Pleiades are.” Under any interpretation, this controversy seems pointless to me. I don’t doubt for a second that there is some merit in the typical teleological argument that looks for signs of goodness and wisdom in the general trend of things. But what do we really understand about the nature of creation or the intentions of the Creator that would allow us to decide the probability of him arranging the stars in one way over another, or give any importance to “mere chance” as opposed to his supposed all-encompassing influence?
§ 22. Reduced to intelligible terms the two following questions seem to me to emerge from the controversy:—
§ 22. Put simply, two clear questions seem to come out of the debate:—
(I.) The stars being distributed through space, some of them would of course be nearly in a straight line behind others when looked at from our planet. Supposing that they were tolerably uniformly distributed, we could calculate about how many of them would thus be seen in apparent close proximity to one another. The question is then put, Are there more of them near to each other, two and two, than such calculation would account for? The answer is that there are many more. So far as I can see the only direct inference that can be drawn from this is that they are not uniformly distributed, but have a tendency to go in pairs. This, however, is a perfectly sound and reasonable application of the theory. Any further conclusions, such as that these pairs of stars will form systems, as it were, to themselves, revolving about one another, and for all practical purposes unaffected by the rest of the sidereal system, are of course derived from astronomical considerations.[10] Probability confines itself to the simple answer that the distribution is not uniform; it cannot pretend to say whether, and by what physical process, these binary systems of stars have been ‘caused’.[11]
(I.) The stars are spread out across space, so some of them are naturally going to appear almost in a straight line behind others when viewed from our planet. Assuming they are fairly evenly distributed, we could estimate how many of them would seem to be in close proximity to one another. This raises the question: Are there more stars close together, in pairs, than our calculations would suggest? The answer is yes, there are many more. From what I can tell, the only conclusion we can directly draw from this is that they are not uniformly distributed, but seem to prefer pairing up. This, however, is a perfectly valid and logical application of the theory. Any further conclusions, like the idea that these pairs of stars might form their own systems, orbiting each other and largely unaffected by the rest of the galaxy, come from astronomical insights.[10] Probability simply tells us that the distribution is not uniform; it can't claim to know whether, or by what physical process, these binary star systems have come to be.[11]
§ 23. (II.) The second question is this, Does the distribution of the stars, after allowing for the case of the binary 262 stars just mentioned, resemble that which would be produced by human agency sprinkling things ‘at random’? (We are speaking, of course, of their distribution as it appears to us, on the visible heavens, for this is nearly all that we can observe; but if they extend beyond the telescopic range in every direction, this would lead to practically much the same discussion as if we considered their actual arrangement in space.) We have fully discussed, in a former chapter, the meaning of ‘randomness.’ Applying it to the case before us, the question becomes this, Is the distribution tolerably uniform on the whole, but with innumerable individual deflections? That is, when we compare large areas, are the ratios of the number of stars in each equal area approximately equal, whilst, as we compare smaller and smaller areas, do the relative numbers become more and more irregular? With certain exceptions, such as that of the Milky Way and other nebular clusters, this seems to be pretty much the case, at any rate as regards the bulk of the stars.[12]
§ 23. (II.) The second question is this: Does the distribution of stars, after accounting for the binary stars mentioned earlier, look like what would happen if humans sprinkled them ‘at random’? (We’re talking about their distribution as we see it in the night sky, since that’s mostly what we can observe; however, if they go beyond what we can see with telescopes in every direction, it would lead to a discussion that's practically the same as if we considered their actual arrangement in space.) We have thoroughly discussed the meaning of ‘randomness’ in a previous chapter. Applying that to our situation, the question is: Is the distribution fairly uniform overall, but with countless individual variations? In other words, when we compare large areas, are the ratios of stars in each equal area roughly equal, while smaller and smaller areas show more and more irregularities? With some exceptions, like the Milky Way and other nebular clusters, this seems to generally hold true, at least when it comes to the majority of the stars.[12]
All further questions: the decision, for instance, for or against any form of the Nebular Hypothesis: or, admitting this, the decision whether such and such parts of the visible heavens have sprung from the same nebula, must be left to Astronomy to adjudicate.
All further questions: the decision, for example, for or against any version of the Nebular Hypothesis; or, if we accept this, the decision on whether specific parts of the visible universe originated from the same nebula, must be left for Astronomy to decide.
NOTE ON THE PROPORTIONS OF THE SEXES.
The following remarks were rather too long for convenient insertion on p. 259, and are therefore appended here.
The following comments were a bit too lengthy to fit comfortably in p. 259, so they're added here instead.
The ‘random’ character of male and female births has generally been rested almost entirely on statistics of place and time. But what is more wanted, surely, is the proportion displayed when we compare a number of families. This seems so obvious that I cannot but suppose that the investigation must have been already made somewhere, though I have not found any trace of it in the most likely quarters. Thus Prof. Lexis (Massenerscheinungen) when supporting his view that the proportion between the sexes at birth is almost the only instance known to him, in natural phenomena, of true normal dispersion about a mean, rests his conclusions on the ordinary statistics of the registers of different countries.
The "random" nature of male and female births has usually relied almost entirely on statistics related to location and time. However, what we really need is the ratio shown when we examine a number of families. This seems so obvious that I can't help but think that this investigation must have been done somewhere, although I haven't found any evidence of it in the most likely sources. For instance, Prof. Lexis (Massenerscheinungen) supports his argument that the ratio of sexes at birth is nearly the only known case in natural phenomena of true normal distribution around a mean, basing his conclusions on the standard statistics from the records of different countries.
It certainly needs proof that the same characteristics will hold good when the family is taken as the unit, especially as some theories (e.g. that of Sadler) would imply that ‘runs’ of boys or girls would be proportionally commoner than pure chance would assign. Lexis has shown that this is most markedly the case with twins: i.e., to use an obviously intelligible notation, (M for male, F for female), that M.M. and F.F. are very much commoner in proportion than M.F.
It definitely needs to be proven that the same traits will apply when looking at the family as a unit, especially since some theories (like Sadler's) suggest that groups of boys or girls are statistically more likely than pure chance would indicate. Lexis has demonstrated that this is especially true for twins: using an easy-to-understand notation, (M for male, F for female), M.M. and F.F. are significantly more common than M.F.
I have collected statistics including over 13,000 male and female births, arranged in families of four and upwards. They were taken from the pedigrees in the Herald's Visitations, and therefore represent as a rule a somewhat select class, viz. the families of the eldest sons of English country gentlemen in the sixteenth century. They are not sufficiently extensive yet for publication, but I give a summary of the results to indicate their tendency so far. The upper line of figures in each case gives the observed results: i.e. 264 in the case of a family of four, the numbers which had four male, three male and one female, two male and two female, and so on. The lower line gives the calculated results; i.e. the corresponding numbers which would have been obtained had batches of M.s and F.s been drawn from a bag in which they were mixed in the ratio assigned by the total observed numbers for those families.
I have gathered data from over 13,000 male and female births, organized into families of four or more. These were sourced from the pedigrees in the Herald's Visitations and typically represent a somewhat exclusive group, specifically the families of the eldest sons of English country gentlemen in the sixteenth century. They aren't extensive enough yet for publication, but I’m providing a summary of the findings to show their tendencies so far. The upper line of figures in each case presents the observed results: for instance, in a family of four, the numbers that had four males, three males and one female, two males and two females, and so on. The lower line gives the calculated results; that is, the corresponding numbers we would have seen if groups of males and females had been drawn from a bag where they were mixed in the ratio indicated by the total observed numbers for those families.
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
The numbers for the larger families are as yet too small to be worth giving, but they show the same tendency. It will be seen that in every case the observed central values are less than the calculated; and that the observed extreme values are much greater than the calculated. The results seem to suggest (so far) that a family cannot be likened to a chance drawing of the requisite number from one bag. A better analogy would be to suppose two bags, one with M.s in excess and the other with F.s in less excess, and that some persons draw from one and some from the other. But fuller statistics are needed.
The numbers for the larger families are still too small to be significant, but they show the same trend. It’s clear that in every case, the observed central values are less than the calculated ones, and the observed extreme values are much higher than the calculated ones. The results seem to suggest (for now) that a family can't be compared to a random selection of the necessary number from one bag. A better analogy might be to imagine two bags, one with an excess of males and the other with a lesser excess of females, and that some people draw from one bag and some from the other. However, more detailed statistics are needed.
It will be observed that the total excess of male births is large. This may arise from undue omission of females; but I have carefully confined myself to the two or three last generations, in each pedigree, for greater security.
It will be noted that the total number of male births is significantly high. This might be due to an unintentional neglect of female births; however, I have made sure to focus only on the last two or three generations in each family line for added accuracy.
1 Essay on Probabilities, p. 114.
__A_TAG_PLACEHOLDER_0__ Essay on Probabilities, p. 114.
2 Doubts have been expressed about the truly random character of the digits in this case (v. De Morgan, Budget of Paradoxes, p. 291), and Jevons has gone so far as to ask (Principles of Science, p. 529), “Why should the value of π, when expressed to a great number of figures, contain the digit 7 much less frequently than any other digit!” I do not quite understand what this means. If such a question were asked in relation to any unusual divergence from the à priori chance in a case of throwing dice, say, we should probably substitute for it the following, as being more appropriate to our science:—Assign the degree of improbability of the event in question; i.e. its statistical rarity. And we should then proceed to judge, in the way indicated in the text, whether this improbability gave rise to any grounds of suspicion.
2 Some people have questioned the truly random nature of the digits in this case (v. De Morgan, Budget of Paradoxes, p. 291), and Jevons even went so far as to ask (Principles of Science, p. 529), “Why does the value of π, when expressed with a large number of figures, contain the digit 7 much less frequently than any other digit?” I’m not sure I fully grasp what this means. If a question like that were raised regarding any unusual deviation from the à priori chance in a dice-throwing scenario, we would probably rephrase it to be more suitable for our field:—Determine the degree of improbability of the event in question; i.e. its statistical rarity. Then we would assess, as suggested in the text, whether this improbability raises any grounds for suspicion.
The calculation is simple. The actual number of 7's, in the 708 digits, is 53: whilst the fair average would be 71. The question is, What is the chance of such a departure from the average in 708 turns? By the usual methods of calculation (v. Galloway on Probability) the chances against an excess or defect of 18 are about 44 : 1, in respect of any specified digit. But of course what we want to decide are the chances against some one of the ten showing this divergence. This I estimate as being approximately determined by the fraction (44/45)10, viz. 0.8. This represents odds of only about 4 : 1 against such an occurrence, which is nothing remarkable. As a matter of fact several digits in the two other magnitudes which Mr Shanks had calculated to the same length, viz. Tan−1 1/5 and Tan−1 1/239, show the same divergencies (v. Proc. Roy. Soc. xxi. 319).
The calculation is straightforward. The actual number of 7's in the 708 digits is 53, while the fair average would be 71. The question is, what are the chances of this deviation from the average occurring in 708 turns? By the usual methods of calculation (see Galloway on Probability), the odds against an excess or deficit of 18 are about 44 to 1 for any specific digit. However, what we really want to determine is the odds against any one of the ten digits showing this divergence. I estimate this to be roughly determined by the fraction (44/45)10, which is about 0.8. This represents odds of only about 4 to 1 against such an occurrence, which is not particularly remarkable. In fact, several digits in the two other lengths that Mr. Shanks calculated to the same extent, specifically Tan−1 1/5 and Tan−1 1/239, show the same divergences (see Proc. Roy. Soc. xxi. 319).
I may call attention here to a point which should have been noticed in the chapter on Randomness. We must be cautious when we decide upon the random character by mere inspection. It is very instructive here to compare the digits in π with those within the ‘period’ of a circulating decimal of very long period. That of 1 ÷ 7699, which yields the full period of 7698 figures, was calculated some years ago by two Cambridge graduates (Mr Lunn and Mr Suffield), and privately printed. If we confine our examination to a portion of the succession the random character seems plausible; i.e. the digits, and their various combinations, come out in nearly, but not exactly, equal numbers. So if we take batches of 10; the averages hover nicely about 45. But if we took the whole period which ‘circulates,’ we should find these characteristics overdone, and the random character would disappear. That is, instead of a merely ultimate approximation to equality we should have (as far as this is possible) an absolute attainment of it.
I want to highlight a point that should have been addressed in the chapter on Randomness. We need to be careful when determining randomness based just on what we see. It’s really informative to compare the digits in π with those in the ‘period’ of a long circulating decimal. For example, 1 ÷ 7699, which gives the full period of 7698 digits, was calculated a few years ago by two Cambridge graduates (Mr. Lunn and Mr. Suffield) and printed privately. If we only look at a portion of the sequence, the randomness appears convincing; that is, the digits and their various combinations show up in nearly, but not quite, equal amounts. So when we take groups of 10, the averages tend to settle around 45. However, if we look at the entire circulating period, we’d find that these traits become exaggerated, and the randomness would fade away. Instead of just being a rough approximation to equality, we would achieve an absolute equality as far as that’s possible.
3 Of course this conventional estimate is nothing different in kind from that which may attach to any order or succession. Ten heads in succession is intrinsically or objectively indistinguishable in character from alternate heads and tails, or seven heads and three tails, &c. Its distinction only consists in its almost universal acceptance as remarkable.
3 Of course, this standard estimate is no different in kind from what can apply to any order or sequence. Ten heads in a row is intrinsically or objectively the same as alternating heads and tails, or seven heads and three tails, etc. Its uniqueness only comes from its almost universal recognition as noteworthy.
4 Our Inheritance in the Great Pyramid, Ed. III. 1877.
4 Our Inheritance in the Great Pyramid, Ed. III. 1877.
5 Made in Nature (Jan. 24, 1878) by Mr J. G. Jackson. It must be remarked that Mr Smyth's alternative statement of his case leads up to that explanation:—“The vertical height of the great pyramid is the radius of a theoretical circle the length of whose curved circumference is exactly equal to the sum of the lengths of the four straight sides of the actual and practical square base.” As regards the alternatives of chance and design, here, it must be remembered in justice to Mr Smyth's argument that the antithesis he admits to chance is not human, but divine design.
5 Made in Nature (Jan. 24, 1878) by Mr. J. G. Jackson. It should be noted that Mr. Smyth's alternative explanation supports this:—“The vertical height of the Great Pyramid is the radius of a theoretical circle whose curved circumference length is exactly equal to the total lengths of the four straight sides of the actual and practical square base.” Regarding the options of chance versus design, it must be acknowledged, in fairness to Mr. Smyth's argument, that the contrast he presents to chance is not human, but divine design.
6 See Cournot, Essai sur les fondements de nos connaissances. Vol. I. p. 71.
6 See Cournot, Essay on the Foundations of Our Knowledge. Vol. I.] p. 71.
7 It deserves notice that considerations of this kind have found their way into the Law Courts though of course without any attempt at numerical valuation. Thus, in the celebrated De Ros trial, in so far as the evidence was indirect, one main ground of suspicion seems to have been that Lord De Ros, when dealing at whist, obtained far more court cards than chance could be expected to assign him; and that in consequence his average gains for several years in succession were unusually large. The counsel for the defence urged that still larger gains had been secured by other players without suspicion of unfairness,—(I cannot find that it was explained over how large an area of experience these instances had been sought; nor how far the magnitude of the stakes, as distinguished from the number of successes, accounted for that of the actual gains),—and that large allowance must be made for skill where the actual gains were computed. (See the Times’ report, Feb. 11, 1837.)
7 It's worth noting that thoughts like these have made their way into the courts, though, of course, without any attempt at quantifying them. In the famous De Ros trial, one main reason for suspicion seemed to be that Lord De Ros, while playing whist, received far more court cards than could reasonably be expected by chance; as a result, his average winnings over several years were unusually high. The defense argued that other players had made even larger winnings without any hint of cheating—(I can't find any explanation of how extensive this experience was; nor how much the size of the stakes, compared to the number of wins, influenced the actual gains)—and that a significant consideration should be given to skill when calculating the actual earnings. (See the Times’ report, Feb. 11, 1837.)
8 Metretike. At the end of this volume will be found a useful list of a number of other publications by the same author on allied topics.
8 Metretike. At the end of this volume, you'll find a helpful list of several other publications by the same author on related topics.
9 That is, if we look simply to statistical results, as Arbuthnott did, and as we should do if we were examining the tosses of a penny. If the remarkable theory of Dr Düsing (Die Regulierung des Geschlechts-verhältnisses… Jena, 1884) be confirmed, the matter would assume a somewhat different aspect. He attempts to show, both on physiological grounds, and by analysis of statistics referring to men and animals, that there is a decidedly compensatory process at work. That is, if for any cause either sex attains a preponderance, agencies are at once set in motion which tend to redress the balance. This is a modification and improvement of the older theory, that the relative age of the parents has something to do with the sex of the offspring.
9 That is, if we just look at statistical results, like Arbuthnott did, and like we should if we were analyzing the tosses of a coin. If Dr. Düsing's remarkable theory (Die Regulierung des Geschlechts-verhältnisses… Jena, 1884) is confirmed, the situation would take on a slightly different perspective. He tries to demonstrate, based on physiological arguments and by analyzing statistics related to both humans and animals, that there is definitely a compensatory process in play. In other words, if for any reason one sex becomes predominant, mechanisms are immediately triggered that work to restore balance. This is a refinement and enhancement of the older theory that the relative age of the parents influences the sex of the offspring.
Quetelet (Letters, p. 61) has attempted to prove a proposition about the succession of male and female births by certain experiments supposed to be tried upon an urn with black and white balls in it. But this is going too far. (See the note at the end of this chapter.)
Quetelet (Letters, p. 61) has tried to prove a claim about the order of male and female births through experiments supposedly conducted with an urn filled with black and white balls. But this is going too far. (See the note at the end of this chapter.)
10 It is precisely analogous to the conclusion that the flowers of the daisies (as distinguished from the plants, v. p. 109) are not distributed at random, but have a tendency to go in groups of two or more. Mere observation shows this: and then, from our knowledge of the growth of plants we may infer that these little groups spring from the same root.
10 It’s exactly like the conclusion that the flowers of the daisies (as opposed to the plants, v. p. 109) aren’t spread out randomly, but tend to grow in clusters of two or more. Just by observing, we can see this: and from what we know about how plants grow, we can deduce that these small clusters come from the same root.
11 In this discussion, writers often speak of the probability of a “physical connection” between these double stars. The phrase seems misleading, for on the usual hypothesis of universal gravitation all stars are physically connected, by gravitation. It is therefore better, as above, to make it simply a question of relative proximity, and to leave it to astronomy to infer what follows from unusual proximity.
11 In this discussion, writers often talk about the likelihood of a “physical connection” between these binary stars. The term seems misleading, because according to the standard theory of universal gravitation, all stars are physically connected through gravitational forces. It's better, as mentioned earlier, to frame it as a matter of relative proximity and let astronomy determine what conclusions can be drawn from unusual closeness.
12 Professor Forbes in the paper in the Philosophical Magazine already referred to (Ch. VII. § 18) gave several diagrams to show what were the actual arrangements of a random distribution. He scattered peas over a chess-board, and then counted the number which rested on each square. His figures seem to show that the general appearance of the stars is much the same as that produced by such a plan of scattering.
12 Professor Forbes, in the paper in the Philosophical Magazine already referred to (Ch. VII. § 18), provided several diagrams to illustrate the actual arrangements of a random distribution. He spread peas over a chessboard and then counted the number that landed on each square. His findings suggest that the overall look of the stars is very similar to what results from this method of scattering.
Some recent investigations by Mr R. A. Proctor seem to show, however, that there are at least two exceptions to this tolerably uniform distribution. (1) He has ascertained that the stars are decidedly more thickly aggregated in the Milky Way than elsewhere. So far as this is to be relied on the argument is the same as in the case of the double stars; it tends to prove that the proximity of the stars in the Milky Way is not merely apparent, but actual. (2) He has ascertained that there are two large areas, in the North and South hemispheres, in which the stars are much more thickly aggregated than elsewhere. Here, it seems to me, Probability proves nothing: we are simply denying that the distribution is uniform. What may follow in the way of inferences as to the physical process of causation by which the stars have been disposed is a question for the Astronomer. See Mr Proctor's Essays on Astronomy, p. 297. Also a series of Essays in The Universe and the coming Transits.
Some recent investigations by Mr. R. A. Proctor seem to show, however, that there are at least two exceptions to this fairly uniform distribution. (1) He has found that the stars are definitely more concentrated in the Milky Way than in other regions. If this is reliable, the argument is similar to that regarding double stars; it suggests that the closeness of the stars in the Milky Way is not just an illusion, but actually real. (2) He has identified two large areas, in the Northern and Southern hemispheres, where the stars are much more densely packed than elsewhere. Here, it seems to me, probability proves nothing: we are simply stating that the distribution is not uniform. What may come from this in terms of conclusions about the physical processes that led stars to be arranged this way is a question for the astronomer. See Mr. Proctor's Essays on Astronomy, p. 297. Also a series of essays in The Universe and the Coming Transits.
CHAPTER 11.
ON CERTAIN CONSEQUENCES OF THE OBJECTIVE TREATMENT OF A SCIENCE OF INFERENCE.[*]
* In the previous edition a large part of this chapter was devoted to the general consideration of the distinction between a Material and a Conceptualist view of Logic. I have omitted most of this here, as also a large part of a chapter devoted to the detailed discussion of the Law of Causation, as I hope before very long to express my opinions on these subjects more fully, and more appropriately, in a treatise on the general principles of Inductive Logic.
Understood. Please provide the text you would like me to modernize. In the previous edition, a significant portion of this chapter focused on the general distinction between a Material and a Conceptualist view of Logic. I’ve left out most of that here, along with a large section of a chapter that discussed the Law of Causation in detail, as I plan to share my thoughts on these topics more completely and appropriately in a treatise on the general principles of Inductive Logic soon.
§ 1. Students of Logic are familiar with that broad distinction between the two methods of treatment to which the names of Material and Conceptualist may be applied. The distinction was one which had been gradually growing up under other names before it was emphasized, and treated as a distinction within the field of Logic proper, by the publication of Mill's well known work. No one, for instance, can read Whewell's treatises on Induction, or Herschel's Discourse, without seeing that they are treating of much the same subject-matter, and regarding it in much the same way, as that which Mill discussed under the name of Logic, though they were not disposed to give it that name. That is, these writers throughout took it for granted that what they had to do was to systematise the facts of nature in their objective form, and under their widest possible treatment, and to expound the principal modes of inference and the principal practical aids in the investigation of these 266 modes of inference, which reason could suggest and which experience could justify. What Mill did was to bring these methods into close relation with such portions of the old scholastic Logic as he felt able to retain, to work them out into much fuller detail, to systematize them by giving them a certain philosophical and psychological foundation,—and to entitle the result Logic.
§ 1. Students of Logic know about the broad distinction between the two treatment methods known as Material and Conceptualist. This distinction had been developing under different names before it was highlighted and treated as a key difference in the field of Logic, particularly through Mill's well-known work. For example, no one can read Whewell's treatises on Induction or Herschel's Discourse without noticing that they are addressing very similar topics and viewing them similarly to how Mill approached them under the term Logic, even though they didn’t choose to call it that. These writers assumed that their task was to organize the facts of nature in their objective form, with the most comprehensive approach possible, and to explain the main methods of inference and the key practical supports for investigating these methods of inference that reason could propose and experience could validate. Mill's contribution was to closely connect these methods with parts of the old scholastic Logic that he felt he could incorporate, to develop them in greater detail, to systematize them by providing a philosophical and psychological foundation, and to label the outcome Logic.
The practical treatment of a science will seldom correspond closely to the ideal which its supporters propose to themselves, and still seldomer to that which its antagonists insist upon demanding from the supporters. If we were to take our account of the distinction between the two views of Logic expounded respectively by Hamilton and by Mill, from Mill and Hamilton respectively, we should certainly not find it easy to bring them under one common definition. By such a test, the material Logic would be regarded as nothing more than a somewhat arbitrary selection from the domain of Physical Science in general, and the conceptualist Logic nothing more than a somewhat arbitrary selection from the domain of Psychology. The former would omit all consideration of the laws of thought and the latter all consideration of the truth or falsehood of our conclusions.
The practical application of a science rarely matches the ideal that its supporters envision for themselves, and even less often aligns with what its critics demand from them. If we were to compare the distinctions between the two perspectives on Logic explained by Hamilton and Mill, taking insights from Mill and Hamilton respectively, we would definitely struggle to fit them under a single definition. According to this measure, material Logic would be seen as just a somewhat random selection from the broader field of Physical Science, while conceptualist Logic would be viewed as merely a somewhat random selection from the realm of Psychology. The former would ignore all consideration of the laws of thought, while the latter would overlook any assessment of the truth or falsehood of our conclusions.
Of course, in practice, such extremes as these are soon seen to be avoidable, and in spite of all controversial exaggerations the expounders of the opposite views do contrive to retain a large area of speculation in common. I do not propose here to examine in detail the restrictions by which this accommodation is brought about, or the very real and important distinctions of method, aim, tests, and limits which in spite of all approach to agreement are still found to subsist. To attempt this would be to open up rather too wide an enquiry to be suitable in a treatise on one subdivision only of the general science of Inference.
Of course, in reality, these extremes are quickly seen to be avoidable, and despite all the heated debates, those with opposing views do manage to keep a significant amount of speculation in common. I don’t intend to dive deeply into the limitations that lead to this compromise, or the real and important differences in methods, goals, tests, and boundaries that, despite efforts to agree, still exist. To do so would open up a discussion that’s too broad for a work focused solely on one subdivision of the general science of Inference.
§ 2. One subdivision of this enquiry is however really forced upon our notice. It does become important to consider the restrictions to which the ultra-material account of the province of Logic has to be subjected, because we shall thus have our attention drawn to an aspect of the subject which, slight and fleeting as it is within the region of Induction becomes very prominent and comparatively permanent in that of Probability. According to this ultra-material view, Inductive Logic would generally be considered to have nothing to do with anything but objective facts: its duty is to start from facts and to confine itself to such methods as will yield nothing but facts. What is doubtful it either establishes or it lets alone for the present, what is unattainable it rejects, and in this way it proceeds to build up by slow accretion a vast fabric of certain knowledge.
§ 2. One part of this inquiry really demands our attention. It's important to consider the limits that the ultra-material perspective on Logic has to follow because it highlights an aspect of the topic that, although minimal and brief in Induction, becomes much more significant and lasting in Probability. According to this ultra-material view, Inductive Logic is generally seen as dealing only with objective facts: its role is to start with facts and to stick to methods that yield only facts. Anything uncertain it either clarifies or sets aside for now, and anything unattainable it dismisses. In this way, it gradually builds up a vast structure of certain knowledge.
But of course all this is supposed to be done by human minds, and therefore if we enquire whether notions or concepts,—call them what we will,—have no place in such a scheme it must necessarily be admitted that they have some place. The facts which form our starting point must be grasped by an intelligent being before inference can be built upon them; and the ‘facts’ which form the conclusion have often, at any rate for some time, no place anywhere else than in the mind of man. But no one can read Mill's treatise, for instance, without noticing how slight is his reference to this aspect of the question. He remarks, in almost contemptuous indifference, that the man who digs must of course have a notion of the ground he digs and of the spade he puts into it, but he evidently considers that these ‘notions’ need not much more occupy the attention of the speculative logician, in so far as his mere inferences are concerned, than they occupy that of the husbandman.
But of course all this is supposed to be done by human minds, so if we ask whether ideas or concepts—call them what you want—have no place in such a system, we must admit that they do have some role. The facts that form our starting point must be understood by an intelligent being before we can draw inferences from them; and the 'facts' that make up the conclusion often exist, at least for a time, only in the mind of a person. However, no one can read Mill's treatise, for example, without noticing how little he addresses this aspect of the issue. He notes, almost with contempt, that the person who digs must have an idea of the ground they’re digging into and the spade they’re using, but he clearly thinks that these 'notions' don't need to concern the speculative logician much more than they concern the farmer.
§ 3. It must be admitted that there is some warrant 268 for this omission of all reference to the subjective side of inference so long as we are dealing with Inductive Logic. The inductive discoverer is of course in a very different position. If he is worthy of the name his mind at every moment will be teeming with notions which he would be as far as any one from calling facts: he is busy making them such to the best of his power. But the logician who follows in his steps, and whose business it is to explain and justify what his leader has discovered, is rather apt to overlook this mental or uncertain stage. What he mostly deals in are the ‘complete inductions’ and ‘well-grounded generalizations’ and so forth, or the exploded errors which contradict them: the prisoners and the corpses respectively, which the real discoverer leaves on the field behind him whilst he presses on to complete his victory. The whole method of science,—expository as contrasted with militant,—is to emphasize the distinction between fact and non-fact, and to treat of little else but these two. In other words a treatise on Inductive Logic can be written without any occasion being found to define what is meant by a notion or concept, or even to employ such terms.
§ 3. It must be acknowledged that there's some justification for not mentioning the subjective aspect of inference, especially when we're focusing on Inductive Logic. The inductive discoverer is, of course, in a completely different situation. If he truly deserves the title, his mind will constantly be filled with ideas that he wouldn't refer to as facts; he's actively working to turn them into facts to the best of his ability. However, the logician who follows in his footsteps, tasked with explaining and justifying what the discoverer has found, often tends to overlook this mental or uncertain stage. His main focus is on 'complete inductions' and 'well-grounded generalizations,' or the debunked errors that contradict them: the prisoners and the corpses, respectively, that the real discoverer leaves behind as he moves forward to achieve his goals. The entire method of science—expository rather than combative—is to highlight the difference between fact and non-fact and to primarily address these two categories. In other words, a treatise on Inductive Logic can be written without ever needing to define what a notion or concept means or even use those terms at all.
§ 4. And yet, when we come to look more closely, signs may be detected even within the field of Inductive Logic, of an occasional breaking down of the sharp distinction in question; we may meet now and then with entities (to use the widest term attainable) in reference to which it would be hard to say that they are either facts or conceptions. For instance, Inductive Logic has often occasion to make use of Hypotheses: to which of the above two classes are these to be referred? They do not seem in strictness to belong to either; nor are they, as will presently be pointed out, by any means a solitary instance of the kind.
§ 4. However, when we take a closer look, we can see signs within the field of Inductive Logic that occasionally blur the clear distinction in question; we may encounter entities (using the broadest term possible) that it would be difficult to classify as either facts or concepts. For example, Inductive Logic frequently relies on Hypotheses: which of the two categories do these belong to? They don’t seem to strictly fit into either one; nor, as will be explained shortly, are they a unique case of this nature.
It is true that within the province of Inductive Logic 269 these hypotheses do not give much trouble on this score. However vague may be the form in which they first present themselves to the philosopher's mind, they have not much business to come before us in our capacity of logicians until they are well on their way, so to say, towards becoming facts: until they are beginning to harden into that firm tangible shape in which they will eventually appear. We generally have some such recommendations given to us as that our hypotheses shall be well-grounded and reasonable. This seems only another way of telling us that however freely the philosopher may make his guesses in the privacy of his own study, he had better not bring them out into public until they can with fair propriety be termed facts, even though the name be given with some qualification, as by terming them ‘probable facts.’ The reason, therefore, why we do not take much account of this intermediate state in the hypothesis, when we are dealing with the inductive processes, is that here at any rate it plays only a temporary part; its appearance in that guise is but very fugitive. If the hypothesis be a sound one, it will soon take its place as an admitted fact; if not, it will soon be rejected altogether. Its state as a hypothesis is not a normal one, and therefore we have not much occasion to scrutinize its characteristics. In so saying, it must of course be understood that we are speaking as inductive logicians; the philosopher in his workshop ought, as already remarked, to be familiar enough with the hypothesis in every stage of its existence from its origin; but the logician's duty is different, dealing as he does with proof rather than with the processes of original investigation and discovery.
It's true that in the realm of Inductive Logic, these hypotheses don't create many issues on this front. No matter how vague they may appear at first to the philosopher's mind, they shouldn't really concern us as logicians until they are well on their way to becoming facts—until they start taking on the solid, tangible form in which they will ultimately emerge. We typically receive recommendations like our hypotheses should be well-grounded and reasonable. This is just another way of saying that while the philosopher can freely make guesses in the privacy of their own study, they should hold off on sharing them publicly until they can legitimately be called facts, even if they're labeled as "probable facts." The reason we don't pay much attention to this intermediate state of the hypothesis when dealing with inductive processes is that it plays only a temporary role; its presence in this form is very fleeting. If the hypothesis is sound, it will quickly be recognized as a fact; if not, it will soon be entirely dismissed. Its status as a hypothesis isn't a normal one, so we don't have much reason to examine its characteristics closely. When we say this, it's important to understand that we're speaking as inductive logicians; the philosopher in their workshop should be quite familiar with the hypothesis at every stage of its development from the beginning. However, the logician's role is different, focusing on proof rather than the processes of original investigation and discovery.
We might indeed even go further, and say that in many cases the hypothesis does not present itself to the reader, that is to the recipient of the knowledge, until it has ceased 270 to deserve that name at all. It may be first suggested to him along with the proof which establishes it, he not having had occasion to think of it before. It thus comes at a single step out of the obscurity of the unknown into the full possession of its rights as a fact, skipping practically the intermediate or hypothetical stage altogether. The original investigator himself may have long pondered over it, and kept it present to his mind, in this its dubious stage, but finally have given it to the world with that amount of evidence which raises it at once in the minds of others to the level of commonly accepted facts.
We might even go further and say that in many cases, the hypothesis doesn’t actually present itself to the reader, who is the recipient of the knowledge, until it no longer deserves that label at all. It may first be suggested to them along with the proof that establishes it, without them having thought about it before. It thus emerges directly from the obscurity of the unknown into full recognition as a fact, skipping the intermediate or hypothetical stage entirely. The original investigator may have spent a long time contemplating it and kept it in mind during this uncertain phase, but ultimately presented it to the world with enough evidence to elevate it in the minds of others to the status of commonly accepted facts.
Still this doubtful stage exists in every hypothesis, though for logical purposes, and to most minds, it exists in a very fugitive way only. When attention has been directed to it, it may be also detected elsewhere in Logic. Take the case, for instance, of the reference of names. Mill gives the examples of the sun, and a battle, as distinguished from the ideas of them which we, or children, may entertain. Here the distinction is plain and obvious enough. But if, on the other hand, we take the case of things whose existence is doubtful or disputed, the difficulty above mentioned begins to show itself. The case of merely extinct things, or such as have not yet come into existence, offers indeed no trouble, since of course actually present existence is not necessary to constitute a fact. The usual distinction may even be retained also in the case of mythical existences. Centaur and Griffin have as universally recognised a significance amongst the poets, painters, and heralds as lion and leopard have. Hence we may claim, even here, that our conceptions shall be ‘truthful,’ ‘consistent with fact,’ and so on, by which we mean that they are to be in accordance with universal convention upon such subjects. Necessary and universal accordance is sometimes claimed 271 to be all that is meant by ‘objective,’ and since universal accordance is attainable in the case of the notoriously fictitious, our fundamental distinction between fact and conception, and our determination that our terms shall refer to what is objective rather than to what is subjective, may with some degree of strain be still conceived to be tenable even here.
Still, this uncertain stage exists in every hypothesis, though for logical purposes, and to most people, it exists in a very fleeting way. Once attention has been drawn to it, it may also be found elsewhere in Logic. Take the example of the reference of names. Mill gives the examples of the sun and a battle, distinguishing them from the ideas we or children might have about them. Here, the distinction is clear enough. But if we consider things whose existence is uncertain or debated, the previously mentioned difficulty starts to emerge. Cases of merely extinct things or those that haven’t come into existence pose no real issues, since actual present existence is not required to constitute a fact. The usual distinction can also be kept in the case of mythical beings. Centaurs and Griffins have as widely recognized significance among poets, painters, and heralds as lions and leopards do. Therefore, we can assert, even here, that our ideas should be ‘truthful,’ ‘consistent with fact,’ and so on, meaning that they should align with universal conventions on such topics. Necessary and universal agreement is sometimes said to be all that is meant by ‘objective,’ and since universal agreement can be achieved in the case of clearly fictional entities, our basic distinction between fact and conception, along with our assertion that our terms should refer to what is objective rather than what is subjective, can still be considered defensible, albeit with some struggle, even in this context. 271
§ 5. But when we come to the case of disputed phenomena the difficulty re-emerges. A supposed planet or new mineral, a doubtful fact in history, a disputed theological doctrine, are but a few examples out of many that might be offered. What some persons strenuously assert, others as strenuously deny, and whatever hope there may be of speedy agreement in the case of physical phenomena, experience shows that there is not much prospect of this in the case of those which are moral and historical, to say nothing of theological. So long as those who are in agreement confine their intercourse to themselves, their ‘facts’ are accepted as such, but as soon as they come to communicate with others all distinction between fact and conception is lost at once, the ‘facts’ of one party being mere groundless ‘conceptions’ to their opponents. There is therefore, I think, in these cases a real difficulty in carrying out distinctly and consistently the account which the Materialist logician offers as to the reference of names. It need hardly be pointed out that what thus applies to names or terms applies equally to propositions in which particular or general statements are made involving names.
§ 5. However, when we look at cases of disputed phenomena, the difficulty comes back. A supposed planet or a new mineral, a questionable historical fact, a debated theological doctrine—these are just a few examples among many. What some people insist on, others firmly deny, and while there might be hope for quick agreement regarding physical phenomena, experience shows that there’s little chance for this when it comes to moral and historical issues, not to mention theological ones. As long as those who agree only interact with each other, their 'facts' are accepted as such. But as soon as they try to communicate with others, the line between fact and belief disappears, with one group's 'facts' being seen as baseless 'beliefs' by their opponents. Therefore, I believe there is a genuine difficulty in clearly and consistently applying the explanations that the Materialist logician provides regarding the references of names. It hardly needs stating that what applies to names or terms also applies equally to propositions that include specific or general statements involving those names.
§ 6. But when we step into Probability, and treat this from the same material or Phenomenal point of view, we can no longer neglect the question which is thus presented to us. The difficulty cannot here be rejected, as referring to what is merely temporary or occasional. The intermediate condition between conjecture and fact, so far from being temporary 272 or occasional only, is here normal. It is just the condition which is specially characteristic of Probability. Hence it follows that however decidedly we may reject the Conceptualist theory we cannot altogether reject the use of Conceptualist language. If we can prove that a given man will die next year, or attain sufficiently near to proof to leave us practically certain on the point, we may speak of his death as a (future) fact. But if we merely contemplate his death as probable? This is the sort of inference, or substitute for inference, with which Probability is specially concerned. We may, if we so please, speak of ‘probable facts,’ but if we examine the meaning of the words we may find them not merely obscure, but self-contradictory. Doubtless there are facts here, in the fullest sense of the term, namely the statistics upon which our opinion is ultimately based, for these are known and admitted by all who have looked into the matter. The same language may also be applied to that extension of these statistics by induction which is involved in the assertion that similar statistics will be found to prevail elsewhere, for these also may rightfully claim universal acceptance. But these statements, as was abundantly shown in the earlier chapters, stand on a very different footing from a statement concerning the individual event; the establishment and discussion of the former belong by rights to Induction, and only the latter to Probability.
§ 6. When we delve into Probability and look at it from the same material or Phenomenal perspective, we can’t ignore the questions it raises. The challenge here can’t be dismissed as something merely temporary or occasional. The state that exists between conjecture and fact is not just temporary or occasional; it's actually the norm. This state is specifically what defines Probability. Therefore, even though we may strongly refute the Conceptualist theory, we can’t completely disregard the use of Conceptualist language. If we can prove that a person will die next year, or get close enough to that proof to make us practically certain, we can refer to his death as a (future) fact. But what if we only think of his death as probable? This is the kind of inference, or substitute for inference, that Probability focuses on. We could refer to 'probable facts,' but if we analyze what those words mean, they may turn out to be not only unclear but also contradictory. There are certainly facts in the fullest sense of the word, namely the statistics our opinion is ultimately based on, as these are known and accepted by everyone who has looked into it. The same language can also apply to extending these statistics through induction, which involves claiming that similar statistics will be found elsewhere, as these too deserve universal acceptance. However, these statements, as clearly shown in earlier chapters, are on a very different foundation from statements about individual events; the creation and discussion of the former rightfully belong to Induction, while only the latter belongs to Probability.
§ 7. It is true that for want of appropriate terms to express such things we are often induced, indeed compelled, to apply the same name of ‘facts’ to such individual contingencies. We should not, for instance, hesitate to speak of the fact of the man dying being probable, possible, unlikely, or whatever it might be. But I cannot help regarding such expressions as a strictly incorrect usage arising out of a 273 deficiency of appropriate technical terms. It is doubtless certain that one or other of the two alternatives must happen, but this alternative certainty is not the subject of our contemplation; what we have before us is the single alternative, which is notoriously uncertain. It is this, and this only, which is at present under notice, and whose occurrence has to be estimated. We have surely no right to dignify this with the name of a fact, under any qualifications, when the opposite alternative has claims, not perhaps actually equal to, but at any rate not much inferior to its own. Such language, as already remarked, may be quite right in Inductive logic, where we are only concerned with conjectures of such a high degree of likelihood that their non-occurrence need not be taken into practical account, and which are moreover regarded as merely temporary. But in Probability the conjecture may have any degree of likelihood about it; it may be just as likely as the other alternative, nay it may be much less likely. In these latter cases, for instance, if the chances are very much against the man's death, it is surely an abuse of language to speak of the ‘fact’ of his dying, even though we qualify it by declaring it to be highly improbable. The subject-matter essential to Probability being the uncertain, we can never with propriety employ upon it language which in its original and correct application is only appropriate to what is actually or approximately certain.
§ 7. It's true that because there aren’t appropriate terms to express these ideas, we often find ourselves, even forced to, using the same term ‘facts’ for these individual situations. For example, we shouldn’t hesitate to talk about the fact that a man dying is probable, possible, unlikely, or whatever it may be. However, I still see such phrases as strictly incorrect usage due to a lack of suitable technical terms. It’s quite certain that one of the two possibilities must occur, but this certainty of alternatives isn't what we’re focusing on; what we have in front of us is the single alternative, which is inherently uncertain. It’s this, and only this, that we’re currently considering, and whose likelihood we need to evaluate. We really shouldn’t elevate this to the level of a fact, under any circumstances, when the opposite possibility has claims that are, if not exactly equal, at least not much lower. As mentioned earlier, this kind of language might be appropriate in Inductive logic, where we deal with conjectures that are so likely that we don’t need to practically consider their non-occurrence and are seen as temporary. But in Probability, a conjecture can have any level of likelihood; it can be just as likely as the opposing possibility, or even much less likely. In these cases, for example, if the odds are strongly against the man's death, it’s certainly a misuse of language to refer to the ‘fact’ of his dying, even if we characterize it as highly improbable. Since the essence of Probability is uncertainty, we can never properly use language that, in its original and correct context, is only suitable for what is actually or nearly certain.
§ 8. It should be remembered also that this state of things, thus characteristic of Probability, is permanent there. So long as they remain under the treatment of that science our conjectures, or whatever we like to call them, never develop into facts. I calculate, for instance, the chance that a die will give ace, or that a man will live beyond a certain age. Such an approximation to knowledge as is thus acquired is as much as we can ever afterwards hope to get, 274 unless we resort to other methods of enquiry. We do not, as in Induction, feel ourselves on the brink of some experimental or other proof which at any moment may raise it into certainty. It is nothing but a conjecture of a certain degree of strength, and such it will ever remain, so long as Probability is left to deal with it. If anything more is ever to be made out of it we must appeal to direct experience, or to some kind of inductive proof. As we have so often said, individual facts can never be determined here, but merely ultimate tendencies and averages of many events. I may, indeed, by a second appeal to Probability improve the character of my conjecture, through being able to refer it to a narrower and better class of statistics; but its essential nature remains throughout what it was.
§ 8. It's important to remember that this situation, which is characteristic of Probability, is permanent. As long as we rely on that science, our guesses, or whatever we choose to call them, never turn into facts. For example, I calculate the likelihood of a die showing an ace, or that a person will live past a certain age. This type of approximation to knowledge is about all we can hope for, 274 unless we explore other methods of investigation. Unlike Induction, we don’t feel like we’re on the verge of some experimental proof that might elevate it to certainty. It’s nothing more than a guess of a certain strength, and it will always stay that way as long as Probability is in charge. If we want to get anything more out of it, we need to turn to direct experience or some sort of inductive proof. As we've often noted, individual facts can never be pinpointed here, only overarching tendencies and averages of multiple events. I can, of course, enhance
It appears to me therefore that the account of the Materialist view of logic indicated at the commencement of this chapter, though substantially sound, needs some slight reconsideration and re-statement. It answers admirably so far as ordinary Induction is concerned, but needs some revision if it is to be equally applicable to that wider view of the nature and processes of acquiring knowledge wherein the science of logic is considered to involve Probability also as well as Induction.
It seems to me that the explanation of the Materialist view of logic mentioned at the beginning of this chapter, while fundamentally sound, requires a bit of reevaluation and restatement. It works well for standard Induction, but needs some adjustments to apply equally to the broader perspective on the nature and processes of gaining knowledge, where the science of logic is seen to encompass both Probability and Induction.
§ 9. Briefly then it is this. We regard the scientific thinker, whether he be the original investigator who discovers, or the logician who analyses and describes the proofs that may be offered, as surrounded by a world of objective phenomena extending indefinitely both ways in time, and in every direction in space. Most of them are, and always will remain, unknown. If we speak of them as facts we mean that they are potential objects of human knowledge, that under appropriate circumstances men could come to determinate and final agreement about them. The scientific or 275 material logician has to superintend the process of converting as much as possible of these unknown phenomena into what are known, of aggregating them, as we have said above, about the nucleus of certain data which experience and observation had to start with. In so doing his principal resources are the Methods of Induction, of which something has been said in a former chapter; another resource is found in the Theory of Probability, and another in Deduction.
§ 9. In short, here's the idea. We view the scientific thinker, whether they are the original researcher who discovers or the logician who analyzes and explains the proofs presented, as surrounded by a universe of objective phenomena that extend endlessly both into the past and the future, and in every direction in space. Most of these phenomena are, and will always remain, unknown. When we refer to them as facts, we mean that they are potential subjects of human knowledge, and that, under the right conditions, people could reach a clear and final understanding of them. The scientific or 275 material logician is responsible for overseeing the process of transforming as much of this unknown phenomena into what is known, collecting them, as stated earlier, around the central core of certain data that experience and observation have provided. In this process, their main tools are the Methods of Induction, which we've discussed in a previous chapter; another tool is the Theory of Probability, and a third is Deduction.
Now, however such language may be objected to as savouring of Conceptualism, I can see no better compendious way of describing these processes than by saying that we are engaged in getting at conceptions of these external phenomena, and as far as possible converting these conceptions into facts. What is the natural history of ‘facts’ if we trace them back to their origin? They first come into being as mere guesses or conjectures, as contemplated possibilities whose correspondence with reality is either altogether disbelieved or regarded as entirely doubtful. In this stage, of course, their contrast with facts is sharp enough. How they arise it does not belong to Logic but to Psychology to say. Logic indeed has little or nothing to do with them whilst they are in this form. Everyone is busy all his life in entertaining such guesses upon various subjects, the superiority of the philosopher over the common man being mainly found in the quality of his guesses, and in the skill and persistence with which he sifts and examines them. In the next stage they mostly go by the name of theories or hypotheses, when they are comprehensive in their scope, or are in any way on a scale of grandeur and importance: when however they are of a trivial kind, or refer to details, we really have no distinctive or appropriate name for them, and must be content therefore to call them ‘conceptions.’ Through this stage they flit with great rapidity in Inductive Logic; often the 276 logician keeps them back until their evidence is so strong that they come before the world at once in the full dignity of facts. Hence, as already remarked, this stage of their career is not much dwelt upon in Logic. But the whole business of Probability is to discuss and estimate them at this point. Consequently, so far as this science is concerned, the explanation of the Material logician as to the reference of names and propositions has to be modified.
Now, although some might argue that this language leans towards Conceptualism, I can't think of a clearer way to describe these processes than by saying we're trying to understand concepts of external phenomena and, as much as we can, turning these concepts into facts. What’s the natural history of ‘facts’ if we trace them back to their origin? They start as mere guesses or conjectures—ideas we think might be possible, but whose connection to reality is often doubted or completely disbelieved. At this point, they stand in sharp contrast to facts. It’s not Logic but Psychology that explains how they come about. Logic actually has little to do with them while they’re in this form. Everyone spends their life entertaining such guesses on various topics, with a philosopher's edge being mostly found in the quality of their guesses and in the skill and persistence they use to analyze and test them. In the next stage, these guesses are often called theories or hypotheses when they encompass broad ideas or carry significant importance; however, when they are trivial or refer to small details, we don’t really have a proper name for them, so we settle on calling them ‘conceptions.’ During this phase, they move quickly in Inductive Logic; often, the logician holds them back until the evidence is strong enough for them to be presented to the world as full-fledged facts. Thus, as noted earlier, this stage of their development isn’t emphasized much in Logic. But the entire purpose of Probability is to discuss and assess them at this point. Therefore, regarding this science, the explanation from the Material logician about how names and propositions refer to each other needs to be adjusted.
§ 10. The best way therefore of describing our position in Probability is as follows:—We are entertaining a conception of some event, past, present, or future. From the nature of the case this conception is all that can be actually entertained by the mind. In its present condition it would be incorrect to call it a fact, though we would willingly, if we could, convert it into such by making certain of it one way or the other. But so long as our conclusions are to be effected by considerations of Probability only, we cannot do this. The utmost we can do is to estimate or evaluate it. The whole function of Probability is to give rules for so doing. By means of reference to statistics or by direct deduction, as the case may be, we are enabled to say how much this conception is to be believed, that is in what proportion out of the total number of cases we shall be right in so doing. Our position, therefore, in these cases seems distinctly that of entertaining a conception, and the process of inference is that of ascertaining to what extent we are justified in adding this conception to the already received body of truth and fact.
§ 10. The best way to describe our position in Probability is as follows:—We are considering an idea about some event, whether it's from the past, present, or future. Given the situation, this idea is all we can actually hold in our minds. Right now, it wouldn't be correct to call it a fact, even though we would gladly turn it into one if we could confirm it either way. But as long as we are relying solely on Probability for our conclusions, we can’t do that. The most we can do is to estimate or evaluate it. The entire purpose of Probability is to provide guidelines for that process. By referencing statistics or making direct deductions, we can determine how credible this idea is, meaning what portion of the total cases we can expect to be correct by doing so. Thus, our role in these situations clearly involves considering an idea, and the process of inference involves figuring out to what extent we are justified in adding this idea to the existing body of truth and fact.
So long, then, as we are confined to Probability these conceptions remain such. But if we turn to Induction we see that they are meant to go a step further. Their final stage is not reached until they have ripened into facts, and so taken their place amongst uncontested truths. This is 277 their final destination in Logic, and our task is not accomplished until they have reached it.
As long as we stick to Probability, these ideas stay as they are. However, when we shift to Induction, we see that they're intended to evolve further. They don't reach their ultimate stage until they develop into facts and establish themselves among accepted truths. This is their final goal in Logic, and our work isn't done until they achieve it. 277
§ 11. Such language as this in which we speak of our position in Probability as being that of entertaining a conception, and being occupied in determining what degree of belief is to be assigned to it, may savour of Conceptualism, but is in spirit perfectly different from it. Our ultimate reference is always to facts. We start from them as our data, and reach them again eventually in our results whenever it is possible. In Probability, of course, we cannot do this in the individual result, but even then (as shown in Ch. VI.) we always justify our conclusions by appeal to facts, viz. to what happens in the long run.
§ 11. The way we talk about our position in Probability, as if we are contemplating an idea and figuring out what level of belief it should have, might seem like Conceptualism. However, it is fundamentally different. Our ultimate reference is always to facts. We begin with them as our basis and, whenever possible, return to them in our findings. In Probability, we can't always do this for individual outcomes, but even then (as shown in Ch. VI.) we always justify our conclusions by referencing the facts, specifically to what occurs over the long term.
The discussion which has been thus given to this part of the subject may seem somewhat tedious, but it was so obviously forced upon us when considering the distinction between the two main views of Logic, that it was impossible to pass it over without fear of misapprehension and confusion. Moreover, as will be seen in the course of the next chapter, several important conclusions could not have been properly explained and justified without first taking pains to make this part of our ground perfectly plain and satisfactory.
The discussion we've had on this part of the topic might seem a bit tedious, but it was clearly necessary when looking at the difference between the two main views of Logic. We couldn't skip it without risking misunderstanding and confusion. Additionally, as will be explained in the next chapter, several important conclusions wouldn't have been properly clarified and justified without first ensuring that this part of our foundation is completely clear and satisfactory.
CHAPTER 12.
CONSEQUENCES OF THE FOREGOING DISTINCTIONS.
§ 1. We are now in a position to explain and justify some important conclusions which, if not direct consequences of the distinctions laid down in the last chapter, will at any rate be more readily appreciated and accepted after that exposition.
§ 1. We can now explain and justify some important conclusions that, while not necessarily direct outcomes of the distinctions made in the last chapter, will definitely be easier to understand and accept after that explanation.
In the first place, it will be seen that in Probability time has nothing to do with the question; in other words, it does not matter whether the event, whose probability we are discussing, be past, present, or future. The problem before us, in its simplest form, is this:—Statistics (extended by Induction, and practically often gained by Deduction) inform us that a certain event has happened, does happen, or will happen, in a certain way in a certain proportion of cases. We form a conception of that event, and regard it as possible; but we want to do more; we want to know how much we ought to expect it (under the explanations given in a former chapter about quantity of belief). There is therefore a sort of relative futurity about the event, inasmuch as our knowledge of the fact, and therefore our justification or otherwise of the correctness of our surmise, almost necessarily comes after the surmise was formed; but the futurity is only relative. The evidence by which the question is to be settled may not be forthcoming yet, or we may have it by 279 us but only consult it afterwards. It is from the fact of the futurity being, as above described, only relative, that I have preferred to speak of the conception of the event rather than of the anticipation of it. The latter term, which in some respects would have seemed more intelligible and appropriate, is open to the objection, that it does rather, in popular estimation, convey the notion of an absolute as opposed to a relative futurity.
In the first place, it’s clear that in Probability, time isn’t relevant to the question; in other words, it doesn’t matter whether the event we’re discussing happened in the past, is happening now, or will happen in the future. The problem we’re dealing with, in its simplest form, is this: Statistics (combined with Induction, and usually derived from Deduction) tell us that a certain event has occurred, is occurring, or will occur in a specific way and in a certain proportion of cases. We form an idea of that event and consider it possible; but we want to go further—we want to know how much we should expect it (based on the explanations provided in a previous chapter about the quantity of belief). There is, therefore, a sort of relative futurity concerning the event because our understanding of the fact, and consequently our justification for whether our assumption is correct, almost always comes after the assumption was made; but the futurity is only relative. The evidence needed to resolve the question might not be available yet, or we might have it on hand but only consult it later. It is due to the fact that the futurity, as described above, is only relative that I chose to refer to the idea of the event rather than anticipating it. The latter term, which might have seemed clearer and more fitting in some respects, is problematic because it tends to suggest, in common understanding, a notion of absolute instead of relative futurity.
§ 2. For example; a die is thrown. Once in six times it gives ace; if therefore we assume, without examination, that the throw is ace, we shall be right once in six times. In so doing we may, according to the usual plan, go forwards in time; that is, form our opinion about the throw beforehand, when no one can tell what it will be. Or we might go backwards; that is, form an opinion about dice that had been cast on some occasion in time past, and then correct our opinion by the testimony of some one who had been a witness of the throws. In either case the mental operation is precisely the same; an opinion formed merely on statistical grounds is afterwards corrected by specific evidence. The opinion may have been formed upon a past, present, or future event; the evidence which corrects it afterwards may be our own eyesight, or the testimony of others, or any kind of inference; by the evidence is merely meant such subsequent examination of the case as is assumed to set the matter at rest. It is quite possible, of course, that this specific evidence should never be forthcoming; the conception in that case remains as a conception, and never obtains that degree of conviction which qualifies it to be regarded as a ‘fact.’ This is clearly the case with all past throws of dice the results of which do not happen to have been recorded.
§ 2. For instance, when a die is rolled, it shows an ace once out of every six times. If we simply assume, without checking, that the result will be an ace, we'll only be right one out of six times. We can approach this in two ways: we can go forwards in time, meaning we form our opinion about the result before it happens, when no one knows what it will be. Or we can go backwards; that is, we can form an opinion about dice that were rolled in the past and then adjust our opinion based on someone else's account of the results. In both cases, the mental process is exactly the same: an opinion based only on statistics is later corrected by specific evidence. This opinion can be about an event that has occurred, is happening, or will happen. The evidence that corrects this opinion could come from our own observation, someone else's testimony, or any type of deduction; by evidence, we mean any follow-up examination of the situation that is expected to clarify the matter. It’s entirely possible, of course, that this specific evidence might never appear; in that case, the idea remains just an idea and doesn't reach the level of certainty required to be considered a 'fact.' This is clearly the situation with any past dice rolls that have not been documented.
In discussing games of chance there are obvious advantages in confining ourselves to what is really, as well as 280 relatively, future, for in that case direct information concerning the contemplated result being impossible, all persons are on precisely the same footing of comparative ignorance, and must form their opinion entirely from the known or inferred frequency of occurrence of the event in question. On the other hand, if the event be passed, there is almost always evidence of some kind and of some value, however slight, to inform us what the event really was; if this evidence is not actually at hand, we can generally, by waiting a little, obtain something that shall be at least of some use to us in forming our opinion. Practically therefore we generally confine ourselves, in anticipations of this kind, to what is really future, and so in popular estimation futurity becomes indissolubly associated with probability.
When talking about games of chance, it makes sense to focus on what is really, as well as relatively, future. In that case, since we can't get direct information about the expected outcome, everyone is on the same level of comparative ignorance and must base their opinions solely on the known or estimated frequency of the event happening. On the flip side, if the event has already occurred, there’s usually some sort of evidence, even if it’s minimal, that tells us what actually happened. If we don’t have this evidence right away, we can often wait a bit and still get something that will help us form our opinion. So, in practice, we usually focus on what is genuinely future-oriented, which means that in popular belief, the future becomes closely linked with probability.
§ 3. There is however an error closely connected with the above view of the subject, or at least an inaccuracy of expression which is constantly liable to lead to error, which has found wide acceptance, and has been sanctioned by writers of the greatest authority. For instance, both Butler, in his Analogy, and Mill, have drawn attention, under one form of expression or another, to the distinction between improbability before the event and improbability after the event, which they consider to be perfectly different things. That this phraseology indicates a distinction of importance cannot be denied, but it seems to me that the language in which it is often expressed requires to be amended.
§ 3. However, there is an error closely related to the view mentioned above, or at least an inaccurate way of expressing it that often leads to misunderstanding. This error has gained widespread acceptance and has been endorsed by highly respected authors. For example, both Butler in his Analogy and Mill have pointed out, in one way or another, the difference between improbability before an event and improbability after it, which they believe are fundamentally different concepts. While the terminology used does highlight an important distinction, I think the way it’s often presented needs to be revised.
Butler's remarks on this subject occur in his Analogy, in the chapter on miracles. Admitting that there is a strong presumption against miracles (his equivalent for the ordinary expression, an ‘improbability before the event’) he strives to obtain assent for them by showing that other events, which also have a strong presumption against them, are received on what is in reality very slight evidence. He 281 says, “There is a very strong presumption against common speculative truths, and against the most ordinary facts, before the proof of them; which yet is overcome by almost any proof. There is a presumption of millions to one against the story of Cæsar, or of any other man. For, suppose a number of common facts so and so circumstanced, of which one had no kind of proof, should happen to come into one's thoughts, every one would without any possible doubt conclude them to be false. And the like may be said of a single common fact.”
Butler's comments on this topic appear in his Analogy, in the chapter about miracles. He acknowledges that there is a strong bias against miracles (his version of the usual phrase, an ‘improbability before the event’) and tries to gain acceptance for them by demonstrating that other occurrences, which also have a strong bias against them, are accepted based on what is actually very weak evidence. He 281 states, “There is a very strong bias against common speculative truths and against the most ordinary facts before we have proof of them; yet this bias is overcome by almost any evidence. There's a bias of millions to one against the account of Cæsar or any other person. For instance, if a set of common facts in specific situations came to mind without any kind of proof, everyone would undoubtedly conclude they are false. The same can be said about a single common fact.”
§ 4. These remarks have been a good deal criticized, and they certainly seem to me misleading and obscure in their reference. If one may judge by the context, and by another passage in which the same argument is afterwards referred to,[1] it would certainly appear that Butler drew no distinction between miraculous accounts, and other accounts which, to use any of the various expressions in common use, are unlikely or improbable or have a presumption against them; and concluded that since some of the latter were instantly accepted upon somewhat mediocre testimony, it was altogether irrational to reject the former when similarly or better supported.[2] This subject will come again under our notice, and demand fuller discussion, in the chapter on the Credibility of extraordinary stories. It will suffice here to 282 remark that, however satisfactory such a view of the matter might be to some theologians, no antagonist of miracles would for a moment accept it. He would naturally object that, instead of the miraculous element being (as Butler considers) “a small additional presumption” against the narrative, it involved the events in a totally distinct class of incredibility; that it multiplied, rather than merely added to, the difficulties and objections in the way of accepting the account.
§ 4. These remarks have faced quite a bit of criticism, and they certainly seem misleading and unclear in their references. If we look at the context and another section where the same argument is mentioned later, it appears that Butler made no distinction between miraculous accounts and other accounts that, to use any of the common terms, are unlikely or improbable or have doubts cast upon them. He concluded that since some of these latter accounts were readily accepted based on relatively weak evidence, it was entirely irrational to reject the former when they were supported equally well or better. This topic will come up again and be discussed in more detail in the chapter on the Credibility of extraordinary stories. For now, it's enough to note that, while some theologians might find this perspective satisfactory, no opponent of miracles would accept it for even a moment. They would likely argue that instead of the miraculous element being, as Butler suggests, “a small additional presumption” against the narrative, it places the events in a completely different category of incredibility; that it increases, rather than just adds to, the difficulties and objections to accepting the account.
Mill's remarks (Logic, Bk. III. ch. XXV. § 4) are of a different character. Discussing the grounds of disbelief he speaks of people making the mistake of “overlooking the distinction between (what may be called) improbability before the fact, and improbability after it, two different properties, the latter of which is always a ground of disbelief, the former not always.” He instances the throwing of a die. It is improbable beforehand that it should turn up ace, and yet afterwards, “there is no reason for disbelieving it if any credible witness asserts it.” So again, “the chances are greatly against A. B.'s dying, yet if any one tells us that he died yesterday we believe it.”
Mill's comments (Logic, Bk. III. ch. XXV. § 4) are different in nature. When discussing the reasons for disbelief, he points out that people often make the mistake of “overlooking the distinction between (what might be called) improbability before the fact and improbability after it, which are two different properties; the latter is always a reason for disbelief, while the former isn't always.” He uses the example of rolling a die. It's unlikely beforehand that it will come up as an ace, but after it happens, “there’s no reason to disbelieve it if a credible witness says it did.” Similarly, “the odds are stacked against A.B. dying, yet if someone tells us he died yesterday, we believe it.”
§ 5. That there is some difficulty about such problems as these must be admitted. The fact that so many people find them a source of perplexity, and that such various explanations are offered to solve the perplexity, are a sufficient proof of this.[3] The considerations of the last chapter, 283 however, over-technical and even scholastic as some of the language in which it was expressed may have seemed to the reader, will I hope guide us to a more satisfactory way of regarding the matter.
§ 5. It's clear that there are challenges with problems like these. The fact that so many people struggle with them and that there are various explanations proposed to address this confusion is enough proof of that.[3] The points made in the last chapter, 283 although they might have sounded overly technical or even academic to the reader, will hopefully lead us to a clearer way of understanding the issue.
When we speak of an improbable event, it must be remembered that, objectively considered, an event can only be more or less rare; the extreme degree of rarity being of course that in which the event does not occur at all. Now, as was shown in the last chapter, our position, when forming judgments of the time in question, is that of entertaining a conception or conjecture (call it what we will), and assigning a certain weight of trustworthiness to it. The real distinction, therefore, between the two classes of examples respectively, which are adduced both by Butler and by Mill, consists in the way in which those conceptions are obtained; they being obtained in one case by the process of guessing, and in the other by that of giving heed to the reports of witnesses.
When we talk about an unlikely event, we need to keep in mind that, from an objective perspective, an event can only be more or less rare; the ultimate level of rarity being the situation where the event doesn’t happen at all. As shown in the last chapter, our stance when making judgments about the past is one of entertaining a concept or hypothesis (whatever you want to call it) and assigning a certain level of reliability to it. Thus, the real difference between the two sets of examples provided by Butler and Mill lies in how those concepts are formed; in one case, they emerge from guessing, while in the other, they come from considering the accounts of witnesses.
§ 6. Take Butler's instance first. In the ‘presumption before the proof’ we have represented to us a man thinking of the story of Cæsar, that is, making a guess about certain historical events without any definite grounds for it, and then speculating as to what value is to be attached to the probability of its truth. Such a guess is of course, as he says, concluded to be false. But what does he understand by the ‘presumption after the proof’? That a story not adopted at random, but actually suggested and supported by witnesses, should be true. The latter might be accepted, whilst the former would undoubtedly be rejected; but all that this proves, or rather illustrates, is that the testimony 284 of almost any witness is in most cases vastly better than a mere guess.[4] We may in both cases alike speak of ‘the event’ if we will; in fact, as was admitted in the last chapter, common language will not readily lend itself to any other way of speaking. But it should be clearly understood that, phrase it how we will, what is really present to the man's mind, and what is to have its probable value assigned to it, is the conception of an event, in the sense in which that expression has already been explained. And surely no two conceptions can have a much more important distinction put between them than that which is involved in supposing one to rest on a mere guess, and the other on the report of a witness. Precisely the same remarks apply to the example given by Mill. Before A. B.'s death our opinion upon the subject was nothing but a guess of our own founded upon life statistics; after his death it was founded upon the evidence of some one who presumably had tolerable opportunities of knowing what the facts really were.
§ 6. Let's first consider Butler's example. In the 'presumption before the proof,' we have a person reflecting on the story of Caesar, making a guess about certain historical events without any solid evidence, and then speculating on the likelihood of its truth. As he points out, this guess is ultimately deemed false. But what does he mean by 'presumption after the proof'? It refers to a story that isn’t randomly selected but is actually suggested and backed by witnesses, which should be accepted as true. The former would typically be dismissed, while the latter could be accepted; this highlights that the testimony of almost any witness is generally far more reliable than just a guess. 284 In both cases, we may refer to ‘the event’ if we choose; indeed, as noted in the last chapter, common language doesn't easily offer alternative expressions. However, it's essential to understand that, regardless of how we phrase it, what truly occupies the man's mind, and what is assigned probable value, is the idea of an event, as previously explained. There’s a significant distinction between a conception based on mere speculation and one based on a witness's account. The same points apply to Mill's example. Before A. B.'s death, our view on the matter was just our guess based on life statistics; after his death, it was based on evidence from someone who presumably had a good understanding of the actual facts.
§ 7. That the distinction before us has no essential connection whatever with time is indeed obvious on a moment's consideration. Conceive for a moment that some one had opportunities of knowing whether A. B. would die or not. If he told us that A. B. would die to-morrow, we should in that case be just as ready to believe him as when he tells us that A. B. has died. If we continued to feel any doubt about the statement (supposing always that we had full 285 confidence about his veracity in matters into which he had duly enquired), it would be because we thought that in his case, as in ours, it was equivalent to a guess, and nothing more. So with the event when past, the fact of its being past makes no difference whatever; until the credible witness informs us of what he knows to have occurred, we should doubt it if it happened to come into our minds, just as much as if it were future.
§ 7. The distinction we're discussing has no real connection to time, which is pretty clear after just a moment’s thought. Imagine someone had the ability to know whether A. B. would die or not. If he told us that A. B. would die tomorrow, we would be just as willing to believe him as when he says that A. B. has died. If we still had any doubt about the statement (assuming we trust him completely in matters he has investigated), it would be because we thought that, in his case as in ours, it was essentially a guess and nothing more. The same goes for an event that has already happened; the fact that it’s in the past doesn’t change anything. Until the credible witness tells us what he knows actually occurred, we would doubt it if it suddenly came to mind, just as we would if it were something in the future.
The distinction, therefore, between probability before the event and probability after the event seems to resolve itself simply into this;—before the event we often have no better means of information than to appeal to statistics in some form or other, and so to guess amongst the various possible alternatives; after the event the guess may most commonly be improved or superseded by appeal to specific evidence, in the shape of testimony or observation. Hence, naturally, our estimate in the latter case is commonly of much more value. But if these characteristics were anyhow inverted; if, that is, we were to confine ourselves to guessing about the past, and if we could find any additional evidence about the future, the respective values of the different estimates would also be inverted. The difference between these values has no necessary connection with time, but depends entirely upon the different grounds upon which our conception or conjecture about the event in question rests.
The distinction between probability before an event and probability after an event essentially comes down to this: before the event, we often have no better way to gather information than to rely on statistics in some form, which leads us to guess among various possible options; after the event, our guess can usually be enhanced or replaced by specific evidence, like testimony or observation. As a result, our assessment in the latter case is generally much more valuable. However, if these roles were somehow reversed—meaning if we were limited to guessing about the past and had access to additional evidence about the future—the relative values of the different estimates would also be flipped. The difference between these values isn’t inherently tied to time but is entirely based on the different foundations of our understanding or speculation about the event in question.
§ 8. The following imaginary example will serve to bring out the point indicated above. Conceive a people with very short memories, and who preserved no kind of record to perpetuate their hold upon the events which happened amongst them.[5] The whole region of the past would then be 286 to them what much of the future is to us; viz. a region of guesses and conjectures, one in reference to which they could only judge upon general considerations of probability, rather than by direct and specific evidence. But conceive also that they had amongst them a race of prophets who could succeed in foretelling the future with as near an approach to accuracy and trustworthiness as our various histories, and biographies, and recollections, can attain in respect to the past. The present and usual functions of direct evidence or testimony, and of probability, would then be simply inverted; and so in consequence would the present accidental characteristics of improbability before and after the event. It would then be the latter which would by comparison be regarded as ‘not always a ground of disbelief,’ whereas in the case of the former we should then have it maintained that it always was so.
§ 8. The following fictional example will help illustrate the point made above. Imagine a people with very short memories who kept no records to remember the events that happened among them.[5] The entire past would then seem to them like much of the future does to us; that is, a realm of guesses and speculations, where they could only make judgments based on general probabilities rather than direct and specific evidence. Now, imagine they also had a group of prophets who could predict the future with a level of accuracy and reliability similar to what our histories, biographies, and memories achieve concerning the past. The typical importance of direct evidence or testimony, and of probability, would be completely flipped; as a result, the usual characteristics of improbability before and after an event would change as well. It would be the latter that would be seen as ‘not always a reason for disbelief,’ while conversely, the former would be argued to always be so.
§ 9. The origin of the mistake just discussed is worth enquiring into. I take it to be as follows. It is often the case, as above remarked, when we are speculating about a future event, and almost always the case when that future event is taken from a game of chance, that all persons are in precisely the same condition of ignorance in respect to it. The limit of available information is confined to statistics, and amounts to the knowledge that the unknown event must assume some one of various alternative forms. The conjecture, therefore, of any one man about it is as valuable as that of any other. But in regard to the past the case is very different. Here we are not in the habit of relying upon statistical information. Hence the conjectures of different men are of extremely different values; in the case of many they amount to what we call positive knowledge. 287 This puts a broad distinction, in popular estimation, between what may be called the objective certainty of the past and of the future, a distinction, however, which from the standing-point of a science of inference ought to have no existence.
§ 9. It's worth looking into the origin of the mistake just mentioned. I believe it goes like this: It's often true, as mentioned earlier, that when we speculate about a future event—especially one involving chance—everyone has the same level of ignorance about it. The limit of available information is based on statistics, which means we only know that the unknown event could take on one of several alternative forms. Therefore, the guess of one person is just as valuable as the guess of another. However, when it comes to the past, the situation is very different. We don't typically rely on statistical information for past events. As a result, the guesses from different people can be drastically different, with many effectively amounting to what we call positive knowledge. 287 This creates a significant distinction in public perception between the objective certainty of the past and the future, a distinction that, from the perspective of inference science, shouldn't really exist.
In consequence of this, when we apply to the past and the future respectively the somewhat ambiguous expression ‘the chance of the event,’ it commonly comes to bear very different significations. Applied to the future it bears its proper meaning, namely, the value to be assigned to a conjecture upon statistical grounds. It does so, because in this case hardly any one has more to judge by than such conjectures. But applied to the past it shifts its meaning, owing to the fact that whereas some men have conjectures only, others have positive knowledge. By the chance of the event is now often meant, not the value to be assigned to a conjecture founded on statistics, but to such a conjecture derived from and enforced by any body else's conjecture, that is by his knowledge and his testimony.
As a result, when we refer to the past and the future using the somewhat unclear term “the chance of the event,” it typically takes on very different meanings. When applied to the future, it holds its intended meaning, which is the value assigned to a guess based on statistical evidence. This is because, in this situation, most people rely solely on those guesses. However, when it's applied to the past, the meaning changes, since some people have only guesses, while others have actual knowledge. Therefore, “the chance of the event” often refers not to the value assigned to a statistical guess, but rather to a guess that is influenced and supported by someone else's guess, that is, by their knowledge and testimony.
§ 10. There is a class of cases in apparent opposition to some of the statements in this chapter, but which will be found, when examined closely, decidedly to confirm them. I am walking, say, in a remote part of the country, and suddenly meet with a friend. At this I am naturally surprised. Yet if the view be correct that we cannot properly speak about events in themselves being probable or improbable, but only say this of our conjectures about them, how do we explain this? We had formed no conjecture beforehand, for we were not thinking about anything of the kind, but yet few would fail to feel surprise at such an incident.
§ 10. There's a group of cases that seem to contradict some of the statements in this chapter, but if you take a closer look, they actually support them. Imagine I'm walking in a remote area and suddenly run into a friend. Naturally, I'm surprised by this. However, if we accept the idea that we can't really talk about events themselves being likely or unlikely, but only about our guesses regarding them, how do we account for this? We hadn’t made any guesses beforehand since we weren’t thinking about anything like that, yet most people would definitely feel surprised by such an encounter.
The reply might fairly be made that we had formed such anticipations tacitly. On any such occasion every one unconsciously divides things into those which are known to him and those which are not. During a considerable 288 previous period a countless number of persons had met us, and all fallen into the list of the unknown to us. There was nothing to remind us of having formed the anticipation or distinction at all, until it was suddenly called out into vivid consciousness by the exceptional event. The words which we should instinctively use in our surprise seem to show this:—‘Who would have thought of seeing you here?’ viz. Who would have given any weight to the latent thought if it had been called out into consciousness beforehand? We put our words into the past tense, showing that we have had the distinction lurking in our minds all the time. We always have a multitude of such ready-made classes of events in our minds, and when a thing happens to fall into one of those classes which are very small we cannot help noticing the fact.
The response could reasonably be that we *had* formed such expectations without realizing it. In moments like these, everyone subconsciously sorts things into what they know and what they don’t. For a long time, countless people had encountered us, and all of them ended up in the category of the unknown for us. There was no reminder of having made that distinction at all until it was suddenly brought to our attention by an extraordinary event. The words we instinctively use in our surprise seem to illustrate this: ‘Who would have thought of seeing you here?’—in other words, who would have considered that hidden thought if it had been brought to consciousness beforehand? We phrase our words in the past tense, indicating that we’ve had that distinction lingering in our minds all along. We always have a variety of these pre-made categories of events in our minds, and when something happens to fit into one of those less common categories, we can’t help but notice it.
Or suppose I am one of a regiment into which a shot flies, and it strikes me, and me only. At this I am surprised, and why? Our common language will guide us to the reason. ‘How strange that it should just have hit me of all men!’ We are thinking of the very natural two-fold division of mankind into, ourselves, and everybody else; our surprise is again, as it were, retrospective, and in reference to this division. No anticipation was distinctly formed, because we did not think beforehand of the event, but the event, when it has happened, is at once assigned to its appropriate class.
Or imagine I’m part of a regiment and a bullet flies in, hitting me and no one else. I’m shocked, but why? Our common language helps us understand why. ‘How strange that it hit me of all people!’ We’re considering the very natural split between ourselves and everyone else; our surprise is, in a sense, looking back, based on this division. There was no clear expectation formed because we didn’t anticipate the event beforehand, but once it happens, we immediately categorize it.
§ 11. This view is confirmed by the following considerations. Tell the story to a friend, and he will be a little surprised, but less so than we were, his division in this particular case being,—his friends (of whom we are but one), and the rest of mankind. It is not a necessary division, but it is the one which will be most likely suggested to him.
§ 11. This perspective is backed by the following points. Share the story with a friend, and they might be a bit surprised, but not as much as we were, their division in this case being — their friends (of which we are just one), and everyone else. It's not a required division, but it’s the one that’s most likely to come to mind for them.
Tell it again to a perfect stranger, and his division being 289 different (viz. we falling into the majority) we shall fail to make him perceive that there is anything at all remarkable in the event.
Tell it again to a total stranger, and since his perspective is different (like us being in the majority), we won't be able to make him see that there's anything special about the event. 289
It is not of course attempted in these remarks to justify our surprise in every case in which it exists. Different persons might be differently affected in the cases supposed, and the examples are therefore given mainly for illustration. Still on principles already discussed (Ch. VI. § 32) we might expect to find something like a general justification of the amount of surprise.
It’s not the intention of these comments to explain our surprise in every situation where it occurs. Different people might react differently in the cases mentioned, so these examples are mainly for illustration. However, based on the principles we've already talked about (Ch. VI. § 32), we might expect to find some overall justification for the level of surprise.
§ 12. The answer commonly given in these cases is confined to attempting to show that the surprise should not arise, rather than to explaining how it does arise. It takes the following form,—‘You have no right to be surprised, for nothing remarkable has really occurred. If this particular thing had not happened something equally improbable must. If the shot had not hit you or your friend, it must have hit some one else who was à priori as unlikely to be hit.’
§ 12. The typical response in these situations is focused on arguing that the surprise is unwarranted rather than clarifying how it happens. It usually goes like this: "You shouldn’t be surprised, because nothing unusual has really happened. If this specific event hadn’t taken place, something just as unlikely would have. If the shot hadn’t hit you or your friend, it would’ve hit someone else who was just as unlikely to be hit."
For one thing this answer does not explain the fact that almost every one is surprised in such cases, and surprised somewhat in the different proportions mentioned above. Moreover it has the inherent unsatisfactoriness of admitting that something improbable has really happened, but getting over the difficulty by saying that all the other alternatives were equally improbable. A natural inference from this is that there is a class of things, in themselves really improbable, which can yet be established upon very slight evidence. Butler accepted this inference, and worked it out to the strange conclusion given above. Mill attempts to avoid it by the consideration of the very different values to be assigned to improbability before and after the event. Some further discussion of this point will be 290 found in the chapter on Fallacies, and in that on the Credibility of Extraordinary Stories.
This answer doesn't really explain why almost everyone is surprised in these situations, and why the level of surprise varies as mentioned earlier. Additionally, it’s inherently unsatisfying to admit that something unlikely has actually happened, only to dismiss the issue by claiming that all other possibilities were equally unlikely. A natural conclusion from this is that there exist certain events, genuinely improbable, that can still be argued for with minimal evidence. Butler recognized this conclusion and developed it into the unusual result stated above. Mill tries to sidestep it by considering the different weights of improbability before and after an event. More discussion on this topic can be found in the chapter on Fallacies and in the one on the Credibility of Extraordinary Stories. 290
§ 13. In connection with the subject at present under discussion we will now take notice of a distinction which we shall often find insisted on in works on Probability, but to which apparently needless importance has been attached. It is frequently said that probability is relative, in the sense that it has a different value to different persons according to their respective information upon the subject in question. For example, two persons, A and B, are going to draw a ball from a bag containing 4 balls: A knows that the balls are black and white, but does not know more; B knows that three are black and one white. It would be said that the probability of a white ball to A is 1/2, and to B 1/4.
§ 13. Regarding the topic we are currently discussing, we will now address a distinction that often appears in works on Probability, although it seems to have been given more significance than it deserves. It's commonly stated that probability is relative, meaning it can hold different values for different people based on their knowledge about the matter at hand. For instance, two individuals, A and B, are about to draw a ball from a bag containing 4 balls: A knows the balls are black and white but lacks further details; B knows that three are black and one is white. It would be said that the probability of drawing a white ball for A is 1/2, while for B, it is 1/4.
When however we regard the subject from the material standing point, there really does not seem to me much more in this than the principle, equally true in every other science, that our inferences will vary according to the data we assume. We might on logical grounds with almost equal propriety speak of the area of a field or the height of a mountain being relative, and therefore having one value to one person and another to another. The real meaning of the example cited above is this: A supposes that he is choosing white at random out of a series which in the long run would give white and black equally often; B supposes that he is choosing white out of a series which in the long run would give three black to one white. By the application, therefore, of a precisely similar rule they draw different conclusions; but so they would under the same circumstances in any other science. If two men are measuring the height of a mountain, and one supposes his base to be 1000 feet, whilst the other takes it to be 1001, they would of course 291 form different opinions about the height. The science of mensuration is not supposed to have anything to do with the truth of the data, but assumes them to have been correctly taken; why should not this be equally the case with Probability, making of course due allowance for the peculiar character of the data with which it is concerned?
When we look at the subject from a material standpoint, it seems to me that there isn’t much more to this than the principle, which is equally true in every other science, that our conclusions will change based on the data we assume. We could reasonably say, on logical grounds, that the area of a field or the height of a mountain is relative, having one value for one person and a different value for another. The real meaning of the example mentioned above is this: A believes he is randomly choosing white from a series that would ultimately yield white and black equally often; B believes he is choosing white from a series that would give three black for every one white in the long run. Therefore, by applying the same rule, they come to different conclusions; but they would do the same in any other science under similar circumstances. If two people are measuring the height of a mountain, and one assumes his base is 1000 feet while the other assumes it’s 1001, they would certainly have different opinions about the height. The science of measurement doesn't concern itself with the accuracy of the data but assumes it has been taken correctly; why shouldn’t the same apply to Probability, making the necessary allowances for the unique nature of the data it deals with?
§ 14. This view of the relativeness of probability is connected, as it appears to me, with the subjective view of the science, and is indeed characteristic of it. It seems a fair illustration of the weak side of that view, that it should lead us to lay any stress on such an expression. As was fully explained in the last chapter, in proportion as we work out the Conceptualist principle we are led away from the fundamental question of the material logic, viz. Is our belief actually correct, or not? and, if the former, to what extent and degree is it correct? We are directed rather to ask, What belief does any one as a matter of fact hold? And, since the belief thus entertained naturally varies according to the circumstances and other sources of information of the person in question, its relativeness comes to be admitted as inevitable, or at least it is not to be wondered at if such should be the case.
§ 14. This perspective on the relativeness of probability is, as I see it, tied to the subjective approach of the science, and it is indeed a defining feature of it. It illustrates a weakness in that view that we focus too much on such expressions. As I fully explained in the last chapter, as we further develop the Conceptualist principle, we drift away from the essential question of material logic: Is our belief actually correct, or not? And if it is correct, to what extent and degree is it accurate? Instead, we are encouraged to ask, What belief does someone actually hold? And since the belief held can fluctuate based on the individual's circumstances and other sources of information, its relativeness comes to be seen as unavoidable, or at least it's not surprising if that's the case.
On our view of Probability, therefore, its ‘relativeness’ in any given case is a misleading expression, and it will be found much preferable to speak of the effect produced by variations in the nature and amount of the data which we have before us. Now it must be admitted that there are frequently cases in our science in which such variations are peculiarly likely to be found. For instance, I am expecting a friend who is a passenger in an ocean steamer. There are a hundred passengers on board, and the crew also numbers a hundred. I read in the papers that one person was lost by falling overboard; my anticipation that it was my friend who 292 was lost is but small, of course. On turning to another paper, I see that the man who was lost was a passenger, not one of the crew; my slight anxiety is at once doubled. But another account adds that it was an Englishman, and on that line at that season the English passengers are known to be few; I at once begin to entertain decided fears. And so on, every trifling bit of information instantly affecting my expectations.
In our view of Probability, calling it ‘relativeness’ in any situation can be misleading, and it's much better to talk about the impact of changes in the type and amount of data we have. It must be acknowledged that there are often instances in our field where such variations are particularly common. For example, I'm waiting for a friend who's on an ocean liner. There are a hundred passengers on board, and the crew also numbers a hundred. I read in the news that one person fell overboard; of course, my expectation that it was my friend who was lost is pretty low. When I check another paper, I see that the person who fell overboard was a passenger, not a crew member; my slight worry immediately doubles. But then another article notes that the person was English, and during that time of year, there are usually few English passengers; suddenly, I start to feel real worry. And so it goes, every little piece of information instantly changing my expectations.
§ 15. Now since it is peculiarly characteristic of Probability, as distinguished from Induction, to be thus at the mercy, so to say, of every little fact that may be floating about when we are in the act of forming our opinion, what can be the harm (it may be urged) of expressing this state of things by terming our state of expectation relative?
§ 15. Since Probability is particularly defined by its dependence on every small detail that may influence our opinions, what’s the harm in calling our expectation relative?
There seem to me to be two objections. In the first place, as just mentioned, we are induced to reject such an expression on grounds of consistency. It is inconsistent with the general spirit and treatment of the subject hitherto adopted, and tends to divorce Probability from Inductive logic instead of regarding them as cognate sciences. We are aiming at truth, as far as that goal can be reached by our road, and therefore we dislike to regard our conclusions as relative in any other sense than that in which truth itself may be said to be relative.
There seem to be two objections to me. First, as previously mentioned, we’re pushed to dismiss such an expression for consistency reasons. It doesn’t align with the overall approach and perspective on the topic we've taken so far, and it separates Probability from Inductive logic instead of seeing them as related fields. We’re striving for truth, to the best of our ability, and because of that, we don’t like to view our conclusions as relative in any way other than the way truth itself can be considered relative.
In the second place, this condition of unstable assent, this constant liability to have our judgment affected, to any degree and at any moment, by the accession of new knowledge, though doubtless characteristic of Probability, does not seem to me characteristic of it in its sounder and more legitimate applications. It seems rather appropriate to a precipitate judgment formed in accordance with the rules, than a strict example of their natural employment. Such precipitate judgments may occur in the case of ordinary deductive 293 conclusions. In the practical exigencies of life we are constantly in the habit of forming a hasty opinion with nearly full confidence, at any rate temporarily, upon the strength of evidence which we must well know at the time cannot be final. We wait a short time, and something else turns up which induces us to alter our opinion, perhaps to reverse it. Here our conclusions may have been perfectly sound under the given circumstances, that is, they may be such as every one else would have drawn who was bound to make up his mind upon the data before us, and they are unquestionably ‘relative’ judgments in the sense now under discussion. And yet, I think, every one would shrink from so terming them who wished systematically to carry out the view that Logic was to be regarded as an organon of truth.
In the second place, this state of unstable agreement, this constant risk of having our judgment influenced, at any degree and at any moment, by new information, while clearly a feature of Probability, doesn’t seem to me to reflect its more solid and legitimate uses. It appears more fitting for a rushed judgment made according to the rules rather than a true example of their natural application. Such hasty judgments can happen with ordinary deductive conclusions. In everyday life, we often form quick opinions with considerable confidence, at least temporarily, based on evidence we know isn’t definitive. We wait a bit, and something new comes up that leads us to change our minds, maybe even completely reverse our opinion. Here, our conclusions may have been perfectly reasonable given the circumstances, meaning they are conclusions anyone else would have drawn who had to decide based on the available data, and they are undoubtedly ‘relative’ judgments in the sense we are discussing. Yet, I believe everyone would hesitate to call them that who wanted to consistently support the idea that Logic should be seen as a tool for finding truth.
§ 16. In the examples of Probability which we have hitherto employed, we have for the most part assumed that there was a certain body of statistics set before us on which our conclusion was to rest. It was assumed, on the one hand, that no direct specific evidence could be got, so that the judgment was really to be one of Probability, and to rest on these statistics; in other words, that nothing better than them was available for us. But it was equally assumed, on the other hand, that these statistics were open to the observation of every one, so that we need not have to put up with anything inferior to them in forming our opinion. In other words, we have been assuming that here, as in the case of most other sciences, those who have to draw a conclusion start from the same footing of opportunity and information. This, for instance, clearly is or ought to be the case when we are concerned with games of chance; ignorance or misapprehension of the common data is never contemplated there. So with the statistics of life, or other insurance: so long as our judgment is to be accurate (after its fashion) or 294 justifiable, the common tables of mortality are all that any one has to go by.
§ 16. In the examples of Probability we've used so far, we've mostly assumed that we had a certain set of statistics to rely on for our conclusions. We assumed that there wasn't any specific direct evidence available, which meant that our judgment was based on Probability and those statistics; in other words, that nothing better was available to us. But it was also assumed that these statistics were accessible to everyone, so we shouldn't have to settle for anything less when forming our opinions. In other words, we’ve been assuming that, just like in most other fields of study, everyone making a conclusion starts with the same level of opportunity and information. This should definitely be the case when dealing with games of chance; ignorance or misunderstanding of the common data isn't considered there. The same applies to life statistics or other insurance: as long as our judgment needs to be accurate (in its own way) or justifiable, the standard mortality tables are all anyone needs to reference. 294
§ 17. It is true that in the case of a man's prospect of death we should each qualify our judgment by what we knew or reasonably supposed as to his health, habits, profession, and so on, and should thus arrive at varying estimates. But no one could justify his own estimate without appealing explicitly or implicitly to the statistical grounds on which he had relied, and if these were not previously available to other persons, he must now set them before their notice. In other words, the judgments we entertain, here as elsewhere, are only relative so long as we rest them on grounds peculiar to ourselves. The process of justification, which I consider to be essential to logic, has a tendency to correct such individualities of judgment, and to set all observers on the same basis as regards their data.
§ 17. It’s true that when it comes to a man's likelihood of dying, we should each adjust our judgment based on what we know or reasonably assume about his health, habits, profession, and so on, leading us to different assessments. However, no one can justify their own assessment without referring explicitly or implicitly to the statistical reasons they relied on, and if these are not already accessible to others, they must now present them for consideration. In other words, the judgments we make, here as elsewhere, are only relative as long as we base them on grounds unique to ourselves. The process of justification, which I believe is essential to logic, tends to correct these individual biases in judgment and align all observers on the same level regarding their data.
It is better therefore to regard the conclusions of Probability as being absolute and objective, in the same sense as, though doubtless in a far less degree than, they are in Induction. Fully admitting that our conclusions will in many cases vary exceedingly from time to time by fresh accessions of knowledge, it is preferable to regard such fluctuations of assent as partaking of the nature of precipitate judgments, founded on special statistics, instead of depending only on those which are common to all observers. In calling such judgments precipitate it is not implied that there is any blame in entertaining them, but simply that, for one reason or another, we have been induced to form them without waiting for the possession of the full amount of evidence, statistical or otherwise, which might ultimately be looked for. This explanation will suit the facts equally well, and is more consistent with the general philosophical position maintained in this work.
It’s better to see the conclusions of Probability as absolute and objective, similar to, though definitely to a lesser extent than, they are in Induction. While we fully acknowledge that our conclusions can vary greatly over time due to new information, it’s preferred to view these fluctuations in agreement as hasty judgments based on specific statistics, rather than solely relying on those that are universal among all observers. By calling these judgments hasty, it doesn’t imply that there’s any fault in having them; it simply means that, for one reason or another, we’ve been led to form them without waiting for all the evidence, statistical or otherwise, that we might ultimately expect. This explanation fits the facts just as well and is more in line with the overall philosophical stance taken in this work.
1 “Is it not self-evident that internal improbabilities of all kinds weaken external proof? Doubtless, but to what practical purpose can this be alleged here, when it has been proved before, that real internal improbabilities, which rise even to moral certainty, are overcome by the most ordinary testimony.” Part II. ch. III.
1 “Isn’t it obvious that internal inconsistencies of all kinds weaken external evidence? Certainly, but what's the point of mentioning that here, when it's already been shown that true internal inconsistencies, even those that reach moral certainty, can be overcome by the most basic testimony.” Part II. ch. III.
2 “Miracles must not be compared to common natural events; or to events which, though uncommon, are similar to what we daily experience; but to the extraordinary phenomena of nature. And then the comparison will be between the presumption against miracles, and the presumption against such uncommon appearances, suppose as comets,”…. Part II. ch. II.
2 “Miracles shouldn’t be compared to everyday natural events or to rare occurrences that are similar to things we see daily; instead, they should be compared to extraordinary phenomena in nature. Then, the comparison will be between the skepticism toward miracles and the skepticism toward unusual appearances, like comets,”…. Part II. ch. II.
3 For instance, Sir J. F. Stephen explains it by drawing a distinction between chances and probabilities, which he says that Butler has confused together; “the objection that very ordinary proof will overcome a presumption of millions to one is based upon a confusion between probabilities and chances. The probability of an event is its capability of being proved. Its chance is the numerical proportion between the number of possible cases—supposed to be equally favourable—favourable to its occurrence; and the number of possible cases unfavourable to its occurrence” (General view of the Criminal Law of England, p. 255). Donkin, again (Phil. Magazine, June, 1851), employs the terms improbability and incredibility to mark the same distinction.
3 For example, Sir J. F. Stephen explains this by distinguishing between chances and probabilities, claiming that Butler has mixed them up; “the objection that very ordinary proof can overcome a presumption of millions to one is based on a confusion between probabilities and chances. The probability of an event refers to how likely it is to be proven. Its chance is the numerical ratio between the number of possible cases—assumed to be equally favorable—supporting its occurrence and the number of possible cases against its occurrence” (General view of the Criminal Law of England, p. 255). Donkin, on the other hand, (Phil. Magazine, June, 1851) uses the terms improbability and incredibility to highlight the same distinction.
4 In the extreme case of the witness himself merely guessing, or being as untrustworthy as if he merely guessed, the two stories will of course stand on precisely the same footing. This case will be noticed again in Chapter XVII. It may be remarked that there are several subtleties here which cannot be adequately noticed without some previous investigation into the question of the credibility of witnesses.
4 In the extreme case where the witness is just guessing, or is as unreliable as if he were only guessing, both accounts will obviously be treated the same way. This situation will be addressed again in Chapter XVII. It's worth mentioning that there are several nuances here that can't be fully understood without some prior exploration into the issue of witness credibility.
5 According to Dante, something resembling this prevailed amongst the occupants of the Inferno. The cardinals and others whom he there meets are able to give information about many events which were yet to happen upon earth, but they had to ask it for many events which actually had happened.
5 According to Dante, a situation like this existed among the residents of the Inferno. The cardinals and others he encounters there can share information about many future events on earth, but they had to inquire about many events that had already occurred.
CHAPTER 13.
ON THE CONCEPTION AND TREATMENT OF MODALITY.
§ 1. The reader who knows anything of the scholastic Logic will have perceived before now that we have been touching in a variety of places upon that most thorny and repulsive of districts in the logical territory;—modality. It will be advisable, however, to put together, somewhat more definitely, what has to be said upon the subject. I propose, therefore, to devote this chapter to a brief account of the principal varieties of treatment which the modals have received at the hands of professed logicians.
§ 1. The reader who knows anything about scholastic Logic will have noticed by now that we've been touching on the most complicated and challenging areas of logic: modality. However, it would be wise to clarify more definitively what needs to be said on this topic. Therefore, I plan to dedicate this chapter to a brief overview of the main ways that logicians have approached modal concepts.
It must be remarked at the outset that the sense in which modality and modal propositions have been at various times understood, is by no means fixed and invariably the same. This diversity of view has arisen partly from corresponding differences in the view taken of the province and nature of logic, and partly from differences in the philosophical and scientific opinions entertained as to the constitution and order of nature. In later times, moreover, another very powerful agent in bringing about a change in the treatment of the subject must be recognized in the gradual and steady growth of the theory of Probability, as worked out by the mathematicians from their own point of view.
It should be noted right from the start that the way modality and modal propositions have been understood at different times is not fixed and doesn't always mean the same thing. This variation in perspective has come about partly due to differing views on the scope and nature of logic, and partly from varying philosophical and scientific beliefs about the makeup and organization of nature. Additionally, in more recent times, a significant factor influencing changes in how this topic is addressed has been the gradual and consistent development of Probability theory, as explored by mathematicians from their own perspective.
§ 2. In spite, however, of these differences of treatment, there has always been some community of subject-matter in the discussions upon this topic. There has almost always 296 been some reference to quantity of belief; enough perhaps to justify De Morgan's[1] remark, that Probability was “the unknown God whom the schoolmen ignorantly worshipped when they so dealt with this species of enunciation, that it was said to be beyond human determination whether they most tortured the modals, or the modals them.” But this reference to quantity of belief has sometimes been direct and immediate, sometimes indirect and arising out of the nature of the subject-matter of the proposition. The fact is, that that distinction between the purely subjective and purely objective views of logic, which I have endeavoured to bring out into prominence in the eleventh chapter, was not by any means clearly recognized in early times, nor indeed before the time of Kant, and the view to be taken of modality naturally shared in the consequent confusion. This will, I hope, be made clear in the course of the following chapter, which is intended to give a brief sketch of the principal different ways in which the modality of propositions has been treated in logic. As it is not proposed to give anything like a regular history of the subject, there will be no necessity to adhere to any strict sequence of time, or to discuss the opinions of any writers, except those who may be taken as representative of tolerably distinct views. The outcome of such investigation will be, I hope, to convince the reader (if, indeed, he had not come to that conviction before), that the logicians, after having had a long and fair trial, have failed to make anything satisfactory out of this subject of the modals by their methods of enquiry and treatment; and that it ought, therefore, to be banished entirely from that science, and relegated to Probability.
§ 2. Despite these differences in approach, there has always been some common ground in the discussions on this topic. There has almost always been a reference to the amount of belief; enough, perhaps, to support De Morgan's remark that Probability was “the unknown God whom the scholars unknowingly worshipped when dealing with this type of statement, to the point that it was said to be beyond human understanding whether they were torturing the modals or the modals were torturing them.” However, this reference to the amount of belief has at times been direct and immediate, and at other times indirect, stemming from the nature of the proposition itself. The truth is that the distinction between purely subjective and purely objective views of logic, which I have tried to highlight in the eleventh chapter, was not clearly recognized in earlier times, nor indeed before Kant, and this confusion naturally extended to the understanding of modality. I hope to clarify this in the next chapter, which is designed to provide a brief overview of the main ways in which the modality of propositions has been addressed in logic. Since this isn't meant to be a comprehensive history of the subject, there is no need to follow a strict chronological order or discuss the views of all writers, except for those who can be seen as representative of reasonably distinct perspectives. The aim of this exploration will be to persuade the reader (if they haven't already reached that conclusion) that logicians, after a long and fair examination, have not succeeded in making anything satisfactory out of the subject of modals through their methods of inquiry and treatment; therefore, it should be completely removed from that field and assigned to Probability.
§ 3. From the earliest study of the syllogistic process it was seen that, complete as that process is within its own 297 domain, the domain, at any rate under its simplest treatment, is a very limited one. Propositions of the pure form,—All (or some) A is (or is not) B,—are found in practice to form but a small portion even of our categorical statements. We are perpetually meeting with others which express the relation of B to A with various degrees of necessity or probability; e.g. A must be B, A may be B; or the effect of such facts upon our judgment, e.g. I am perfectly certain that A is B, I think that A may be B; with many others of a more or less similar type. The question at once arises, How are such propositions to be treated? It does not seem to have occurred to the old logicians, as to some of their successors in modern times, simply to reject all consideration of this topic. Their faith in the truth and completeness of their system of inference was far too firm for them to suppose it possible that forms of proposition universally recognized as significant in popular speech, and forms of inference universally recognized there as valid, were to be omitted because they were inconvenient or complicated.
§ 3. From the earliest studies of the syllogistic process, it was clear that, while this process is complete within its own framework, that framework, at least in its simplest form, is very limited. Propositions in the pure form—All (or some) A is (or is not) B—are found to be only a small part of our categorical statements in practice. We constantly encounter others that express the relation of B to A with varying degrees of necessity or probability; for example, A must be B, A may be B; or the impact of such facts on our judgment, for instance, I am completely certain that A is B, I think that A may be B; along with many others that are somewhat similar. This raises the question: How should we handle such propositions? It doesn’t seem to have dawned on the old logicians, nor some of their more modern successors, to simply ignore this topic. Their confidence in the truth and completeness of their inference system was too strong for them to believe that forms of propositions widely recognized as significant in everyday conversation, and forms of inference generally accepted as valid, could be disregarded just because they were inconvenient or complex.
§ 4. One very simple plan suggests itself, and has indeed been repeatedly advocated, viz. just to transfer all that is characteristic of such propositions into that convenient receptacle for what is troublesome elsewhere, the predicate.[2] Has not another so-called modality been thus got rid of?[3] 298 and has it not been attempted by the same device to abolish the distinctive characteristic of negative propositions, viz. by shifting the negative particle into the predicate? It must be admitted that, up to a certain point, something may be done in this way. Given the reasoning, ‘Those who take arsenic will probably die; A has taken it, therefore he will probably die;’ it is easy to convert this into an ordinary syllogism of the pure type, by simply wording the major, ‘Those who take arsenic are people-who-will-probably-die,’ when the conclusion follows in the same form, ‘A is one who-will-probably-die.’ But this device will only carry us a very little way. Suppose that the minor premise also is of the same modal description, e.g. ‘A has probably taken arsenic,’ and it will be seen that we cannot relegate the modality here also to the predicate without being brought to a stop by finding that there are four terms in the syllogism.
§ 4. There's a straightforward idea that comes to mind and has been suggested many times, which is to move everything distinctive about these propositions into the convenient spot for what’s problematic elsewhere—the predicate.[2] Haven't we gotten rid of another so-called modality this way? [3] 298 And hasn't the same technique been used to eliminate the unique feature of negative propositions, specifically by moving the negative particle into the predicate? It's true that, up to a point, this approach can work. Take the reasoning, "Those who take arsenic will probably die; A has taken it, so he will probably die;" it’s easy to turn this into a standard syllogism of the pure type by simply phrasing the major premise as "Those who take arsenic are people-who-will-probably-die," leading to the conclusion, "A is one who-will-probably-die." But this method will only take us so far. If the minor premise is also similar in modality, for example, "A has probably taken arsenic," we will see that we can't just push the modality into the predicate without running into trouble—finding ourselves with four terms in the syllogism.
But even if there were not this particular objection, it does not appear that anything is to be gained in the way of intelligibility or method by such a device as the above. For 299 what is meant by a modal predicate, by the predicate ‘probably mortal,’ for instance, in the proposition ‘All poisonings by arsenic are probably mortal’? If the analogy with ordinary pure propositions is to hold good, it must be a predicate referring to the whole of the subject, for the subject is distributed. But then we are at once launched into the difficulties discussed in a former chapter (Ch. VI. §§ 19–25), when we attempt to justify or verify the application of the predicate. We have to enquire (at least on the view adopted in this work) whether the application of the predicate ‘probably mortal’ to the whole of the subject, really means at bottom anything else than that the predicate ‘mortal’ is to be applied to a portion (more than half) of the members denoted by the subject. When the transference of the modality to the predicate raises such intricate questions as to the sense in which the predicate is to be interpreted, there is surely nothing gained by the step.
But even if there wasn't this specific objection, it doesn't seem like anything is gained in terms of clarity or method by using a device like the one mentioned above. For what do we mean by a modal predicate, like the predicate 'probably mortal,' for example, in the statement 'All poisonings by arsenic are probably mortal'? If we want the analogy with regular pure propositions to hold, it needs to be a predicate that refers to the whole of the subject since the subject is distributed. However, this immediately brings us back to the issues discussed in a previous chapter (Ch. VI. §§ 19–25), when we try to justify or confirm the application of the predicate. We have to consider (at least based on the perspective taken in this work) whether the application of the predicate 'probably mortal' to the whole of the subject really means anything different than that the predicate 'mortal' is to be applied to a portion (more than half) of the members represented by the subject. When transferring modality to the predicate raises such complicated questions about how to interpret the predicate, there’s clearly nothing gained by taking that step.
§ 5. A second, and more summary way of shelving all difficulties of the subject, so far at least as logic, or the writers upon logic, are concerned, is found by simply denying that modality has any connection whatever with logic. This is the course adopted by many modern writers, for instance, by Hamilton and Mansel, in reference to whom one cannot help remarking that an unduly large portion of their logical writings seems occupied with telling us what does not belong to logic. They justify their rejection on the ground that the mode belongs to the matter, and must be determined by a consideration of the matter, and therefore is extralogical. To a certain extent I agree with their grounds of rejection, for (as explained in Chapter VI.) it is not easy to see how the degree of modality of any proposition, whether premise or conclusion, can be justified without appeal to the matter. But then questions of justification, in any adequate sense of 300 the term, belong to a range of considerations somewhat alien to Hamilton's and Mansel's way of regarding the science. The complete justification of our inferences is a matter which involves their truth or falsehood, a point with which these writers do not much concern themselves, being only occupied with the consistency of our reasonings, not with their conformity with fact. Were I speaking as a Hamiltonian I should say that modality is formal rather than material, for though we cannot justify the degree of our belief of a proposition without appeal to the matter, we can to a moderate degree of accuracy estimate it without any such appeal; and this would seem to be quite enough to warrant its being regarded as formal.
§ 5. A quicker and more straightforward way to avoid all the issues related to the subject, at least from a logical perspective or according to the authors on logic, is to simply claim that modality has no connection to logic whatsoever. Many modern writers, like Hamilton and Mansel, take this approach. It's interesting to note that a surprisingly large part of their writings on logic seems focused on explaining what does not relate to logic. They defend their stance by stating that modality pertains to the substance and must be determined by examining that substance, which is why it's seen as extralogical. To some extent, I agree with their reasoning for rejection, since (as explained in Chapter VI.) it's not clear how the extent of modality in any proposition, whether it's a premise or conclusion, can be justified without considering the substance. However, questions of justification, in any meaningful sense, belong to a set of considerations that are somewhat different from how Hamilton and Mansel view the science. Fully justifying our inferences involves determining their truth or falsehood, a topic these writers don't address much, as they focus only on the consistency of our reasoning, rather than its alignment with reality. If I were speaking from a Hamiltonian perspective, I would argue that modality is formal rather than material, because even though we can't justify how strongly we believe a proposition without considering the substance, we can estimate it with a reasonable degree of accuracy without such consideration; and that seems sufficient to classify it as formal.
It must be admitted that Hamilton's account of the matter when he is recommending the rejection of the modals, is not by any means clear and consistent. He not only fails, as already remarked, to distinguish between the formal and the material (in other words, the true and the false) modality; but when treating of the former he fails to distinguish between the extremely diverse aspects of modality when viewed from the Aristotelian and the Kantian stand-points. Of the amount and significance of this difference we shall speak presently, but it may be just pointed out here that Hamilton begins (Vol. I. p. 257) by rejecting the modals on the ground that the distinctions between the necessary, the contingent, the possible, and the impossible, must be wholly rested on an appeal to the matter of the propositions, in which he is, I think, quite correct. But then a little further on (p. 260), in explaining ‘the meaning of three terms which are used in relation to pure and modal propositions,’ he gives the widely different Kantian, or three-fold division into the apodeictic, the assertory, and the problematic. He does not take the precaution of pointing out to his hearers the very 301 different general views of logic from which these two accounts of modality spring.[4]
It has to be acknowledged that Hamilton's explanation when he recommends rejecting the modals is not at all clear or consistent. He not only fails, as mentioned earlier, to differentiate between formal and material (in other words, true and false) modality; but when discussing the former, he does not distinguish between the very different aspects of modality from the Aristotelian and Kantian perspectives. We will discuss the significance of this difference shortly, but it is important to note here that Hamilton starts (Vol. I. p. 257) by dismissing the modals on the basis that the distinctions among necessary, contingent, possible, and impossible must entirely rely on the content of the propositions, which I think is quite correct. However, a little later (p. 260), while explaining "the meaning of three terms used in relation to pure and modal propositions," he presents the widely different Kantian, or three-fold division into apodeictic, assertory, and problematic. He does not take the precaution of informing his audience about the very different overall views of logic from which these two accounts of modality originate. 301
§ 6. There is one kind of modal syllogism which it would seem unreasonable to reject on the ground of its not being formal, and which we may notice in passing. The premise ‘Any A is probably B,’ is equivalent to ‘Most A are B.’ Now it is obvious that from two such premises as ‘Most A are B,’ ‘Most A are C,’ we can deduce the consequence, ‘Some C are B.’ Since this holds good whatever may be the nature of A, B, and C, it is, according to ordinary usage of the term, a formal syllogism. Mansel, however, refuses to admit that any such syllogisms belong to formal logic. His reasons are given in a rather elaborate review[5] and criticism of some of the logical works of De Morgan, to whom the introduction of ‘numerically definite syllogisms’ is mainly due. Mansel does not take the particular example given above, as he is discussing a somewhat more comprehensive algebraic form. He examines it in a special numerical example:[6]—18 out of 21 Ys are X; 15 out of 21 Ys are Z; the conclusion that 12 Zs are X is rejected from formal logic on the ground that the arithmetical judgment involved is synthetical, not analytical, and rests upon an intuition of quantity. We cannot enter upon any examination of these 302 reasons here; but it may merely be remarked that his criticism demands the acceptance of the Kantian doctrines as to the nature of arithmetical judgments, and that it would be better to base the rejection not on the ground that the syllogism is not formal, but on the ground that it is not analytical.
§ 6. There’s one type of modal syllogism that seems unreasonable to dismiss just because it’s not formal, and we can briefly mention it. The premise ‘Any A is probably B’ is the same as saying ‘Most A are B.’ It’s clear that from two premises like ‘Most A are B’ and ‘Most A are C,’ we can conclude that ‘Some C are B.’ This is true regardless of what A, B, and C are, so according to the usual definition of the term, this is a formal syllogism. However, Mansel argues that such syllogisms shouldn’t be considered part of formal logic. He provides his reasons in a detailed review[5]and critique of some of De Morgan’s logical works, who is chiefly responsible for introducing ‘numerically definite syllogisms.’ Mansel doesn’t specifically discuss the example given above since he’s looking at a broader algebraic form. Instead, he analyzes a specific numerical instance:[6]—18 out of 21 Ys are X; 15 out of 21 Ys are Z; and he dismisses the conclusion that 12 Zs are X from formal logic by arguing that the arithmetic judgment involved is synthetic, not analytic, and relies on an intuition of quantity. We can’t delve into the details of these reasons here, but it’s worth mentioning that his critique requires acceptance of Kantian views on the nature of arithmetic judgments, and it might be more accurate to reject it not because the syllogism is not formal, but because it is not analytical.
§ 7. There is another and practical way of getting rid of the perplexities of modal reasoning which must be noticed here. It is the resource of ordinary reasoners rather than the decision of professed logicians,[7] and, like the first method of evasion already pointed out in this chapter, is of very partial application. It consists in treating the premises, during the process of reasoning, as if they were pure, and then reintroducing the modality into the conclusion, as a sort of qualification of its full certainty. When each of the premises is nearly certain, or when from any cause we are not concerned with the extent of their departure from full certainty, this rough expedient will answer well enough. It is, I apprehend, the process which passes through the minds of most persons in such cases, in so far as they reason consciously. They would, presumably, in such an example as that previously given (§ 4), proceed as if the premises that ‘those who take arsenic will die,’ and that ‘the man in question has taken it,’ were quite true, instead of being only probably true, and they would consequently draw the conclusion that ‘he would die.’ But bearing in mind that the premises are not certain, they would remember that the conclusion was only to be held with a qualified assent. This they would 303 express quite correctly, if the mere nature and not the degree of that assent is taken into account, by saying that ‘he is likely to die.’ In this case the modality is rejected temporarily from the premises to be reintroduced into the conclusion.
§ 7. There's another practical way to deal with the challenges of modal reasoning that needs to be mentioned here. This approach is more of a tool for everyday thinkers rather than a method endorsed by formal logicians, and like the first evasion mentioned in this chapter, it has limited applicability. It involves treating the premises as if they were absolutely true during the reasoning process, then reintroducing the idea of modality in the conclusion as a way to qualify its certainty. When each of the premises is almost certain, or when we aren't focused on how far they stray from being completely certain, this rough approach works reasonably well. I believe this is how most people think in these situations, especially when they are reasoning consciously. For instance, in the previous example given in (§ 4), they would act as if the premises that 'those who take arsenic will die' and 'the man in question has taken it' are entirely true, rather than just probably true, leading them to conclude that 'he will die.' However, keeping in mind that the premises aren't certain, they would acknowledge that the conclusion should only be accepted with a qualified agreement. They would express this accurately, considering the nature and not the degree of that agreement, by saying that 'he is likely to die.' In this case, the modality is temporarily removed from the premises and then reintroduced into the conclusion.
It is obvious that such a process as this is of a very rough and imperfect kind. It does, in fact, omit from accurate consideration just the one point now under discussion. It takes no account of the varying shades of expression by which the degree of departure from perfect conviction is indicated, which is of course the very thing with which modality is intended to occupy itself. At best, therefore, it could only claim to be an extremely rude way of deciding questions, the accurate and scientific methods of treating which are demanded of us.
It’s clear that this process is quite rough and imperfect. In fact, it completely overlooks the very point we're discussing. It doesn’t consider the different shades of expression that show how far someone departs from perfect conviction, which is exactly what modality is meant to address. At best, it can only be seen as a very crude way of resolving issues that should be handled with more precise and scientific methods.
§ 8. In any employment of applied logic we have of course to go through such a process as that just mentioned. Outside of pure mathematics it can hardly ever be the case that the premises from which we reason are held with absolute conviction. Hence there must be a lapse from absolute conviction in the conclusion. But we reason on the hypothesis that the premises are true, and any trifling defection from certainty, of which we may be conscious, is mentally reserved as a qualification to the conclusion. But such considerations as these belong rather to ordinary applied logic; they amount to nothing more than a caution or hint to be borne in mind when the rules of the syllogism, or of induction, are applied in practice. When, however, we are treating of modality, the extent of the defection from full certainty is supposed to be sufficiently great for our language to indicate and appreciate it. What we then want is of course a scientific discussion of the principles in accordance with which this departure is to be measured and expressed, 304 both in our premises and in our conclusion. Such a plan therefore for treating modality, as the one under discussion, is just as much a banishment of it from the field of real logical enquiry, as if we had determined avowedly to reject it from consideration.
§ 8. In any application of logic, we naturally need to go through a process like the one just mentioned. Outside of pure mathematics, it's rarely the case that the premises we use are completely convincing. Because of this, there will be a gap in certainty regarding the conclusion. However, we reason based on the assumption that the premises are true, and any minor doubts we might have are mentally noted as a qualification to the conclusion. These types of considerations belong more to basic applied logic; they serve as nothing more than a caution or reminder to keep in mind when we apply the rules of syllogism or induction in practice. However, when we discuss modality, the degree of uncertainty is assumed to be significant enough for our language to express and acknowledge it. What we really need is a scientific examination of the principles that will help us measure and articulate this departure, both in our premises and in our conclusion. Thus, a plan for addressing modality, like the one we're discussing, effectively excludes it from the realm of serious logical inquiry, just as if we had explicitly chosen to ignore it. 304
§ 9. Before proceeding to a discussion of the various ways in which modality may be treated by those who admit it into logic, something must be said to clear up a possible source of confusion in this part of the subject. In the cases with which we have hitherto been mostly concerned, in the earlier chapters of this work, the characteristic of modality (for in this chapter we may with propriety use this logical term) has generally been found in singular and particular propositions. It presented itself when we had to judge of individual cases from a knowledge of the average, and was an expression of the fact that the proposition relating to these individuals referred to a portion only of the whole class from which the average was taken. Given that of men of fifty-five, three out of five will die in the course of twenty years, we have had to do with propositions of the vague form, ‘It is probable that AB (of that age) will die,’ or of the more precise form, ‘It is three to two that AB will die,’ within the specified time. Here the modal proposition naturally presents itself in the form of a singular or particular proposition.
§ 9. Before moving on to discuss the different ways modality can be approached by those who include it in logic, it's important to clarify a potential source of confusion regarding this part of the topic. In the cases we've mostly focused on in the earlier chapters of this work, the characteristic of modality (as we can appropriately call this logical term in this chapter) has typically been found in singular and particular propositions. It emerged when we needed to make judgments about individual cases based on an understanding of the average, reflecting the fact that the proposition about these individuals pertains only to a subset of the entire class from which the average was drawn. For example, given that out of men aged fifty-five, three out of five will die over the next twenty years, we've dealt with propositions of the vague form, ‘It is probable that AB (of that age) will die,’ or in the more precise form, ‘It is three to two that AB will die,’ within that timeframe. Here, the modal proposition naturally appears as a singular or particular proposition.
§ 10. But when we turn to ordinary logic we may find universal propositions spoken of as modal. This must mostly be the case with those which are termed necessary or impossible, but it may also be the case with the probable. We may meet with the form ‘All X is probably Y.’ Adopting the same explanation here as has been throughout adopted in analogous cases, we must say that what is meant by the modality of such a proposition is the proportional number of 305 times in which the universal proposition would be correctly made. And in this there is, so far, no difficulty. The only difference is that whereas the justification of the former, viz. the particular or individual kind of modal, was obtainable within the limits of the universal proposition which included it, the justification of the modality of a universal proposition has to be sought in a group or succession of other propositions. The proposition has to be referred to some group of similar ones and we have to consider the proportion of cases in which it will be true. But this distinction is not at all fundamental.
§ 10. However, when we look at everyday logic, we might find that universal propositions are described as modal. This mostly applies to those that are called necessary or impossible, but it can also apply to the probable. We might come across the phrase ‘All X is probably Y.’ Using the same explanation we've applied in similar situations, we need to say that the modality of such a proposition refers to the proportion of times the universal proposition would be correctly stated. So far, this presents no issues. The only difference is that while the justification for the former, specifically the particular or individual type of modal, could be derived from the universal proposition that encompasses it, the justification for the modality of a universal proposition must be found in a group or series of other propositions. The proposition needs to be compared to a set of similar ones, and we must look at the ratio of cases in which it will be true. But this distinction is not really fundamental.
It is quite true that universal propositions from their nature are much less likely than individual ones to be justified, in practice, by such appeal. But, as has been already frequently pointed out, we are not concerned with the way in which our propositions are practically obtained, nor with the way in which men might find it most natural to test them; but with that ultimate justification to which we appeal in the last resort, and which has been abundantly shown to be of a statistical character. When, therefore, we say that ‘it is probable that all X is Y,’ what we mean is, that in more than half the cases we come across we should be right in so judging, and in less than half the cases we should be wrong.
It is indeed true that universal statements are generally less likely to be justified than individual ones when put to the test. However, as has been pointed out many times before, we aren't focused on how our statements are practically arrived at or how people might naturally test them; rather, we are concerned with that ultimate justification we appeal to in the end, which has been clearly shown to be based on statistical evidence. So, when we say that “it is probable that all X is Y,” we mean that in more than half the cases we encounter, we would be correct in that judgment, and in less than half the cases, we would be wrong.
§ 11. It is at this step that the possible ambiguity is encountered. When we talk of the chance that All X is Y, we contemplate or imply the complementary chance that it is not so. Now this latter alternative is not free from ambiguity. It might happen, for instance, in the cases of failure, that no X is Y, or it might happen that some X, only, is not Y; for both of these suppositions contradict the original proposition, and are therefore instances of its failure. In practice, no doubt, we should have various recognized rules and 306 inductions to fall back upon in order to decide between these alternatives, though, of course, the appeal to them would be in strictness extralogical. But the mere existence of such an ambiguity, and the fact that it can only be cleared up by appeal to the subject-matter, are in themselves no real difficulty in the application of the conception of modality to universal propositions as well as to individual ones.
§ 11. This is where we encounter potential ambiguity. When we discuss the probability that All X is Y, we also reference the complementary possibility that it is not. However, this latter option is not free from ambiguity. It could be that no X is Y, or it could be that only some X are not Y; both of these scenarios contradict the original statement and therefore represent its failure. In practice, we likely have various established rules and 306 inductions to rely on for deciding between these alternatives, although, technically speaking, referring to them would be outside the scope of logic. Yet the mere presence of such ambiguity, and the fact that it can only be resolved by looking at the subject matter, do not actually pose a real problem in applying the concept of modality to both universal and individual propositions.
§ 12. Having noticed some of the ways in which the introduction of modality into logic has been evaded or rejected, we must now enter into a brief account of its treatment by those who have more or less deliberately admitted its claims to acceptance.
§ 12. After observing some of the ways modality has been avoided or dismissed in logic, we must now provide a brief overview of how those who have somewhat intentionally acknowledged its validity have treated it.
The first enquiry will be, What opinions have been held as to the nature of modality? that is, Is it primarily an affection of the matter of the proposition, and, if not, what is it exactly? In reference to this enquiry it appears to me, as already remarked, that amongst the earlier logicians no such clear and consistent distinction between the subjective and objective views of logic as is now commonly maintained, can be detected.[8] The result of this appears in their treatment of modality. This always had some reference to the subjective side of the proposition, viz. in this case to the nature or quantity of the belief with which it was entertained; but it is equally clear that this characteristic was not estimated at first hand, so to say, and in itself, but rather from a consideration of the matter determining what it should be. The commonly accepted scholastic or Aristotelian division, for instance, is into the necessary, the contingent, the possible, and the impossible. This is clearly a division according to 307 the matter almost entirely, for on the purely mental side the necessary and the impossible would be just the same; one implying full conviction of the truth of a proposition, and the other of that of its contradictory. So too, on the same side, it would not be easy to distinguish between the contingent and the possible. On the view in question, therefore, the modality of a proposition was determined by a reference to the nature of the subject-matter. In some propositions the nature of the subject-matter decided that the predicate was necessarily joined to the subject; in others that it was impossible that they should be joined; and so on.
The first question is: What beliefs have been held about the nature of modality? That is, is it mainly an aspect of the content of the proposition, and if not, what exactly is it? Regarding this question, it seems to me, as already mentioned, that among earlier logicians there wasn’t such a clear and consistent distinction between the subjective and objective views of logic as is generally accepted today. The outcome of this is evident in their approach to modality. This has always had some connection to the subjective aspect of the proposition, specifically regarding the nature or amount of belief with which it was held. However, it is also evident that this characteristic was not initially assessed on its own, but rather through a consideration of the matter determining what it should be. The commonly accepted scholastic or Aristotelian classification, for example, is divided into necessary, contingent, possible, and impossible. This division is primarily based on the content, because from a purely mental perspective, necessary and impossible are essentially the same; one reflecting complete conviction in the truth of a proposition, and the other in the truth of its opposite. Similarly, it is also difficult to differentiate between the contingent and the possible on that same level. Therefore, under this view, the modality of a proposition was determined with reference to the nature of the subject matter. In some propositions, the nature of the subject matter dictated that the predicate was necessarily linked to the subject; in others, that it was impossible for them to be linked; and so on.
§ 13. The artificial character of such a four-fold division will be too obvious to modern minds for it to be necessary to criticize it. A very slight study of nature and consequent appreciation of inductive evidence suffice to convince us that those uniformities upon which all connections of phenomena, whether called necessary or contingent, depend, demand extremely profound and extensive enquiry; that they admit of no such simple division into clearly marked groups; and that, therefore, the pure logician had better not meddle with them.[9]
§ 13. The artificial nature of this four-part division will be obvious to modern readers, so there’s no need for critique. Just a little exploration of nature and a basic understanding of inductive evidence is enough to show us that the patterns underlying all connections of events, whether labeled necessary or contingent, require very deep and extensive investigation; that they cannot be neatly divided into clearly defined groups; and that, therefore, a pure logician should probably avoid getting involved with them.[9]
The following extract from Grote's Aristotle (Vol. I. p. 192) will serve to show the origin of this four-fold division, its conformity with the science of the day, and consequently its utter want of conformity with that of our own time:—“The distinction of Problematical and Necessary Propositions corresponds, in the mind of Aristotle, to that capital and characteristic doctrine of his Ontology and Physics, already touched on in this chapter. He thought, as we have seen, that in the vast circumferential region of the Kosmos, from 308 the outer sidereal sphere down to the lunar sphere, celestial substance was a necessary existence and energy, sempiternal and uniform in its rotations and influence; and that through its beneficent influence, pervading the concavity between the lunar sphere and the terrestrial centre (which included the four elements with their compounds) there prevailed a regularizing tendency called Nature; modified, however, and partly counteracted by independent and irregular forces called Spontaneity and Chance, essentially unknowable and unpredictable. The irregular sequences thus named by Aristotle were the objective correlate of the Problematical Proposition in Logic. In these sublunary sequences, as to future time, may or may not, was all that could be attained, even by the highest knowledge; certainty, either of affirmation or negation, was out of the question. On the other hand, the necessary and uniform energies of the celestial substance, formed the objective correlate of the Necessary Proposition in Logic; this substance was not merely an existence, but an existence necessary and unchangeable… he considers the Problematical Proposition in Logic to be not purely subjective, as an expression of the speaker's ignorance, but something more, namely, to correlate with an objective essentially unknowable to all.”
The following extract from Grote's Aristotle (Vol. I. p. 192) shows the origin of this four-part division, how it aligns with the science of its time, and its complete disconnect with our understanding today: “In Aristotle's mind, the distinction between Problematical and Necessary Propositions relates to the main and defining idea of his Ontology and Physics, which we've already mentioned in this chapter. He believed, as we have seen, that in the vast expanse of the Kosmos, from the outer celestial sphere down to the lunar sphere, celestial substance was a necessary existence and energy, eternal and consistent in its rotations and effects. Through its beneficial influence filling the area between the lunar sphere and the Earth (which included the four elements and their combinations), there existed a regularizing force called Nature. However, this was modified and partly opposed by independent and chaotic forces known as Spontaneity and Chance, which are fundamentally unknowable and unpredictable. The irregular events labeled by Aristotle were the objective counterpart of the Problematical Proposition in Logic. Regarding these sublunary events concerning the future, the best we could achieve was may or may not, even with the highest knowledge; certainty, whether affirmation or negation, was impossible. Conversely, the necessary and consistent energies of celestial substance formed the objective counterpart of the Necessary Proposition in Logic; this substance was not just an existence but a necessary and unchangeable existence… he sees the Problematical Proposition in Logic as not purely subjective, merely reflecting the speaker's ignorance, but something more significant, that correlates with an objective that is essentially unknowable to everyone.”
§ 14. Even after this philosophy began to pass away, the divisions of modality originally founded upon it might have proved, as De Morgan has remarked,[10] of considerable service in mediæval times. As he says, people were much more frequently required to decide in one way or the other upon a single testimony, without there being a sufficiency of specific knowledge to test the statements made. The old logician “did not know but that any day of the week might bring from Cathay or Tartary an account of men who ran on 309 four wheels of flesh and blood, or grew planted in the ground, like Polydorus in the Æneid, as well evidenced as a great many nearly as marvellous stories.” Hence, in default of better inductions, it might have been convenient to make rough classifications of the facts which were and which were not to be accepted on testimony (the necessary, the impossible, &c.), and to employ these provisional inductions (which is all we should now regard them) as testing the stories which reached him. Propositions belonging to the class of the impossible might be regarded as having an antecedent presumption against them so great as to prevail over almost any testimony worth taking account of, and so on.
§ 14. Even after this philosophy started to fade away, the categories of modality that were originally based on it could have been quite useful during medieval times, as De Morgan noted. People often had to make decisions based on a single piece of evidence, without enough concrete knowledge to verify the claims being made. The old logician “did not know but that any day of the week might bring from Cathay or Tartary a report of men who ran on four wheels of flesh and blood, or who were rooted in the ground, like Polydorus in the Æneid, as well documented as a lot of other similarly incredible stories.” Therefore, in the absence of better reasoning, it might have been helpful to create rough classifications of facts that should or shouldn’t be accepted based on testimony (the necessary, the impossible, etc.), and to use these temporary classifications (which we should now consider them) to evaluate the stories that came to him. Claims deemed impossible might be assumed to have such a strong presumption against them that they would outweigh almost any credible testimony, and so on.
§ 15. But this old four-fold division of modals continued to be accepted and perpetuated by the logicians long after all philosophical justification for it had passed away. So far as I have been able to ascertain, scarcely any logician of repute or popularity before Kant, was bold enough to make any important change in the way of regarding them.[11] Even the Port-Royal Logic, founded as it is on Cartesianism, repeats the traditional statements, though with extreme brevity. This adherence to the old forms led, it need not be remarked, to considerable inconsistency and confusion in many cases. These forms were founded, as we have seen, on an objective view of the province of logic, and this view was 310 by no means rigidly carried out in many cases. In fact it was beginning to be abandoned, to an extent and in directions which we have not opportunity here to discuss, before the influence of Kant was felt. Many, for instance, added to the list of the four, by including the true and the false; occasionally also the probable, the supposed, and the certain were added. This seems to show some tendency towards abandoning the objective for the subjective view, or at least indicates a hesitation between them.
§ 15. However, this old four-part division of modals continued to be accepted and maintained by logicians long after there was any philosophical reason for it. From what I can tell, hardly any well-known or popular logician before Kant was brave enough to make any significant changes in how they viewed them.[11] Even the Port-Royal Logic, which is based on Cartesianism, repeats the traditional statements, albeit very briefly. This sticking to old forms led, as you might expect, to significant inconsistency and confusion in many instances. These forms were established, as we've seen, based on an objective perspective of logic, and this perspective was not consistently applied in many cases. In fact, it was starting to be abandoned, to some extent and in ways we can't discuss here, even before Kant's influence was felt. For example, many added to the original four by including the true and the false; sometimes they also included the probable, the supposed, and the certain. This seems to indicate a shift away from the objective view towards a subjective one, or at least reflects uncertainty between the two.
§ 16. With Kant's view of modality almost every one is familiar. He divides judgments, under this head, into the apodeictic, the assertory, and the problematic. We shall have to say something about the number and mutual relations of these divisions presently; we are now only concerned with the general view which they carry out. In this respect it will be obvious at once what a complete change of position has been reached. The ‘necessary’ and the ‘impossible’ demanded an appeal to the matter of a proposition in order to recognize them; the ‘apodeictic’ and the ‘assertory’, on the other hand, may be true of almost any matter, for they demand nothing but an appeal to our consciousness in order to distinguish between them. Moreover, the distinction between the assertory and the problematic is so entirely subjective and personal, that it may vary not only between one person and another, but in the case of the same person at different times. What one man knows to be true, another may happen to be in doubt about. The apodeictic judgment is one which we not only accept, but which we find ourselves unable to reverse in thought; the assertory is simply accepted; the problematic is one about which we feel in doubt.
§ 16. Most people are familiar with Kant's perspective on modality. He categorizes judgments into three types: apodeictic, assertory, and problematic. We'll discuss the relationship and number of these categories later; for now, we’re focused on the overall perspective they represent. It’s clear that a significant shift has occurred. The terms ‘necessary’ and ‘impossible’ required examining the content of a proposition to identify them; however, the ‘apodeictic’ and ‘assertory’ can apply to almost anything, as they only require an appeal to our awareness to differentiate between them. Additionally, the difference between the assertory and the problematic is so subjective that it can vary not just from person to person, but also for the same person at different times. What one person considers true, another may be unsure about. An apodeictic judgment is one that we not only accept but also cannot contradict in our thoughts; an assertory judgment is simply accepted; a problematic judgment is one we feel uncertain about.
This way of looking at the matter is the necessary outcome of the conceptualist or Kantian view of logic. It has 311 been followed by many logicians, not only by those who may be called followers of Kant, but by almost all who have felt his influence. Ueberweg, for instance, who is altogether at issue with Kant on some fundamental points, adopts it.
This perspective is the natural result of the conceptualist or Kantian view of logic. Many logicians have embraced it, not just those who are direct followers of Kant, but almost everyone influenced by him. For example, Ueberweg, who fundamentally disagrees with Kant on several key issues, also takes this approach.
§ 17. The next question to be discussed is, How many subdivisions of modality are to be recognized? The Aristotelian or scholastic logicians, as we have seen, adopted a four-fold division. The exact relations of some of these to each other, especially the possible and the contingent, is an extremely obscure point, and one about which the commentators are by no means agreed. As, however, it seems tolerably clear that it was not consciously intended by the use of these four terms to exhibit a graduated scale of intensity of conviction, their correspondence with the province of modern probability is but slight, and the discussion of them, therefore, becomes rather a matter of special or antiquarian interest. De Morgan, indeed (Formal Logic, p. 232), says that the schoolmen understood by contingent more likely than not, and by possible less likely than not. I do not know on what authority this statement rests, but it credits them with a much nearer approach to the modern views of probability than one would have expected, and decidedly nearer than that of most of their successors.[12] The general conclusion at which I have arrived, after a reasonable amount of investigation, is that there were two prevalent views on the subject. Some (e.g. Burgersdyck, Bk. I. ch. 32) admitted that there were at bottom only two kinds of modality; the contingent and the possible being equipollent, as also the necessary and the impossible, provided the one asserts and the other denies. This is the view to which those would naturally be led who looked mainly to the nature of the subject-matter. 312 On the other hand, those who looked mainly at the form of expression, would be led by the analogy of the four forms of proposition, and the necessity that each of them should stand in definite opposition to each other, to insist upon a distinction between the four modals.[13] They, therefore, endeavoured to introduce a distinction by maintaining (e.g. Crackanthorpe, Bk. III. ch. 11) that the contingent is that which now is but may not be, and the possible that which now is not but may be. A few appear to have made the distinction correspondent to that between the physically and the logically possible.
§ 17. The next question to discuss is, how many subdivisions of modality should we recognize? The Aristotelian or scholastic logicians, as we've seen, used a four-fold division. The exact relationships between some of these, especially the possible and the contingent, are extremely unclear, and the commentators do not agree on this point. However, it seems fairly clear that these four terms were not intentionally used to represent a graduated scale of conviction, so their connection to modern probability is quite limited, making the discussion more of a specialized or historical interest. De Morgan, indeed (Formal Logic, p. 232), claims that the schoolmen understood "contingent" to mean more likely than not, and "possible" to mean less likely than not. I'm unsure on what authority this statement is based, but it suggests that they were closer to modern views of probability than expected, and definitely closer than most of their successors. [12] My general conclusion, after some investigation, is that there were two main perspectives on this topic. Some (e.g. Burgersdyck, Bk. I. ch. 32) accepted that there were fundamentally only two kinds of modality; that the contingent and the possible are equivalent, as well as the necessary and the impossible, provided one affirms and the other denies. This view naturally arises for those who focus on the nature of the subject matter. 312 Conversely, those who primarily focus on the form of expression are led, by the analogy of the four forms of proposition and the need for each to oppose one another, to insist on distinguishing between the four modals.[13] They, therefore, sought to introduce a distinction by asserting (e.g. Crackanthorpe, Bk. III. ch. 11) that the contingent is what currently exists but may not continue to do so, and the possible is what currently does not exist but could. A few seemed to align the distinction with that between the physically and the logically possible.
§ 18. When we get to the Kantian division we have reached much clearer ground. The meaning of each of these terms is quite explicit, and it is also beyond doubt that they have a more definite tendency in the direction of assigning a graduated scale of conviction. So long as they are regarded from a metaphysical rather than a logical standing point, there is much to be said in their favour. If we use introspection merely, confining ourselves to a study of the judgments themselves, to the exclusion of the grounds on which they rest, there certainly does seem a clear and well-marked distinction between judgments which we cannot even conceive to be reversed in thought; those which we could reverse, but which we accept as true; and those which we merely entertain as possible.
§ 18. When we reach the Kantian division, we find ourselves on much clearer ground. The meaning of each of these terms is quite clear, and it’s also undeniable that they have a more definite tendency toward establishing a graded scale of certainty. As long as we look at them from a metaphysical rather than a logical perspective, there’s a lot to support them. If we rely only on introspection, focusing just on the judgments themselves and ignoring the reasons behind them, there definitely seems to be a clear and distinct difference between judgments that we can't even imagine could be reversed in thought; those that we could reverse but accept as true; and those that we just consider as possible.
Regarded, however, as a logical division, Kant's arrangement seems to me of very little service. For such logical purposes indeed, as we are now concerned with, it really seems to resolve itself into a two-fold division. The distinction between the apodeictic and the assertory will be 313 admitted, I presume, even by those who accept the metaphysical or psychological theory upon which it rests, to be a difference which concerns, not the quantity of belief with which the judgments are entertained, but rather the violence which would have to be done to the mind by the attempt to upset them. Each is fully believed, but the one can, and the other cannot, be controverted. The belief with which an assertory judgment is entertained is full belief, else it would not differ from the problematic; and therefore in regard to the quantity of belief, as distinguished from the quality or character of it, there is no difference between it and the apodeictic. It is as though, to offer an illustration, the index had been already moved to the top of the scale in the assertory judgment, and all that was done to convert this into an apodeictic one, was to clamp it there. The only logical difference which then remains is that between problematic and assertory, the former comprehending all the judgments as to the truth of which we have any degree of doubt, and the latter those of which we have no doubt. The whole range of the former, therefore, with which Probability is appropriately occupied, is thrown undivided into a single compartment. We can hardly speak of a ‘division’ where one class includes everything up to the boundary line, and the other is confined to that boundary line. Practically, therefore, on this view, modality, as the mathematical student of Probability would expect to find it, as completely disappears as if it were intended to reject it.
However, when considered as a logical division, Kant's arrangement seems to offer very little help. For the logical purposes we're focused on right now, it really breaks down into a two-fold division. I assume the distinction between the apodeictic and assertory will be accepted even by those who support the metaphysical or psychological theory behind it, as a difference that doesn't concern the amount of belief we have in the judgments but rather the effort it would take to challenge them. Both are fully believed, but one can be contested while the other cannot. The belief in an assertory judgment is complete; otherwise, it wouldn’t be different from the problematic. Therefore, regarding the quantity of belief, as opposed to the quality or nature of it, there is no difference between assertory and apodeictic. It's like saying the index has already been moved to the top of the scale in the assertory judgment, and all that's needed to turn it into an apodeictic one is to clamp it there. The only logical difference that remains is between problematic and assertory, with the former encompassing all judgments we have some doubt about, and the latter being those we have no doubt about. Hence, the entire range of the former, which Probability deals with appropriately, gets grouped into one single category. We can hardly consider it a ‘division’ when one class includes everything up to the limit and the other is restricted to that limit. Practically speaking, under this perspective, modality, as a math student studying Probability would expect to see, vanishes completely as if it were meant to be disregarded.
§ 19. By less consistent and systematic thinkers, and by those in whom ingenuity was an over prominent feature, a variety of other arrangements have been accepted or proposed. There is, of course, some justification for such attempts in the laudable desire to bring our logical forms into better harmony with ordinary thought and language. In practice, 314 as was pointed out in an earlier chapter, every one recognizes a great variety of modal forms, such as ‘likely,’ ‘very likely,’ ‘almost certainly,’ and so on almost without limit in each direction. It was doubtless supposed that, by neglecting to make use of technical equivalents for some of these forms, we should lose our logical control over certain possible kinds of inference, and so far fall short even of the precision of ordinary thought.
§ 19. Less consistent and systematic thinkers, along with those whose creativity was particularly pronounced, have accepted or suggested various other arrangements. There is, of course, some justification for these attempts, rooted in the commendable desire to align our logical structures more closely with everyday thought and language. In practice, as noted in an earlier chapter, everyone acknowledges a wide range of modal forms, like ‘likely,’ ‘very likely,’ ‘almost certainly,’ and so on, nearly without limit in either direction. It was likely assumed that by not using technical equivalents for some of these forms, we would lose our logical grip on certain possible types of inference, thereby compromising even the precision of ordinary thinking.
With regard to such additional forms, it appears to me that all those which have been introduced by writers who were uninfluenced by the Theory of Probability, have done little else than create additional confusion, as such writers do not attempt to marshal their terms in order, or to ascertain their mutual relations. Omitting, of course, forms obviously of material modality, we have already mentioned the true and the false; the probable, the supposed, and the certain. These subdivisions seem to have reached their climax at a very early stage in Occam (Prantl, III. 380), who held that a proposition might be modally affected by being ‘vera, scita, falsa, ignota, scripta, prolata, concepta, credita, opinata, dubitata.’
Regarding these additional forms, I think that all those introduced by writers who weren't influenced by the Theory of Probability have mostly just added to the confusion, since these writers don’t try to organize their terms or figure out how they relate to each other. Excluding, of course, forms that are clearly related to material modality, we’ve already talked about the true and the false; the probable, the supposed, and the certain. These categories seem to have peaked at an early stage with Occam (Prantl, III. 380), who believed that a proposition could be modally affected by being ‘vera, scita, falsa, ignota, scripta, prolata, concepta, credita, opinata, dubitata.’
§ 20. Since the growth of the science of Probability, logicians have had better opportunities of knowing what they had to aim at; and, though it cannot be said that their attempts have been really successful, these are at any rate a decided improvement upon those of their predecessors. Dr Thomson,[14] for instance, gives a nine-fold division. He says that, arranging the degrees of modality in an ascending scale, we find that a judgment may be either possible, doubtful, probable, morally certain for the thinker himself, morally certain for a class or school, morally certain for all, physically certain with a limit, physically certain without 315 limitation, and mathematically certain. Many other divisions might doubtless be mentioned, but, as every mathematician will recognize, the attempt to secure any general agreement in such a matter of arrangement is quite hopeless. It is here that the beneficial influence of the mathematical theory of Probability is to be gratefully acknowledged. As soon as this came to be studied it must have been perceived that in attempting to mark off clearly from one another certain gradations of belief, we should be seeking for breaches in a continuous magnitude. In the advance from a slight presumption to a strong presumption, and from that to moral certainty, we are making a gradual ascent, in the course of which there are no natural halting-places. The proof of this continuity need not be entered upon here, for the materials for it will have been gathered from almost every chapter of this work. The reader need merely be reminded that the grounds of our belief, in all cases which admit of number and measurement, are clearly seen to be of this description; and that therefore unless the belief itself is to be divorced from the grounds on which it rests, what thus holds as to their characteristics must hold also as to its own.
§ 20. Since the development of the science of Probability, logicians have had better chances to understand their objectives; and while it can't be said their efforts have been wholly successful, they are certainly a significant step up from those of their predecessors. Dr. Thomson, [14] for example, presents a nine-fold classification. He states that, when organizing the degrees of modality in an increasing order, we find that a judgment can be categorized as possible, doubtful, probable, morally certain for the thinker, morally certain for a group or school, morally certain for everyone, physically certain with limits, physically certain without limits, and mathematically certain. Many other classifications could be mentioned, but as every mathematician will understand, trying to achieve any general consensus on such arrangements is quite futile. This is where we should acknowledge the positive impact of the mathematical theory of Probability. Once this began to be studied, it became clear that in trying to distinctly separate various levels of belief, we would be looking for breaks in a continuous spectrum. Moving from slight presumption to strong presumption, and then to moral certainty involves a gradual progression, where there are no clear stopping points. We don't need to prove this continuity here, as the evidence has been gathered throughout nearly every chapter of this work. The reader simply needs to remember that the basis of our beliefs, in all matters that can be quantified and measured, clearly falls into this category; and unless the belief is separated from the foundations it stands on, what applies to their characteristics must also apply to the belief itself.
It follows, therefore, that modality in the old sense of the word, wherein an attempt was made to obtain certain natural divisions in the scale of conviction, must be finally abandoned. All that it endeavoured to do can now be done incomparably better by the theory of Probability, with its numerical scale which admits of indefinite subdivision. None of the old systems of division can be regarded as a really natural one; those which admit but few divisions being found to leave the whole range of the probable in one unbroken class, and those which adopt many divisions lapsing into unavoidable vagueness and uncertainty.
It follows that the old concept of modality, where an effort was made to establish specific natural divisions in the hierarchy of belief, must be completely abandoned. Everything it tried to achieve can now be done much better through the theory of Probability, which uses a numerical scale that allows for endless subdivision. None of the old division systems can be considered genuinely natural; those with only a few divisions end up grouping the entire range of the probable into one continuous category, while those with many divisions become inevitably vague and uncertain.
§ 21. Corresponding to the distinction between pure 316 and modal propositions, but even more complicated and unsatisfactory in its treatment, was that between pure and modal syllogisms. The thing discussed in the case of the latter was, of course, the effect produced upon the conclusion in respect of modality, by the modal affection of one or both premises. It is only when we reach such considerations as these that we are at all getting on to the ground appropriate to Probability; but it is obvious that very little could be done with such rude materials, and the inherent clumsiness and complication of the whole modal system come out very clearly here. It was in reference probably to this complication that some of the bitter sayings[15] of the schoolmen and others which have been recorded, were uttered.
§ 21. Corresponding to the difference between pure 316 and modal propositions, but even more complex and unsatisfactory in its treatment, was the distinction between pure and modal syllogisms. The discussion regarding the latter was, of course, about how the modality of one or both premises affects the conclusion. It's only when we start considering these aspects that we really move into the realm of Probability; however, it's clear that there's not much that can be achieved with such rough materials, and the inherent awkwardness and complexity of the entire modal system become very evident here. It was probably in reference to this complexity that some of the harsh remarks made by the schoolmen and others that have been recorded were expressed.
Aristotle has given an intricate investigation of this subject, and his followers naturally were led along a similar track. It would be quite foreign to my purpose in the slight sketch in this chapter to attempt to give any account of these enquiries, even were I competent to do so; for, as has been pointed out, the connection between the Aristotelian modals and the modern view of the nature of Probability, though real, is exceedingly slight. It need only be remarked that what was complicated enough with four modals to be taken account of, grows intricate beyond all endurance when such as the ‘probable’ and the ‘true’ and the ‘false’ have also to be assigned a place in the list. The following examples[16] will show the kind of discussions with which the logicians 317 exercised themselves. ‘Whether, with one premise certain, and the other probable, a certain conclusion may be inferred’: ‘Whether, from the impossible, the necessary can be inferred’; ‘Whether, with one premise necessary and the other de inesse, the conclusion is necessary’, and so on, endlessly.
Aristotle did a detailed examination of this topic, and his followers naturally followed a similar path. It wouldn’t really serve my purpose in this brief overview in this chapter to try to summarize these inquiries, even if I were capable of doing so; because, as has been noted, the link between the Aristotelian modals and the modern understanding of Probability, while real, is very minimal. It's worth mentioning that what was already complicated with four modals becomes incredibly intricate when terms like ‘probable,’ ‘true,’ and ‘false’ also need to be included in the discussion. The following examples[16]will illustrate the types of debates that logicians engaged in. ‘Can a certain conclusion be drawn with one certain premise and one probable premise?’ ‘Can the necessary be inferred from the impossible?’ ‘Is the conclusion necessary if one premise is necessary and the other is de inesse?’, and so on, endlessly.
§ 22. On the Kantian view of modality the discussion of such kinds of syllogisms becomes at once decidedly more simple (for here but three modes are recognized), and also somewhat more closely connected with strict Probability, (for the modes are more nearly of the nature of gradations of conviction). But, on the other hand, there is less justification for their introduction, as logicians might really be expected to know that what they are aiming to effect by their clumsy contrivances is the very thing which Probability can carry out to the highest desired degree of accuracy. The former methods are as coarse and inaccurate, compared with the latter, as were the roughest measurements of Babylonian night-watchers compared with the refined calculations of the modern astronomer. It is indeed only some of the general adherents of the Kantian Logic who enter upon any such considerations as these; some, such as Hamilton and Mansel, entirely reject them, as we have seen. By those who do treat of the subject, such conclusions as the following are laid down; that when both premises are apodeictic the conclusion will be the same; so when both are assertory or problematic. If one is apodeictic and the other assertory, the latter, or ‘weaker,’ is all that is to be admitted for the conclusion; and so on. The English reader will find some account of these rules in Ueberweg's Logic.[17]
§ 22. According to the Kantian perspective on modality, the discussion of these types of syllogisms becomes noticeably simpler (since only three modes are recognized) and somewhat more aligned with strict probability (as the modes resemble varying degrees of certainty). However, on the flipside, there's less reason to introduce them, since logicians should realize that what they aim to achieve with their awkward approaches is exactly what probability can accomplish with much greater accuracy. The earlier methods are as rough and inaccurate compared to the latter as the most basic measurements of Babylonian stargazers were compared to the sophisticated calculations of modern astronomers. Indeed, it's primarily a subset of Kantians who consider these matters; others, like Hamilton and Mansel, completely dismiss them, as we've noted. Among those who do address the topic, conclusions such as the following are established: when both premises are apodeictic, the conclusion will align; similarly, when both are assertory or problematic. If one is apodeictic and the other is assertory, then only the latter, or 'weaker,' premise should be accepted for the conclusion; and so forth. The English reader can find some explanation of these rules in Ueberweg's Logic.[17]
§ 23. But although those modals, regarded as instruments of accurate thought, have been thus superseded by the 318 precise arithmetical expressions of Probability, the question still remains whether what may be termed our popular modal expressions could not be improved and adapted to more accurate use. It is true that the attempt to separate them from one another by any fundamental distinctions is futile, for the magnitude of which they take cognizance is, as we have remarked, continuous; but considering the enormous importance of accurate terminology, and of recognizing numerical distinctions wherever possible, it would be a real advance if any agreement could be arrived at with regard to the use of modal expressions. We have already noticed (Ch. II. § 16) some suggestions by Mr Galton as to the possibility of a natural system of classification, resting upon the regularity with which most kinds of magnitudes tend to group themselves about a mean. It might be proposed, for instance, that we should agree to apply the term ‘good’ to the first quarter, measuring from the best downwards; ‘indifferent’ to the middle half, and ‘bad’ to the last quarter. There seems no reason why a similarly improved terminology should not some day be introduced into the ordinary modal language of common life. It might be agreed, for instance, that ‘very improbable’ should as far as possible be confined to those events which had odds of (say) more than 99 to 1 against them; and so on, with other similar expressions. There would, no doubt, be difficulties in the way, for in all applications of classification we have to surmount the two-fold obstacles which lie in the way, firstly (to use Kant's expression) of the faculty of making rules, and secondly of that of subsumption under rules. That is to say, even if we had agreed upon our classes, there would still be much doubt and dispute, in the case of things which did not readily lend themselves to be counted or measured, as to whether the odds were more or less than the assigned quantity.
§ 23. Although these modal expressions, seen as tools for clear thinking, have been replaced by precise mathematical expressions of Probability, the question remains whether our everyday modal expressions could be improved and made more accurate. It's true that trying to separate them through any fundamental differences is pointless, since the measurements they address are, as we've noted, continuous. However, given the huge importance of precise terminology and recognizing numerical differences whenever possible, it would be a genuine improvement if we could reach an agreement on how to use modal expressions. We’ve already mentioned (Ch. II. § 16) some of Mr. Galton's ideas about creating a natural classification system based on the regularity with which many types of measurements tend to cluster around a mean. For example, we might propose that we designate ‘good’ for the top quarter, measuring from the best downward; ‘indifferent’ for the middle half; and ‘bad’ for the bottom quarter. There doesn’t seem to be any reason why a similarly improved terminology shouldn’t someday be adopted into the everyday modal language we use. For instance, we could agree that ‘very improbable’ should primarily apply to events that have odds of (let’s say) more than 99 to 1 against them, and so forth with other similar phrases. Certainly, there would be challenges to overcome, as in all classification efforts we have to deal with the two-fold barriers described by Kant: the ability to make rules and the ability to categorize things under those rules. In other words, even if we did agree on our categories, there would still be much uncertainty and disagreement regarding items that can’t easily be counted or measured, on whether the odds are more or less than the specified amount.
It is true that when we know the odds for or against an event, we can always state them explicitly without the necessity of first agreeing as to the usage of terms which shall imply them. But there would often be circumlocution and pedantry in so doing, and as long as modal terms are in practical use it would seem that there could be no harm, and might be great good, in arriving at some agreement as to the degree of probability which they should be generally understood to indicate. Bentham, as is well known, in despair of ever obtaining anything accurate out of the language of common life on this subject, was in favour of a direct appeal to the numerical standard. He proposed the employment, in judicial trials, of an instrument, graduated from 0 to 10, on which scale the witness was to be asked to indicate the degree of his belief of the facts to which he testified: similarly the judge might express the force with which he held his conclusion. The use of such a numerical scale, however, was to be optional only, not compulsory, as Bentham admitted that many persons might feel at a loss thus to measure the degree of their belief. (Rationale of Judicial Evidence, Bk. I., Ch. VI.)
It's true that when we know the odds for or against an event, we can always state them clearly without needing to first agree on the terms that imply them. However, doing so often involves unnecessary complexity and formalism, and as long as modal terms are in practical use, it seems there can be no harm—and potentially much benefit—in reaching some agreement on the degree of probability they should commonly represent. Bentham, as is widely known, was frustrated with the imprecision of everyday language on this topic and advocated for a direct reliance on numerical standards. He suggested using a tool in legal trials, scaled from 0 to 10, where witnesses would indicate the strength of their belief regarding the facts they were testifying about; similarly, judges could express the strength of their conclusions. However, the adoption of such a numerical scale was to be optional, not mandatory, as Bentham acknowledged that many people might struggle to quantify their belief in this way. (Rationale of Judicial Evidence, Bk. I., Ch. VI.)
§ 24. Throughout this chapter we have regarded the modals as the nearest counterpart to modern Probability which was afforded by the old systems of logic. The reason for so regarding them is, that they represented some slight attempt, rude as it was, to recognize and measure certain gradations in the degree of our conviction, and to examine the bearing of such considerations upon our logical inferences.
§ 24. In this chapter, we have seen modals as the closest equivalent to modern Probability offered by the old systems of logic. The reason for this perspective is that they represented a basic effort, as crude as it may have been, to acknowledge and quantify certain levels of our conviction, and to explore how these factors influence our logical conclusions.
But although it is amongst the modals that the germs of the methods of Probability are thus to be sought; the true subject-matter of our science, that is, the classes of objects with which it is most appropriately concerned, are rather represented by another part of the scholastic logic. This 320 was the branch commonly called Dialectic, in the old sense of that term. Dialectic, according to Aristotle, seems to have been a sort of sister art to Rhetoric. It was concerned with syllogisms differing in no way from demonstrative syllogisms, except that their premises were probable instead of certain. Premises of this kind he termed topics, and the syllogisms which dealt with them enthymemes. They were said to start from ‘signs and likelihoods’ rather than from axioms.[18]
But even though the foundations of Probability can be found among the modals, the main focus of our science—meaning the types of objects we deal with most appropriately—is actually represented by another part of scholastic logic. This was the branch commonly referred to as Dialectic in the old sense of the term. According to Aristotle, Dialectic was like a sibling art to Rhetoric. It dealt with syllogisms that were no different from demonstrative syllogisms, except that their premises were probable instead of certain. He called these premises topics, and the syllogisms that used them enthymemes. They were said to start from ‘signs and likelihoods’ rather than from axioms. 320
§ 25. The terms in which such reasonings are commonly described sound very much like those applicable to Probability, as we now understand it. When we hear of likelihood, and of probable syllogisms, our first impression might be that the inferences involved would be of a similar character.[19] This, however, would be erroneous. In the 321 first place the province of this Dialectic was much too wide, for it covered in addition the whole field of what we should now term Scientific or Material Induction. The distinctive characteristic of the dialectic premises was their want of certainty, and of such uncertain premises Probability (as I have frequently insisted) takes account of one class only, Induction concerning itself with another class. Again, not the slightest attempt was made to enter upon the enquiry, How uncertain are the premises? It is only when this is attempted that we can be considered to enter upon the field of Probability, and it is because, after a rude fashion, the modals attempted to grapple with this problem, that we have regarded them as in any way occupied with our special subject-matter.
§ 25. The way these arguments are usually described sounds a lot like how we understand Probability today. When we hear terms like likelihood and probable syllogisms, we might first think that the inferences involved are similar. However, that would be a mistake. In the 321 first place, the scope of this Dialectic was much broader, as it included everything we would now call Scientific or Material Induction. The key distinction of the dialectic premises was their uncertainty, and Probability (as I have often pointed out) only considers one type of uncertain premises, while Induction deals with a different type. Additionally, there was no effort made to explore the question, How uncertain are the premises? It’s only when this question is explored that we can be said to enter the realm of Probability, and it’s because the modals tried to tackle this issue, albeit in a rough way, that we see them as having any relevance to our specific topic.
§ 26. Amongst the older logics with which I have made any acquaintance, that of Crackanthorpe gives the fullest discussion upon this subject. He divides his treatment of the syllogism into two parts, occupied respectively with the ‘demonstrative’ and the ‘probable’ syllogism. To the latter a whole book is devoted. In this the nature and consequences of thirteen different ‘loci’[20] are investigated, though it is not very clear in what sense they can every one of them be regarded as being ‘probable.’
§ 26. Among the older logics I'm familiar with, Crackanthorpe offers the most comprehensive discussion on this topic. He splits his analysis of the syllogism into two sections: one focused on the 'demonstrative' syllogism and the other on the 'probable' syllogism. The latter gets an entire book dedicated to it. This book explores the nature and implications of thirteen different 'loci'[20] , although it's not entirely clear how each of them can be considered 'probable.'
It is doubtless true, that if the old logicians had been in possession of such premises as modern Probability is concerned with, and had adhered to their own way of treating them, they would have had to place them amongst such loci, and thus to make the consideration of them a part of their Dialectic. But inasmuch as there does not seem to have been the slightest attempt on their part to do more here than recognize the fact of the premises being probable; that is, since it was not attempted to measure their probability and that of the conclusion, I cannot but regard this part of Logic as having only the very slightest relation to Probability as now conceived. It seems to me little more than one of the ways (described at the commencement of this chapter) by which the problem of Modality is not indeed rejected, but practically evaded.
It’s certainly true that if the old logicians had access to the kinds of premises that modern Probability deals with and had stuck to their own methods, they would have had to categorize them as such loci and make examining them a part of their Dialectic. However, since it appears they made no real effort to do anything beyond acknowledging that the premises were probable—that is, since they didn’t try to measure their probability or that of the conclusion—I can only see this part of Logic as having very little connection to Probability as we understand it now. To me, it seems like just one of the ways (mentioned at the beginning of this chapter) that the issue of Modality is not outright dismissed, but effectively avoided.
§ 27. As Logic is not the only science which is directly and prominently occupied with questions about belief and evidence, so the difficulties which have arisen there have been by no means unknown elsewhere. In respect of the modals, this seems to have been manifestly the case in Jurisprudence. Some remarks, therefore, may be conveniently made here upon this application of the subject, though of course with the brevity suitable on the part of a layman who has to touch upon professional topics.
§ 27. Logic isn’t the only field that deals directly with questions about belief and evidence, and the challenges that have come up in this area are definitely found in other fields too. When it comes to modals, it seems pretty clear that this has been particularly evident in law. So, I’d like to make a few comments on this application of the topic, while keeping it brief since I'm just an outsider discussing professional matters.
Recall for a moment what are the essentials of modality. These I understand to be the attempt to mark off from one another, without any resort to numerical notation, varying degrees of conviction or belief, and to determine the consequent effect of premises, thus affected, upon our conclusions. Moreover, as we cannot construct or retain a scale of any kind without employing a standard from and by which to measure it, the attainment and recognition of a standard of certainty, or of one of the other degrees of conviction, is 323 almost inseparably involved in the same enquiry. In this sense of the term, modal difficulties have certainly shown themselves in the department of Law. There have been similar attempts here, encountered by similar difficulties, to come to some definite agreement as to a scale of arrangement of the degrees of our assent. It is of course much more practicable to secure such agreement in the case of a special science, confined more or less to the experts, than in subjects into which all classes of outsiders have almost equal right of entry. The range of application under the former circumstances is narrower, and the professional experts have acquired habits and traditions by which the standards may be retained in considerable integrity. It does not appear, however, according to all accounts, as if any very striking success had been attained in this direction by the lawyers.
Think for a moment about the basics of modality. I see these as the effort to distinguish between different degrees of certainty or belief without using numbers, and to figure out how these varying degrees impact our conclusions. Also, since we can't create or maintain a scale without having a standard to measure against, finding and recognizing a standard of certainty, or any other degree of belief, is closely tied to this inquiry. In this context, modal challenges have definitely arisen in the field of Law. Similar attempts have been made here, facing similar issues, to reach an agreement on how to organize our levels of agreement. It's certainly easier to achieve that consensus in specialized sciences that are more or less limited to experts than in fields where almost anyone can participate. The scope of application in the former case is smaller, and the professionals have developed habits and traditions that help maintain those standards fairly well. However, it seems, based on all reports, that the lawyers haven't made very significant progress in this area. 323
§ 28. The difficulty in its scientific, or strictly jurisprudential shape, seems to have shown itself principally in the attempt to arrange legal evidence into classes in respect of the degree of its cogency. This, I understand, was the case in the Roman law, and in some of the continental systems of jurisprudence which took their rise from the Roman law. “The direct evidence of so many witnesses was plena probatio. Then came minus plena probatio, then semiplenâ major and semiplenâ minor; and by adding together a certain number of half-proofs—for instance, by the production of a tradesman's account-books, plus his supplementary oath—full proof might be made out. It was on this principle that torture was employed to obtain a confession. The confession was evidence suppletory to the circumstances which were held to justify its employment.”[21]
§ 28. The challenge in its scientific or legal form seems to have mainly appeared in the effort to categorize legal evidence based on how convincing it is. I understand this was the case in Roman law and in some of the legal systems in Europe that developed from Roman law. “The direct evidence from several witnesses was plena probatio. Then came minus plena probatio, followed by semiplenâ major and semiplenâ minor; and by combining a certain number of half-proofs—such as presenting a tradesman's account books, plus his additional oath—full proof could be established. It was based on this principle that torture was used to obtain a confession. The confession served as supplementary evidence to the circumstances that were considered to justify its use.”[21]
According to Bentham,[22] the corresponding scale in the 324 English school was:—Positive proof, Violent presumption. Probable presumption, Light or Rash presumption. Though admitted by Blackstone and others, I understand that these divisions are not at all generally accepted at the present day.
According to Bentham,[22] the corresponding scale in the 324 English school was:—Positive proof, Strong presumption. Likely presumption, Weak or Reckless presumption. Even though Blackstone and others recognized them, I believe these divisions aren't widely accepted today.
§ 29. In the above we are reminded rather of modal syllogisms. The principal practical form in which the difficulty underlying the simple modal propositions presents itself, is in the attempt to obtain some criterion of judicial certainty. By ‘certainty’ here we mean, of course, not what the metaphysicians term apodeictic,[23] for that can seldom or never be secured in practical affairs, but such a degree of conviction, short of this, as every reasonable person will feel to be sufficient for all his wants. Here again, one would think, the quest must appear, to accurate thinkers, an utterly hopeless one; an effort to discover natural breaks in a continuous magnitude. There cannot indeed be the least doubt that, amongst limited classes of keen and practised intellects, a standard of certainty, as of everything else, might be retained and handed down with considerable accuracy: this is possible in matters of taste and opinion where personal peculiarities of judgment are far more liable to cause disagreement and confusion. But then such a consensus is almost entirely an affair of tact and custom; whereas what is wanted in the case in question is some criterion to which the comparatively uninitiated may be able to appeal. The standard, therefore, must not merely be retained by recollection, but be generally recognizable by its characteristics. If such a criterion could 325 be secured, its importance could hardly be overrated. But so far as one may judge from the speeches of counsel, the charges of judges, and the verdicts of juries, nothing really deserving the name is ever attained.
§ 29. The previous discussion brings to mind modal syllogisms. The main practical issue related to simple modal propositions lies in trying to establish some criterion for judicial certainty. By ‘certainty,’ we don't mean the kind that metaphysicians call apodeictic, since that is rarely, if ever, achievable in real-life situations. Instead, we refer to a level of confidence that any reasonable person would consider adequate for their needs. It might seem to precise thinkers that this quest is utterly futile, akin to trying to find natural breaks in a continuous scale. There is no doubt that, among small groups of sharp and experienced minds, a standard of certainty could be preserved and passed down with significant accuracy. This holds true even in matters of taste and opinion, where personal judgment differences can lead to discord and confusion. However, achieving such a consensus is mainly about instinct and tradition; what is needed here is a criterion that less experienced individuals can reference. Therefore, the standard must not only be remembered but also be easily recognized by its features. If a criterion like that could be established, its significance would be immense. Yet, from what can be gathered from lawyers' arguments, judges' instructions, and jury verdicts, nothing that truly warrants the title is ever achieved.
§ 30. The nearest approach, perhaps, to a recognized standard is to be found in the frequent assurance that juries are not bound to convict only in case they have no doubt of the guilt of the accused; for the absolute exclusion of all doubt, the utter impossibility of suggesting any counter hypothesis which this assumes, is unattainable in human affairs. But, it is frequently said, they are to convict if they have no ‘reasonable doubt,’ no such doubt, that is, as would be ‘a hindrance to acting in the important affairs of life.’ As a caution against seeking after unattainable certainty, such advice may be very useful; but it need hardly be remarked that the certainty upon which we act in the important affairs of life is no fixed standard, but varies exceedingly according to the nature of those affairs. The greater the reward at stake, the greater the risk we are prepared to run, and conversely. Hardly any degree of certainty can exist, upon the security of which we should not be prepared to act under appropriate circumstances.[24]
§ 30. The closest thing to a recognized standard is the common reassurance that juries aren’t required to convict unless they have no doubt about the defendant’s guilt; achieving complete certainty, with no possibility of suggesting an alternative explanation, is impossible in human matters. However, it’s often stated that they should convict if they have no ‘reasonable doubt,’ meaning no doubt that would interfere with making important life decisions. While this caution against chasing impossible certainty can be very helpful, it’s worth noting that the certainty we rely on in significant decisions isn’t a fixed benchmark—it varies widely depending on the situation. The higher the reward at stake, the more risk we’re willing to take, and vice versa. There’s hardly any level of certainty that we wouldn’t be willing to act on in the right circumstances.[24]
Some writers indeed altogether deny that any standard, in the common sense of the word, either is, or ought to be, aimed at in legal proceedings. For instance, Sir J. F. 326 Stephen, in his work on English Criminal Law,[25] after noticing and rejecting such standards as that last indicated, comes to the conclusion that the only standard recognized by our law is that which induces juries to convict:—“What is judicial proof? That which being permitted by law to be given in evidence, induces twelve men, chosen according to the Jury Act, to say that, having heard it, their minds are satisfied of the truth of the proposition which it affirms. They may be prejudiced, they may be timid, they may be rash, they may be ignorant; but the oath, the number, and the property qualification, are intended, as far as possible, to neutralize these disadvantages, and answer precisely to the conditions imposed upon standards of value or length.” (p. 263.)
Some writers completely reject the idea that any standard, in the common sense of the word, should be aimed for in legal proceedings. For instance, Sir J. F. 326 Stephen, in his work on English Criminal Law, [25] after noticing and dismissing such standards as previously mentioned, concludes that the only standard our law recognizes is the one that leads juries to convict:—“What is judicial proof? It’s what, when allowed by law as evidence, leads twelve jurors, selected according to the Jury Act, to declare that, after hearing it, they believe the proposition is true. They might be biased, fearful, impulsive, or uninformed; however, the oath, the number, and the property qualification are designed to neutralize these drawbacks as much as possible and align with the criteria set for standards of value or measurement.” (p. 263.)
To admit this is much about the same thing as to abandon such a standard as unattainable. Evidence which induces a jury to convict may doubtless be a standard to me and others of what we ought to consider ‘reasonably certain,’ provided of course that the various juries are tolerably uniform in their conclusions. But it clearly cannot be proposed as a standard to the juries themselves; if their decisions are to be consistent and uniform, they want some external indication to guide them. When a man is asking, How certain ought I to feel? to give such an answer as the above is, surely, merely telling him that he is to be as certain as 327 he is. If, indeed, juries composed a close profession, they might, as was said above, retain a traditional standard. But being, as they are, a selection from the ordinary lay public, their own decisions in the past can hardly be held up to them as a direction what they are to do in future.
To say this is pretty much the same as giving up on a standard as impossible to reach. Evidence that leads a jury to convict might be a standard for me and others regarding what we should think of as ‘reasonably certain,’ as long as different juries are reasonably consistent in their conclusions. However, it clearly can't serve as a standard for the juries themselves; if their decisions are going to be consistent and uniform, they need some external guidance. When someone is asking, How certain should I feel? to suggest the answer above is just telling them they should feel as certain as they do. If juries were a close-knit profession, they might maintain a traditional standard, as mentioned earlier. But since they are made up of a mix from the average public, their past decisions can hardly be used as a guide for what they should do moving forward.
§ 31. It would appear therefore that we may fairly say that the English law, at any rate, definitely rejects the main assumption upon which the logical doctrine of modality and its legal counterpart are based: the assumption, namely, that different grades of conviction can be marked off from one another with sufficient accuracy for us to be able to refer individual cases to their corresponding classes. And that with regard to the collateral question of fixing a standard of certainty, it will go no further than pronouncing, or implying, that we are to be content with nothing short of, but need not go beyond, ‘reasonable certainty.’
§ 31. It seems that we can confidently state that English law, at least, clearly rejects the main assumption underlying the logical doctrine of modality and its legal equivalent: the idea that different levels of certainty can be distinguished from one another with enough precision to categorize individual cases into their appropriate groups. Regarding the related issue of establishing a standard of certainty, it will only declare, or suggest, that we should accept nothing less than, but need not exceed, ‘reasonable certainty.’
This is a statement of the standard, with which the logician and scientific man can easily quarrel; and they may with much reason maintain that it has not the slightest claim to accuracy, even if it had one to strict intelligibility. If a man wishes to know whether his present degree of certainty is reasonable, whither is he to appeal? He can scarcely compare his mental state with that which is experienced in ‘the important affairs of life,’ for these, as already remarked, would indicate no fixed value. At the same time, one cannot suppose that such an expression is destitute of all signification. People would not continue to use language, especially in matters of paramount importance and interest, without meaning something by it. We are driven therefore to conclude that ‘reasonable certainty’ does in a rude sort of way represent a traditional standard to which it is attempted to adhere. As already remarked, this is perfectly practicable in the case of any class of professional 328 men, and therefore not altogether impossible in the case of those who are often and closely brought into connection with such a class. Though it is hard to believe that any such expressions, when used for purposes of ordinary life, attain at all near enough to any conventional standard to be worth discussion; yet in the special case of a jury, acting under the direct influence of a judge, it seems quite possible that their deliberate assertion that they are ‘fully convinced’ may reach somewhat more nearly to a tolerably fixed standard than ordinary outsiders would at first think likely.
This is a statement of the standard that logicians and scientists might easily dispute; they could reasonably argue that it has no claims to accuracy, even if it were to have some claim to strict understanding. If someone wants to know whether their current level of certainty is reasonable, where should they turn? It's difficult to compare their mental state with what is felt in "the important affairs of life," since these, as mentioned before, would show no fixed value. However, one cannot assume that such an expression is completely meaningless. People wouldn’t keep using language, especially about crucial issues, without having some meaning behind it. Thus, we have to conclude that "reasonable certainty" does, in a rough way, reflect a traditional standard that people try to stick to. As mentioned earlier, this is certainly achievable for any group of professionals, and it's not entirely impossible for those who often work closely with such a group. Though it’s hard to believe that such expressions, when used in everyday life, come anywhere near a conventional standard to be worth discussing, it does seem quite possible that in the specific situation of a jury, acting under a judge's direct influence, their careful claim that they are "fully convinced" could get somewhat closer to a reasonably fixed standard than ordinary outsiders would initially assume.
§ 32. Are there then any means by which we could ascertain what this standard is; in other words, by which we could determine what is the real worth, in respect of accuracy, of this ‘reasonable certainty’ which the juries are supposed to secure? In the absence of authoritative declarations upon the subject, the student of Logic and Probability would naturally resort to two means, with a momentary notice of which we will conclude this enquiry.
§ 32. So, are there any ways we can figure out what this standard is; in other words, how we can determine the actual value, in terms of accuracy, of this ‘reasonable certainty’ that juries are expected to provide? Without any official statements on the topic, someone studying Logic and Probability would likely turn to two methods, and a brief mention of these will wrap up this inquiry.
The first of these would aim at determining the standard of judicial certainty indirectly, by simply determining the statistical frequency with which the decisions (say) of a jury were found to be correct. This may seem to be a hopeless task; and so indeed it is, but not so much on any theoretic insufficiency of the determining elements as on account of the numerous arbitrary assumptions which attach to most of the problems which deal with the probability of testimony and judgments. It is not necessary for this purpose that we should have an infallible superior court which revised the decisions of the one under consideration;[26] it is sufficient if a 329 large number of ordinary representative cases are submitted to a court consisting even of exactly similar materials to the one whose decisions we wish to test. Provided always that we make the monstrous assumption that the judgments of men about matters which deeply affect them are ‘independent’ in the sense in which the tosses of pence are independent, then the statistics of mere agreement and disagreement will serve our purpose. We might be able to say, for instance, that a jury of a given number, deciding by a given majority, were right nine times out of ten in their verdict. Conclusions of this kind, in reference to the French courts, are what Poisson has attempted at the end of his great work on the Probability of Judgments; though I do not suppose that he attached much numerical accuracy to his results.
The first approach would try to figure out the standard of judicial certainty indirectly by simply assessing how often jury decisions were found to be correct. This might seem like an impossible task; and it really is, but not so much because of any theoretical shortcomings in the determining elements, but rather due to the many arbitrary assumptions tied to most issues concerning the probability of testimony and judgments. For this purpose, we don't need an infallible higher court to review the decisions we're looking at; it's enough if a large number of ordinary representative cases are presented to a court with materials similar to those whose decisions we want to evaluate. As long as we make the outrageous assumption that people's judgments about matters that greatly affect them are ‘independent’ in the same way that coin tosses are, then the statistics of agreement and disagreement will meet our needs. We could say, for example, that a jury of a certain number, reaching a specific majority, was correct nine times out of ten in their verdict. Conclusions like this regarding the French courts are what Poisson has attempted at the end of his significant work on the Probability of Judgments; although I doubt he assigned much numerical accuracy to his findings.
A scarcely more hopeful means would be found by a reference to certain cases of legal ‘presumptions.’ A ‘conclusive presumption’ is defined as follows:—“Conclusive, or as they are elsewhere termed imperative or absolute presumptions of law, are rules determining the quantity of evidence requisite for the support of any particular averment which is not permitted to be overcome by any proof that the fact is otherwise.”[27] A large number of such presumptions will be found described in the text-books, but they seem to refer to matters far too vague, for the most part, to admit of any reduction to statistical frequency of occurrence. It is indeed maintained by some authorities that any assignment of degree of Probability is not their present object, but that they are simply meant to exclude the troublesome delays 330 that would ensue if everything were considered open to doubt and question. Moreover, even if they did assign a degree of certainty this would rather be an indication of what legislators or judges thought reasonable than of what was so considered by the juries themselves.
A barely more promising approach would be to look at certain instances of legal 'presumptions.' A 'conclusive presumption' is defined as follows:—“Conclusive, or as they are also called imperative or absolute presumptions of law, are rules that determine the amount of evidence needed to support any specific claim that cannot be challenged by any proof that the fact is different.”[27] Many of these presumptions can be found in legal textbooks, but they tend to refer to issues that are too vague for any statistical analysis of frequency. Some experts argue that assigning a level of probability isn’t the main goal; rather, they are simply intended to eliminate the annoying delays that would happen if everything were treated as if it were up for debate. Furthermore, even if they did provide a level of certainty, it would more reflect what lawmakers or judges considered reasonable rather than what juries truly thought.
There are indeed presumptions as to the time after which a man, if not heard of, is supposed to be dead (capable of disproof, of course, by his reappearance). If this time varied with the age of the man in question, we should at once have some such standard as we desire, for a reference to the Life tables would fix his probable duration of life, and so determine indirectly the measure of probability which satisfied the law. But this is not the case; the period chosen is entirely irrespective of age. The nearest case in point (and that does not amount to much) which I have been able to ascertain is that of the age after which it has been presumed that a woman was incapable of bearing children. This was the age of 53. A certain approach to a statistical assignment of the chances in this case is to be found in Quetelet's Physique Sociale (Vol. I. p. 184, note). According to the authorities which he there quotes it would seem that in about one birth in 5500 the mother was of the age of 50 or upwards. This does not quite assign the degree of what may be called the à priori chance against the occurrence of a birth at that age, because the fact of having commenced a family at an early age represents some diminution of the probability of continuing it into later life. But it serves to give some indication of what may be called the odds against such an event.
There are indeed assumptions about the time after which a man, if not heard from, is considered dead (though this can be disproven if he reappears). If this timeline varied with the man's age, we would have the standard we’re looking for, since a reference to life tables would provide his likely lifespan and indirectly determine the probability that satisfies the law. However, that’s not the case; the chosen period is completely independent of age. The closest example I’ve found (which isn’t very significant) is the age at which it was presumed a woman could no longer bear children. This age was 53. A rough statistical estimate for this situation can be found in Quetelet's Physique Sociale (Vol. I, p. 184, note). According to the sources he cites, it seems that in about one birth in 5,500, the mother was 50 or older. This doesn’t fully capture the statistical odds against childbirth at that age, because starting a family at a younger age reduces the chances of continuing it later in life. But it does give a sense of the odds against such a situation occurring.
It need not be remarked that any such clues as these to the measure of judicial certainty are far too slight to be of any real value. They only deserve passing notice as a possible logical solution of the problem in question, or rather as 331 an indication of the mode in which, in theory, such a solution would have to be sought, were the English law, on those subjects, a perfectly consistent scheme of scientific evidence. This is the mode in which one would, under those circumstances, attempt to extract from its proceedings an admission of the exact measure of that standard of certainty which it adopted, but which it declined openly to enunciate.
It shouldn't be pointed out that any clues like these regarding the level of judicial certainty are way too minimal to be truly useful. They only merit a brief mention as a potential logical solution to the problem at hand, or rather as 331 an indication of how, in theory, such a solution would need to be pursued if English law on those subjects were a completely consistent system of scientific evidence. This is how one would, under those circumstances, try to draw from its proceedings a clear admission of the exact level of certainty that it accepted, but chose not to explicitly state.
1 Formal Logic, p. 232.
__A_TAG_PLACEHOLDER_0__ Formal Logic, p. 232.
2 This appears to be the purport of some statements in a very confused passage in Whately's Logic (Bk. II., ch. IV. § 1). “A modal proposition may be stated as a pure one by attaching the mode to one of the terms, and the proposition will in all respects fall under the foregoing rules;… ‘It is probable that all knowledge is useful;’ ‘probably useful’ is here the predicate.” He draws apparently no such distinction as that between the true and false modality referred to in the next note. What is really surprising is that even Hamilton puts the two (the true and the false modality) upon the same footing. “In regard to these [the former] the case is precisely the same; the mode is merely a part of the predicate.” Logic, I. 257.
2 This seems to be the meaning of some statements in a very confusing section in Whately's Logic (Bk. II., ch. IV. § 1). “A modal proposition can be expressed as a pure one by linking the mode to one of the terms, and the proposition will fully adhere to the previous rules;… ‘It is likely that all knowledge is useful;’ here ‘likely useful’ is the predicate.” He apparently makes no distinction between the true and false modality mentioned in the next note. What is truly surprising is that even Hamilton treats the two (the true and false modality) the same. “In regards to these [the former] the situation is exactly the same; the mode is simply part of the predicate.” Logic, I. 257.
3 I allude of course to such examples as ‘A killed B unjustly,’ in which the killing of B by A was sometimes said to be asserted not simply but with a modification. (Hamilton's Logic, I. 256.) It is obvious that the modification in such cases is by rights merely a part of the predicate, there being no formal distinction between ‘A is the killer of B’ and ‘A is the unjust killer of B.’ Indeed some logicians who were too conservative to reject the generic name of modality in this application adopted the common expedient of introducing a specific distinction which did away with its meaning, terming the spurious kind ‘material modality’ and the genuine kind ‘formal modality’. The former included all the cases in which the modification belonged by right either to the predicate or to the subject; the latter was reserved for the cases in which the modification affected the real conjunction of the predicate with the subject. (Keckermann, Systema Logicæ, Lib. II. ch. 3.) It was, I believe, a common scholastic distinction.
3 I’m referring to examples like ‘A killed B unjustly,’ where the act of B being killed by A is sometimes said to be asserted not straightforwardly, but with some modification. (Hamilton's Logic, I. 256.) It's clear that this modification is essentially just a part of the predicate, with no formal difference between ‘A is the killer of B’ and ‘A is the unjust killer of B.’ In fact, some logicians who were too traditional to completely discard the general term of modality in this context came up with the common solution of creating a specific distinction that diluted its meaning, labeling the false type as ‘material modality’ and the true type as ‘formal modality’. The former covered all instances where the modification rightfully belonged to either the predicate or the subject; the latter was set aside for cases where the modification actually influenced the real connection between the predicate and the subject. (Keckermann, Systema Logicæ, Lib. II. ch. 3.) I think this was a standard scholastic distinction.
For some account of the dispute as to whether the negative particle was to be considered to belong to the copula or to the predicate, see Hamilton's Logic, I. 253.
For details on the debate about whether the negative particle is associated with the copula or the predicate, check out Hamilton's Logic, I. 253.
4 He has also given a short discussion of the subject elsewhere (Discussions, Ed. II. p. 702), in which a somewhat different view is taken. The modes are indeed here admitted into logic, but only in so far as they fall by subdivision under the relation of genus and species, which is of course tantamount to their entire rejection; for they then differ in no essential way from any other examples of that relation.
4 He has also provided a brief discussion on the topic elsewhere (Discussions, Ed. II. p. 702), where a slightly different perspective is presented. The modes are acknowledged in logic, but only to the extent that they are categorized under the relationship of genus and species, which effectively means they are entirely dismissed; as this makes them no different from any other instances of that relationship.
5 Letters, Lectures and Reviews, p. 61. Elsewhere in the review (p. 45) he gives what appears to me a somewhat different decision.
5 Letters, Lectures and Reviews, p. 61. Elsewhere in the review (p. 45) he offers what seems to be a somewhat different conclusion.
6 It must be remembered that this is not one of the proportional propositions with which we have been concerned in previous chapters: it is meant that there are exactly 21 Ys, of which just 18 are X, not that on the average 18 out of 21 may be so regarded.
6 It should be noted that this is not one of the proportional proposals we've discussed in earlier chapters: it means that there are exactly 21 Ys, of which just 18 are X, not that, on average, 18 out of 21 can be seen that way.
8 The distinction is however by no means entirely neglected. Thus Smiglecius, when discussing the modal affections of certainty and necessity, says, “certitudo ad cognitionem spectat: necessitas vero est in re” (Disputationes; Disp. XIII., Quæst. XII.).
8 However, this distinction isn't completely overlooked. For instance, Smiglecius, when talking about the modal aspects of certainty and necessity, states, “certainty relates to knowledge; necessity, on the other hand, exists in reality” (Disputationes; Disp. XIII., Quæst. XII.).
9 It may be remarked that Whately (Logic, Bk. II. ch. II. § 2) speaks of necessary, impossible and contingent matter, without any apparent suspicion that they belong entirely to an obsolete point of view.
9 It's worth noting that Whately (Logic, Bk. II. ch. II. § 2) discusses necessary, impossible, and contingent issues without showing any indication that these ideas are completely outdated.
10 Formal Logic, p. 233.
__A_TAG_PLACEHOLDER_0__ Formal Logic, p. 233.
11 The subject was sometimes altogether omitted, as by Wolf. He says a good deal however about probable propositions and syllogisms, and, like Leibnitz before him, looked forward to a “logica probabilium” as something new and desirable. I imagine that he had been influenced by the writers on Chances, as of the few who had already treated that subject nearly all the most important are referred to in one passage (Philosophia Rationalis sive Logica, § 593).
11 Sometimes, the topic was completely left out, as Wolf did. He talks quite a bit about likely propositions and syllogisms, and, like Leibnitz before him, anticipated a "logic of probabilities" as something new and valuable. I think he was influenced by the authors on chances, since most of the key ones who had previously addressed that topic are mentioned in one section (Philosophia Rationalis sive Logica, § 593).
Lambert stands quite apart. In this respect, as in most others where mathematical conceptions and symbols are involved, his logical attitude is thoroughly unconventional. See, for instance, his chapter ‘Von dem Wahrscheinlichen’, in his Neues Organon.
Lambert is quite different. In this area, as in most others that involve mathematical ideas and symbols, his way of thinking is completely unconventional. For example, check out his chapter ‘Von dem Wahrscheinlichen’ in his Neues Organon.
12 I cannot find the slightest authority for the statement in the elaborate history of Logic by Prantl.
12 I can't find any support for the statement in Prantl's detailed history of Logic.
13 “Hi quatuor modi magnam censeri solent analogiam habere cum quadruplici propositionum in quantitate et qualitate varietate” (Wallis's Instit. Logic. Bk. II. ch. 8).
13 “Hi, there are usually four main ways to think of analogies that relate to the four types of propositions in terms of variety of quantity and quality” (Wallis's Instit. Logic. Bk. II. ch. 8).
14 Laws of Thought, § 118.
__A_TAG_PLACEHOLDER_0__ Laws of Thought, § 118.
15 “Haud scio magis ne doctrinam modalium scholastici exercuerint, quam ea illos vexarit. Certe usque adeo sudatum hic fuit, ut dicterio locus sit datus; De modalibus non gustabit asinus.” Keckermann, Syst. Log. Bk. II. ch. 3.
15 “I’m not sure if the academic teachings of modes troubled them more than they actually practiced them. Certainly, they worked so hard on it that it deserves a saying: The donkey won’t taste of the modalities.” Keckermann, Syst. Log. Bk. II. ch. 3.
16 Smiglecii Disputationes, Ingolstadt, 1618.
__A_TAG_PLACEHOLDER_0__ Smiglecii Disputationes, Ingolstadt, 1618.
See also Prantl's Geschichte der Logik (under Occam and Buridan) for accounts of the excessive complication which the subtlety of those learned schoolmen evolved out of such suitable materials.
See also Prantl's Geschichte der Logik (under Occam and Buridan) for discussions on the excessive complexity that the cleverness of those educated scholars developed from such appropriate materials.
18 “The εἰκòς and σημεῖον themselves are propositions; the former stating a general probability, the latter a fact, which is known to be an indication, more or less certain, of the truth of some further statement, whether of a single fact, or of a general belief. The former is a general proposition, nearly, though not quite, universal; as ‘most men who envy hate’; the latter is a singular proposition, which however is not regarded as a sign, except relatively to some other proposition, which it is supposed may by inferred from it.” (Mansel's Aldrich; Appendix F, where an account will be found of the Aristotelian enthymeme, and dialectic syllogism. Also, of course, Grote's Aristotle, Topics and elsewhere.)
18 “The εἰκòς and σημεῖον are both propositions; the first indicates a general probability, while the second states a fact that serves as a more or less certain indication of the truth of another statement, either about a specific fact or a broader belief. The first is a general proposition, almost universal; like ‘most people who envy also hate’; the second is a singular proposition, which is only seen as a sign in relation to another proposition that is assumed to be inferred from it.” (Mansel's Aldrich; Appendix F, where an account will be found of the Aristotelian enthymeme, and dialectic syllogism. Also, of course, Grote's Aristotle, Topics and elsewhere.)
19 “Nam in hoc etiam differt demonstratio, sen demonstrativa argumentatio, à probabili, quia in illâ tam conclusio quam præmissæ necessariæ sunt; in probabili autem argumentatione sicut conclusio ut probabilis infertur ita præmissæ ut probabiles afferuntur” (Crackanthorpe, Bk. V., Ch. 1); almost the words with which De Morgan distinguishes between logic and probability in a passage already cited (see Ch. VI. § 3).
19 “In this, the demonstration also differs from demonstrative reasoning and from probable reasoning because in the former, both the conclusion and the premises are necessary; however, in probable reasoning, the conclusion is inferred as probable and the premises are presented as probable” (Crackanthorpe, Bk. V., Ch. 1); nearly the same words De Morgan uses to differentiate between logic and probability in a previously cited passage (see Ch. VI. § 3).
Perhaps it was a development of some such view as this that Leibnitz looked forward to. “J'ai dit plus d'une fois qu'il faudrait une nouvelle espèce de Logique, qui traiteroit des degrés de Probabilité, puisqu'Aristote dans ses Topiques n'a rien moins fait que cela” (Nouveaux essais, Lib. IV. ch. XVI). It is possible, indeed, that he had in his mind more what we now understand by the mathematical theory of Probability, but in the infancy of a science it is of course hard to say whether any particular subject is definitely contemplated or not. Leibnitz (as Todhunter has shown in his history) took the greatest interest in such chance problems as had yet been discussed.
Perhaps it was a development of a view like this that Leibnitz anticipated. “I have said more than once that we need a new kind of logic that addresses degrees of probability, since Aristotle in his Topics did nothing less” (Nouveaux essais, Lib. IV. ch. XVI). It’s possible that he was thinking more along the lines of what we now refer to as the mathematical theory of probability, but in the early stages of a science, it’s hard to determine whether a specific topic is clearly being considered or not. Leibnitz (as Todhunter has pointed out in his history) was very interested in the chance problems that had already been discussed.
20 By loci were understood certain general classes of premises. They stood, in fact, to the major premise in somewhat the same relation that the Category or Predicament did to the term. Crackanthorpe says of them, “sed duci a loco probabiliter arguendi, hoc vere proprium est Argumentationis probabilis; et in hoc a Demonstratione differt, quia Demonstrator utitur solummodo quatuor Locis eisque necessariis…. Præter hos autem, ex quibus quoque probabiliter arguere licet, sunt multo plures Loci arguendi probabiliter; ut a Genere, a Specie, ab Adjuncto, ab Oppositis, et similia” (Logica, Lib. V., ch. II.).
20 By loci, we understood certain general categories of premises. They were actually related to the major premise in a way similar to how the Category or Predicament related to the term. Crackanthorpe mentions them, “but to be led by loco probabiliter arguendi, this is truly the essence of Probable Argumentation; and in this, it differs from Demonstration, because the Demonstrator only uses four necessary Loci…. Besides these, from which it is also possible to argue probabilistically, there are many more Loci for arguing probabilistically; such as from the Genus, from the Species, from the Adjunct, from Opposites, and similar” (Logica, Lib. V., ch. II.).
21 Stephen's General View of the Criminal Law of England, p. 241.
21 Stephen's General View of the Criminal Law of England, p. 241.
23 Though this is claimed by some Kantian logicians;—Nie darf an einem angeblichen Verbrecher die gesetzliche Strafe vollzogen werden, bevor er nicht selbst das Verbrechen eingestanden. Denn wenn auch alle Zeugnisse und die übrigen Anzeigen wider ihn wären, so bleibt doch das Gegentheil immer möglich” (Krug, Denklehre, § 131).
23 Though this is claimed by some Kantian logicians;—No legal punishment should be carried out on an alleged criminal until they confess to the crime themselves. Because even if there are all testimonies and other evidence against them, the opposite remains always possible” (Krug, Denklehre, § 131).
24 As Mr C. J. Monro puts it: “Suppose that a man is suspected of murdering his daughter. Evidence which would not convict him before an ordinary jury might make a grand jury find a true bill; evidence which would not do this might make a coroner's jury bring in a verdict against him; evidence which would not do this would very often prevent a Chancery judge from appointing the man guardian to a ward of the court; evidence which would not affect the judge's mind might make a father think twice on his death-bed before he appointed the man guardian to his daughter.”
24 As Mr. C. J. Monro states: “Imagine a man suspected of murdering his daughter. Evidence that wouldn’t convict him in front of a regular jury might still persuade a grand jury to indict him; evidence that wouldn’t do that could lead a coroner's jury to rule against him; evidence that wouldn’t sway this jury might often stop a Chancery judge from making him the guardian of a ward in the court; evidence that wouldn’t influence the judge could still make a father reconsider, on his deathbed, before choosing this man as guardian to his daughter.”
25 The portions of this work which treat of the nature of proof in general, and of judicial proof in particular, are well worth reading by every logical student. It appears to me, however, that the author goes much too far in the direction of regarding proof as subjective, that is as what does satisfy people, rather than as what should satisfy them. He compares the legislative standard of certainty with that of value; this latter is declared to be a certain weight of gold, irrespective of the rarity or commonness of that metal. So with certainty; if people grow more credulous the intrinsic value of the standard will vary.
25 The parts of this work that discuss the nature of proof in general, and judicial proof in particular, are definitely worth reading for any logical student. However, I think the author goes too far in viewing proof as subjective, meaning it’s based on what does satisfy people, instead of what should satisfy them. He compares the legislative standard of certainty to that of value; the latter is defined as a specific weight of gold, regardless of how rare or common that metal is. The same goes for certainty; if people become more gullible, the intrinsic value of the standard will change.
26 The question will be more fully discussed in a future chapter, but a few words may be inserted here by way of indication. Reduce the case to the simplest possible elements by supposing only two judges or courts, of the same average correctness of decision. Let this be indicated by x. Then the chance of their agreeing is x2 + (1 − x)2, for they agree if both are right or both wrong. If the statistical frequency of this agreement is known, that is, the frequency with which the first judgment is confirmed by the second, we have the means of determining x.
26 The question will be discussed in more detail in a future chapter, but a few points can be made here for clarity. Simplify the case to the most basic elements by considering just two judges or courts, each with the same average accuracy in decision-making. Let's represent this as x. The probability of them agreeing is x2 + (1 - x)2, since they agree only if both are right or both are wrong. If we know how often this agreement occurs, that is, the frequency with which the first judgment is confirmed by the second, we can determine x.
CHAPTER 14.
FALLACIES.
§ 1. In works on Logic a chapter is generally devoted to the discussion of Fallacies, that is, to the description and classification of the different ways in which the rules of Logic may be transgressed. The analogy of Probability to Logic is sufficiently close to make it advisable to adopt the same plan here. In describing his own opinions an author is, of course, perpetually obliged to describe and criticise those of others which he considers erroneous. But some of the most widely spread errors find no supporters worth mentioning, and exist only in vague popular misapprehension. It will be found the best arrangement, therefore, at the risk of occasional repetition, to collect a few of the errors that occur most frequently, and as far as possible to trace them to their sources; but it will hardly be worth the trouble to attempt any regular system of arrangement and classification. We shall mainly confine ourselves, in accordance with the special province of this work, to problems which involve questions of logical interest, or to those which refer to the application of Probability to moral and social science. We shall avoid the discussion of isolated problems in games of chance and skill except when some error of principle seems to be involved in them.
§ 1. In discussions about Logic, there's usually a chapter dedicated to Fallacies, which means looking at and categorizing the different ways the rules of Logic can be violated. The relationship between Probability and Logic is close enough that it makes sense to follow a similar approach here. When sharing his views, an author inevitably needs to describe and critique the views of others that he thinks are incorrect. However, some of the most common errors don’t have any real defenders and only exist in vague public misunderstandings. Therefore, it seems best, despite the potential for some repetition, to gather a few of the most frequently encountered errors and try to trace them back to their origins; but it’s probably not worth the effort to create a formal system of organization and classification. We will mainly focus, as this work intends, on issues that involve questions of logical interest, or those related to applying Probability to moral and social sciences. We will steer clear of discussing isolated issues in games of chance and skill unless a fundamental error seems to be at play.
§ 2. (I.) One of the most fertile sources of error and confusion upon the subject has been already several times alluded to, and in part discussed in a previous chapter. This consists in choosing the class to which to refer an event, and therefore judging of the rarity of the event and the consequent improbability of foretelling it, after it has happened, and then transferring the impressions we experience to a supposed contemplation of the event beforehand. The process in itself is perfectly legitimate (however unnecessary it may be), since time does not in strictness enter at all into questions of Probability. No error therefore need arise in this way, if we were careful as to the class which we thus selected; but such carefulness is often neglected.
§ 2. (I.) One of the biggest sources of error and confusion on this topic has already been mentioned a few times and partially discussed in a previous chapter. This issue involves selecting the category to which an event belongs, which in turn affects our judgment about how rare the event is and how unlikely it is to predict it, after it has occurred, and then applying those impressions to a hypothetical consideration of the event beforehand. The process itself is completely legitimate (though it may be unnecessary), since time doesn't really factor into questions of Probability. Therefore, no error should arise from this if we are careful about the class we choose; however, that carefulness is often overlooked.
An illustration may afford help here. A man once pointed to a small target chalked upon a door, the target having a bullet hole through the centre of it, and surprised some spectators by declaring that he had fired that shot from an old fowling-piece at a distance of a hundred yards, His statement was true enough, but he suppressed a rather important fact. The shot had really been aimed in a general way at the barn-door, and had hit it; the target was afterwards chalked round the spot where the bullet struck. A deception analogous to this is, I think, often practised unconsciously in other matters. We judge of events on a similar principle, feeling and expressing surprise in an equally unreasonable way, and deciding as to their occurrence on grounds which are really merely a subsequent adjunct of our own. Butler's remarks about ‘the story of Cæsar,’ discussed already in the twelfth chapter, are of this character. He selects a series of events from history, and then imagines a person guessing them correctly who at the time had not the history before him. As I have already pointed out, it is one 334 thing to be unlikely to guess an event rightly without specific evidence; it is another and very different thing to appreciate the truth of a story which is founded partly or entirely upon evidence. But it is a great mistake to transfer to one of these ways of viewing the matter the mental impressions which properly belong to the other. It is like drawing the target afterwards, and then being surprised to find that the shot lies in the centre of it.
An example might help here. A man once pointed to a small target drawn on a door, which had a bullet hole right in the middle, and surprised some onlookers by claiming that he had shot it from an old shotgun at a distance of a hundred yards. His statement was technically true, but he left out an important detail. The shot was generally aimed at the barn door and happened to hit it; the target was later drawn around where the bullet had struck. A similar kind of deception often happens unconsciously in other situations. We judge events based on a similar principle, showing surprise in an equally unreasonable way, and making decisions about their occurrence based on information that is really just an afterthought of our own. Butler's comments about ‘the story of Cæsar,’ discussed earlier in the twelfth chapter, are an example of this. He takes a series of historical events and imagines a person guessing them right without having the history in front of them at the time. As I've pointed out before, it's one thing to be unlikely to correctly guess an event without specific evidence; it's another, completely different thing to understand the truth of a story that is based partly or entirely on evidence. But it's a big mistake to transfer the mental impressions from one perspective to the other. It's like drawing the target afterward and then being surprised to find that the shot is in the center of it.
§ 3. One aspect of this fallacy has been already discussed, but it will serve to clear up difficulties which are often felt upon the subject if we reexamine the question under a somewhat more general form.
§ 3. One aspect of this fallacy has already been discussed, but it will help clarify the difficulties often encountered on the subject if we take another look at the question in a slightly broader context.
In the class of examples under discussion we are generally presented with an individual which is not indeed definitely referred to a class, but in regard to which we have no great difficulty in choosing the appropriate class. Now suppose we were contemplating such an event as the throwing of sixes with a pair of dice four times running. Such a throw would be termed a very unlikely event, as the odds against its happening would be 36 × 36 × 36 × 36 − 1 to 1 or 1679615 to 1. The meaning of these phrases, as has been abundantly pointed out, is simply that the event in question occurs very rarely; that, stated with numerical accuracy, it occurs once in 1679616 times.
In the examples we're discussing, we usually come across an individual that isn’t exactly classified, but identifying the right category isn’t too hard. Now, let’s say we're thinking about an event like rolling a six with a pair of dice four times in a row. This occurrence would be considered extremely unlikely since the odds against it happening are 36 × 36 × 36 × 36−1 to 1, or 1 in 1,679,615. What these phrases mean, as has been clearly pointed out, is that this event happens very rarely; in numerical terms, it occurs once in 1,679,616 times.
§ 4. But now let us make the assumption that the throw has actually occurred; let us put ourselves into the position of contemplating sixes four times running when it is known or reported that this throw has happened. The same phrase, namely that the event is a very unlikely one, will often be used in relation to it, but we shall find that this phrase may be employed to indicate, on one occasion or another, extremely different meanings.
§ 4. Now, let’s assume that the throw has really taken place; let’s imagine considering rolling sixes four times in a row when it’s already known or reported that this throw occurred. The same phrase, that this event is very unlikely, is often used in relation to it, but we will see that this phrase can have very different meanings depending on the context.
(1) There is, firstly, the most correct meaning. The 335 event, it is true, has happened, and we know what it is, and therefore, we have not really any occasion to resort to the rules of Probability; but we can nevertheless conceive ourselves as being in the position of a person who does not know, and who has only Probability to appeal to. By calling the chances 1679615 to 1 against the throw we then mean to imply the fact, that inasmuch as such a throw occurs only once in 1679616 times, our guess, were we to guess, would be correct only once in the same number of times; provided, that is, that it is a fair guess, based simply on these statistical grounds.
(1) First, there’s the most accurate meaning. The 335 event, indeed, has happened, and we know what it is, which means we really don’t have a reason to rely on the rules of Probability. However, we can still imagine ourselves in the shoes of someone who doesn’t know and has to depend solely on Probability. By stating that the odds are 1 in 1,679,615 against that throw, we imply that since such a throw happens only once in 1,679,616 attempts, our guess—if we were to guess—would be correct only once out of that many times, assuming it’s a fair guess based purely on these statistical grounds.
§ 5. (2) But there is a second and very different conception sometimes introduced, especially when the event in question is supposed to be known, not as above by the evidence of our experience, but by the report of a witness. We may then mean by the ‘chances against the event’ (as was pointed out in Chapter XII.) not the proportional number of times we should be right in guessing the event, but the proportional number of times the witness will be right in reporting it. The bases of our inference are here shifted on to new ground. In the former case the statistics were the throws and their respective frequency, now they are the witnesses' statements and their respective truthfulness.
§ 5. (2) However, there is a second and quite different idea that sometimes comes up, especially when the event in question is assumed to be known, not through our own experiences, but by a witness's account. In this case, when we refer to the 'chances against the event' (as mentioned in Chapter XII.), we don't mean the proportion of times we would be right in guessing the event, but the proportion of times the witness will be accurate in reporting it. The basis for our conclusions has now shifted to a new perspective. In the first scenario, the statistics were about the attempts and their respective frequencies; now they are about the witness's statements and their respective accuracy.
§ 6. (3) But there is yet another meaning sometimes intended to be conveyed when persons talk of the chances against such an event as the throw in question. They may mean—not, Here is an event, how often should I have guessed it?—nor, Here is a report, how often will it be correct?—but something different from either, namely, Here is an event, how often will it be found to be produced by some one particular kind of cause?
§ 6. (3) However, there’s another meaning that people sometimes convey when discussing the odds against an event like the one in question. They might not mean, "Here’s an event, how often would I have predicted it?" or "Here’s a report, how often will it be accurate?" Instead, they're referring to something different: "Here’s an event, how often is it caused by a specific type of cause?"
When, for example, a man hears of dice giving the same throw several times running, and speaks of this as very 336 extraordinary, we shall often find that he is not merely thinking of the improbability of his guess being right, or of the report being true, but, that along with this, he is introducing the question of the throw having been produced by fair dice. There is, of course, no reason whatever why such a question as this should not also be referred to Probability, provided always that we could find the appropriate statistics by which to judge. These statistics would be composed, not of throws of the particular dice, nor of reports of the particular witness, but of the occasions on which such a throw as the one in question respectively had, and had not, been produced fairly. The objection to entering upon this view of the question would be that no such statistics are obtainable, and that if they were, we should prefer to form our opinion (on principles to be described in Chapter XVI.) from the special circumstances of the case rather than from an appeal to the average.
When, for instance, a man hears about dice showing the same results multiple times in a row and talks about it as very unusual, we often find that he’s not just considering the slim chance of his guess being correct or the report being accurate, but also questioning whether the results were produced by fair dice. Of course, there’s no reason why this question shouldn’t also relate to Probability, as long as we could find the right statistics to evaluate it. These statistics wouldn’t be based on the results of those specific dice or the accounts of that specific witness, but rather on instances when the result in question happened, and when it didn’t, fairly. The issue with pursuing this perspective would be that no such statistics are available, and even if they were, we would prefer to form our opinion (based on the principles outlined in Chapter XVI.) from the specific details of the situation rather than relying on averages.
§ 7. The reader will easily be able to supply examples in illustration of the distinctions just given; we will briefly examine but one. I hide a banknote in a certain book in a large library, and leave the room. A person tells me that, after I went out, a stranger came in, walked straight up to that particular book, and took it away with him. Many people on hearing this account would reply, How extremely improbable! On analysing the phrase, I think we shall find that certainly two, and possibly all three, of the above meanings might be involved in this exclamation. (1) What may be meant is this,—Assuming that the report is true, and the stranger innocent, a rare event has occurred. Many books might have been thus taken without that particular one being selected. I should not therefore have expected the event, and when it has happened I am surprised. Now a man has a perfect right to be surprised, but he has no 337 logical right (so long as we confine ourselves to this view) to make his surprise a ground for disbelieving the event. To do this is to fall into the fallacy described at the commencement of this chapter. The fact of my not having been likely to have guessed a thing beforehand is no reason in itself for doubting it when I am informed of it. (2) Or I may stop short of the events reported, and apply the rules of Probability to the report itself. If so, what I mean is that such a story as this now before me is of a kind very generally false, and that I cannot therefore attach much credit to it now. (3) Or I may accept the truth of the report, but doubt the fact of the stranger having taken the book at random. If so, what I mean is, that of men who take books in the way described, only a small proportion will be found to have taken them really at random; the majority will do so because they had by some means ascertained, or come to suspect, what there was inside the book.
§ 7. The reader can easily think of examples that illustrate the distinctions just mentioned; we will briefly look at just one. I hide a banknote in a certain book in a large library and then leave the room. Someone tells me that after I left, a stranger came in, walked straight up to that specific book, and took it with him. Many people, upon hearing this story, would say, "That's highly unlikely!" Analyzing this phrase, I think we can identify that certainly two, and possibly all three, of the above interpretations might be present in this reaction. (1) One possible meaning is this: assuming the report is true and the stranger is innocent, an unusual event has occurred. Many books could have been taken, yet that particular one was chosen. Therefore, I wouldn't have expected this event, and I’m surprised when it happens. Now, a person has every right to be surprised, but they don’t have a logical basis (as long as we stick to this view) to use their surprise as a reason to disbelieve the event. Falling into that logic is the same error described at the start of this chapter. The fact that I wouldn’t have predicted something beforehand doesn’t justify doubting it once I hear about it. (2) Alternatively, I could stop before the reported events and apply probability rules to the claim itself. If so, what I mean is that a story like this one is generally false, and I can’t give it much credibility. (3) Or I might accept the report as true, but question whether the stranger really took the book at random. In that case, what I mean is that among those who take books as described, only a small fraction actually do so randomly; most would have figured out or suspected what was inside the book.
Each of the above three meanings is a possible and a legitimate meaning. The only requisite is that we should be careful to ascertain which of them is present to the mind, so as to select the appropriate statistics. The first makes in itself the most legitimate use of Probability; the drawback being that at the time in question the functions of Probability are superseded by the event being otherwise known. The second or third, therefore, is the more likely meaning to be present to the mind, for in these cases Probability, if it could be practically made use of, would, at the time in question, be a means of drawing really important inferences. The drawbacks are the difficulty of finding such statistics, and the extreme disturbing influence upon these statistics of the circumstances of the special case.
Each of the three meanings mentioned above is valid and acceptable. The only requirement is that we should be careful to determine which one is being considered, so we can choose the right statistics. The first meaning represents the most legitimate application of probability; however, the downside is that during that specific time, the functions of probability are overridden by the fact that the event is already known. Therefore, the second or third meaning is more likely to be in mind, as in these cases, probability, if it could be practically applied, would serve as a way to draw genuinely important conclusions. The challenges are finding such statistics and the significant impact that the specific circumstances have on these statistics.
§ 8. (II.) Closely connected with the tendency just mentioned is that which prompts us to confound a true 338 chance selection with one which is more or less picked. When we are dealing with familiar objects in a concrete way, especially when the greater rarity corresponds to superiority of quality, almost every one has learnt to recognize the distinction. No one, for instance, on observing a fine body of troops in a foreign town, but would be prompted to ask whether they came from an average regiment or from one that was picked. When however the distinction refers to unfamiliar objects, and especially when only comparative rarity seems to be involved, the fallacy may assume a rather subtle and misleading form, and seems to deserve special notice by the consideration of a few examples.
§ 8. (II.) Closely related to the tendency just mentioned is the one that leads us to confuse a true random selection with one that is somewhat curated. When we’re dealing with familiar objects in a tangible way, especially when greater rarity indicates higher quality, almost everyone has learned to recognize the difference. For example, anyone observing a well-trained group of soldiers in a foreign city would likely wonder whether they were from an average regiment or from an elite one. However, when the distinction involves unfamiliar objects, and particularly when only relative rarity seems at play, the mistake can become quite subtle and misleading, warranting special attention through a few examples.
Sometimes the result is not so much an actual fallacy as a slight misreckoning of the order of probability of the event under consideration. For instance, in the Pyramid question, we saw that it made some difference whether we considered that π alone was to be taken into account or whether we put this constant into a class with a small number of other similar ones. In deciding, however, whether or not there is anything remarkable in the actual falling short of the representation of the number 7 in the evaluation of π (v. p. 247) the whole question turns upon considerations of this kind. The only enquiry raised is whether there is anything remarkable in this departure from the mean, and the answer depends upon whether we suppose that we are referring to a predetermined digit, or to whatever digit of the ten happens to be most above or below the average. Or, take the case raised by Cournot (Exposition de la Théorie des Chances, §§ 102, 114), that a certain deviation from the mean in the case of Departmental returns of the proportion between male and female births is significant and indicative of a difference in kind, provided that we select at random a single French Department; but that the same deviation may 339 be accidental if it is the maximum of the respective returns for several Departments.[1] The answer may be given one way or the other according as we bear this consideration in mind.
Sometimes the result isn't exactly a fallacy but more a small misunderstanding of the probability order of the event in question. For example, in the Pyramid question, we noticed that it mattered whether we took π by itself or included it with a few other similar constants. When deciding whether there’s anything noteworthy about the actual absence of the number 7 in the evaluation of π (v. p. 247), the entire issue revolves around these kinds of considerations. The only question raised is whether this deviation from the average is significant, and the answer relies on whether we’re thinking of a specific predetermined digit or any of the ten digits that happens to be significantly above or below the average. Or consider Cournot's case (Exposition de la Théorie des Chances, §§ 102, 114), where a certain deviation from the average in Departmental returns of male and female births is significant and suggests a real difference, as long as we randomly select a single French Department; however, that same deviation might just be coincidental if it’s the highest among the returns for several Departments. The answer can be understood one way or the other depending on how we consider this.
§ 9. We are peculiarly liable to be misled in this way when we are endeavouring to determine the cause of some phenomenon, by mere statistics, in entire ignorance as to the direction in which the cause should be expected. In such cases an ingenious person who chooses to look about over a large field can never fail to hit upon an explanation which is plausible in the sense that it fits in with the hitherto observed facts. With a tithe of the trouble which Mr Piazzi Smyth expended upon the measurement of the great pyramid, I think I would undertake to find plausible intimations of several of the important constants and standards which he discovered there, in the dimensions of the desk at which I am writing. The oddest instance of this sort of conclusion is perhaps to be found in the researches of a writer who has discovered[2] that there is a connection of a striking kind between the respective successes of the Oxford and the Cambridge boat in the annual race, and the greater and less frequency of sun-spots.
§ 9. We are particularly prone to being misled in this way when we try to figure out the cause of a phenomenon using only statistics, completely unaware of where we should expect the cause to come from. In these cases, a clever person who looks across a broad area can always find an explanation that seems reasonable because it aligns with the facts we've observed so far. With just a fraction of the effort that Mr. Piazzi Smyth put into measuring the Great Pyramid, I believe I could find believable hints of several important constants and standards he identified there, right in the dimensions of the desk where I'm writing. The most peculiar example of this kind of conclusion might be found in the work of a writer who discovered that there's a striking correlation between the success of the Oxford and Cambridge boats in their annual race and the more or less frequent occurrence of sunspots.
Of course our usual practical resource in such cases is to make appeal to our previous knowledge of the subject in question, which enables us to reject as absurd a great number of hypotheses which can nevertheless make a fair show when they are allowed to rest upon a limited amount of adroitly selected instances. But it must be remembered that if any theory chooses to appeal to statistics, to statistics it must be suffered to go for judgment. Even the boat race theory 340 could be established (if sound) on this ground alone. That is, if it really could be shown that experience in the long run confirmed the preponderance of successes on one side or the other according to the relative frequency of the sun-spots, we should have to accept the fact that the two classes of events were not really independent. One of the two, whichever it may be, must be suspected of causing or influencing the other; or both must be caused or influenced by some common circumstances.
Of course, our usual practical approach in these situations is to draw on what we already know about the subject, which allows us to dismiss many hypotheses that might seem plausible when based on a limited selection of cleverly chosen examples. However, it’s important to remember that if any theory wants to rely on statistics, it must be evaluated based on statistical evidence. Even the boat race theory could be validated (if it’s valid) on this basis alone. That is, if it could genuinely be demonstrated that over time, the success rates leaned towards one side or the other in relation to the frequency of sunspots, we would have to accept the reality that these two types of events are not actually independent. One of them, whichever it is, must be suspected of affecting or influencing the other; or both could be influenced by some shared factors.
§ 10. (III.) The fallacy described at the commencement of this chapter arose from determining to judge of an observed or reported event by the rules of Probability, but employing a wrong set of statistics in the process of judging. Another fallacy, closely connected with this, arises from the practice of taking some only of the characteristics of such an event, and arbitrarily confining to these the appeal to Probability. Suppose I toss up twelve pence and find that eleven of them give heads. Many persons on witnessing such an occurrence would experience a feeling which they would express by the remark, How near that was to getting all heads! And if any thing very important were staked on the throw they would be much excited at the occurrence. But in what sense were we near to twelve? There is a not uncommon error, I apprehend, which consists in unconsciously regarding the eleven heads as a thing which is already somehow secured, so that one might as it were keep them, and then take our chance for securing the remaining one. The eleven are mentally set aside, looked upon as certain (for they have already happened), and we then introduce the notion of chance merely for the twelfth. But this twelfth, having also happened, has no better claim to such a distinction than any of the others. If we will introduce the notion of chance in the case of the one that gave tail we must do the same in 341 the case of all the others as well. In other words, if the tosser be dissatisfied at the appearance of the one tail, and wish to cancel it and try his luck again, he must toss up the whole lot of pence again fairly together. In this case, of course, so far from his having a better prospect for the next throw he may think himself in very good luck if he makes again as good a throw as the one he rejected. What he is doing is confounding this case with that in which the throws are really successive. If eleven heads have been tossed up in turn, we are of course within an even chance of getting a twelfth; but the circumstances are quite different in the instance proposed.
§ 10. (III.) The fallacy mentioned at the beginning of this chapter comes from deciding to evaluate an observed or reported event using Probability rules, but using the wrong statistics for that evaluation. Another related fallacy occurs when only some characteristics of that event are considered, while limiting the appeal to Probability to just those traits. For example, if I toss twelve coins and get eleven heads, many people witnessing this might say, "Wow, that was so close to getting all heads!" If something really important was at stake in that toss, they would be very excited about this outcome. But how were we actually close to getting twelve heads? There is a common mistake where people unconsciously think of the eleven heads as a result that's already guaranteed, so they might as well keep those and just take a chance on the twelfth. The eleven are mentally set aside, seen as certain (since they have already happened), and then we only consider chance for the twelfth. However, since the twelfth has also happened, it doesn't deserve any special treatment compared to the others. If we’re going to consider the one tail as a matter of chance, we have to apply the same thinking to all the others too. In other words, if the person tossing the coins is unhappy about getting one tail and wants to redo the toss, they have to throw all the coins together again fairly. In this case, rather than having a better chance on the next toss, they should feel lucky if they get a result as good as the one they're rejecting. What they're doing is mixing this situation up with one where the tosses are genuinely successive. If eleven heads were tossed one after the other, we would have an even chance of getting a twelfth; but the situation is quite different in this example.
§ 11. In the above example the error is transparent. But in forming a judgment upon matters of greater complexity than dice and pence, especially in the case of what are called ‘narrow escapes,’ a mistake of an analogous kind is, I apprehend, far from uncommon. A person, for example, who has just experienced a narrow escape will often be filled with surprise and anxiety amounting almost to terror. The event being past, these feelings are, at the time, in strictness inappropriate. If, as is quite possible, they are merely instinctive, or the result of association, they do not fall within the province of any kind of Logic. If, however, as seems more likely, they partially arise from a supposed transference of ourselves into that point of past time at which the event was just about to happen, and the production by imagination of the feelings we should then expect to experience, this process partakes of the nature of an inference, and can be right or wrong. In other words, the alarm may be proportionate or disproportionate to the amount of danger that might fairly have been reckoned upon in such a hypothetical anticipation. If the supposed transfer were completely carried out, there would be no fallacy; but it is often very 342 incompletely done, some of the component parts of the event being supposed to be determined or ‘arranged’ (to use a sporting phrase) in the form in which we now know that they actually have happened, and only the remaining ones being fairly contemplated as future chances.
§ 11. In the example above, the error is clear. However, when making judgments about more complex issues than dice and money, especially regarding what we call 'close calls,' I think similar mistakes are quite common. For instance, someone who just had a close call often feels a mix of surprise and anxiety, almost to the point of fear. Once the event is over, these feelings are, strictly speaking, inappropriate. If they are instinctive or based on associations, they don't fit into any form of Logic. But if, as seems more likely, they stem from a perceived transfer of ourselves to the moment just before the event occurred and the imagination producing the feelings we would expect to feel then, this process resembles an inference and can be right or wrong. In other words, the panic may be either appropriate or excessive considering the actual danger that could have been anticipated. If the supposed transfer were fully realized, there would be no fallacy; however, it's often only partially done, with some parts of the event assumed to be fixed or 'arranged' (to use a sports term) in the manner we know they actually occurred, while the remaining parts are seen as future possibilities. 342
A man, for example, is out with a friend, whose rifle goes off by accident, and the bullet passes through his hat. He trembles with anxiety at thinking what might have happened, and perhaps remarks, ‘How very near I was to being killed!’ Now we may safely assume that he means something more than that a shot passed very close to him. He has some vague idea that, as he would probably say, ‘his chance of being killed then was very great.’ His surprise and terror may be in great part physical and instinctive, arising simply from the knowledge that the shot had passed very near him. But his mental state may be analysed, and we shall then most likely find, at bottom, a fallacy of the kind described above. To speak or think of chance in connection with the incident, is to refer the particular incident to a class of incidents of a similar character, and then to consider the comparative frequency with which the contemplated result ensues. Now the series which we may suppose to be most naturally selected in this case is one composed of shooting excursions with his friend; up to this point the proceedings are assumed to be designed, beyond it only, in the subsequent event, was there accident. Once in a thousand times perhaps on such occasions the gun will go off accidentally; one in a thousand only of those discharges will be directed near his friend's head. If we will make the accident a matter of Probability, we ought by rights in this way (to adopt the language of the first example), to ‘toss up again’ fairly. But we do not do this; we seem to assume for certain that the shot goes within an inch of our heads, detach 343 that from the notion of chance at all, and then begin to introduce this notion again for possible deflections from that saving inch.
A man is out with a friend when his friend's rifle accidentally goes off, and the bullet whizzes through his hat. He feels a surge of anxiety thinking about what could have happened, perhaps saying, ‘I was so close to being killed!’ It's safe to say he means more than just a shot being close to him. He has a vague sense that, as he might put it, ‘the chance of being killed was very high.’ His surprise and fear might largely be physical and instinctive, simply because he knows the shot was really close. However, we can analyze his mental state and probably find that, at its core, it involves a mistake of reasoning as described earlier. To talk or think about chance related to the incident is to link that specific event to a group of similar incidents and then consider how often the expected outcome happens. The series that would most likely come to mind here includes shooting trips with his friend; so far, all actions have been intentional, and only afterward did an accident occur. Perhaps, once in a thousand times, the gun might go off accidentally; only one in a thousand of those shots would be aimed near his friend's head. If we treat the accident as a matter of probability, we should logically ‘toss the dice again’ fairly. But we don’t do that; we seem to assume it’s certain the shot will go within an inch of our heads, removing it from the concept of chance entirely, and then we start to reintroduce the idea of chance regarding any deviations from that safe inch.
§ 12. (IV.) We will now notice a fallacy connected with the subjects of betting and gambling. Many or most of the popular misapprehensions on this subject imply such utter ignorance and confusion as to the foundations of the science that it would be needless to discuss them here. The following however is of a far more plausible kind, and has been a source of perplexity to persons of considerable acuteness.
§ 12. (IV.) Now, let's talk about a misconception related to betting and gambling. Many, if not most, of the common misunderstandings about this topic show a complete lack of knowledge and confusion regarding the basics of the subject, so it wouldn’t make sense to go over them here. However, the following point is much more convincing and has confused some fairly sharp individuals.
The case, put into the simplest form, is as follows.[3] Suppose that a person A is playing against B, B being either another individual or a group of individuals, say a gambling bank. They begin by tossing for a shilling, and A maintains that he is in possession of a device which will insure his winning. If he does win on the first occasion he has clearly gained his point so far. If he loses, he stakes next time two shillings instead of one. The result of course is that if he wins on the second occasion he replaces his former loss, and is left with one shilling profit as well. So he goes on, doubling his stake after every loss, with the obvious result that on the first occasion of success he makes good all his previous losses, and is left with a shilling over. But such an occasion must come sooner or later, by the assumptions of chance on which the game is founded. Hence it follows that he can insure, sooner or later, being left a final winner. Moreover he may win to any amount; firstly from the obvious consideration that he might make his initial stake as large as he pleased, a hundred pounds, for 344 instance, instead of a shilling; and secondly, because what he has done once he may do again. He may put his shilling by, and have a second spell of play, long or short as the case may be, with the same termination to it. Accordingly by mere persistency he may accumulate any sum of money he pleases, in apparent defiance of all that is meant by luck.
The case, put simply, is as follows.[3] Imagine that a person A is competing against B, where B is either another person or a group, like a gambling bank. They start by flipping a coin for a shilling, and A claims he has a method that guarantees his victory. If he wins the first time, he’s clearly achieved his goal. If he loses, he bets two shillings next instead of one. The outcome is that if he wins the second time, he recovers his previous loss and nets a shilling profit. He continues this pattern, doubling his stake after every loss, which means that the first time he wins, he compensates for all his earlier losses and ends up with a shilling profit. However, by the nature of chance inherent in the game, that winning moment will eventually come. Thus, it's clear that he can eventually secure a definite win. Furthermore, he could potentially win any amount; first, because he could set his initial stake as high as he wanted—say, a hundred pounds instead of a shilling—and second, because whatever he has done once, he can do again. He could set aside his shilling and have another round of play, whether it's long or short, with the same outcome. Therefore, through simple persistence, he could accumulate any sum of money he desires, seemingly challenging the concept of luck.
§ 13. I have classed this opinion among fallacies, as the present is the most convenient opportunity of discussing it, though in strictness it should rather be termed a paradox, since the conclusion is perfectly sound. The only fallacy consists in regarding such a way of obtaining the result as mysterious. On the contrary, there is nothing more easy than to insure ultimate success under the given conditions. The point is worth enquiry, from the principles it involves, and because the answers commonly given do not quite meet the difficulty. It is sometimes urged, for instance, that no bank would or does allow the speculator to choose at will the amount of his stake, but puts a limit to the amount for which it will consent to play. This is quite true, but is of course no answer to the hypothetical enquiry before us, which assumes that such a state of things is allowed. Again, it has been urged that the possibility in question turns entirely upon the fact that credit must be supposed to be given, for otherwise the fortune of the player may not hold out until his turn of luck arrives:—that, in fact, sooner or later, if he goes on long enough, his fortune will not hold out long enough, and all his gains will be swept away. It is quite true that credit is a condition of success, but it is in no sense the cause. We may suppose both parties to agree at the outset that there shall be no payments until the game be ended, A having the right to decide when it shall be considered to be ended. It still remains true that whereas in ordinary gambling, i.e. with fixed or haphazard stakes, A could 345 not ensure winning eventually to any extent, he can do so if he adopt such a scheme as the one in question. And this is the state of things which seems to call for explanation.
§ 13. I’ve categorized this opinion as a fallacy since this is the best time to discuss it, though technically it should be called a paradox, as the conclusion is absolutely valid. The only fallacy lies in viewing this method of reaching the conclusion as mysterious. In reality, ensuring eventual success under the given conditions is incredibly straightforward. The topic is worth exploring due to the principles it raises and because the usual answers don't quite address the issue. For instance, it's often claimed that no bank would let the speculator arbitrarily choose the amount of their stake, instead imposing a limit on how much they are willing to gamble. This is true, but it doesn't answer the hypothetical question at hand, which assumes that such a situation is permitted. It has also been suggested that the possibility we’re discussing entirely depends on the assumption of credit being given, as otherwise the player might run out of resources before their luck changes: that eventually, if they keep playing long enough, their resources will dwindle, leading to a complete loss of their gains. It is accurate that credit is a condition for success, but it is not the cause. We can imagine both parties agreeing from the start that there will be no payments until the game is over, with A having the authority to decide when it is officially finished. It remains true that while in typical gambling—i.e., with fixed or random stakes—A couldn't guarantee winning to any degree, they can do so if they adopt the scheme we're discussing. This scenario is what seems to require explanation.
§ 14. What causes perplexity here is the supposed fact that in some mysterious way certainty has been conjured out of uncertainty; that in a game where the detailed events are utterly inscrutable, and where the average, by supposition, shows no preference for either side, one party is nevertheless succeeding somehow in steadily drawing the luck his own way. It looks as if it were a parallel case with that of a man who should succeed by some device in permanently securing more than half of the tosses with a penny which was nevertheless to be regarded as a perfectly fair one.
§ 14. What’s confusing here is the idea that somehow certainty has emerged from uncertainty; that in a game where the specific outcomes are completely unknown, and where, theoretically, there’s no advantage for either side, one party is still managing to consistently get the odds in their favor. It seems similar to a situation where someone could somehow ensure that they win more than half of the flips of a coin that is supposed to be completely fair.
This is quite a mistake. The real fact is that A does not expose his gains to chance at all; all that he so exposes is the number of times he has to wait until he gains. Put such a case as this. I offer to give a man any sum of money he chooses to mention provided he will at once give it back again to me with one pound more. It does not need much acuteness to see that it is a matter of indifference to me whether he chooses to mention one pound, or ten, or a hundred. Now suppose that instead of leaving it to his choice which of these sums is to be selected each time, the two parties agree to leave it to chance. Let them, for instance, draw a number out of a bag each time, and let that be the sum which A gives to B under the prescribed conditions. The case is not altered. A still gains his pound each time, for the introduction of the element of chance has not in any way touched this. All that it does is to make this pound the result of an uncertain subtraction, sometimes 10 minus 9, sometimes 50 minus 49, and so on. It is these numbers only, not their difference, which he submits to luck, and this is of no consequence whatever.
This is quite a mistake. The truth is that A doesn’t leave his gains up to chance at all; all he exposes is how many times he has to wait until he gains. Consider this scenario: I offer to give a person any amount of money they choose, as long as they immediately give it back to me with one extra pound. It doesn’t take much insight to see that it doesn’t matter to me whether they choose one pound, ten, or a hundred. Now suppose instead of letting them choose which amount to pick each time, the two parties agree to leave it up to chance. For instance, they can draw a number from a bag each time, and that will be the amount A gives to B under the agreed conditions. The situation hasn’t changed. A still gains his pound each time, because introducing chance hasn’t affected this at all. All it does is turn that pound into the result of an uncertain subtraction, sometimes 10 minus 9, sometimes 50 minus 49, and so on. It’s these numbers alone, not their difference, that he leaves up to luck, and that doesn’t matter at all.
To suggest to any individual or company that they should consent to go on playing upon such terms as these would be too barefaced a proposal. And yet the case in question is identical in principle, and almost identical in form, with this. To offer to give a man any sum he likes to name provided he gives you back again that same sum plus one, and to offer him any number of terms he pleases of the series 1, 2, 4, 8, 16, &c., provided you have the next term of the set, are equivalent. The only difference is that in the latter case the result is attained with somewhat more of arithmetical parade. Similarly equivalent are the processes in case we prefer to leave it to chance, instead of to choice, to decide what sum or what number of terms shall be fixed upon. This latter is what is really done in the case in question. A man who consents to go on doubling his stake every time he wins, is leaving nothing else to chance than the determination of the particular number of terms of such a geometrical series which shall be allowed to pass before he stops.
To suggest to anyone or any company that they should agree to keep playing under these terms would be an incredibly bold proposal. Yet, the situation we're discussing is essentially the same in principle and nearly the same in form. Offering to give someone any amount they choose as long as they return that same amount plus one, and offering them any number of terms they want from the sequence 1, 2, 4, 8, 16, etc., as long as you have the next term in the series, are equivalent. The only difference is that in the second case, the result is achieved with a bit more mathematical showmanship. The processes are similarly equivalent if we leave the decision of what sum or how many terms to chance instead of choice. This is exactly what happens in the scenario we are discussing. A person who agrees to double their stake every time they win is leaving nothing to chance other than how many specific terms of that geometric series will occur before they decide to stop.
§ 15. It may be added that there is no special virtue in the particular series in question, viz. that in accordance with which the stake is doubled each time. All that is needed is that the last term of the series should more than balance all the preceding ones. Any other series which increased faster than this geometrical one, would answer the purpose as well or better. Nor is it necessary, again, that the game should be an even or ‘fair’ one. Chance, be it remembered, affects nothing here but the number of terms to which the series attains on each occasion, its final result being always arithmetically fixed. When a penny is tossed up it is only on one of every two occasions that the series runs to more than two terms, and so his fixed gains come in pretty regularly. But unless he was playing for a limited time only, it would 347 not affect him if the series ran to two hundred terms; it would merely take him somewhat longer to win his stakes. A man might safely, for instance, continue to lay an even bet that he would get the single prize in a lottery of a thousand tickets, provided he thus doubled, or more than doubled, his stake each time, and unlimited credit was given.
§ 15. It’s worth noting that there’s no special advantage in the specific series being discussed, namely, the one where the stake is doubled every time. What really matters is that the last term of the series outweighs all the previous ones. Any other series that grows faster than this geometric series would work just as well or even better. It’s also important to understand that it doesn’t have to be a fair game. Chance only influences how many terms the series reaches each time; the final result is always mathematically determined. When a coin is tossed, it’s only on one out of every two times that the series extends beyond two terms, so his fixed gains come in fairly consistently. However, unless he’s playing for a limited time, it wouldn’t matter if the series went up to two hundred terms; it would just take him a bit longer to win his bets. For example, a person could safely keep making an even bet that he would win the single prize in a lottery with a thousand tickets, as long as he doubled his stake each time or more, and had unlimited credit.
§ 16. So regarded, the problem is simple enough, but there are two points in it to which attention may conveniently be directed.
§ 16. Viewed this way, the problem is actually pretty straightforward, but there are two aspects of it that we should focus on.
In the first place, it serves very pointedly to remind us of the distinction between a series of events (in this case the tosses of the penny) which really are subjects of chance, and our conduct founded upon these events, which may or may not be so subject.[4] It is quite possible that this latter may be so contrived as to be in many respects a matter of absolute certainty,—a consideration, I presume, familiar enough to professional betting men. Why is the ordinary way of betting on the throws of a penny fair to both parties? Because a ‘fair’ series is ‘fairly’ treated. The heads and tails occur at random, but on an average equally often, and the stakes are either fixed or also arranged at random. If a man backs heads every time for the same amount, he will of course in the long run neither win nor lose. Neither will he if he varies the stake every time, provided he does not vary it in such a way as to make its amount dependent on the fact of his having won or lost the time before. But he may, if he pleases, and the other party consents, so arrange his stakes (as in the case in question) that Chance, if one might so express it, does not get a fair chance. Here the human elements of choice and design have been so brought to bear upon a series of events which, regarded by themselves, 348 exhibit nothing but the physical characteristics of chance, that the latter elements disappear, and we get a result which is arithmetically certain. Other analogous instances might be suggested, but the one before us has the merit of most ingeniously disguising the actual process.
First of all, it really highlights the difference between a series of events (in this case, the coin tosses) that are truly random and our actions based on those events, which could be random or not. It’s entirely possible that our approach can be arranged in such a way that it becomes a matter of complete certainty—something, I assume, that professional gamblers know well. Why is the usual method of betting on coin tosses fair for both sides? Because a 'fair' series is treated 'fairly.' Heads and tails come up randomly but on average equally often, and the bets are either fixed or also arranged randomly. If someone bets on heads every time for the same amount, they will, over time, neither win nor lose. They also won’t lose if they change the stake each time—as long as they don't base the amount on whether they won or lost previously. However, they can choose to arrange their bets (like in the case we’re discussing) so that Chance, if I may phrase it this way, doesn’t get a fair shot. Here, the human aspects of choice and strategy have been applied to a series of events that on their own display only the physical traits of randomness, so those traits vanish, leading to a mathematically certain outcome. Other similar examples could be mentioned, but the one at hand is particularly clever in hiding the actual process.
§ 17. The meaning of the remark just made will be better seen by a comparison with the following case. It has been attempted[5] to explain the preponderance of male births over female by assuming that the chances of the two are equal, but that the general desire to have a male heir tends to induce many unions to persist until the occurrence of this event, and no longer. It is supposed that in this way there would be a slight preponderance of families which consisted of one son only, or of two sons and one daughter, and so forth.
§ 17. The meaning of the remark just made will be clearer when compared to the following case. It has been attempted to explain the greater number of male births compared to female ones by suggesting that the chances of both are equal, but that the general wish for a male heir causes many couples to stay together until they achieve this outcome, and not beyond that. It is believed that as a result, there would be a slight majority of families with only one son or with two sons and one daughter, and so on.
This is quite fallacious (as had been noticed by Laplace, in his Essai); and there could not be a better instance chosen than this to show just what we can do and what we cannot do in the way of altering the luck in a real chance-succession of events. To suppose that the number of actual births could be influenced in the way in question is exactly the same thing as to suppose that a number of gamblers could increase the ratio of heads to tails, to something over one-half, by each handing the coin to his neighbour as soon as he had thrown a head: that they have only to leave off as soon as head has appeared; an absurdity which we need not pause to explain at this stage. The essential point about the ‘Martingale’ is that, whereas the occurrence of the events on which the stakes are laid is unaffected, the stakes themselves can be so adjusted as to make the luck swing one way.
This is quite misleading (as noted by Laplace in his Essay); and there's no better example to illustrate what we can and cannot do when it comes to changing luck in a series of random events. Assuming that the actual number of births could be affected in this way is just like thinking that a group of gamblers could raise the ratio of heads to tails above fifty percent by passing the coin to their neighbor each time they flipped a head and stopping as soon as they got a head—it's an absurdity we don’t need to explain right now. The key point about the 'Martingale' is that, while the outcome of the events being bet on remains unchanged, the stakes can be adjusted to make luck tilt in a certain direction.
§ 18. In the second place, this example brings before us what has had to be so often mentioned already, namely, that the series of Probability are in strictness supposed to be interminable. If therefore we allow either party to call upon us to stop, especially at a point which just happens to suit him, we may get results decidedly opposed to the integrity of the theory. In the case before us it is a necessary stipulation for A that he may be allowed to leave off when he wishes, that is at one of the points at which the throw is in his favour. Without this stipulation he may be left a loser to any amount.
§ 18. Secondly, this example highlights something that has been frequently mentioned, specifically that the series of probabilities are technically considered endless. If we let either party call for a stop, especially at a point that conveniently suits them, we may end up with results that are clearly against the integrity of the theory. In this case, it’s essential for A to be allowed to stop whenever he wants, that is, at one of the points where the outcome is in his favor. Without this option, he could end up losing a considerable amount.
Introduce the supposition that one party may arbitrarily call for a stoppage when it suits him and refuse to permit it sooner, and almost any system of what would be otherwise fair play may be converted into a very one-sided arrangement. Indeed, in the case in question, A need not adopt this device of doubling the stakes every time he loses. He may play with a fixed stake, and nevertheless insure that one party shall win any assigned sum, assuming that the game is even and that he is permitted to play on credit.
Introduce the idea that one party can randomly decide to stop the game whenever it benefits them and deny stopping it earlier, and almost any system that would normally be fair can become very biased. In fact, in this situation, A doesn’t have to choose the tactic of doubling the stakes every time he loses. He can play with a fixed stake and still ensure that one party will win a specific amount, assuming that the game is fair and that he’s allowed to play on credit.
§ 19. (V.) A common mistake is to assume that a very unlikely thing will not happen at all. It is a mistake which, when thus stated in words, is too obvious to be committed, for the meaning of an unlikely thing is one that happens at rare intervals; if it were not assumed that the event would happen sometimes it would not be called unlikely, but impossible. This is an error which could scarcely occur except in vague popular misapprehension, and is so abundantly refuted in works on Probability, that it need only be touched upon briefly here. It follows of course, from our definition of Probability, that to speak of a very rare combination of events as one that is ‘sure never to happen,’ is to use language incorrectly. Such a phrase may pass current as a 350 loose popular exaggeration, but in strictness it involves a contradiction. The truth about such rare events cannot be better described than in the following quotation from De Morgan:[6]—
§ 19. (V.) A common mistake is to think that something very unlikely will never happen. This assumption, when clearly stated, seems too obvious to be made, because something unlikely is defined as occurring only rarely; if we didn't believe the event could happen at all, it wouldn't be considered unlikely, but impossible. This error usually arises from vague public misunderstandings and is so well addressed in Probability literature that we only need to mention it briefly here. According to our definition of Probability, to refer to an extremely rare combination of events as one that is ‘sure never to happen’ is incorrect. While such a phrase might be acceptable as a loose exaggeration in everyday conversation, it is technically contradictory. The truth regarding these rare events is best expressed in the following quotation from De Morgan:[6]—
“It is said that no person ever does arrive at such extremely improbable cases as the one just cited [drawing the same ball five times running out of a bag containing twenty balls]. That a given individual should never throw an ace twelve times running on a single die, is by far the most likely; indeed, so remote are the chances of such an event in any twelve trials (more than 2,000,000,000 to 1 against it) that it is unlikely the experience of any given country, in any given century, should furnish it. But let us stop for a moment, and ask ourselves to what this argument applies. A person who rarely touches dice will hardly believe that doublets sometimes occur three times running; one who handles them frequently knows that such is sometimes the fact. Every very practised user of those implements has seen still rarer sequences. Now suppose that a society of persons had thrown the dice so often as to secure a run of six aces observed and recorded, the preceding argument would still be used against twelve. And if another society had practised long enough to see twelve aces following each other, they might still employ the same method of doubting as to a run of twenty-four; and so on, ad infinitum. The power of imagining cases which contain long combinations so much exceeds that of exhibiting and arranging them, that it is easy to assign a telegraph which should make a separate signal for every grain of sand in a globe as large as the visible universe, upon the hypothesis of the most space-penetrating astronomer. The fallacy of the preceding objection lies in supposing events in number beyond our experience, composed 351 entirely of sequences such as fall within our experience. It makes the past necessarily contain the whole, as to the quality of its components; and judges by samples. Now the least cautious buyer of grain requires to examine a handful before he judges of a bushel, and a bushel before he judges of a load. But relatively to such enormous numbers of combinations as are frequently proposed, our experience does not deserve the title of a handful as compared with a bushel, or even of a single grain.”
“It’s said that no one ever does encounter such incredibly unlikely scenarios as the one just mentioned [drawing the same ball five times in a row out of a bag containing twenty balls]. The chances of a person rolling an ace twelve times in a row on a single die are incredibly low; in fact, the odds of that happening in any twelve attempts are more than 2,000,000,000 to 1 against it, making it unlikely that any country, in any century, would actually witness it. But let’s pause for a moment and consider what this argument applies to. Someone who rarely uses dice is unlikely to believe that doubles can appear three times in a row; however, someone who frequently uses them knows that it can happen. Every seasoned player has seen even more unusual sequences. Now imagine a group of people who have rolled the dice so often that they’ve recorded a run of six aces; the same argument would still be used against the idea of rolling twelve. And if another group had practiced long enough to roll twelve aces in a row, they could still doubt the likelihood of rolling twenty-four; and so on, ad infinitum. The ability to conceive of scenarios involving long combinations far surpasses the ability to display and arrange them, making it easy to imagine a telegraph that could signal each grain of sand in a globe as large as the observable universe, according to the most far-reaching astronomer. The flaw in the earlier argument lies in assuming events occurring in numbers beyond our experience are made up entirely of sequences we have experienced. It implies the past must encompass the whole, regarding the quality of its parts; and judges based on samples. A cautious buyer of grain would check a handful before deciding about a bushel, and a bushel before judging a load. However, in relation to the enormous numbers of combinations typically proposed, our experience barely qualifies as a handful compared to a bushel, or even a single grain.”
§ 20. The origin of this inveterate mistake is not difficult to be accounted for. It arises, no doubt, from the exigencies of our practical life. No man can bear in mind every contingency to which he may be exposed. If therefore we are ever to do anything at all in the world, a large number of the rarer contingencies must be left entirely out of account. And the necessity of this oblivion is strengthened by the shortness of our life. Mathematically speaking, it would be said to be certain that any one who lives long enough will be bitten by a mad dog, for the event is not an impossible, but only an improbable one, and must therefore come to pass in time. But this and an indefinite number of other disagreeable contingencies have on most occasions to be entirely ignored in practice, and thence they come almost necessarily to drop equally out of our thought and expectation. And when the event is one in itself of no importance, like a rare throw of the dice, a great effort of imagination may be required, on the part of persons not accustomed to abstract mathematical calculation, to enable them to realize the throw as being even possible.
§ 20. The reason for this persistent mistake is not hard to figure out. It likely comes from the demands of our everyday lives. No one can keep track of every possible situation they might face. Therefore, if we are ever going to do anything in the world, we have to completely overlook a lot of rare situations. This need to forget is made even more pressing by the brevity of our lives. Mathematically speaking, we could say it’s certain that anyone who lives long enough will eventually be bitten by a rabid dog, since that event is not impossible, just unlikely, and must happen eventually. However, this and countless other unpleasant situations usually have to be completely ignored in practice, which leads us to forget about them almost entirely. When the event is something trivial, like a rare throw of the dice, it takes a significant stretch of the imagination for people who aren’t used to abstract mathematical thinking to even grasp that the outcome is possible.
Attempts have sometimes been made to estimate what extremity of unlikelihood ought to be considered as equivalent to this practical zero point of belief. In so far as such attempts are carried out by logicians, or by those who 352 are unwilling to resort to mathematical valuation of chances, they must be regarded as merely a special form of the modal difficulties discussed in the last chapter, and need not therefore be reconsidered here; but a word or two may be added concerning the views of some who have looked at the matter from the mathematician's point of view.
Attempts have sometimes been made to determine what level of unlikelihood should be seen as equivalent to this practical zero point of belief. To the extent that such attempts are made by logicians or by those who don't want to use mathematical evaluation of probabilities, they should be seen as just a specific version of the modal challenges discussed in the last chapter, so there’s no need to go over them again here; however, a few words might be added about the perspectives of some who have examined the issue from a mathematician's viewpoint.
The principal of these is perhaps Buffon. He has arrived at the estimate (Arithmétique Morale § VIII.) that this practical zero is equivalent to a chance of 1/10,000. The grounds for selecting this fraction are found in the fact that, according to the tables of mortality accessible to him, it represents the chance of a man of 56 dying in the course of the next day. But since no man under common circumstances takes the chance into the slightest consideration, it follows that it is practically estimated as having no value.
The main one of these is probably Buffon. He has calculated (in Arithmétique Morale § VIII.) that this practical zero is equivalent to a chance of 1/10K. The reason for choosing this fraction comes from the fact that, according to the mortality tables available to him, it reflects the chance of a 56-year-old man dying within the next day. However, since no man in normal circumstances considers this risk at all, it is practically regarded as having no value.
It is obvious that this result is almost entirely arbitrary, and in fact his reasons cannot be regarded as anything more than a slender justification from experience for adopting a conveniently simple fraction; a justification however which would apparently have been equally available in the case of any other fractions lying within wide limits of the one selected.[7]
It’s clear that this outcome is pretty much random, and honestly, his reasoning can’t really be seen as more than a weak justification based on experience for choosing a conveniently simple fraction; a justification that, it seems, could just as easily apply to any other fractions within a broad range of the one chosen.[7]
§ 21. There is one particular form of this error, which, from the importance occasionally attached to it, deserves perhaps more special examination. As stated above, there can be no doubt that, however unlikely an event may be, if we (loosely speaking) vary the circumstances sufficiently, or 353 if, in other words, we keep on trying long enough, we shall meet with such an event at last. If we toss up a pair of dice a few times we shall get doublets; if we try longer with three we shall get triplets, and so on. However unusual the event may be, even were it sixes a thousand times running, it will come some time or other if we have only patience and vitality enough. Now apply this result to the letters of the alphabet. Suppose that one letter at a time is drawn from a bag which contains them all, and is then replaced. If the letters were written down one after another as they occurred, it would commonly be expected that they would be found to make mere nonsense, and would never arrange themselves into the words of any language known to men. No more they would in general, but it is a commonly accepted result of the theory, and one which we may assume the reader to be ready to admit without further discussion, that, if the process were continued long enough, words making sense would appear; nay more, that any book we chose to mention,—Milton's Paradise Lost or the plays of Shakespeare, for example,—would be produced in this way at last. It would take more days than we have space in this volume to represent in figures, to make tolerably certain of obtaining the former of these works by thus drawing letters out of a bag, but the desired result would be obtained at length.[8] 354 Now many people have not unnaturally thought it derogatory to genius to suggest that its productions could have also been obtained by chance, whilst others have gone on to argue, If this be the case, might not the world itself in this manner have been produced by chance?
§ 21. There’s a specific version of this mistake that, because of its occasional significance, deserves a closer look. As mentioned earlier, it’s clear that no matter how unlikely an event is, if we change the circumstances enough, or if we keep trying long enough, we will eventually encounter that event. If we roll a pair of dice a few times, we'll get doubles; if we keep going with three dice, we’ll get triples, and so on. No matter how rare the event may be, even if it were rolling sixes a thousand times in a row, it will eventually happen if we just have enough patience and energy. Now, let’s apply this idea to the letters of the alphabet. Imagine drawing one letter at a time from a bag that contains all of them, then putting it back. If we wrote down the letters as they were drawn, we would usually expect them to form nothing but gibberish, and they would rarely come together to form words in any known language. Generally, that’s true, but it's a widely accepted conclusion of the theory, which we can assume the reader is willing to accept without further debate, that if we kept this process going long enough, meaningful words would eventually show up; moreover, any book we could mention—like Milton’s Paradise Lost or the plays of Shakespeare—would ultimately be created this way. It would take more days than we have room in this book to illustrate with numbers to show that we could reliably obtain the former work through this method of drawing letters from a bag, but eventually, we would achieve the desired outcome.[8] 354 Now, many people understandably find it belittling to genius to suggest that its creations could also be the result of chance, while others have argued, if that’s true, couldn’t the world itself have been created by chance in the same way?
§ 22. We will begin with the comparatively simple, determinate, and intelligible problem of the possible production of the works of a great human genius by chance. With regard to this possibility, it may be a consolation to some timid minds to be reminded that the power of producing the works of a Shakespeare, in time, is not confined to consummate genius and to mere chance. There is a third alternative, viz. that of purely mechanical procedure. Any one, down almost to an idiot, might do it, if he took sufficient time about the task. For suppose that the required number of letters were procured and arranged, not by chance, but designedly, and according to rules suggested by the theory of permutations: the letters of the alphabet and the number of them to be employed being finite, every order in which they could occur would come in its due turn, and therefore every thing which can be expressed in language would be arrived at some time or other.
§ 22. Let's start with the relatively simple, clear, and understandable issue of whether the works of a great human genius could come about by chance. For those who are a bit anxious, it might be reassuring to remember that the ability to produce works like Shakespeare's, over time, isn’t just limited to extraordinary talent and sheer luck. There's a third possibility, which is the methodical approach. Anyone, even someone with limited intellect, could potentially do it if they took enough time. Imagine if you gathered the necessary letters and arranged them not randomly, but purposefully, using rules based on permutation theory: since the alphabet has a finite number of letters, every sequence they could form would eventually occur, allowing for the expression of everything that can be communicated in language at some point.
There is really nothing that need shock any one in such a result. Its possibility arises from the following cause. 355 The number of letters, and therefore of words, at our disposal is limited; whatever therefore we may desire to express in language necessarily becomes subject to corresponding limitation. The possible variations of thought are literally infinite, so are those of spoken language (by intonation of the voice, &c.); but when we come to words there is a limitation, the nature of which is distinctly conceivable by the mind, though the restriction is one that in practice will never be appreciable, owing to the fact that the number of combinations which may be produced is so enormous as to surpass all power of the imagination to realize.[9] The answer therefore is plain, and it is one that will apply to many other cases as well, that to put a finite limit upon the number of ways in which a thing can be done, is to determine that any one who is able and willing to try long enough shall succeed in doing it. If a great genius condescends to perform it under these circumstances, he must submit to the possibility of having his claims rivalled or disputed by the chance-man and idiot. If Shakespeare were limited to the use of eight or nine assigned words, the time within which the latter agents might claim equality with him would not be very great. As it is, having had the range of the English language at his disposal, his reputation is not in danger of being assailed by any such methods.
There’s really nothing surprising about such a result. Its possibility stems from the following reason. 355 The number of letters, and therefore words, we have to work with is limited; so whatever we want to express in words is inevitably subject to this limitation. The potential variations of thought are literally endless, and so are those in spoken language (like tone of voice, etc.); but when it comes to actual words, there is a limit, which can be clearly understood, even though in practice, it won’t seem significant because the number of combinations possible is so vast that it exceeds anyone’s imagination to truly grasp. [9] The answer is straightforward, and it applies to many other situations as well: putting a finite limit on the number of ways something can be done means that anyone who is able and willing to try long enough will eventually succeed at it. If a great genius chooses to perform it under these circumstances, they must accept the possibility of their claims being challenged or disputed by an average person or a fool. If Shakespeare had only eight or nine specific words to use, the time before others could claim equality with him wouldn’t be very long. As it is, having had access to the full range of the English language, his reputation isn’t at risk from such comparisons.
§ 23. The case of the possible production of the world by chance leads us into an altogether different region of discussion. We are not here dealing with figures the nature and use of which are within the fair powers of the understanding, however the imagination may break down in attempting to realize the smallest fraction of their full significance. 356 The understanding itself is wandering out of its proper province, for the conditions of the problem cannot be assigned. When we draw letters out of a bag we know very well what we are doing; but what is really meant by producing a world by chance? By analogy of the former case, we may assume that some kind of agent is presupposed;—perhaps therefore the following supposition is less absurd than any other. Imagine some being, not a Creator but a sort of Demiurgus, who has had a quantity of materials put into his hands, and he assigns them their collocations and their laws of action, blindly and at haphazard: what are the odds that such a world as we actually experience should have been brought about in this way?
§ 23. The idea of the world coming into existence by chance leads us into a completely different discussion. We're not dealing with concepts that our understanding can easily grasp, even though our imagination struggles to comprehend even a small part of their full meaning. 356 Our understanding itself is going beyond its proper limits, as we cannot clearly define the conditions of the problem. When we pull letters from a bag, we know exactly what we're doing; but what does it really mean to create a world by chance? By analogy to the previous case, we might assume that some kind of agent is involved; therefore, the following idea might be less far-fetched than others. Imagine a being, not a Creator but more like a Demiurge, who has been given a collection of materials and randomly assigns their combinations and laws of action. What are the chances that such a world as we actually experience came about in this way?
If it were worth while seriously to set about answering such a question, and if some one would furnish us with the number of the letters of such an alphabet, and the length of the work to be written with them, we could proceed to indicate the result. But so much as this may surely be affirmed about it;—that, far from merely finding the length of this small volume insufficient for containing the figures in which the adverse odds would be given, all the paper which the world has hitherto produced would be used up before we had got far on our way in writing them down.
If it were worthwhile to seriously try to answer such a question, and if someone could provide us with the number of letters in that alphabet and the length of the work that needs to be written using them, we could then show the outcome. But at the very least, we can say this: far from just finding the length of this small volume too short to contain the figures reflecting the unfavorable odds, all the paper that the world has produced up to now would be exhausted before we even made much progress in writing them down.
§ 24. The most seductive form in which the difficulty about the occurrence of very rare events generally presents itself is probably this. ‘You admit (some persons will be disposed to say) that such an event may sometimes happen; nay, that it does sometimes happen in the infinite course of time. How then am I to know that this occasion is not one of these possible occurrences?’ To this, one answer only can be given,—the same which must always be given where statistics and probability are concerned,—‘The present may be such an occasion, but it is inconceivably unlikely that it 357 should be one. Amongst countless billions of times in which you, and such as you, urge this, one person only will be justified; and it is not likely that you are that one, or that this is that occasion.’
§ 24. The most tempting way the challenge about the occurrence of very rare events usually shows up is probably like this. ‘You agree (some people might say) that such an event can sometimes happen; in fact, it does happen sometimes over an infinite amount of time. So how can I be sure that this moment isn’t one of those possible occurrences?’ To this, there’s only one response to give—the same one that should always be given when it comes to statistics and probability—‘This might be such a moment, but it is incredibly unlikely that it 357 is one. Out of countless billions of times you and others like you raise this, only one person will be justified; and it’s not likely that you are that one, or that this is that moment.’
§ 25. There is another form of this practical inability to distinguish between one high number and another in the estimation of chances, which deserves passing notice from its importance in arguments about heredity. People will often urge an objection to the doctrine that qualities, mental and bodily, are transmitted from the parents to the offspring, on the ground that there are a multitude of instances to the contrary, in fact a great majority of such instances. To raise this objection implies an utter want of appreciation of the very great odds which possibly may exist, and which the argument in support of heredity implies do exist against any given person being distinguished for intellectual or other eminence. This is doubtless partly a matter of definition, depending upon the degree of rarity which we consider to be implied by eminence; but taking any reasonable sense of the term, we shall readily see that a very great proportion of failures may still leave an enormous preponderance of evidence in favour of the heredity doctrine. Take, for instance, that degree of eminence which is implied by being one of four thousand. This is a considerable distinction, though, since there are about two thousand such persons to be found amongst the total adult male population of Great Britain, it is far from implying any conspicuous genius. Now suppose that in examining the cases of a large number of the children of such persons, we had found that 199 out of 200 of them failed to reach the same distinction. Many persons would conclude that this was pretty conclusive evidence against any hereditary transmission. To be able to adduce only one favourable, as against 199 hostile instances, would 358 to them represent the entire break-down of any such theory. The error, of course, is obvious enough, and one which, with the figures thus before him, hardly any one could fail to avoid. But if one may judge from common conversation and other such sources of information, it is found in practice exceedingly difficult adequately to retain the conviction that even though only one in 200 instances were favourable, this would represent odds of about 20 to 1 in favour of the theory. If hereditary transmission did not prevail, only one in 4000 sons would thus rival their fathers; but we find actually, let us say (we are of course taking imaginary proportions here), that one in 200 does. Hence, if the statistics are large enough to be satisfactory, there has been some influence at work which has improved the chances of mere coincidence in the ratio of 20 to 1. We are in fact so little able to realise the meaning of very large numbers,—that is, to retain the ratios in the mind, where large numbers are concerned,—that unless we repeatedly check ourselves by arithmetical considerations we are too apt to treat and estimate all beyond certain limits as equally vast and vague.
§ 25. There is another way this practical inability to tell one high number from another in assessing chances shows up, and it’s important in discussions about heredity. People often argue against the idea that traits, both mental and physical, are passed down from parents to children because there are many cases that suggest otherwise, in fact, a majority of them. Raising this argument shows a complete lack of understanding of the significant odds that might exist and which the argument supporting heredity suggests do exist against any individual being recognized for intellectual or other exceptional qualities. This is definitely partly a matter of definition, depending on how rare we consider something to be when we say it’s exceptional; but using any reasonable interpretation of the term, it becomes clear that a large number of failures can still leave strong evidence in favor of the heredity argument. For example, consider the level of distinction represented by being one in four thousand. This is quite an achievement, but since there are about two thousand people who fit this description among the total adult male population of Great Britain, it doesn’t necessarily imply any remarkable genius. Now, if we look at a large number of the children of such individuals and find that 199 out of 200 of them did not achieve the same level of distinction, many would conclude that this serves as solid evidence against any hereditary transmission. To them, having only one favorable case against 199 unfavorable instances would represent the complete collapse of any such theory. The mistake, of course, is pretty obvious, and with those numbers in front of them, hardly anyone could miss it. But judging by everyday conversations and similar sources of information, it becomes incredibly challenging to truly hold onto the belief that even if only one in 200 cases is favorable, this still represents odds of about 20 to 1 in favor of the theory. If hereditary transmission didn’t occur, only one in 4000 sons would rival their fathers. Yet, we find that, let’s say (we are obviously using hypothetical figures here), one in 200 does. Therefore, if the statistics are large enough to be reliable, something has influenced the chances of mere coincidence in a 20 to 1 ratio. In fact, we struggle so much to grasp the significance of very large numbers—that is, to retain the ratios in our minds when dealing with large figures—that unless we keep checking ourselves with arithmetic, we tend to view and assess everything beyond certain limits as equally vast and vague.
§ 26. (VI.) In discussing the nature of the connexion between Probability and Induction, we examined the claims of a rule commonly given for inferring the probability that an event which had been repeatedly observed would recur again. I endeavoured to show that all attempts to obtain and prove such a rule were necessarily futile; if these reasons were conclusive the employment of such a rule must of course be regarded as fallacious. A few examples may conveniently be added here, tending to show how instead of there being merely a single rule of succession we might better divide the possible forms into three classes.
§ 26. (VI.) In discussing the relationship between Probability and Induction, we looked into the validity of a rule often used to determine the likelihood that an event, which has been frequently observed, will happen again. I tried to demonstrate that all attempts to formulate and validate such a rule were ultimately pointless; if my reasoning is sound, then relying on such a rule must be seen as misleading. A few examples can be added here to illustrate that rather than just having one straightforward rule of succession, we could more effectively categorize the possible forms into three groups.
(1) In some cases when a thing has been observed to happen several times it becomes in consequence more likely 359 that the thing should happen again. This agrees with the ordinary form of the rule, and is probably the case of most frequent occurrence. The necessary vagueness of expression when we talk of the ‘happening of a thing’ makes it quite impossible to tolerate the rule in this general form, but if we specialize it a little we shall find it assume a more familiar shape. If, for example, we have observed two or more properties to be frequently associated together in a succession of individuals, we shall conclude with some force that they will be found to be so connected in future. The strength of our conviction however will depend not merely on the number of observed coincidences, but on far more complicated considerations; for a discussion of which the reader must be referred to regular treatises on Inductive evidence. Or again, if we have observed one of two events succeed the other several times, the occurrence of the former will excite in most cases some degree of expectation of the latter. As before, however, the degree of our expectation is not to be assigned by any simple formula; it will depend in part upon the supposed intimacy with which the events are connected. To attempt to lay down definite rules upon the subject would lead to a discussion upon laws of causation, and the circumstances under which their existence may be inferred, and therefore any further consideration of the matter must be abandoned here.
(1) In some cases, when something has been observed to happen multiple times, it becomes more likely that it will happen again. This aligns with the usual interpretation of the rule and is probably the most common situation. The necessary vagueness when we refer to the ‘happening of a thing’ makes it difficult to accept the rule in such a broad form, but if we narrow it down a bit, it takes on a more familiar shape. For example, if we have seen two or more properties frequently associated together in a series of individuals, we will strongly conclude that they will likely be connected in the future. However, the strength of our belief will depend not just on the number of observed coincidences, but on much more complex factors; for a detailed discussion, the reader should refer to standard texts on Inductive evidence. Alternatively, if we have seen one of two events follow the other multiple times, the occurrence of the first event will generally create some level of expectation for the second. Again, the level of our expectation can’t be determined by a simple formula; it will partly rely on how closely the events are thought to be related. Trying to establish definitive rules on the subject would lead to a discussion about laws of causation and the conditions under which their existence can be inferred, so any further exploration of this topic will be set aside here.
§ 27. (2) Or, secondly, the past recurrence may in itself give no valid grounds for inference about the future; this is the case which most properly belongs to Probability.[10] 360 That it does so belong will be easily seen if we bear in mind the fundamental conception of the science. We are there introduced to a series,—for purposes of inference an indefinitely extended series,—of terms, about the details of which, information, except on certain points, is not given; our knowledge being confined to the statistical fact, that, say, one in ten of them has some attribute which we will call X. Suppose now that five of these terms in succession have been X, what hint does this give about the sixth being also an X? Clearly none at all; this past fact tells us nothing; the formula for our inference is still precisely what it was before, that one in ten being X it is one to nine that the next term is X. And however many terms in succession had been of one kind, precisely the same formula would still be given.
§ 27. (2) Alternatively, the fact that something has happened in the past doesn't necessarily give us any valid reason to believe it will happen again in the future; this case fits best within the realm of Probability.[10] 360° It's clear that this concept belongs to Probability if we remember the basic idea behind the science. We're introduced to a series—a series that can extend indefinitely for the sake of inference—where details are mostly unknown, except for a few specific points. Our knowledge is limited to the statistical fact that, say, one in ten of these items has a certain attribute we’ll label as X. Now, if we find that five of these items in a row have been X, what does that tell us about the sixth item also being X? Clearly, it tells us nothing at all; this past occurrence provides no information. The formula for our inference remains exactly the same as it was before: with one in ten being X, the odds are one to nine that the next item is X. No matter how many items in a row were the same, the formula would still be the same.
§ 28. The way in which events will justify the answer given by this formula is often misunderstood. For the benefit therefore of those unacquainted with some of the conceptions familiar to mathematicians, a few words of explanation may be added. Suppose then that we have had X twelve times in succession. This is clearly an anomalous state of things. To suppose anything like this continuing to occur would be obviously in opposition to the statistics, which assert that in the long run only one in ten is X. But how is this anomaly got over? In other words, how do we obviate the conclusion that X's must occur more frequently than once in ten times, after such a long succession of them as we have now had? Many people seem to believe that there must be a diminution of X's afterwards to counterbalance their past preponderance. This however would be quite a mistake; the proportion in which they occur in future must remain the same throughout; it cannot be altered if we are to adhere to our statistical formula. The fact is that the rectification of the exceptional disturbance 361 in the proportion will be brought about simply by the continual influx of fresh terms in the series. These will in the long run neutralize the disturbance, not by any special adaptation, as it were, for the purpose, but by the mere weight of their overwhelming numbers. At every stage therefore, in the succession, whatever might have been the number and nature of the preceding terms, it will still be true to say that one in ten of the terms will be an X.
§ 28. The way events validate the answer given by this formula is often misunderstood. So, for the benefit of those who aren't familiar with some concepts common to mathematicians, a brief explanation is in order. Let's say we observe X happening twelve times in a row. This clearly represents an unusual situation. To think this could continue would clearly contradict the statistics, which indicate that over time, only one in ten is X. But how do we account for this anomaly? In other words, how do we avoid concluding that X must occur more often than once in ten times after such a long streak of them? Many people seem to think that there must be a decrease in X occurrences afterward to balance out their previous abundance. However, this would be completely incorrect; the ratio of their occurrences in the future must remain consistent. It can't change if we stick to our statistical formula. The truth is that the correction of the exceptional imbalance in the ratio will simply happen through the continuous addition of new terms to the series. These will eventually neutralize the disturbance, not through any specific adjustment, but just due to the sheer volume of their numbers. Therefore, at every stage in the sequence, no matter what the number and nature of the previous terms were, it will still hold true that one in ten of the terms will be an X.
If we had to do only with a finite number of terms, however large that number might be, such a disturbance as we have spoken of would, it is true, need a special alteration in the subsequent proportions to neutralize its effects. But when we have to do with an infinite number of terms, this is not the case; the ‘limit’ of the series, which is what we then have to deal with, is unaffected by these temporary disturbances. In the continued progress of the series we shall find, as a matter of fact, more and more of such disturbances, and these of a more and more exceptional character. But whatever the point we may occupy at any time, if we look forward or backward into the indefinite extension of the series, we shall still see that the ultimate limit to the proportion in which its terms are arranged remains the same; and it is with this limit, as above mentioned, that we are concerned in the strict rules of Probability.
If we were only dealing with a finite number of terms, no matter how large that number is, the disturbance we discussed would indeed require a special adjustment in the subsequent proportions to counteract its effects. However, when dealing with an infinite number of terms, this isn’t the case; the ‘limit’ of the series, which is what we’re concerned with, is not affected by these temporary disturbances. As the series continues to progress, we will actually encounter more and more of such disturbances, and they will become increasingly unusual. But regardless of where we are positioned at any moment, if we look forward or backward into the endless extension of the series, we will still see that the ultimate limit of how its terms are arranged remains unchanged; and it is this limit, as mentioned earlier, that we focus on in the strict rules of Probability.
The most familiar example, perhaps, of this kind is that of tossing up a penny. Suppose we have had four heads in succession; people[11] have tolerably realized by now that ‘head the fifth time’ is still an even chance, as ‘head’ was each 362 time before, and will be ever after. The preceding paragraph explains how it is that these occasional disturbances in the average become neutralized in the long run.
The most familiar example of this type is probably flipping a coin. Let’s say we’ve gotten four heads in a row; people have generally figured out by now that getting ‘heads’ the fifth time is still a 50-50 chance, just like it was every time before and will be in the future. The previous paragraph explains how these occasional fluctuations in the average even out over time.
§ 29. (3) There are other cases which, though rare, are by no means unknown, in which such an inference as that obtained from the Rule of Succession would be the direct reverse of the truth. The oftener a thing happens, it may be, the more unlikely it is to happen again. This is the case whenever we are drawing things from a limited source (as balls from a bag without replacing them), or whenever the act of repetition itself tends to prevent the succession (as in giving false alarms).
§ 29. (3) There are other situations that, while rare, are not completely unheard of, where the conclusion derived from the Rule of Succession is actually the opposite of what’s true. Sometimes, the more frequently something occurs, the less likely it is to happen again. This happens when we’re taking items from a limited source (like drawing balls from a bag without putting them back), or when the act of repetition itself makes the outcome less likely (like in the case of false alarms).
I am quite ready to admit that we believe the results described in the last two classes on the strength of some such general Inductive rule, or rather principle, as that involved in the first. But it would be a great error to confound this with an admission of the validity of the rule in each special instance. We are speaking about the application of the rule to individual cases, or classes of cases; this is quite a distinct thing, as was pointed out in a previous chapter, from giving the grounds on which we rest the rule itself. If a man were to lay it down as a universal rule, that the testimony of all persons was to be believed, and we adduced an instance of a man having lied, it would not be considered that he saved his rule by showing that we believed that it was a lie on the word of other persons. But it is perfectly consistent to give as a merely general, but not universal, rule, that the testimony of men is credible; then to separate off a second class of men whose word is not to be trusted, and finally, if any one wants to know our ground for the second rule, to rest it upon the first. If we were speaking of necessary laws, such a conflict as this would be as hopeless as the old ‘Cretan’ puzzle in logic; but in instances 363 of Inductive and Analogical extension it is perfectly harmless.
I’m ready to acknowledge that we believe the results discussed in the last two classes based on some general inductive principle, like the one mentioned first. However, it would be a big mistake to confuse this with accepting that the rule is valid for every specific case. We're talking about applying the rule to individual cases or groups of cases; this is a completely different matter, as pointed out in a previous chapter, from explaining the reasons behind the rule itself. If someone were to claim that it's a universal rule to believe everyone’s testimony, and we presented an example of a person lying, it wouldn’t hold up for him to say that we believed it was a lie based on the testimony of others. But it’s totally reasonable to state as a general, though not universal, rule that men's testimony is credible; then to set aside a second category of men whose words can't be trusted, and finally, if anyone wants to know the basis for the second rule, to support it with the first. If we were discussing necessary laws, a conflict like this would be as impossible as the old ‘Cretan’ logic puzzle; but in cases of inductive and analogical reasoning, it poses no real issue. 363
§ 30. A familiar example will serve to bring out the three different possible conclusions mentioned above. We have observed it rain on ten successive days. A and B conclude respectively for and against rain on the eleventh day; C maintains that the past rain affords no data whatever for an opinion. Which is right? We really cannot determine à priori. An appeal must be made to direct observation, or means must be found for deciding on independent grounds to which class we are to refer the instance. If, for example, it were known that every country produces its own rain, we should choose the third rule, for it would be a case of drawing from a limited supply. If again we had reasons to believe that the rain for our country might be produced anywhere on the globe, we should probably conclude that the past rainfall threw no light whatever on the prospect of a continuance of wet weather, and therefore take the second. Or if, finally, we knew that rain came in long spells or seasons, as in the tropics, then the occurrence of ten wet days in succession would make us believe that we had entered on one of these seasons, and that therefore the next day would probably resemble the preceding ten.
§ 30. A familiar example will illustrate the three different possible conclusions mentioned earlier. We have observed it rain for ten consecutive days. A and B draw different conclusions—one for rain on the eleventh day and the other against it; C argues that past rain offers no basis for an opinion. Who is correct? We really can’t determine that à prior. We must rely on direct observation, or we need to find independent reasons to decide which category we should attribute this situation to. For instance, if it were known that every region produces its own rain, we would choose the third option, as it would be a case of drawing from a limited supply. If we had reasons to believe that the rain in our area might come from anywhere in the world, we might conclude that past rainfall provides no insight into the likelihood of continued wet weather, and therefore lean towards the second conclusion. Finally, if we knew that rain comes in long spells or seasons, like in the tropics, then ten consecutive rainy days would lead us to believe that we had entered one of those seasons and that the next day would likely be similar to the previous ten.
Since then all these forms of such an Inductive rule are possible, and we have often no à priori grounds for preferring one to another, it would seem to be unreasonable to attempt to establish any universal formula of anticipation. All that we can do is to ascertain what are the circumstances under which one or other of these rules is, as a matter of fact, found to be applicable, and to make use of it under those circumstances.
Since then, all these types of Inductive rules are possible, and we often have no à priori reasons to prefer one over another. It seems unreasonable to try to establish any universal formula for anticipation. All we can do is figure out the circumstances under which each of these rules actually applies and use it in those situations.
§ 31. (VII.) In the cases discussed in (V.) the almost infinitely small chances with which we were concerned were 364 rightly neglected from all practical consideration, however proper it might be, on speculative grounds, to keep our minds open to their actual existence. But it has often occurred to me that there is a common error in neglecting to take them into account when they may, though individually small, make up for their minuteness by their number. As the mathematician would express it, they may occasionally be capable of being integrated into a finite or even considerable magnitude.
§ 31. (VII.) In the cases discussed in (V.) the extremely tiny chances we were concerned with were 364 justifiably disregarded in practical terms, although it might be appropriate, from a theoretical standpoint, to remain open to their actual existence. However, I’ve often thought that there’s a common mistake in ignoring them when they might, although individually small, compensate for that minuteness through their sheer number. As a mathematician would put it, they can sometimes be integrated into a finite or even significant magnitude.
For instance, we may be confronted with a difficulty out of which there appears to be only one appreciably possible mode of escape. The attempt is made to force us into accepting this, however great the odds apparently are against it, on the ground that improbable as it may seem, it is at any rate vastly more probable than any of the others. I can quite admit that, on practical grounds, we may often find it reasonable to adopt this course; for we can only act on one supposition, and we naturally and rightly choose, out of a quantity of improbabilities, the least improbable. But when we are not forced to act, no such decisive preference is demanded of us. It is then perfectly reasonable to refuse assent to the proposed explanation; even to say distinctly that we do not believe it, and at the same time to decline, at present, to accept any other explanation. We remain, in fact, in a state of suspense of judgment, a state perfectly right and reasonable so long as no action demanding a specific choice is forced upon us. One alternative may be decidedly probable as compared with any other individually, but decidedly improbable as compared with all others collectively. This in itself is intelligible enough; what people often fail to see is that there is no necessary contradiction between saying and feeling this, and yet being prepared vigorously to act, when action is forced upon us, as though this alternative were really the true one.
For example, we might face a situation where it seems there's only one clear way out. There’s pressure for us to accept this option, no matter how unlikely it seems, because, even if it’s improbable, it's still considered much more likely than the others. I can acknowledge that, in practical terms, it often makes sense for us to take this route; after all, we can only act based on one assumption, and we naturally choose the least unlikely option from a bunch of improbable choices. However, when we aren’t forced to act, we don’t have to make such a clear decision. It’s perfectly reasonable to refuse to agree with the suggested explanation; we can even say outright that we don’t believe it and choose not to accept any other explanation for now. We actually remain in a state of withholding judgment, which is completely valid and reasonable as long as no specific action requiring a choice is imposed on us. One option may be quite likely compared to any single alternative, but still unlikely when considered against all the others combined. This concept is understandable; what people often overlook is that there's no inherent contradiction in expressing and feeling this, while still being ready to act decisively when action is required, as if this alternative were genuinely the correct one.
§ 32. To take a specific instance, this way of regarding the matter has often occurred to me in disputes upon ‘Spiritualist’ manifestations. Assent is urged upon us because, it is said, no other possible solution can be suggested. It may be quite true that apparently overwhelming difficulties may lie as against each separate alternative solution; but is it always sufficiently realized how numerous such solutions may be? No matter that each individually may be almost incredible: they ought all to be massed together and thrown into the scale against the proffered solution, when the only question asked is, Are we to accept this solution? There is no unfairness in such a course. We are perfectly ready to adopt the same plan against any other individual alternative, whenever any person takes to claiming this as the solution of the difficulty. We are looking at the matter from a purely logical point of view, and are quite willing, so far, to place every solution, spiritualist or otherwise, upon the same footing. The partisans of every alternative are in somewhat the same position as the members of a deliberative assembly, in which no one will support the motion of any other member. Every one can aid effectively in rejecting every other motion, but no one can succeed in passing his own. Pressure of urgent necessity may possibly force them out of this state of practical inaction, by, so to say, breaking through the opposition at some point of least resistance; but unless aided by some such pressure they are left in a state of hopeless dead-lock.
§ 32. For example, this perspective has often come to my mind during debates about 'Spiritualist' phenomena. We are urged to agree because it is claimed that no other possible explanation can be provided. While it may be true that there seem to be significant challenges against each individual alternative solution, is it fully recognized how many such solutions might exist? Even if each one seems almost unbelievable, they should all be considered together and weighed against the proposed solution when the only question is, Should we accept this solution? There’s no unfairness in this approach. We are fully prepared to apply the same method against any other single alternative whenever someone claims this as the answer to the issue. We view the situation from a purely logical standpoint and are entirely open to placing all solutions, spiritualist or otherwise, on equal ground. Supporters of each alternative find themselves in a position similar to members of a deliberative body, where no one will back anyone else’s proposal. Everyone can effectively help in dismissing every other proposal, but no one can successfully pass their own. A pressing necessity might possibly push them out of this state of practical inaction by finding a point of least resistance; however, without such pressure, they remain stuck in a hopeless standstill.
§ 33. Assuming that the spiritualistic solution admits of, and is to receive, scientific treatment, this, it seems to me, is the conclusion to which one might sometimes be led in the face of the evidence offered. We might have to say to every individual explanation, It is incredible, I cannot accept it; and unless circumstances should (which it is hardly possible 366 that they should) force us to a hasty decision,—a decision, remember, which need indicate no preference of the judgment beyond what is just sufficient to turn the scale in its favour as against any other single alternative,—we leave the matter thus in abeyance. It will very likely be urged that one of the explanations (assuming that all the possible ones had been included) must be true; this we readily admit. It will probably also be urged that (on the often-quoted principle of Butler) we ought forthwith to accept the one which, as compared with the others, is the most plausible, whatever its absolute worth may be. This seems distinctly an error. To say that such and such an explanation is the one we should accept, if circumstances compelled us to anticipate our decision, is quite compatible with its present rejection. The only rational position surely is that of admitting that the truth is somewhere amongst the various alternatives, but confessing plainly that we have no such preference for one over another as to permit our saying anything else than that we disbelieve each one of them.
§ 33. If we assume that the spiritualistic solution can and should be examined scientifically, it seems to me that this is the conclusion we might occasionally reach when considering the evidence presented. We could find ourselves saying to each individual explanation, "It's unbelievable, I can't accept it." Unless circumstances should (which is unlikely) force us into a quick decision—remember, a decision that reflects no greater preference than what’s necessary to favor it over any other single alternative—we'll leave the matter open. It’s likely that someone will argue that one of the explanations (assuming all possible ones have been considered) must be true; we agree with that. It may also be suggested that, according to the often-cited principle of Butler, we should immediately accept the one that is the most plausible compared to the others, regardless of its absolute value. This clearly seems to be a mistake. Saying that a particular explanation is the one we should accept, if circumstances forced us to make a premature decision, is consistent with currently rejecting it. The only reasonable stance is to acknowledge that the truth lies somewhere among the different options, while honestly admitting that we have no strong preference for one over another, allowing us to say only that we disbelieve each one of them.
§ 34. (VIII.) The very common fallacy of ‘judging by the event,’ as it is generally termed, deserves passing notice here, as it clearly belongs to Probability rather than to Logic; though its nature is so obvious to those who have grasped the general principles of our science, that a very few words of remark will suffice. In one sense every proposition must consent to be judged by the event, since this is merely, in other words, submitting it to the test of experience. But there is the widest difference between the test appropriate to a universal proposition and that appropriate to a merely proportional or statistical one. The former is subverted by a single exception; the latter not merely admits exceptions, but implies them. Nothing, however, is more common than to blame advice (in others) because it has happened to turn 367 out unfortunately, or to claim credit for it (in oneself) because it has happened to succeed. Of course if the conclusion was avowedly one of a probable kind we must be prepared with complacency to accept a hostile event, or even a succession of them; it is not until the succession shows a disposition to continue over long that suspicion and doubt should arise, and then only by a comparison of the degree of the assigned probability, and the magnitude of the departure from it which experience exhibits. For any single failure the reply must be, ‘the advice was sound’ (supposing, that is, that it was to be justified in the long run), ‘and I shall offer it again under the same circumstances.’
§ 34. (VIII.) The common mistake of ‘judging by the outcome,’ as it's usually called, is worth mentioning here, as it clearly relates more to Probability than to Logic; although it’s so obvious to those who understand the general principles of our field that a few words will be enough. In one way, every statement must agree to be judged by the outcome, since that’s just another way of saying it’s being put to the test of experience. However, there’s a big difference between the test suitable for a universal statement and that for a simple proportional or statistical one. The former can be disproved by a single exception, while the latter not only allows exceptions but actually expects them. Yet it’s very common to criticize advice (in others) because it turned out poorly or to take credit for it (in oneself) because it happened to work out. If the conclusion was explicitly one of a probable nature, then we must calmly accept a negative outcome, or even a series of them; suspicion and doubt should only arise if this series continues for a long time, and then only by comparing the level of assigned probability with the degree of deviation shown by experience. For any single failure, the response must be, ‘the advice was sound’ (assuming, of course, that it can be justified in the long run), ‘and I will offer it again under the same circumstances.’
§ 35. The distinction drawn in the above instance deserves careful consideration; for owing to the wide difference between the kind of propositions dealt with in Probability and in ordinary Logic, and the consequent difference in the nature of the proof offered, it is quite possible for arguments of the same general appearance to be valid in the former and fallacious in the latter, and conversely.
§ 35. The distinction made in the previous example deserves close attention; because of the significant difference between the types of statements addressed in Probability and those in ordinary Logic, and the resulting difference in the nature of the proof provided, it is entirely possible for arguments that look similar to be valid in one case and misleading in the other, and vice versa.
For instance, take the well-known fallacy which consists in simply converting a universal affirmative, i.e. in passing from All A is B to All B is A. When, as in common Logic, the conclusion is to be as certain as the premise, there is not a word to be said for such a step. But if we look at the process with the more indulgent eye of Induction or Probability we see that a very fair case may sometimes be made out for it. The mere fact that ‘Some B is A’ raises a certain presumption that any particular B taken at random will be an A. There is some reason, at any rate, for the belief, though in the absence of statistics as to the relative frequency of A and B we are unable to assign a value to this belief. I suspect that there may be many cases in which a man has inferred that some particular B is an A on the 368 ground that All A is B, who might justly plead in his behalf that he never meant it to be a necessary, but only a probable inference. The same remarks will of course apply also to the logical fallacy of Undistributed Middle.
For example, consider the well-known fallacy that involves simply switching a universal affirmative, meaning going from All A is B to All B is A. In traditional Logic, if the conclusion is supposed to be just as certain as the premise, there’s no justification for making that leap. However, if we view the process through the more lenient lens of Induction or Probability, we can see that sometimes a reasonable argument can be made for it. The fact that ‘Some B is A’ suggests some likelihood that any random B will be an A. There’s some basis for that belief, but without statistics on the relative occurrences of A and B, we can't really quantify it. I suspect there are many instances where someone has concluded that a specific B is an A based on the premise that All A is B, and they could reasonably argue that they never intended for it to be seen as a necessary conclusion, only as a probable one. The same points obviously apply to the logical fallacy of Undistributed Middle.
Now for a case of the opposite kind, i.e. one in which Probability fails us, whereas the circumstances seem closely analogous to those in which ordinary inference would be able to make a stand. Suppose that I know that one letter in a million is lost when in charge of the post. I write to a friend and get no answer. Have I any reason to suppose that the fault lies with him? Here is an event (viz. the loss of the letter) which has certainly happened; and we suppose that, of the only two causes to which it can be assigned, the ‘value,’ i.e. statistical frequency, of one is accurately assigned, does it not seem natural to suppose that something can be inferred as to the likelihood that the other cause had been operative? To say that nothing can be known about its adequacy under these circumstances looks at first sight like asserting that an equation in which there is only one unknown term is theoretically insoluble.
Now let's consider a different scenario, where Probability lets us down, even though the situation seems similar to those where common reasoning would work. Imagine I know that one letter in a million gets lost in the mail. I write to a friend and don’t receive a reply. Should I assume the problem is with him? There's an event (the loss of the letter) that definitely occurred; and if we assume that the ‘value,’ or statistical frequency, of one possible cause is known, doesn’t it seem reasonable to think we can infer something about the likelihood that the other cause was at play? Claiming that we can’t know anything about its impact in this case initially seems like saying that an equation with just one unknown is unsolvable in theory.
As examples of this kind have been amply discussed in the chapter upon Inverse rules of Probability I need do no more here than remind the reader that no conclusion whatever can be drawn as to the likelihood that the fault lay with my friend rather than with the Post Office. Unless we either know, or make some assumption about, the frequency with which he neglects to answer the letters he receives, the problem remains insoluble.
As examples of this kind have been thoroughly covered in the chapter on Inverse Rules of Probability, I only need to remind the reader here that we can’t really conclude whether my friend or the Post Office is at fault. Unless we know or make some assumptions about how often he fails to respond to the letters he gets, the problem can't be solved.
The reason why the apparent analogy, indicated above, to an equation with only one unknown quantity, fails to hold good, is that for the purposes of Probability there are really two unknown quantities. What we deal with are proportional or statistical propositions. Now we are only told that 369 in the instance in question the letter was lost, not that they were found to be lost in such and such a proportion of cases. Had this latter information been given to us we should really have had but one unknown quantity to determine, viz. the relative frequency with which my correspondent neglects to answer his letters, and we could then have determined this with the greatest ease.
The reason the analogy mentioned earlier, comparing it to an equation with just one unknown, doesn’t work is that, in terms of Probability, there are actually two unknowns involved. What we’re looking at are proportional or statistical statements. Right now, we only know that in this case, the letter was lost, but we don’t know how often this happens in other cases. If we had that information, we would really only have one unknown quantity to figure out, which is the relative frequency at which my correspondent doesn’t respond to his letters, and then we could determine it quite easily.
1 Discussed by Mr F. Y. Edgeworth, in the Phil. Mag. for April, 1887.
1 Discussed by Mr. F. Y. Edgeworth, in the Phil. Mag. for April, 1887.
2 Journal of the Statistical Soc. (Vol. XLII. p. 328) Dare one suspect a joke?
2 Journal of the Statistical Soc. (Vol. XLII. p. 328) Can we even entertain the idea of a joke?
3 It appears to have been long known to gamblers under the name of the Martingale. There is a paper by Babbage (Trans. of Royal Soc. of Edinburgh, for 1823) which discusses certain points connected with it, but scarcely touches on the subject of the sections which follow.
3 It seems to have been widely recognized by gamblers as the Martingale. There's a paper by Babbage (Trans. of Royal Soc. of Edinburgh, for 1823) that talks about some related points, but it hardly addresses the topics covered in the sections that follow.
4 Attention will be further directed to this distinction in the chapter on Insurance and Gambling.
4 Attention will be focused further on this distinction in the chapter about Insurance and Gambling.
5 As by Prévost in the Bibliothèque Universelle de Genève, Oct. 1829. The explanation is noted, and apparently accepted, by Quetelet (Physique Sociale, I. 171).
5 As by Prévost in the Bibliothèque Universelle de Genève, Oct. 1829. The explanation is noted and seems to be agreed upon by Quetelet (Physique Sociale, I. 171).
6 Essay on Probabilities, p. 126.
__A_TAG_PLACEHOLDER_0__ Essay on Probabilities, p. 126.
7 This theoretical or absolute neglect of what is very rare must not be confused with the practical neglect sometimes recommended by astronomical and other observers. A criterion, known as Chauvenet's, for indicating the limits of such rejection will be found described in Mr Merriman's Least Squares (p. 166). But this rests on the understanding that a smaller balance of error would thus result in the long run. The very rare event is deliberately rejected, not overlooked.
7 This theoretical or absolute disregard for what is extremely rare should not be confused with the practical neglect sometimes suggested by astronomers and other observers. A criterion, known as Chauvenet's, for indicating the limits of such rejection is described in Mr. Merriman's Least Squares (p. 166). However, this is based on the assumption that a smaller balance of error would result in the long run. The very rare event is intentionally rejected, not ignored.
8 The process of calculation may be readily indicated. There are, say, about 350,000 letters in the work in question. Since any of the 26 letters of the alphabet may be drawn each time, the possible number of combinations would be 26350,000; a number which, as may easily be inferred from a table of logarithms, would demand for its expression nearly 500,000 figures. Only one of these combinations is favourable, if we reject variations of spelling. Hence unity divided by this number would represent the chance of getting the desired result by successive random selection of the required number of 350,000 letters.
8 The calculation process can be easily explained. There are about 350,000 letters in the work in question. Since any of the 26 letters of the alphabet can be chosen each time, the total number of combinations would be 26350,000; a figure that, as can be easily seen from a logarithm table, would require nearly 500,000 digits to express. Out of all these combinations, only one is correct if we ignore spelling variations. Therefore, one divided by this number would represent the likelihood of achieving the desired outcome through successive random selections of the required 350,000 letters.
If this chance is thought too small, and any one asks how often the above random selection must be repeated in order to give him odds of 2 to 1 in favour of success, this also can be easily shown. If the chance of an event on each occasion is 1/n, the chance of getting it once at least in n trials is 1 − (n − 1/n)n; for we shall do this unless we fail n times running. When (as in the case in question) n is very large, this may be shown algebraically to be equivalent to odds of about 2 to 1. That is, when we have drawn the requisite quantity of letters a number of times equal to the inconceivably great number above represented, it is still only 2 to 1 that we shall have secured what we want:—and then we have to recognize it.
If this chance seems too small, and someone asks how many times the random selection needs to be repeated to give them 2 to 1 odds in favor of success, this can be easily explained. If the chance of an event happening each time is 1/n, the chance of getting it at least once in n trials is 1 − (n − 1/n)n; we’ll succeed unless we fail n times in a row. When (as in this case) n is very large, this can be shown mathematically to be equivalent to odds of about 2 to 1. This means that after drawing the required number of letters a number of times equal to the incredibly large number mentioned earlier, it’s still only 2 to 1 that we have what we want:—and then we need to recognize it.
9 The longest life which could reasonably be attributed to any language would of course dwindle into utter insignificance in the face of such periods of time as are being here arithmetically contemplated.
9 The longest lifespan that can realistically be assigned to any language would inevitably shrink into complete insignificance when compared to the time periods that are being mathematically considered here.
10 We are here assuming of course that the ultimate limit to which our average tends is known, either from knowledge of the causes or from previous extensive experience. We are assuming that e.g. the die is known to be a fair one; if this is not known but a possible bias has to be inferred from its observed performances, the case falls under the former head.
10 We're assuming, of course, that we know the ultimate limit to which our average tends, either because we understand the causes or have extensive experience from the past. We're assuming that, for example, the die is known to be fair; if that isn't the case and we have to infer a possible bias from its observed outcomes, then it falls under the previous category.
11 Except indeed the gamblers. According to a gambling acquaintance whom Houdin, the conjurer, describes himself as having met at Spa, “the oftener a particular combination has occurred the more certain it is that it will not be repeated at the next coup: this is the groundwork of all theories of probabilities and is termed the maturity of chances” (Card-sharping exposed, p. 85).
11 Except for the gamblers. According to a gambling friend that Houdin, the magician, mentions meeting at Spa, “the more often a certain combination has happened, the less likely it is to happen again on the next coup: this is the foundation of all probability theories and is called the maturity of chances” (Card-sharping exposed, p. 85).
CHAPTER 15.
INSURANCE AND GAMBLING.
§ 1. If the reader will recall to mind the fundamental postulate of the Science of Probability, established and explained in the first few chapters, and so abundantly illustrated since, he will readily recognize that the two opposite characteristics of individual irregularity and average regularity will naturally be differently estimated by different minds. To some persons the elements of uncertainty may be so painful, either in themselves or in their consequences, that they are anxious to adopt some means of diminishing them. To others the ultimate regularity of life, at any rate within certain departments, its monotony as they consider it, may be so wearisome that they equally wish to effect some alteration and improvement in its characteristics. We shall discuss briefly these mental tendencies, and the most simple and obvious modes of satisfying them.
§ 1. If the reader recalls the basic principle of the Science of Probability, which was established and explained in the first few chapters and has been illustrated extensively since, they will easily see that the two opposing traits of individual irregularity and average regularity will be perceived differently by different people. For some, the elements of uncertainty may be so distressing, whether in themselves or in their outcomes, that they feel compelled to find ways to lessen them. For others, the ultimate regularity of life, at least in certain areas, and its monotony, as they see it, may be so tedious that they also want to bring about some change and improvement in its characteristics. We will briefly discuss these mental tendencies and the simplest and most obvious ways to address them.
To some persons, as we have said, the world is all too full of change and irregularity and consequent uncertainty. Civilization has done much to diminish these characteristics in certain directions, but it has unquestionably aggravated them in other directions, and it might not be very easy to say with certainty in which of these respects its operation has been, at present, on the whole most effective. The diminution of irregularity is exemplified, amongst other things, in the case of the staple products which supply our necessary food and 371 clothing. With respect to them, famine and scarcity are by comparison almost unknown now, at any rate in tolerably civilized communities. As a consequence of this, and of the vast improvements in the means of transporting goods and conveying intelligence, the fluctuations in the price of such articles are much less than they once were. In other directions, however, the reverse has been the case. Fashion, for instance, now induces so many people in every large community simultaneously to desire the same thing, that great fluctuations in value may ensue. Moreover a whole group of causes (to enter upon any discussion of which would be to trench upon the ground of Political Economy) combine to produce great and frequent variations in matters concerning credit and the currency, which formerly had no existence. Bankruptcy, for instance, is from the nature of the case, almost wholly a creation of modern times. We will not attempt to strike any balance between these opposite results of modern civilization, beyond remarking that in matters of prime importance the actual uncertainties have been probably on the whole diminished, whereas in those which affect the pocket rather than the life, they have been rather increased. It might also be argued with some plausibility that in cases where the actual uncertainties have not become greater, they have for all practical purposes done so, by their consequences frequently becoming more serious, or by our estimate of these consequences becoming higher.
For some people, as we’ve mentioned, the world is filled with change and unpredictability, leading to uncertainty. Civilization has done a lot to reduce these traits in some areas, but it has undeniably increased them in others. It may not be easy to determine where its impact has been most effective overall. The reduction of unpredictability is seen, among other things, in staple products that provide our essential food and clothing. Regarding these items, famine and scarcity are nearly unheard of now, especially in reasonably developed societies. As a result, combined with significant improvements in shipping goods and sharing information, price fluctuations for these products are less extreme than they used to be. In other areas, however, the opposite is true. Fashion, for example, drives many people in each large community to want the same things at the same time, leading to significant value fluctuations. Additionally, a whole range of factors (discussing them would delve into Political Economy) contributes to frequent and large variations in issues related to credit and currency that didn’t exist before. Bankruptcy, for example, is predominantly a modern phenomenon. We won’t try to weigh the positive and negative outcomes of modern civilization, except to note that in crucial matters, actual uncertainties have likely decreased overall, whereas those affecting finances rather than life have increased. It could also be argued, quite convincingly, that in situations where actual uncertainties haven’t increased, they have essentially become greater due to the more serious consequences or our heightened perception of these consequences.
§ 2. However the above question, as to the ultimate balance of gain or loss, should be decided, there can be no doubt that many persons find the present amount of uncertainty in some of the affairs of life greater than suits their taste. How are they to diminish it? Something of course may be done, as regards the individual cases, by prudence and foresight. Our houses may be built with a view not to 372 take fire so readily, or precautions may be taken that there shall be fire-engines at hand. In the warding off of death from disease and accident, something may be done by every one who chooses to live prudently. Precautions of the above kind, however, do not introduce any questions of Probability. These latter considerations only come in when we begin to invoke the regularity of the average to save us from the irregularities of the details. We cannot, it is true, remove the uncertainty in itself, but we can so act that the consequences of that uncertainty shall be less to us, or to those in whom we are interested. Take the case of Life Insurance. A professional man who has nothing but the income he earns to depend upon, knows that the whole of that income may vanish in a moment by his death. This is a state of things which he cannot prevent; and if he were the only one in such a position, or were unable or unwilling to combine with his fellow-men, there would be nothing more to be done in the matter except to live within his income as much as possible, and so leave a margin of savings.
§ 2. Regardless of how the above question about potential gains or losses is answered, it's clear that many people find the current level of uncertainty in various aspects of life more than they would like. How can they reduce it? Some progress can be made in individual cases through careful planning and foresight. We can design our homes to be less vulnerable to fire or make sure that fire engines are available. When it comes to avoiding death from illness or accidents, everyone can take steps if they choose to live cautiously. However, these types of precautions don’t touch on issues of Probability. Those considerations only come into play when we start relying on averages to protect us from the unpredictability of the details. It’s true that we can’t completely eliminate uncertainty, but we can act in a way that minimizes its impact on us or on those we care about. Take Life Insurance, for example. A professional who relies solely on their earned income knows that all of it could disappear suddenly due to their death. This is an unavoidable situation, and if they were the only one affected or were unable or unwilling to join forces with others, the only option left would be to live within their means as much as possible and try to save whatever they can.
§ 3. There is however an easy mode of escape for him. All that he has to do is to agree with a number of others, who are in the same position as himself, to make up, so to say, a common purse. They may resolve that those of their number who live to work beyond the average length of life shall contribute to support the families of those who die earlier. If a few only concurred in such a resolution they would not gain very much, for they would still be removed by but a slight step from that uncertainty which they are seeking to escape. What is essential is that a considerable number should thus combine so as to get the benefit of that comparative regularity which the average, as is well known, almost always tends to exhibit.
§ 3. However, there is an easy way out for him. All he needs to do is agree with a group of others in the same situation to form, so to speak, a shared fund. They can decide that those who live to work beyond the average lifespan will contribute to support the families of those who pass away sooner. If only a few participate in such a plan, they won’t gain much, as they would only be a small step away from the uncertainty they are trying to avoid. What’s important is that a significant number should come together to benefit from the consistent pattern that averages tend to show.
§ 4. The above simple considerations really contain the 373 essence of all insurance. Such points as the fact that the agreement for indemnity extends only to a certain definite sum of money; and that instead of calling for an occasional general contribution at the time of the death of each member they substitute a fixed annual premium, out of the proceeds of which the payment is to be made, are merely accidents of convenience and arrangement. Insurance is simply equivalent to a mutual contract amongst those who dread the consequences of the uncertainty of their life or employment, that they will employ the aggregate regularity to neutralize as far as possible the individual irregularity. They know that for every one who gains by such a contract another will lose as much; or if one gains a great deal many must have lost a little. They know also that hardly any of their number can expect to find the arrangement a ‘fair’ one, in the sense that they just get back again what they have paid in premiums, after deducting the necessary expenses of management; but they deliberately prefer this state of things. They consist of a body of persons who think it decidedly better to leave behind them a comparatively fixed fortune, rather than one which is extremely uncertain in amount; although they are perfectly aware that, owing to the unavoidable expenses of managing the affairs of such a society, the comparatively fixed sum, so to be left, will be a trifle less than the average fortunes which would have been left had no such system of insurance been adopted.
§ 4. The points mentioned above really capture the essence of all insurance. For example, the fact that the indemnity agreement is limited to a specific sum of money and that instead of asking for a general contribution when a member dies, they require a fixed annual premium, which will fund the payment, are just practical arrangements. Insurance serves as a mutual contract among those who fear the unpredictability of their life or employment, allowing them to use the group’s consistency to offset individual uncertainties. They recognize that for every person who benefits from this contract, another will suffer an equal loss; or if one person gains significantly, many others must have lost a smaller amount. They also understand that it’s unlikely any of them will see the arrangement as “fair,” meaning they won’t just get back what they paid in premiums after covering management costs; yet they actively choose this setup. They are a group of individuals who believe it’s better to leave behind a relatively fixed financial benefit rather than one that is highly uncertain, even though they know that, because of mandatory management expenses, the fixed amount left will be slightly less than the average fortunes that would have been bequeathed without such an insurance system in place.
As this is not a regular treatise upon Insurance no more need be said upon the exact nature of such societies, beyond pointing out that they are of various different kinds. Sometimes they really are what we have compared them with, viz. mutual agreements amongst a group of persons to make up each other's losses to a certain extent. Into this category fall the Mutual Insurance Societies, Benefit Societies, Trades 374 Unions (in respect of some of their functions), together with innumerable other societies which go by various names. Sometimes they are companies worked by proprietors or shareholders for a profit, like any other industrial enterprise. This is the case, I believe, with the majority of the ordinary Life Insurance Societies. Sometimes, again, it is the State which undertakes the management, as in the case of our Post Office Insurance business.
Since this isn't a standard discussion on insurance, there's no need to go into detail about the exact nature of these societies other than to note that they come in various types. Sometimes, they genuinely resemble what we’ve compared them to, which are mutual agreements among a group of people to cover each other's losses to some extent. This includes Mutual Insurance Societies, Benefit Societies, and Trade Unions (in relation to some of their functions), along with countless other groups that go by different names. Other times, they are companies run by owners or shareholders for profit, just like any other business. I believe this applies to most typical Life Insurance Societies. Additionally, there are instances where the State manages the operations, as seen with our Post Office Insurance service.
§ 5. It is clear that there is no necessary limit to the range of application of this principle.[1] It is quite conceivable that the majority of the inhabitants of some nation might be so enamoured of security that they should devise a grand insurance society to cover almost every concern in life. They could not indeed abolish uncertainty, for the conditions of life are very far from permitting this, but they could without much difficulty get rid of the worst of the consequences of it. They might determine to insure not merely their lives, houses, ships, and other things in respect of which sudden and total loss is possible, but also to insure their business; in the sense of avoiding not only 375 bankruptcy, but even casual bad years, on the same principle of commutation. Unfamiliar as such an aim may appear when introduced in this language, it is nevertheless one which under a name of suspicious import to the conservative classes has had a good deal of attention directed to it. It is really scarcely anything else than Communism, which might indeed be defined as a universal and compulsory[2] insurance society which is to take account of all departments of business, and, in some at least of its forms, to invade the province of social and domestic life as well.
§ 5. It’s clear that there’s no necessary limit to how this principle can be applied.Please provide the text you'd like me to modernize. It's quite possible that most people in a nation could become so focused on security that they create a massive insurance organization to cover nearly every aspect of life. While they couldn’t eliminate uncertainty—since life conditions make that impossible—they could certainly minimize the worst effects of it. They might decide to insure not just their lives, homes, ships, and other valuables at risk of sudden loss, but also their businesses, aiming to prevent not only bankruptcy but also the occasional bad years, following the same principle of risk-sharing. Although this may sound unusual when put this way, it’s an idea that, under a name often viewed suspiciously by traditionalists, has received considerable attention. In essence, it’s not much different from Communism, which could indeed be described as a universal and mandatory[2] insurance society covering all areas of business and, in some of its forms, encroaching on social and domestic life as well.
Although nothing so comprehensive as this is likely to be practically carried out on any very large scale, it deserves notice that the principle itself is steadily spreading in every direction in matters of detail. It is, for instance, the great complaint against Trades Unions that they too often seek to secure these results in respect of the equalization of the workmen's wages, thus insuring to some degree against incompetence, as they rightly and wisely do against illness and loss of work. Again, there is the Tradesman's Mutual Protection Society, which insures against the occasional loss entailed by the necessity of having to conduct prosecutions at law. There are societies in many towns for the prosecution of petty thefts, with the object of escaping the same uncertain and perhaps serious loss. Amongst instances of insurance for the people rather than by them, there is of course the giant example of the English Poor Law, in which the resemblance to an initial Communistic system becomes very marked. The poor are insured against loss 376 of work arising not only from illness and old age, but from any cause except wilful idleness. They do not, it is true, pay the whole premium, but since they mostly bear some portion of the burden of municipal and county taxation they must certainly be considered as paying a part of the premium. In some branches also of the public and private services the system is adopted of deducting a percentage from the wage or salary, for the purpose of a semi-compulsory insurance against death, illness or superannuation.
Although nothing as extensive as this is likely to be practically implemented on a very large scale, it’s worth noting that the principle itself is steadily being adopted in various details. For example, a major criticism of Trade Unions is that they often try to ensure equal wages for workers, which somewhat protects against incompetence, just as they rightly and wisely protect against illness and job loss. There’s also the Tradesman's Mutual Protection Society, which provides insurance against the occasional costs of having to pursue legal actions. Many towns have societies dedicated to prosecuting petty thefts, aiming to avoid uncertain and potentially significant losses. Among the examples of insurance for the people instead of by them, there’s the prominent case of the English Poor Law, where the similarities to an initial Communistic system are quite clear. The poor are insured against the loss of work due to not only illness and old age but any cause aside from willful idleness. While it’s true that they do not pay the full premium, since they primarily bear some of the costs associated with municipal and county taxes, they should certainly be seen as contributing to part of the premium. In some sectors of public and private services, there is also a system of deducting a percentage from wages or salaries for a semi-compulsory insurance against death, illness, or retirement.
§ 6. Closely connected with Insurance, as an application of Probability, though of course by contrast, stands Gambling. Though we cannot, in strictness, term either of these practices the converse of the other, it seems nevertheless correct to say that they spring from opposite mental tendencies. Some persons, as has been said, find life too monotonous for their taste, or rather the region of what can be predicted with certainty is too large and predominant in their estimation. They can easily adopt two courses for securing the changes they desire. They may, for one thing, aggravate and intensify the results of events which are comparatively incapable of prevision, these events not being in themselves of sufficient importance to excite any strong emotions. The most obvious way of doing this is by betting upon them. Or again, they may invent games or other pursuits, the individual contingencies of which are entirely removed from all possible human prevision, and then make heavy money consequences depend upon these contingencies. This is gambling proper, carried on mostly by means of cards and dice and the roulette.
§ 6. Closely related to Insurance, as a use of Probability, but in contrast, is Gambling. While we can't strictly say that one is the opposite of the other, it does seem accurate to say they come from different mental approaches. Some people, as mentioned, find life too dull for their liking, or they feel that what can be predicted with certainty takes up too much space in their lives. They have two main options for creating the excitement they want. They can choose to amplify the outcomes of events that are somewhat unpredictable, as those events might not be dramatic enough to provoke strong feelings on their own. The easiest way to do this is by placing bets on them. Alternatively, they might come up with games or activities whose individual outcomes are completely beyond any human prediction, and then tie significant financial stakes to those outcomes. This is true gambling, primarily conducted through cards, dice, and roulette.
The gambling spirit, as we have said, seeks for the excitement of uncertainty and variety. When therefore people make a long continued practice of playing, especially if the 377 stakes for which they play are moderate in comparison with their fortune, this uncertainty from the nature of the case begins to diminish. The thoroughly practised gambler, if he possesses more than usual skill (in games where skill counts for something), must be regarded as a man following a profession, though a profession for the most part of a risky and exciting kind, to say nothing of its ignoble and often dishonest character. If, on the other hand, his skill is below the average, or the game is one in which skill does not tell and the odds are slightly in favour of his antagonist, as in the gaming tables, one light in which he can be regarded is that of a man who is following a favourite amusement; if this amusement involves a constant annual outlay on his part, that is nothing more than what has to be said of most other amusements.
The gambling spirit, as we've mentioned, craves the thrill of uncertainty and variety. So, when people consistently engage in playing—especially if the stakes are reasonable compared to their wealth—the uncertainty naturally starts to decrease. A highly skilled gambler, if they have more than average talent in games where skill matters, should be seen as someone pursuing a profession, albeit a risky and thrilling one, not to mention its often unprincipled and dishonest nature. On the other hand, if their skill is below average, or if the game doesn’t require much skill and the odds slightly favor their opponent, like at gaming tables, they can be viewed as someone enjoying a favorite pastime; if this pastime requires a steady annual investment from them, that's just the same as what can be said for most other hobbies.
§ 7. We cannot, of course, give such a rational explanation as the above in every case. There are plenty of novices, and plenty of fanatics, who go on steadily losing in the full conviction that they will eventually come out winners. But it is hard to believe that such ignorance, or such intellectual twist, can really be so widely prevalent as would be requisite to constitute them the rule rather than the exception. There must surely be some very general impulse which is gratified by such resources, and it is not easy to see what else this can be than a love of that variety and consequent excitement which can only be found in perfection where exact prevision is impossible.
§ 7. We can't always provide such a clear explanation as the one above in every case. There are many beginners and many enthusiasts who keep losing, convinced that they'll eventually come out ahead. However, it's hard to believe that such ignorance or distorted thinking could really be as common as would be needed to make it the norm rather than the exception. There must be some strong underlying desire that is satisfied by these actions, and it's not easy to see what else this could be but a love for the variety and excitement that can only be found in situations where exact predictions are impossible.
It is of course very difficult to make any generalization here as to the comparative prevalence of various motives amongst mankind; but when one considers what is the difference which most quiet ordinary whist players feel between a game for ‘love’ and one in which there is a small stake, one cannot but assign a high value to the 378 influence of a wish to emphasize the excitement of loss and gain.
It’s really hard to make broad statements about the different reasons people have; however, when you think about how regular whist players see the difference between a game played just for fun and one that has a small bet, it’s clear that the desire to heighten the thrill of winning and losing is quite significant. 378
I would not for a moment underrate the practical dangers which are found to attend the practice of gambling. It is remarked that the gambler, if he continues to play for a long time, is under an almost irresistible impulse to increase his stakes, and so re-introduce the element of uncertainty. It is in fact this tendency to be thus led on, which makes the principal danger and mischief of the practice. Risk and uncertainty are still such normal characteristics of even civilized life, that the mere extension of such tendencies into new fields does not in itself offer any very alarming prospect. It is only to be deprecated in so far as there is a danger, which experience shows to be no trifling one, that the fascination found in the pursuit should lead men into following it up into excessive lengths.[3]
I wouldn't underestimate the real dangers that come with gambling. It's noted that if someone keeps gambling for a long time, they develop an almost irresistible urge to increase their bets, bringing back the element of uncertainty. This tendency to get drawn in is what makes gambling particularly risky and harmful. Risk and uncertainty are typical aspects of even civilized life, so just extending these tendencies into new areas isn’t inherently alarming. It only becomes concerning because, as experience shows, there's a significant risk that the thrill of the chase can lead people to pursue it excessively.[3]
§ 8. The above general treatment of Gambling and Insurance seems to me the only rational and sound principle of division;—namely, that on which the different practices which, under various names, are known as gambling or insurance, are arranged in accordance with the spirit of which they are the outcome, and therefore of the results which they are designed to secure. If we were to attempt 379 to judge and arrange them according to the names which they currently bear, we should find ourselves led to no kind of systematic division whatever; the fact being that since they all alike involve, as their essential characteristic, payments and receipts, one or both of which are necessarily uncertain in their date or amount, the names may often be interchanged.
§ 8. The general approach to Gambling and Insurance seems to me to be the only logical and sensible way to categorize them; that is, by organizing the different practices known as gambling or insurance according to the underlying spirit that they arise from, and therefore the outcomes they aim to achieve. If we tried to judge and categorize them based on their current names, we would end up with no systematic organization at all; the reality is that since they all share the core characteristic of involving payments and receipts—one or both of which are inherently uncertain in their timing or amount—the names can often be swapped.
For instance, a lottery and an ordinary insurance society against accident, if we merely look to the processes performed in them, are to all intents and purposes identical. In each alike there is a small payment which is certain in amount, and a great receipt which is uncertain in amount. A great many persons pay the small premium, whereas a few only of their number obtain a prize, the rest getting no return whatever for their outlay. In each case alike, also, the aggregate receipts and losses are intended to balance each other, after allowing for the profits of those who carry on the undertaking. But of course when we take into account the occasions upon which the insurers get their prizes, we see that there is all the difference in the world between receiving them at haphazard, as in a lottery, and receiving them as a partial set-off to a broken limb or injured constitution, as in the insurance society.
For example, a lottery and a typical accident insurance company are, in essence, quite similar when we look at how they operate. In both, there's a small, fixed payment, and a large, uncertain payout. Many people pay the small premium, but only a few win a prize, while the rest get nothing in return for their investment. In both cases, the total money received and lost is meant to balance out, after accounting for the profits of those running the operation. However, when we consider the circumstances under which the insurers receive their prizes, there's a huge difference between getting them randomly, like in a lottery, and receiving them as compensation for a broken limb or health issues, like in the insurance company.
Again, the language of betting may be easily made to cover almost every kind of insurance. Indeed De Morgan has described life insurance as a bet which the individual makes with the company, that he will not live beyond a certain age. If he dies young, he is pecuniarily a gainer, if he dies late he is a loser.[4] Here, too, though the expression 380 is technically quite correct (since any such deliberate risk of money, upon an unproductive venture, may fall under the definition of a bet), there is the broadest distinction between betting with no other view whatever than that of risking money, and betting with the view of diminishing risk and loss as much as possible. In fact, if the language of sporting life is to be introduced into the matter, we ought, I presume, to speak of the insurer as ‘hedging’ against his death.
Again, the language of betting can easily apply to almost any kind of insurance. In fact, De Morgan described life insurance as a bet that a person makes with the company, wagering that they will not live beyond a certain age. If the individual dies young, they gain financially; if they die later, they lose. [4] Here, even though the term 380 is technically accurate (since any intentional risk of money on a non-productive venture fits the definition of a bet), there is a clear difference between betting just for the sake of risking money and betting to reduce risk and loss as much as possible. In fact, if we are going to use the terminology of sports betting in this context, we might as well refer to the insurer as ‘hedging’ against their own death.
§ 9. Again, in Tontines we have a system of what is often called Insurance, and in certain points rightly so, but which is to all intents and purposes simply and absolutely a gambling transaction. They have been entirely abandoned, I believe, for some time, but were once rather popular, especially in France. On this plan the State, or whatever society manages the business, does not gain anything until the last member of the Tontine is dead. As the number of the survivors diminishes, the same sum-total of annuities still continues to be paid amongst them, as long as any are left alive, so that each receives a gradually increasing sum. Hence those who die early, instead of receiving the most, as on the ordinary plan, receive the least; for at the death of each member the annuity ceases absolutely, so far as he and his relations are concerned. The whole affair therefore is to all intents and purposes a gigantic system of betting, to see which can live the longest; the State being the common stake-holder, and receiving a heavy commission for its superintendence, this commission being naturally its sole motive for encouraging such a transaction. It is recorded of one of the French Tontines[5] that a widow of 97 was left, as the last survivor, to receive an annuity of 73,500 livres during the rest of the life which she could manage to drag on after that age;—she having originally subscribed a 381 single sum of 300 livres only. It is obvious that such a system as this, though it may sometimes go by the name of insurance, is utterly opposed to the spirit of true insurance, since it tends to aggravate existing inequalities of fortune instead of to mitigate them. The insurer here bets that he will die old; in ordinary insurance he bets that he will die young.
§ 9. Again, in Tontines, we have a system that’s often referred to as insurance, and in some respects it is, but in reality, it's just a form of gambling. They’ve been mostly unused for quite some time now, but were once quite popular, especially in France. Under this system, the State, or the organization overseeing it, doesn’t benefit until the last member of the Tontine passes away. As the number of survivors decreases, the total amount of annuities continues to be shared among them, as long as any are still alive, meaning each person receives a progressively larger amount. So, those who die early, instead of getting the most, actually get the least; when a member dies, their annuity stops completely for them and their family. Thus, this whole arrangement is essentially a massive betting system to see who lives the longest, with the State acting as the common stake-holder and taking a hefty commission for overseeing the process, which is clearly its main incentive for promoting such a scheme. It’s noted that in one French Tontine, a 97-year-old widow was the last survivor, receiving an annuity of 73,500 livres for however long she could manage to live after that age—having initially contributed just a single payment of 300 livres. It’s clear that this kind of system, though it may sometimes be labeled as insurance, is completely contrary to the essence of true insurance, as it tends to exacerbate existing financial inequalities instead of alleviating them. In this case, the insurer bets they will die old; in standard insurance, they bet they'll die young.
Again, to take one final instance, common opinion often regards the bank or company which keeps a rouge et noir table, and the individuals who risk their money at it, as being both alike engaged in gambling. So they may be, technically, but for all practical purposes such a bank is as sure and safe a business as that of any ordinary insurance society, and probably far steadier in its receipts than the majority of ordinary trades in a manufacturing or commercial city. The bank goes in for many and small transactions, in proportion to its capital; their customers, very often, in proportion to their incomes go in for very heavy transactions. That the former comes out a gainer year after year depends, of course, upon the fact that the tables are notoriously slightly in their favour. But the steadiness of these gains when compared with the unsteadiness of the individual losses depends simply upon,—in fact, is merely an illustration of,—the one great permanent contrast which lies at the basis of all reasoning in Probability.
Again, to take one final example, people often see the bank or company that runs a rouge et noir table and the individuals who gamble their money as being equally involved in gambling. Technically, they may be, but for all practical purposes, that bank operates as a reliable and secure business, just like an ordinary insurance company, and is likely more stable in its income than most typical businesses in a manufacturing or commercial city. The bank engages in many small transactions relative to its capital; its customers, often based on their incomes, tend to make much larger bets. The bank’s consistent profits year after year is due to the fact that the odds are slightly in their favor. However, the consistency of these profits compared to the unpredictability of individual losses simply illustrates the fundamental contrast that underpins all reasoning in Probability.
§ 10. We have so far regarded Insurance and Gambling as being each the product of a natural impulse, and as having each, if we look merely to experience, a great mass of human judgment in its favour. The popular moral judgment, however, which applauds the one and condemns the other rests in great part upon an assumption, which has doubtless much truth in it, but which is often interpreted with an absoluteness which leads to error in each direction;—the duty of insurance being 382 too peremptorily urged upon every one, and the practice of gambling too universally regarded as involving a sacrifice of real self-interest, as being in fact little better than a persistent blunder. The assumption in question seems to be extracted from the acknowledged advantages of insurance, and then invoked to condemn the practice of gambling. But in so doing the fact does not seem to be sufficiently recognized that the latter practice, if we merely look to the extent and antiquity of the tacit vote of mankind in its favour, might surely claim to carry the day.
§ 10. Until now, we've viewed Insurance and Gambling as natural instincts, each supported by substantial human judgment based on experience. However, the common moral perspective that praises one and condemns the other relies largely on an assumption that, while it has some truth, is often interpreted too rigidly, leading to mistakes in both directions—people are too forceful in urging everyone to insure themselves, while gambling is often seen as a clear disregard for genuine self-interest, seen as little better than a constant mistake. This assumption seems to be drawn from the recognized benefits of insurance and then used to criticize gambling. However, it doesn't seem to be fully acknowledged that the practice of gambling, considering the long-standing and widespread tacit approval from humanity, could certainly argue in its favor.
It is of course obvious that in all cases with which we are concerned, the aggregate wealth is unaltered; money being merely transferred from one person to another. The loss of one is precisely equivalent to the gain of another. At least this is the approximation to the truth with which we find it convenient to start.[6] Now if the happiness which is yielded by wealth were always in direct proportion to its amount, it is not easy to see why insurance should be advocated or gambling condemned. In the case of the latter this is obvious enough. I have lost £50, say, but others (one or more as the case may be) have gained it, and the increase of their happiness would exactly balance the diminution of mine. In the case of Insurance there is a slight complication, arising from the fact that the falling in of the policy does not happen at random (otherwise, as already pointed 383 out, it would be simply a lottery), but is made contingent upon some kind of loss, which it is intended as far as possible to balance. I insure myself on a railway journey, break my leg in an accident, and, having paid threepence for my ticket, receive say £200 compensation from the insurance company. The same remarks, however, apply here; the happiness I acquire by this £200 would only just balance the aggregate loss of the 16,000 who have paid their threepences and received no return for them, were happiness always directly proportional to wealth.
It’s obviously clear that in all the cases we’re discussing, the total wealth remains unchanged; money is just shifting from one person to another. The loss for one person is exactly equal to the gain for someone else. At least, that’s the starting point we find convenient. Now, if the happiness provided by wealth were always directly proportional to its amount, it’s hard to understand why insurance would be encouraged and gambling would be criticized. In the case of gambling, this is pretty straightforward. If I lose £50, for example, others (one or more) have gained that amount, and the increase in their happiness would completely offset my loss. With insurance, there’s a slight twist since the payout doesn’t happen randomly (otherwise, as mentioned, it would just be a lottery), but rather is linked to a specific loss that it aims to balance out. I insure myself for a train journey, break my leg in an accident, and after paying threepence for my ticket, I might receive £200 from the insurance company. However, the same points apply here; the happiness I gain from that £200 would just offset the overall loss of the 16,000 people who paid their threepences and got nothing in return, assuming happiness were always directly proportional to wealth.
§ 11. The practice of Insurance does not, I think, give rise to many questions of theoretic interest, and need not therefore detain us longer. The fact is that it has hardly yet been applied sufficiently long and widely, or to matters which admit of sufficiently accurate statistical treatment, except in one department. This, of course, is Life Insurance; but the subject is one which requires constant attention to details of statistics, and is (rightly) mainly carried out in strict accordance with routine. As an illustration of this we need merely refer to the works of De Morgan,—a professional actuary as well as a writer on the theory of Probability,—who has found but little opportunity to aid his speculative treatment of Probability by examples drawn from this class of considerations.
§ 11. I don't think the practice of insurance raises many interesting theoretical questions, so we don't need to spend much more time on it. The reality is that it hasn't been applied long or widely enough, or to issues that allow for accurate statistical analysis, except in one area. That area is Life Insurance; however, it requires constant attention to statistical details and is (rightly) primarily done according to strict routines. To illustrate this, we can simply refer to the works of De Morgan—a professional actuary and a writer on the theory of probability—who has found little opportunity to support his theoretical work on probability with examples from this field.
With Gambling it is otherwise. Not only have a variety of interesting single problems been discussed (of which the Petersburg problem is the best known) but several speculative questions of considerable importance have been raised. One of these concerns the disadvantages of the practice of gambling. There have been a number of writers who, not content with dwelling upon the obvious moral and indirect mischief which results, in the shape of over-excitement, consequent greed, withdrawal from the steady business 384 habits which alone insure prosperity in the long run, diversion of wealth into dishonest hands, &c., have endeavoured to demonstrate the necessary loss caused by the practice.
With gambling, it's different. Not only have various interesting individual issues been explored (with the Petersburg problem being the most famous), but several significant speculative questions have also come up. One of these is about the drawbacks of gambling. Many authors, not satisfied with just discussing the obvious moral issues and indirect problems like excessive excitement, greed, disengagement from steady work that is essential for long-term success, and the shifting of wealth into dishonest hands, have tried to show the unavoidable losses that come from this practice.
§ 12. These attempts may be divided into two classes. There are (1) those which appeal to merely numerical considerations, and (2) those which introduce what is called the ‘moral’ as distinguished from the mathematical value of a future contingency.
§ 12. These attempts can be split into two categories. There are (1) those that focus on numerical factors, and (2) those that incorporate what is referred to as the ‘moral’ value, as opposed to the mathematical value of a future outcome.
(1) For instance, an ingenious attempt has been made by Mr Whitworth to prove that gambling is necessarily disadvantageous on purely mathematical grounds.
(1) For example, Mr. Whitworth has cleverly tried to show that gambling is always detrimental from a purely mathematical perspective.
When two persons play against each other one of the two must be ruined sooner or later, even though the game be a fair one, supposing that they go on playing long enough; the one with the smaller income having of course the worst chance of being the lucky survivor. If one of them has a finite, and the other an infinite income, it must clearly be the former who will be the ultimate sufferer if they go on long enough. It is then maintained that this is in fact every individual gambler's position, “no one is restricted to gambling with one single opponent; the speculator deals with the public at large, with a world whose resources are practically unlimited. There is a prospect that his operations may terminate to his own disadvantage, through his having nothing more to stake; but there is no prospect that it will terminate to his advantage through the exhaustion of the resources of the world. Every one who gambles is carrying on an unequal warfare: he is ranged with a restricted capital against an adversary whose means are infinite.”[7]
When two people compete against each other, one of them has to end up losing eventually, even if the game is fair and they keep playing long enough; the one with the smaller income obviously has the worst chance of being the one who survives. If one has a limited income and the other has an unlimited income, it’s clear that the person with the limited income will be the one who ultimately suffers if they keep playing. It’s argued that this is the situation for every individual gambler: “No one is limited to gambling against just one opponent; the speculator engages with the general public, with a world that has virtually unlimited resources. There’s a chance that their operations may end unfavorably for them because they have nothing left to bet; however, there’s no chance it will end in their favor due to the depletion of the world’s resources. Every gambler is fighting an unequal battle: they are up against an adversary with infinite means while having only limited capital.”[7]
In the above argument it is surely overlooked that the adversaries against whom he plays are not one body with a common purse, like the bank in a gambling establishment. 385 Each of these adversaries is in exactly the same position as he himself is, and a precisely similar proof might be employed to show that each of them must be eventually ruined which is of course a reduction to absurdity. Gambling can only transfer money from one player to another, and therefore none of it can be actually lost.
In the argument above, it's clearly missed that the opponents he faces aren't a single entity with a shared pot, like the bank in a casino. 385 Each of these opponents is in the same situation he is, and a similar argument could be made to show that each of them will eventually face ruin, which is obviously absurd. Gambling merely shifts money from one player to another, so no actual money is lost.
§ 13. What really becomes of the money, when they play to extremity, is not difficult to see. First suppose a limited number of players. If they go on long enough, the money will at last all find its way into the pocket of some one of their number. If their fortunes were originally equal, each stands the same chance of being the lucky survivor; in which case we cannot assert, on any numerical grounds, that the prospect of the play is disadvantageous to any one of them. If their fortunes were unequal, the one who had the largest sum to begin with can be shown to have the best chance, according to some assignable law, of being left the final winner; in which case it must be just as advantageous for him, as it was disadvantageous for his less wealthy competitors.
§ 13. It's pretty clear what happens to the money when they go all in. First, let’s assume a limited number of players. If they keep playing long enough, the money will eventually end up in one person's hands. If they all started with the same amount, each player has an equal chance of coming out on top; in that situation, we can't say that playing is a bad deal for anyone. However, if they started with different amounts, the player who had the most at the beginning will statistically have a better chance of winning in the end; in this case, it’s definitely a better situation for him and a worse one for his less wealthy competitors.
When, instead of a limited number of players, we suppose an unlimited number, each as he is ruined retiring from the table and letting another come in, the results are more complicated, but their general tendency can be readily distinguished. If we supposed that no one retired except when he was ruined, we should have a state of things in which all the old players were growing gradually richer. In this case the prospect before the new comers would steadily grow worse and worse, for their chance of winning against such rich opponents would be exceedingly small. But as this is an unreasonable supposition, we ought rather to assume that not only do the ruined victims retire, but also that those who have gained fortunes of a certain amount 386 retire also, so that the aggregate and average wealth of the gambling body remains pretty steady. What chance any given player has of being ruined, and how long he may expect to hold out before being ruined, will depend of course upon the initial incomes of the players, the rules of the game, the stakes for which they play, and other considerations. But it is clear that for all that is lost by one, a precisely equal sum must be gained by others, and that therefore any particular gambler can only be cautioned beforehand that his conduct is not to be recommended, by appealing to some such suppositions as those already mentioned in a former section.
When we imagine not just a limited number of players but an unlimited number, where each player who loses leaves the table for someone else to take their place, the outcomes become more complex, but we can still clearly see the overall trend. If we assume that no one leaves except when they have lost everything, we would observe a situation where all the experienced players are gradually getting wealthier. In this scenario, the prospects for newcomers would continuously worsen, as their chances of winning against such wealthy opponents would be extremely slim. However, since this assumption is unrealistic, we should instead consider that not only do those who lose everything leave, but also that those who have made a significant profit leave as well, so that the total and average wealth of the players remains relatively stable. The likelihood of any given player getting ruined, and how long they can expect to last before that happens, will naturally depend on factors like the initial incomes of the players, the rules of the game, the stakes they are playing for, and other factors. It's clear that for every loss one player incurs, an exactly equal amount must be gained by others. Therefore, any specific gambler can be warned in advance that their actions are not advisable by referencing some of the assumptions mentioned in an earlier section.
§ 14. As an additional justification of this view the reader may observe that the state of things in the last example is one which, expressed in somewhat different language and with a slight alteration of circumstances, is being incessantly carried on upon a gigantic scale upon every side of us. Call it the competition of merchants and traders in a commercial country, and the general results are familiar enough. It is true that in so far as skill comes into the question, they are not properly gamblers; but in so far as chance and risk do, they may be fairly so termed, and in many branches of business this must necessarily be the case to a very considerable extent. Whenever business is carried on in a reckless way, the comparison is on general grounds fair enough. In each case alike we find some retiring ruined, and some making their fortunes; and in each case alike also the chances, cœteris paribus, lie with those who have the largest fortunes. Every one is, in a sense, struggling against the collective commercial world, but since each of his competitors is doing the same, we clearly could not caution any of them (except indeed the poorer ones) that their efforts must finally end in disadvantage.
§ 14. To further support this perspective, the reader may notice that the situation described in the last example is one that, expressed in slightly different terms and with a few changes in circumstances, is constantly happening on a massive scale all around us. Call it the competition among merchants and traders in a commercial nation, and the overall outcomes are well-known. It’s true that when it comes to skill, they aren’t exactly gamblers; however, when it comes to chance and risk, they can definitely be seen as such, and in many business sectors, this is often fundamentally the case. Whenever business is conducted carelessly, the comparison holds up reasonably well. In both scenarios, we see some individuals going bankrupt while others strike it rich; and similarly, in each case, the odds, cœteris paribus, favor those with the most significant resources. Everyone is, in a way, fighting against the broader commercial environment, but since each of their competitors is doing the same, we clearly couldn’t warn any of them (except perhaps the less fortunate) that their efforts will ultimately lead to a disadvantage.
§ 15. If we wish to see this result displayed in its most decisive form we may find a good analogy in a very different class of events, viz. in the fate of surnames. We are all gamblers in this respect, and the game is carried out to the last farthing with a rigour unknown at Newmarket or Monte Carlo. In its complete treatment the subject is a very intricate one,[8] but a simple example will serve to display the general tendency. Suppose a colony comprising 1000 couples of different surnames, and suppose that each of these has four children who grow up to marry. Approximately, one in 16 of these families will consist of girls only; and therefore, under ordinary conventions, about 62 of the names will have disappeared for ever after the next generation. Four again out of 16 will have but one boy, each of whom will of course be in the same position as his father, viz. the sole representative of his name. Accordingly in the next generation one in 16 of these names will again drop out, and so the process continues. The number which disappears in each successive generation becomes smaller, as the stability of the survivors becomes greater owing to their larger numbers. But there is no check to the process.
§ 15. If we want to see this result shown in its most clear form, we can find a good analogy in a very different set of events, namely, the fate of surnames. We're all players in this regard, and the game carries on to the last penny with a strictness that you wouldn't find at Newmarket or Monte Carlo. The topic is quite complex when fully explored, but a simple example will highlight the general trend. Imagine a community made up of 1,000 couples with different surnames, and suppose each couple has four children who grow up to marry. Roughly, one in 16 of these families will consist of girls only; therefore, under usual conventions, about 62 of the names will have vanished entirely after the next generation. Another four out of 16 will have just one boy, who will be in the same situation as his father, that is, the sole representative of his name. Consequently, in the next generation, one in 16 of these names will also fade away, and this pattern continues. The number of names that disappears in each generation becomes smaller as the stability of the survivors increases due to their larger numbers. However, there’s no halt to this process.
§ 16. The analogy here is a very close one, the names which thus disappear corresponding to the gamblers who retire ruined and those which increase in number corresponding to the lucky winners. The ultimate goal in each case alike,—of course an exceedingly remote one,—is the exclusive survival of one at the expense of all the others. That one surname does thus drop out after another must have struck every one who has made any enquiry into family 388 genealogy, and various fanciful accounts have been given by those unfamiliar with the theory of probability. What is often apt to be overlooked is the extreme slightness of what may be termed the “turn of the tables” in favour of the survival at each generation. In the above numerical example we have made an extravagantly favourable supposition, by assuming that the population doubles at every generation. In an old and thickly populated country where the numbers increase very slowly, we should be much nearer the mark in assuming that the average effective family,—that is, the average number of children who live to marry,—was only two. In this case every family which was represented at any time by but a single male would have but three chances in four of surviving extinction, and of course the process of thinning out would be a more rapid one.
§ 16. The comparison here is very similar, with the disappearing names representing the gamblers who leave after losing everything, and the increasing names representing the lucky winners. The ultimate goal in both cases—though it's incredibly unlikely—is for one name to survive exclusively at the cost of all the others. The fact that one surname drops out after another must be noticeable to anyone who has researched family 388 genealogy, and various imaginative explanations have been offered by those unfamiliar with probability theory. What often gets overlooked is just how minimal the “turn of the tables” is in favor of survival in each generation. In the numerical example above, we’ve made an excessively optimistic assumption that the population doubles with each generation. In an old, densely populated country where numbers grow slowly, we’d be much closer to reality by assuming that the average effective family—that is, the average number of children who survive to marry—was only two. In this scenario, any family represented by just one male would only have a three out of four chance of not going extinct, and naturally, the process of decline would occur more quickly.
§ 17. The most interesting class of attempts to prove the disadvantages of gambling appeal to what is technically called ‘moral expectation’ as distinguished from ‘mathematical expectation.’ The latter may be defined simply as the average money value of the venture in question; that is, it is the product of the amount to be gained (or lost) and the chance of gaining (or losing) it. For instance, if I bet four to one in sovereigns against the occurrence of ace with a single die there would be, on the average of many throws, a loss of four pounds against a gain of five pounds on each set of six occurrences; i.e. there would be an average gain of three shillings and fourpence on each throw. This is called the true or mathematical expectation. The so-called ‘moral expectation’, on the other hand, is the subjective value of this mathematical expectation. That is, instead of reckoning a money fortune in the ordinary way, as what it is, the attempt is made to reckon it at what it is felt to be. The elements of computation 389 therefore become, not pounds and shillings, but sums of pleasure enjoyed actually or in prospect. Accordingly when reckoning the present value of a future gain, we must now multiply, not the objective but the subjective value, by the chance we have of securing that gain.
§ 17. The most compelling examples trying to show the downsides of gambling refer to what’s known as ‘moral expectation’ in contrast to ‘mathematical expectation.’ The latter can be simply defined as the average monetary outcome of a given gamble; it’s the product of the potential gain (or loss) and the likelihood of achieving (or losing) it. For example, if I bet four to one in sovereigns against rolling an ace with a single die, over many rolls I would, on average, lose four pounds against a potential gain of five pounds for every set of six rolls; that is, there would be an average gain of three shillings and fourpence for each roll. This is referred to as the true or mathematical expectation. On the other hand, the so-called ‘moral expectation’ is the personal value of this mathematical expectation. Instead of calculating a fortune in the usual way, as it actually is, the focus is on what it is felt to be. As a result, the elements of calculation become, not pounds and shillings, but amounts of pleasure actually experienced or anticipated. Therefore, when calculating the present value of a future gain, we now need to multiply, not the objective but the subjective value, by the probability of obtaining that gain.
With regard to the exact relation of this moral fortune to the physical various more or less arbitrary assumptions have been made. One writer (Buffon) considers that the moral value of any given sum varies inversely with the total wealth of the person who gains it. Another (D. Bernoulli) starting from a different assumption, which we shall presently have to notice more particularly, makes the moral value of a fortune vary as the logarithm of its actual amount.[9] A third (Cramer) makes it vary with the square root of the amount.
Regarding the exact relationship of moral wealth to physical wealth, various somewhat arbitrary assumptions have been proposed. One writer (Buffon) believes that the moral value of a specific amount decreases as the total wealth of the individual increases. Another writer (D. Bernoulli), beginning from a different assumption, which we will discuss in more detail shortly, argues that the moral value of a fortune depends on the logarithm of its actual amount. A third writer (Cramer) claims that it depends on the square root of the amount.
§ 18. Historically, these proposals have sprung from the wish to reconcile the conclusions of the Petersburg problem with the dictates of practical common sense; for, by substituting the moral for the physical estimate the total value of the expectation could be reduced to a finite sum. On this ground therefore such proposals have no great interest, for, as we have seen, there is no serious difficulty in the problem when rightly understood.
§ 18. Historically, these proposals have come from the desire to align the findings of the Petersburg problem with practical common sense; because by replacing the physical assessment with a moral one, the overall value of the expectation could be simplified to a finite sum. For this reason, such proposals are not particularly intriguing, since, as we've observed, there is no significant challenge in the problem when it is properly understood.
These same proposals however have been employed in 390 order to prove that gambling is necessarily disadvantageous, and this to both parties. Take, for instance, Bernoulli's supposition. It can be readily shown that if two persons each with a sum of £50 to start with choose to risk, say, £10 upon an even wager there will be a loss of happiness as a result; for the pleasure gained by the possessor of £60 will not be equal to that which is lost by the man who leaves off with £40.[10]
These same proposals, however, have been used to prove that gambling is always detrimental for both sides. For instance, consider Bernoulli's assumption. It's easy to show that if two people each start with £50 and decide to risk, say, £10 on an even bet, there will be a loss of happiness as a result; the pleasure gained by the person with £60 won't match the loss experienced by the person who ends up with £40.
§ 19. This is the form of argument commonly adopted; but, as it stands, it does not seem conclusive. It may surely be replied that all which is thus proved is that inequality is bad, on the ground that two fortunes of £50 are better than one of £60 and one of £40. Conceive for instance that the original fortunes had been £60 and £40 respectively, the event may result in an increase of happiness; for this will certainly be the case if the richer man loses and the fortunes are thus equalized. This is quite true; and we are therefore obliged to show,—what can be very easily shown,—that if the other alternative had taken place and the two fortunes had been made still more unequal (viz. £65 and £35 respectively) the happiness thus lost would more than balance what would have been gained by the equalization. And since these two suppositions are equally likely there will be a loss in the long run.
§ 19. This is a typical way of arguing; however, as it stands, it doesn't seem definitive. It can definitely be argued that all this proves is that inequality is harmful, based on the idea that two fortunes of £50 are better than one of £60 and one of £40. Imagine, for example, if the original fortunes were £60 and £40, respectively; the outcome could lead to increased happiness, especially if the richer person loses, making the fortunes equal. That's certainly true, and we need to demonstrate—something that can be shown quite easily—that if the alternative had happened and the two fortunes had become even more unequal (e.g., £65 and £35), the happiness lost would outweigh what was gained by equalizing them. Since these two scenarios are equally possible, there will be a loss over time.
The consideration just adduced seems however to show 391 that the common way of stating the conclusion is rather misleading; and that, on the assumption in question as to the law of dependence of happiness on wealth, it really is the case that the effective element in rendering gambling disadvantageous is its tendency to the increase of the inequality in the distribution of wealth.
The discussion just presented suggests that the usual way of stating the conclusion can be misleading; and that, based on the assumption regarding how happiness depends on wealth, the main reason gambling is disadvantageous is that it tends to increase inequality in wealth distribution. 391
§ 20. This raises two questions, one of some speculative interest in connection with our subject, and the other of supreme importance in the conduct of life. The first is this: quite apart from any particular assumption which we make about moral fortunes or laws of variation of happiness, is it the fact that gambling tends to increase the existing inequalities of wealth? Theoretically there is no doubt that this is so. Take the simplest case and suppose two people tossing for a pound. If their fortunes were equal to begin with there must be resultant inequality. If they were unequal there is an even chance of the inequality being increased or diminished; but since the increase is proportionally greater than the decrease, the final result remains of the same kind as when the fortunes were equal.[11] Taking a more general view the same conclusion underlies all our reasoning as to the averages of large numbers, viz. that the resultant divergencies increase absolutely (however they diminish relatively) as the numbers become greater. And of course we refer to these absolute divergencies when we are talking of the distribution of wealth.
§ 20. This brings up two questions, one that's somewhat speculative in relation to our topic, and the other that’s extremely important in how we live our lives. The first question is this: regardless of any specific beliefs we have about moral outcomes or how happiness varies, does gambling really tend to widen the gap in wealth? Theoretically, there's no doubt that it does. Consider the simplest scenario: imagine two people betting a pound. If they start with equal wealth, there will inevitably be resulting inequality. If they start with unequal wealth, there's a 50/50 chance that the inequality will either increase or decrease; however, because the potential increase is proportionally larger than the decrease, the end result is still similar to when their wealth was equal. Taking a broader perspective, the same conclusion is at the core of all our reasoning regarding averages in large groups, meaning that the total differences in wealth increase absolutely (even if they decrease relatively) as the numbers grow. And of course, we’re talking about these absolute differences when we discuss how wealth is distributed.
§ 21. This is the theoretic conclusion. How far the actual practice of gambling introduces counteracting agencies must be left to the determination of those who are competent to pronounce. So far as outsiders are authorised to judge from what they read in the newspapers and other public sources of information, it would appear that these counteracting agencies are very considerable, and that in consequence it is a rather insecure argument to advance against gambling. Many a large fortune has notoriously been squandered on the race-course or in gambling saloons, and most certainly a large portion, if not the major part, has gone to swell the incomes of many who were by comparison poor. But the solution of this question must clearly be left to those who have better opportunities of knowing the facts than is to be expected on the part of writers on Probability.
§ 21. This is the theoretical conclusion. How much the actual practice of gambling introduces opposing factors is something that should be determined by those qualified to make a judgment. To the extent that outsiders can form an opinion based on what they read in newspapers and other public information sources, it seems that these opposing factors are significant, making it a shaky argument to present against gambling. Many large fortunes have famously been wasted on the racetrack or in gambling venues, and certainly a substantial portion, if not the majority, has ended up increasing the incomes of many who were relatively poor. However, the resolution of this issue should clearly be left to those who have a better understanding of the facts than is reasonable to expect from writers on Probability.
§ 22. The general conclusion to be drawn is that those who invoked this principle of moral fortune as an argument against gambling were really raising a much more intricate and far-reaching problem than they were aware of. What they were at work upon was the question, What is the distribution of wealth which tends to secure the maximum of happiness? Is this best secured by equality or inequality? Had they really followed out the doctrine on which their denunciation of gambling was founded they ought to have adopted the Socialist's ideal as being distinctly that which tends to increase happiness. And they ought to have brought under the same disapprobation which they expressed against gambling all those tendencies of modern civilized life which work in the same direction. For instance; keen competition, speculative operations, extended facilities of credit, mechanical inventions, enlargement of business operations into vast firms:—all these, and other 393 similar tendencies too numerous to mention here, have had some influence in the way of adding to existing inequalities. They are, or have been, in consequence denounced by socialists: are we honestly to bring them to this test in order to ascertain whether or not they are to be condemned? The reader who wishes to see what sort of problems this assumption of ‘moral fortune’ ought to introduce may be recommended to read Mr F. Y. Edgeworth's Mathematical Psychics, the only work with which I am acquainted which treats of these questions.
§ 22. The overall conclusion is that those who used the principle of moral fortune to argue against gambling were actually tackling a much more complex and significant issue than they realized. They were really asking the question, What is the distribution of wealth that promotes the greatest happiness? Is this best achieved through equality or inequality? If they truly followed the doctrine that underpinned their condemnation of gambling, they should have embraced the Socialist's vision, as it is clearly aimed at increasing happiness. They also should have applied the same criticism they had for gambling to all the aspects of modern life that push in the same direction. For example: intense competition, speculative ventures, expanded credit options, technological advancements, and the growth of large corporations—these and many other similar trends, too numerous to list here, have contributed to increasing inequalities. These have been criticized by socialists; should we genuinely evaluate them to determine whether they deserve condemnation? Readers interested in the type of issues that the idea of ‘moral fortune’ should raise are encouraged to check out Mr. F. Y. Edgeworth's Mathematical Psychics, the only work I know that discusses these topics.
1 The question of the advisability of inoculation against the small-pox, which gave rise to much discussion amongst the writers on Probability during the last century, is a case in point of the same principles applied to a very different kind of instance. The loss against which the insurance was directed was death by small-pox, the premium paid was the illness and other inconvenience, and the very small risk of death, from the inoculation. The disputes which thence arose amongst writers on the subject involved the same difficulties as to the balance between certain moderate loss and contingent great loss. In the seventeenth century it seems to have been an occasional practice, before a journey into the Mediterranean, to insure against capture by Moorish pirates, with a view to secure having the ransom paid. (See, for an account of some extraordinary developments of the insurance principle, Walford's Insurance Guide and Handbook. It is not written in a very scientific spirit, but it contains much information on all matters connected with insurance.)
1 The issue of whether it's wise to get vaccinated against smallpox sparked a lot of debate among probability writers in the last century, serving as a relevant example of similar principles applied to a different context. The risk that insurance aimed to cover was death from smallpox, while the cost was the illness and the other inconveniences that came with the inoculation, which posed a very small risk of death. The disagreements that arose among writers on the topic dealt with the same challenges regarding the balance between a certain moderate loss and a possible significant loss. In the seventeenth century, it seems it was sometimes common practice to insure against being captured by Moorish pirates before traveling to the Mediterranean, in order to ensure the ransom would be paid. (For an account of some extraordinary developments of the insurance principle, see Walford's Insurance Guide and Handbook. It may not have a very scientific tone, but it offers a lot of information on all matters related to insurance.)
2 All that is meant by the above comparison is that the ideal aimed at by Communism is similar to that of Insurance. If we look at the processes by which it would be carried out, and the means for enforcing it, the matter would of course assume a very different aspect. Similarly with the action of Trades Unionism referred to in the next paragraph.
2 What the above comparison means is that the goal of Communism is similar to that of Insurance. If we consider how it would be implemented and the ways to enforce it, the situation would clearly look very different. The same applies to the actions of Trade Unionism mentioned in the next paragraph.
3 One of the best discussions that I have recently seen on these subjects, by a writer at once thoroughly competent and well informed, is in Mr Proctor's Chance and Luck. It appears to me however that he runs into an extreme in his denunciation not of the folly but of the dishonesty of all gambling. Surely also it is a strained use of language to speak of all lotteries as ‘unfair’ and even ‘swindling’ on the ground that the sum-total of what they distribute in prizes is less than that of what they receive in payments. The difference, in respect of information deliberately withheld and false reports wilfully spread, between most of the lotteries that have been supported, and the bubble companies which justly deserve the name of swindles, ought to prevent the same name being applied to both.
3 One of the best discussions I've recently come across on these topics, by a writer who is both highly knowledgeable and informed, is in Mr. Proctor's Chance and Luck. However, it seems to me that he goes too far in his condemnation not of the foolishness but of the dishonesty of all gambling. It’s also a bit of a stretch to label all lotteries as ‘unfair’ and even ‘swindling’ just because the total amount they pay out in prizes is less than what they take in. The distinction, regarding the misinformation deliberately withheld and false claims spread, between most of the lotteries that have been endorsed and the fraudulent schemes that rightly deserve the title of swindles, should prevent the same label from being applied to both.
4 “A fire insurance is a simple bet between the office and the party, and a life insurance is a collection of wagers. There is something of the principle of a wager in every transaction in which the results of a future event are to bring gain or loss.” Penny Cyclopædia, under the head of Wager.
4 “Fire insurance is basically a straightforward bet between the company and the individual, while life insurance is like a series of bets. Every transaction that involves the outcome of a future event leading to either profit or loss has an element of a wager.” Penny Cyclopædia, under the head of Wager.
5 Encyclopédie Methodique, under the head of Tontines.
5 Encyclopédie Methodique, under the section of Tontines.
6 Of course, if we introduce considerations of Political Economy, corrections will have to be made. For one thing, every Insurance Office is, as De Morgan repeatedly insists, a Savings Bank as well as an Insurance Office. The Office invests the premiums, and can therefore afford to pay a larger sum than would otherwise be the case. Again, in the case of gambling, a large loss of capital by any one will almost necessarily involve an actual destruction of wealth; to say nothing of the fact that, practically, gambling often causes a constant transfer of wealth from productive to unproductive purposes.
6 Of course, if we consider Political Economy, adjustments will need to be made. For one thing, every Insurance Office is, as De Morgan repeatedly points out, a Savings Bank in addition to being an Insurance Office. The Office invests the premiums, which allows it to pay out a larger amount than would otherwise be feasible. Additionally, in the case of gambling, a significant loss of capital by any one individual will almost certainly lead to an actual loss of wealth; not to mention that, in practice, gambling often results in a continual transfer of wealth from productive to unproductive uses.
8 It was, I believe, first treated as a serious problem by Mr Galton. (See the Journal Anthrop. Inst. Vol. IV. 1875, where a complete mathematical solution is indicated by Mr H. W. Watson.)
8 I think Mr. Galton was the first to seriously address this issue. (See the Journal Anthrop. Inst. Vol. IV. 1875, where Mr. H. W. Watson gives a complete mathematical solution.)
9 Bernoulli himself does not seem to have based his conclusions upon actual experience. But it is a noteworthy fact that the assumption with which he starts, viz. that the subjective value of any small increment (dx) is inversely proportional to the sum then possessed (x), and which leads at once to the logarithmic law above mentioned, is identical with one which is now familiar enough to every psychologist. It is what is commonly called Fechner's Law, which he has established by aid of an enormous amount of careful experiment in the case of a number of our simple sensations. But I do not believe that he has made any claim that such a law holds good in the far more intricate dependence of happiness upon wealth.
9 Bernoulli himself doesn’t seem to have based his conclusions on actual experience. However, it’s important to note that the assumption he starts with—that the subjective value of any small increment (dx) is inversely proportional to the total amount possessed (x)—leads directly to the logarithmic law mentioned above. This assumption is identical to one that is now well-known to every psychologist. It’s what’s commonly referred to as Fechner's Law, which he established through extensive careful experimentation involving several of our basic sensations. But I don’t think he has claimed that this law also applies to the far more complex relationship between happiness and wealth.
10 The formula expressive of this moral happiness is c log x/a; where x stands for the physical fortune possessed at the time, and a for that small value of it at which happiness is supposed to disappear: c being an arbitrary constant. Let two persons, whose fortune is x, risk y on an even bet. Then the balance, as regards happiness, must be drawn between
10 The formula for this moral happiness is c log x/a; where x represents the wealth someone has at the moment, and a is the minimal amount of that wealth where happiness is thought to fade away: c is a constant value. If two people, each with wealth x, bet y on a fair game, then the overall happiness balance must be considered between
or log x2 and log(x + y)(x − y),
or x2 and x2 − y2,
the former of which is necessarily
the greater.
or log x2 and log(x + y)(x − y),
or x2 and x2 − y2,
the first of which is definitely
the greater.
11 This may be seen more clearly as follows. Suppose two pair of gamblers, each pair consisting of men possessing £50 and £30 respectively. Now if we suppose the richer man to win in one case and the poorer in the other these two results will be a fair representation of the average; for there are only two alternatives and these will be equally frequent in the long run. It is obvious that we have had two fortunes of £50 and two of £30 converted into one of £20, two of £40, and one of £60. And this is clearly an increase of inequality.
11 This can be seen more clearly as follows. Imagine two pairs of gamblers, with each pair consisting of men who have £50 and £30, respectively. Now, if we assume that the richer man wins in one case and the poorer man wins in the other, these two outcomes will fairly represent the average; because there are only two options, and these will occur with equal frequency over time. It's clear that we've had two fortunes of £50 and two of £30 combined into one of £20, two of £40, and one of £60. This clearly shows an increase in inequality.
CHAPTER 16.
THE APPLICATION OF PROBABILITY TO TESTIMONY.
§ 1. On the principles which have been adopted in this work, it becomes questionable whether several classes of problems which may seem to have acquired a prescriptive right to admission, will not have to be excluded from the science of Probability. The most important, perhaps, of these refer to what is commonly called the credibility of testimony, estimated either at first hand and directly, or as influencing a juryman, and so reaching us through his sagacity and trustworthiness. Almost every treatise upon the science contains a discussion of the principles according to which credit is to be attached to combinations of the reports of witnesses of various degrees of trustworthiness, or the verdicts of juries consisting of larger or smaller numbers. A great modern mathematician, Poisson, has written an elaborate treatise expressly upon this subject; whilst a considerable portion of the works of Laplace, De Morgan, and others, is devoted to an examination of similar enquiries. It would be presumptuous to differ from such authorities as these, except upon the strongest grounds; but I confess that the extraordinary ingenuity and mathematical ability which have been devoted to these problems, considered as questions in Probability, fails to convince me that they ought to have been so considered. The following are the principal grounds for this opinion.
§ 1. On the principles used in this work, it's questionable whether several types of problems that might seem to deserve inclusion will actually have to be excluded from the science of Probability. The most significant of these relate to what's commonly known as the credibility of testimony, evaluated either directly or through its influence on a juryman, thus reaching us via his insight and reliability. Almost every treatise on the subject includes a discussion on the principles by which credit should be assigned to the reports of witnesses with varying levels of trustworthiness, or the verdicts of juries made up of larger or smaller groups. A prominent modern mathematician, Poisson, has written a detailed work specifically on this issue, while a significant portion of the writings of Laplace, De Morgan, and others explores similar investigations. It would be presumptuous to disagree with such respected authorities unless there are very strong reasons to do so; however, I must admit that the remarkable creativity and mathematical skill applied to these problems, viewed as questions in Probability, fail to persuade me that they should be regarded in that way. Here are the main reasons for this viewpoint.
§ 2. It will be remembered that in the course of the chapter on Induction we entered into a detailed investigation of the process demanded of us when, instead of the appropriate propositions from which the inference was to be made being set before us, the individual presented himself, and the task was imposed upon us of selecting the requisite groups or series to which to refer him. In other words, instead of calculating the chance of an event from determinate conditions of frequency of its occurrence (these being either obtained by direct experience, or deductively inferred) we have to select the conditions of frequency out of a plurality of more or less suitable ones. When the problem is presented to us at such a stage as this, we may of course assume that the preliminary process of obtaining the statistics which are extended into the proportional propositions has been already performed; we may suppose therefore that we are already in possession of a quantity of such propositions, our principal remaining doubt being as to which of them we should then employ. This selection was shown to be to a certain extent arbitrary; for, owing to the fact of the individual possessing a large number of different properties, he became in consequence a member of different series or groups, which might present different averages. We must now examine, somewhat more fully than we did before, the practical conditions under which any difficulty arising from this source ceases to be of importance.
§ 2. It will be remembered that in the chapter on Induction, we thoroughly looked into the process we go through when the specific propositions we need for inference aren’t provided, and instead, an individual appears, requiring us to pick the right groups or series to classify them. In other words, rather than calculating the likelihood of an event based on definite frequency conditions (gathered either through direct experience or deductive reasoning), we have to choose the frequency conditions from a variety of more or less relevant options. When we are faced with this kind of problem, we can assume that the initial process of gathering the statistics that are translated into proportional propositions has already been completed; therefore, we can expect that we already have a number of these propositions, with our main uncertainty being which one to use. This selection was found to be somewhat arbitrary; due to the individual having many different attributes, they end up belonging to various series or groups, each potentially showing different averages. Now we need to explore, in greater detail than before, the practical situations in which any challenges arising from this issue become insignificant.
§ 3. One condition of this kind is very simple and obvious. It is that the different statistics with which we are presented should not in reality offer materially different results, If, for instance, we were enquiring into the probability of a man aged forty dying within the year, we might if we pleased take into account the fact of his having red hair, or his having been born in a certain county or town. 396 Each of these circumstances would serve to specialize the individual, and therefore to restrict the limits of the statistics which were applicable to his case. But the consideration of such qualities as these would either leave the average precisely as it was, or produce such an unimportant alteration in it as no one would think of taking into account. Though we could hardly say with certainty of any conceivable characteristic that it has absolutely no bearing on the result, we may still feel very confident that the bearing of such characteristics as these is utterly insignificant. Of course in the extreme case of the things most perfectly suited to the Calculus of Probability, viz. games of pure chance, these subsidiary characteristics are quite irrelevant. Any further particulars about the characteristics of the cards in a really fair pack, beyond those which are familiar to all the players, would convey no information whatever about the result.
§ 3. One condition of this kind is very simple and obvious. It is that the various statistics presented to us should not actually yield significantly different results. For example, if we were looking into the likelihood of a 40-year-old man dying within the year, we might consider factors like whether he has red hair or where he was born. 396 Each of these details would help narrow down the individual case, thus limiting the statistics that apply to him. However, considering such qualities would either leave the average exactly the same or cause such a minor change that no one would find it worth mentioning. While we can't say for sure that any specific characteristic has no effect on the outcome, we can be pretty confident that the impact of these traits is completely negligible. Of course, in the extreme case of things perfectly suited to the Probability Calculus, like games of pure chance, these additional characteristics are totally irrelevant. Any extra details about the characteristics of the cards in a truly fair deck, beyond what all players already know, wouldn't provide any information about the result.
Or again; although the different sets of statistics may not as above give almost identical results, yet they may do what practically comes to very much the same thing, that is, arrange themselves into a small number of groups, all of the statistics in any one group practically coinciding in their results. If for example a consumptive man desired to insure his life, there would be a marked difference in the statistics according as we took his peculiar state of health into account or not. We should here have two sets of statistics, so clearly marked off from one another that they might almost rank with the distinctions of natural kinds, and which would in consequence offer decidedly different results. If we were to specialize still further, by taking into account insignificant qualities like those mentioned in the last paragraph, we might indeed get more limited sets of statistics applicable to persons still more closely resembling the individual in 397 question, but these would not differ sufficiently in their results to make it worth our while to do so. In other words, the different propositions which are applicable to the case in point arrange themselves into a limited number of groups, which, and which only, need be taken into account; whence the range of choice amongst them is very much diminished in practice.
Or again, while the different sets of statistics may not give nearly identical results as mentioned above, they might end up doing something very similar. They can organize themselves into a small number of groups, where all the statistics in any one group yield very similar results. For example, if a person with tuberculosis wanted to insure his life, the statistics would show a significant difference depending on whether we consider his specific health condition or not. In this case, we would have two sets of statistics clearly separated, almost like different categories, which would consequently yield quite different outcomes. If we were to narrow it down even further by considering minor details like those mentioned earlier, we might get more specific sets of statistics relevant to individuals even more similar to the person in question. However, these would not differ enough in their results to make it worthwhile. In other words, the various propositions relevant to this situation can be grouped into a limited number of categories, which are the only ones that really need to be considered, significantly reducing the range of options among them in practice.
§ 4. The reasons for the conditions above described are not difficult to detect. Where these conditions exist the process of selecting a series or class to which to refer any individual is very simple, and the selection is, for the particular purposes of inference, final. In any case of insurance, for example, the question we have to decide is of the very simple kind; Is A. B. a man of a certain age? If so one in fifty in his circumstances will die in the course of the year. If any further questions have to be decided they would be of the following description. Is A. B. a healthy man? Does he follow a dangerous trade? But here too the classes in question are but few, and the limits by which they are bounded are tolerably precise; so that the reference of an individual to one or other of them is easy. And when we have once chosen our class we remain untroubled by any further considerations; for since no other statistics are supposed to offer a materially different average, we have no occasion to take account of any other properties than those already noticed.
§ 4. The reasons for the conditions described above are not hard to identify. When these conditions are present, choosing a series or class to which to refer any individual is straightforward, and the selection is, for the specific purposes of inference, final. In any instance of insurance, for example, the question we need to answer is quite simple: Is A. B. a man of a certain age? If so, one in fifty in his circumstances will die within the year. If there are any additional questions to consider, they would typically include: Is A. B. a healthy man? Does he have a dangerous job? But here too, the relevant classes are few, and the boundaries that define them are fairly clear; thus, assigning an individual to one or the other is easy. Once we have selected our class, we are not troubled by any further considerations, since no other statistics are believed to provide a significantly different average, and we have no need to account for any other characteristics beyond those already mentioned.
The case of games of chance, already referred to, offers of course an instance of these conditions in an almost ideal state of perfection; the same circumstances which fit them so eminently for the purposes of fair gambling, fitting them equally to become examples in Probability. When a die is to be thrown, all persons alike stand on precisely the same footing of knowledge and of ignorance about the result; the 398 only data to which any one could appeal being that each face turns up on an average once in six times.
The situation with games of chance, as mentioned earlier, provides a nearly perfect example of these conditions. The same elements that make them ideal for fair gambling also make them relevant for Probability. When a die is rolled, everyone has the same level of knowledge and ignorance about the outcome; the only information available is that each face shows up an average of once every six rolls. 398
§ 5. Let us now examine how far the above conditions are fulfilled in the case of problems which discuss what is called the credibility of testimony. The following would be a fair specimen of one of the elementary enquiries out of which these problems are composed;—Here is a statement made by a witness who lies once in ten times, what am I to conclude about its truth? Objections might fairly be raised against the possibility of thus assigning a man his place upon a graduated scale of mendacity. This however we will pass over, and will assume that the witness goes about the world bearing stamped somehow on his face the appropriate class to which he belongs, and consequently, the degree of credit to which he has a claim on such general grounds. But there are other and stronger reasons against the admissibility of this class of problems.
§ 5. Let’s now look at how well the above conditions are met when discussing the credibility of testimony. A good example of one of the basic inquiries that make up these issues is this: If a witness lies once every ten times, what should I conclude about the truth of their statement? There could be valid objections to the idea of categorizing someone's honesty on a scale of dishonesty. However, we'll set that aside and assume that the witness somehow displays their level of trustworthiness on their face, indicating the degree of credibility they deserve based on general factors. Yet, there are other, more compelling reasons against accepting this type of problem.
§ 6. That which has been described in the previous sections as the ‘individual’ which had to be assigned to an appropriate class or series of statistics is, of course, in this case, a statement. In the particular instance in question this individual statement is already assigned to a class, that namely of statements made by a witness of a given degree of veracity; but it is clearly optional with us whether or not we choose to confine our attention to this class in forming our judgment; at least it would be optional whenever we were practically called on to form an opinion. But in the case of this statement, as in that of the mortality of the man whose insurance we were discussing, there are a multitude of other properties observable, besides the one which is supposed to mark the given class. Just as in the latter there were (besides his age), the place of his birth, the nature of his occupation, and so on; so in the former there are (besides its 399 being a statement by a certain kind of witness), the fact of its being uttered at a certain time and place and under certain circumstances. At the time the statement is made all these qualities or attributes of the statement are present to us, and we clearly have a right to take into account as many of them as we please. Now the question at present before us seems to be simply this;—Are the considerations, which we might thus introduce, as immaterial to the result in the case of the truth of a statement of a witness, as the corresponding considerations are in the case of the insurance of a life? There can surely be no hesitation in the reply to such a question. Under ordinary circumstances we soon know all that we can know about the conditions which determine us in judging of the prospect of a man's death, and we therefore rest content with general statistics of mortality; but no one who heard a witness speak would think of simply appealing to his figure of veracity, even supposing that this had been authoritatively communicated to us. The circumstances under which the statement is made instead of being insignificant, are of overwhelming importance. The appearance of the witness, the tone of his voice, the fact of his having objects to gain, together with a countless multitude of other circumstances which would gradually come to light as we reflect upon the matter, would make any sensible man discard the assigned average from his consideration. He would, in fact, no more think of judging in this way than he would of appealing to the Carlisle or Northampton tables of mortality to determine the probable length of life of a soldier who was already in the midst of a battle.
§ 6. What we've discussed earlier regarding the 'individual' that needs to be categorized into a specific class or set of statistics is, in this instance, a statement. In this case, the individual statement has already been placed in a class, specifically statements given by a witness with a certain level of truthfulness; however, it’s clearly up to us whether we choose to limit our focus to this class when forming our judgment. This would at least be optional whenever we are practically required to offer an opinion. But regarding this statement, as with the mortality of the person whose insurance we were discussing, there are numerous other qualities observable, in addition to the one that identifies the given class. Just as there were factors like his age, place of birth, and type of occupation in the latter case, there are also (besides it being a statement from a particular kind of witness) the fact that it was made at a specific time and place and under certain circumstances. At the moment the statement is made, all these qualities or attributes are evident to us, and we clearly have the right to consider as many of them as we wish. Now the question before us seems straightforward: Are the factors we might introduce as irrelevant to the outcome regarding the truth of a witness's statement as the corresponding factors are in the case of life insurance? There should be no hesitation in answering such a question. Under normal circumstances, we quickly learn all we can about the conditions that influence our judgment of a man's likelihood of dying, which is why we rely on general mortality statistics; but no one who hears a witness would think to merely reference their level of truthfulness, even if this figure had been officially provided to us. The circumstances surrounding the statement are not insignificant; they are crucially important. The witness's demeanor, the tone of their voice, the possibility that they have something to gain, along with numerous other factors that would gradually emerge as we contemplate the situation, would lead any reasonable person to disregard the average that has been assigned. In fact, they would no more consider making a judgment this way than they would think to use the Carlisle or Northampton mortality tables to estimate the probable lifespan of a soldier who is already engaged in battle.
§ 7. It cannot be replied that under these circumstances we still refer the witness to a class, and judge of his veracity by an average of a more limited kind; that we infer, for example, that of men who look and act like him under such 400 circumstances, a much larger proportion, say nine-tenths, are found to lie. There is no appeal to a class in this way at all, there is no immediate reference to statistics of any kind whatever; at least none which we are conscious of using at the time, or to which we should think of resorting for justification afterwards. The decision seems to depend upon the quickness of the observer's senses and of his apprehension generally.
§ 7. You can't argue that in this situation we still categorize the witness and judge their honesty by a limited average; for instance, we can’t conclude that among men who look and behave like him in such circumstances, a much higher proportion, say nine out of ten, are likely to lie. There’s no appeal to a category in that sense, and there’s no immediate reference to any statistics whatsoever; at least none that we are aware of using at the moment, or that we would think of using later to justify our conclusions. The decision seems to rely on how quickly the observer perceives and understands things in general.
Statistics about the veracity of witnesses seem in fact to be permanently as inappropriate as all other statistics occasionally may be. We may know accurately the percentage of recoveries after amputation of the leg; but what surgeon would think of forming his judgment solely by such tables when he had a case before him? We need not deny, of course, that the opinion he might form about the patient's prospects of recovery might ultimately rest upon the proportions of deaths and recoveries he might have previously witnessed. But if this were the case, these data are lying, as one may say, obscurely in the background. He does not appeal to them directly and immediately in forming his judgment. There has been a far more important intermediate process of apprehension and estimation of what is essential to the case and what is not. Sharp senses, memory, judgment, and practical sagacity have had to be called into play, and there is not therefore the same direct conscious and sole appeal to statistics that there was before. The surgeon may have in his mind two or three instances in which the operation performed was equally severe, but in which the patient's constitution was different; the latter element therefore has to be properly allowed for. There may be other instances in which the constitution was similar, but the operation more severe; and so on. Hence, although the ultimate appeal may be to the statistics, it is not so directly; 401 their value has to be estimated through the somewhat hazy medium of our judgment and memory, which places them under a very different aspect.
Statistics about the truthfulness of witnesses seem to be just as unreliable as statistics can sometimes be. We might know the exact percentage of recoveries after a leg amputation, but what surgeon would base his judgment solely on those numbers when facing a patient? We can’t deny that his opinion on the patient’s chances of recovery might ultimately rely on the rates of deaths and recoveries he has witnessed before. However, if that's the case, those data are subtly hidden in the background. He doesn’t directly refer to them when making his judgment. There’s a much more important process of understanding and assessing what matters in the case and what doesn’t. Sharp senses, memory, judgment, and practical wisdom must all come into play, so there isn’t the same direct and exclusive reliance on statistics as there was before. The surgeon may recall two or three cases where the operation was equally challenging, but the patient's constitution varied; therefore, this aspect has to be properly factored in. There might be other cases with similar constitutions but different levels of operation severity, and so on. Thus, although the final reference may be to the statistics, it isn't as straightforward; their value has to be assessed through the somewhat unclear lens of our judgment and memory, which presents them in a very different light. 401
§ 8. Any one who knows anything of the game of whist may supply an apposite example of the distinction here insisted on, by recalling to mind the alteration in the nature of our inferences as the game progresses. At the commencement of the game our sole appeal is rightfully made to the theory of Probability. All the rules upon which each player acts, and therefore upon which he infers that the others will act, rest upon the observed frequency (or rather upon the frequency which calculation assures us will be observed) with which such and such combinations of cards are found to occur. Why are we told, if we have more than four trumps, to lead them out at once? Because we are convinced, on pure grounds of probability, capable of being stated in the strictest statistical form, that in a majority of instances we shall draw our opponent's trumps, and therefore be left with the command. Similarly with every other rule which is recognized in the early part of the play.
§ 8. Anyone who knows a bit about the game of whist can easily give an example of the distinction being emphasized here by thinking about how our reasoning changes as the game moves forward. At the start of the game, we rely solely on the theory of Probability. All the rules each player follows, and the assumptions they make about how others will play, are based on the observed frequency (or, more accurately, the frequency that calculations predict will occur) of certain card combinations. Why are we advised to lead out our trumps immediately if we have more than four? Because we are convinced, based purely on probability—which can be expressed in strict statistical terms—that in most cases, we will draw our opponent's trumps and therefore maintain control. The same applies to every other rule recognized in the early stages of play.
But as the play progresses all this is changed, and towards its conclusion there is but little reliance upon any rules which either we or others could base upon statistical frequency of occurrence, observed or inferred. A multitude of other considerations have come in; we begin to be influenced partly by our knowledge of the character and practice of our partner and opponents; partly by a rapid combination of a multitude of judgments, founded upon our observation of the actual course of play, the grounds of which we could hardly realize or describe at the time and which may have been forgotten since. That is, the particular combination of cards, now before us, does not readily fall into any well-marked class to which alone it can 402 reasonably be referred by every one who has the facts before him.
But as the play goes on, all of this changes, and by the end there’s not much trust in any rules that we or others could base on statistical frequency, whether observed or assumed. A lot of other factors come into play; we start to be influenced partly by what we know about our partner and opponents, and partly by quickly combining various judgments based on our observation of how the game is actually unfolding, which we could hardly explain or even recognize at the time and might have forgotten since. In other words, the specific combination of cards in front of us doesn’t easily fit into any clear category that everyone with the facts would agree on. 402
§ 9. A criticism somewhat resembling the above has been given by Mill (Logic, Bk. III. Chap. XVIII. § 3) upon the applicability of the theory of Probability to the credibility of witnesses. But he has added other reasons which do not appear to me to be equally valid; he says “common sense would dictate that it is impossible to strike a general average of the veracity, and other qualifications for true testimony, of mankind or any class of them; and if it were possible, such an average would be no guide, the credibility of almost every witness being either below or above the average,” The latter objection would however apply with equal force to estimating the length of a man's life from tables of mortality; for the credibility of different witnesses can scarcely have a wider range of variation than the length of different lives. If statistics of credibility could be obtained, and could be conveniently appealed to when they were obtained, they might furnish us in the long run with as accurate inferences as any other statistics of the same general description. These statistics would however in practice naturally and rightly be neglected, because there can hardly fail to be circumstances in each individual statement which would more appropriately refer it to some new class depending on different statistics, and affording a far better chance of our being right in that particular case. In most instances of the kind in question, indeed, such a change is thus produced in the mode of formation of our opinion, that, as already pointed out, the mental operation ceases to be in any proper sense founded on appeal to statistics.[1]
§ 9. A criticism somewhat like the one above has been made by Mill (Logic, Bk. III. Chap. XVIII. § 3) regarding the use of the theory of Probability in assessing the credibility of witnesses. However, he has added other points that I find less convincing; he argues, “common sense suggests that it’s impossible to determine a general average of truthfulness and other qualities necessary for reliable testimony among all people or any specific group; and even if that were possible, such an average wouldn’t be useful, as the credibility of nearly every witness falls either below or above that average.” This latter argument, however, applies equally to predicting a person’s lifespan based on mortality tables, since the credibility of different witnesses likely varies just as much as the lengths of different lives. If we could gather statistics on credibility and easily access them, they might eventually provide us with inferences as accurate as any other statistics of this kind. Nevertheless, these statistics would likely be disregarded in practice because there are almost always specific circumstances in each statement that would make it more appropriate to classify it under a different standard, which would give us a much better chance of being right in that particular instance. In most cases like this, in fact, the way we form our opinions changes so that, as already indicated, the mental process stops being properly based on statistical references.
§ 10. The Chance problems which are concerned with testimony are not altogether confined to such instances as those hitherto referred to. Though we must, as it appears to me, reject all attempts to estimate the credibility of any particular witness, or to refer him to any assigned class in respect of his trustworthiness, and consequently abandon as unsuitable any of the numerous problems which start from such data as ‘a witness who is wrong once in ten times,’ yet it does not follow that testimony may not to a slight extent be treated by our science in a somewhat different manner. We may be quite unable to estimate, except in the roughest possible way, the veracity of any particular witness, and yet it may be possible to form some kind of opinion upon the veracity of certain classes of witnesses; to say, for instance, that Europeans are superior in this way to Orientals. So we might attempt to explain why, and to what extent, an opinion in which the judgments of ten persons, say jurors, concur, is superior to one in which five only concur. Something may also be done towards laying down the principles in accordance with which we are to decide whether, and why, extraordinary stories deserve less credence than ordinary ones, even if we cannot arrive at any precise and definite decision upon the point. This last question is further discussed in the course of the next chapter.
§ 10. The Chance problems related to testimony are not limited to the examples we've talked about so far. While I believe we should dismiss all attempts to evaluate the credibility of a specific witness or categorize them based on their reliability, and thus set aside many problems that start with ideas like 'a witness who is wrong one out of ten times,' it doesn't mean we can't examine testimony in a slightly different way in our field. We might not be able to accurately judge the truthfulness of any individual witness except in the most general terms, but it could be possible to form some opinion about the truthfulness of certain groups of witnesses; for example, we might say that Europeans are generally more reliable than Orientals. We could also try to explain why an opinion shared by ten jurors is more credible than one shared by only five. Additionally, we may establish some principles for determining why extraordinary claims are considered less credible than ordinary ones, even if we can't reach a clear and definitive conclusion on this matter. This last topic is explored further in the next chapter.
§ 11. The change of view in accordance with which it follows that questions of the kind just mentioned need not be entirely rejected from scientific consideration, presents itself 404 in other directions also. It has, for instance, been already pointed out that the individual characteristics of any sick man's disease would be quite sufficiently important in most cases to prevent any surgeon from judging about his recovery by a genuine and direct appeal to statistics, however such considerations might indirectly operate upon his judgment. But if an opinion had to be formed about a considerable number of cases, say in a large hospital, statistics might again come prominently into play, and be rightly recognized as the principal source of appeal. We should feel able to compare one hospital, or one method of treatment, with another. The ground of the difference is obvious. It arises from the fact that the characteristics of the individuals, which made us so ready to desert the average when we had to judge of them separately, do not produce the same disturbance when were have to judge about a group of cases. The averages then become the most secure and available ground on which to form an opinion, and therefore Probability again becomes applicable.
§ 11. The shift in perspective suggests that questions like the ones just mentioned shouldn't be completely dismissed from scientific inquiry, and this idea is also relevant in other areas. For example, it's already been noted that the unique aspects of a sick person's condition are usually significant enough that a surgeon wouldn't rely solely on statistics to assess their recovery, even though these factors might influence their judgment in an indirect way. However, if we need to form an opinion about a large number of cases, such as in a big hospital, statistics might once again play a crucial role and be rightly regarded as a key reference point. We would be able to compare one hospital or treatment method against another. The reason for this difference is clear. It comes from the fact that the individual traits that made us hesitant to rely on averages when assessing each case separately don't create the same confusion when evaluating a group of cases. In those situations, averages become the most reliable and accessible basis for forming an opinion, which means Probability becomes relevant again.
But although some resort to Probability may be admitted in such cases as these, it nevertheless does not appear to me that they can ever be regarded as particularly appropriate examples to illustrate the methods and resources of the theory. Indeed it is scarcely possible to resist the conviction that the refinements of mathematical calculation have here been pushed to lengths utterly unjustifiable, when we bear in mind the impossibility of obtaining any corresponding degree of accuracy and precision in the data from which we have to start. To cite but one instance. It would be hard to find a case in which love of consistency has prevailed over common sense to such an extent as in the admission of the conclusion that it is unimportant what are the numbers for and against a particular statement, provided 405 the actual majority is the same. That is, the unanimous judgment of a jury of eight is to count for the same as a majority of ten to two in a jury of twelve. And yet this conclusion is admitted by Poisson. The assumptions under which it follows will be indicated in the course of the next chapter.
But while some use probability in cases like these, I don’t think they can really be considered good examples to illustrate the methods and tools of the theory. In fact, it’s hard to ignore the belief that the intricacies of mathematical calculation have been taken way too far when we remember that there’s no way to get a similar level of accuracy and precision from the data we start with. For instance, it’s difficult to find a case where the desire for consistency has overshadowed common sense as much as in the acceptance of the idea that the actual numbers for and against a certain statement don’t matter, as long as the real majority is the same. That is, the unanimous decision of a jury of eight is treated the same as a majority of ten to two in a jury of twelve. And yet, Poisson accepts this conclusion. The assumptions behind this will be explained in the next chapter.
Again, perfect independence amongst the witnesses or jurors is an almost necessary postulate. But where can this be secured? To say nothing of direct collusion, human beings are in almost all instances greatly under the influence of sympathy in forming their opinions. This influence, under the various names of political bias, class prejudice, local feeling, and so on, always exists to a sufficient degree to induce a cautious person to make many of those individual corrections which we saw to be necessary when we were estimating the trustworthiness, in any given case, of a single witness; that is, they are sufficient to destroy much, if not all, of the confidence with which we resort to statistics and averages in forming our judgment. Since then this Essay is mainly devoted to explaining and establishing the general principles of the science of Probability, we may very fairly be excused from any further treatment of this subject, beyond the brief discussions which are given in the next chapter.
Again, perfect independence among witnesses or jurors is almost a necessary assumption. But where can this be guaranteed? Not to mention direct collusion, humans are in almost every case heavily influenced by sympathy when forming their opinions. This influence, known by various terms like political bias, class prejudice, local sentiment, and so on, always exists to a degree that encourages a careful person to make many of those individual adjustments we determined were necessary when assessing the reliability, in any specific case, of a single witness; that is, these factors can undermine much, if not all, of the trust we place in statistics and averages when making our judgments. Since this Essay mainly focuses on explaining and establishing the general principles of the science of Probability, we can reasonably skip any further discussion on this topic, aside from the brief conversations presented in the next chapter.
1 It may be remarked also that there is another reason which tends to dissuade us from appealing to principles of Probability in the majority of the cases where testimony has to be estimated. It often, perhaps usually happens, that we are not absolutely forced to come to a decision; at least so far as the acquitting of an accused person may be considered as avoiding a decision. It may be of much greater importance to us to attain not merely truth on the average, but truth in each individual instance, so that we had rather not form an opinion at all than form one of which we can only say in its justification that it will tend to lead us right in the long run.
1 It's worth noting that there's another reason that discourages us from relying on probability principles in most cases where we need to evaluate testimony. It often, and maybe usually, happens that we're not completely compelled to make a decision; at least when it comes to acquitting someone, that can be seen as avoiding a choice. It might be much more important for us to seek not just the average truth but the truth in each specific case, so we'd rather not form an opinion at all than create one that we can only justify by saying it'll likely lead us to the right conclusion over time.
CHAPTER 17.
ON THE CREDIBILITY OF EXTRAORDINARY STORIES.
§ 1. It is now time to recur for fuller investigation to an enquiry which has been already briefly touched upon more than once; that is, the validity of testimony to establish, as it is frequently expressed, an otherwise improbable story. It will be remembered that in a previous chapter (the twelfth) we devoted some examination to an assertion by Butler, which seemed to be to some extent countenanced by Mill, that a great improbability before the proof might become but a very small improbability after the proof. In opposition to this it was pointed out that the different estimates which we undoubtedly formed of the credibility of the examples adduced, had nothing to do with the fact of the event being past or future, but arose from a very different cause; that the conception of the event which we entertain at the moment (which is all that is then and there actually present to us, and as to the correctness of which as a representation of facts we have to make up our minds) comes before us in two very different ways. In one instance it was a mere guess of our own which we knew from statistics would be right in a certain proportion of cases; in the other instance it was the assertion of a witness, and therefore the appeal was not now primarily to statistics of the event, but to the trustworthiness of the witness. The conception, 407 or ‘event’ if we will so term it, had in fact passed out of the category of guesses (on statistical grounds), into that of assertions (most likely resting on some specific evidence), and would therefore be naturally regarded in a very different light.
§ 1. It's now time to revisit an inquiry we've previously touched on several times: the validity of testimony to establish, as it's often put, an otherwise unlikely story. You'll remember that in an earlier chapter (the twelfth), we explored an assertion by Butler, which seemed somewhat supported by Mill, that a major improbability before the evidence might become a very small improbability after the evidence. In contrast to this, it was highlighted that the different assessments we undoubtedly made regarding the credibility of the examples presented had nothing to do with whether the event was past or future, but stemmed from a different source altogether; the conception of the event we have in mind at that moment (which is all that is actually present to us, and about which we have to form our opinion regarding its accuracy as a representation of facts) appears to us in two very distinct ways. In one case, it was merely our own guess, which we knew from statistics would be correct in a certain percentage of cases; in the other case, it was the statement of a witness, so the focus shifted from the statistics of the event to the reliability of the witness. The conception, 407 or ‘event’ if we call it that, had in fact shifted from the category of guesses (based on statistics) to that of assertions (most likely grounded in specific evidence), and would therefore naturally be viewed in a very different light.
§ 2. But it may seem as if this principle would lead us to somewhat startling conclusions. For, by transferring the appeal from the frequency with which the event occurs to the trustworthiness of the witness who makes the assertion, is it not implied that the probability or improbability of an assertion depends solely upon the veracity of the witness? If so, ought not any story whatever to be believed when it is asserted by a truthful person?
§ 2. But it might seem like this principle could lead us to some surprising conclusions. By shifting the focus from how often an event happens to how reliable the witness is who makes the claim, doesn’t it suggest that whether a claim is likely or unlikely depends only on the honesty of the witness? If that’s the case, shouldn’t we believe any story if it’s told by a truthful person?
In order to settle this question we must look a little more closely into the circumstances under which such testimony is commonly presented to us. As it is of course necessary, for clearness of exposition, to take a numerical example, let us suppose that a given statement is made by a witness who, on the whole and in the long run, is right in what he says nine times out of ten.[1] Here then is an average given to us, an average veracity that is, which includes all the particular statements which the witness has made or will make.
To settle this question, we need to take a closer look at the circumstances under which such testimony is typically presented to us. For clarity's sake, let's consider a numerical example: suppose a witness makes a statement that, overall and in the long run, is correct nine times out of ten. Here, we have an average presented to us, an average truthfulness, which encompasses all the specific statements the witness has made or will make.
§ 3. Now it has been abundantly shown in a former chapter (Ch. IX. §§ 14–32) that the mere fact of a particular 408 average having been assigned, is no reason for our being forced invariably to adhere to it, even in those cases in which our most natural and appropriate ground of judgment is found in an appeal to statistics and averages. The general average may constantly have to be corrected in order to meet more accurately the circumstances of particular cases. In statistics of mortality, for instance; instead of resorting to the wider tables furnished by people in general of a given age, we often prefer the narrower tables furnished by men of a particular profession, abode, or mode of life. The reader may however be conveniently reminded here that in so doing we must not suppose that we are able, by any such device, in any special or peculiar way to secure truth. The general average, if persistently adhered to throughout a sufficiently wide and varied experience, would in the long run tend to give us the truth; all the advantage which the more special averages can secure for us is to give us the same tendency to the truth with fewer and slighter aberrations.
§ 3. It has been clearly established in a previous chapter (Ch. IX. §§ 14–32) that just because a specific average has been set, it doesn't mean we have to stick to it all the time, especially in cases where our best judgment comes from looking at statistics and averages. The general average often needs adjustments to better fit the specific situations. Take mortality statistics, for example; instead of using the broader tables of the general population of a certain age, we often choose the more specific tables for men within a certain profession, living situation, or lifestyle. However, it's important to remember that using these specific tables doesn't guarantee we'll find the truth in any unique way. The broader average, if consistently applied through a wide range of experiences, will ultimately lead us to the truth; the benefit of the more specific averages is that they point us toward the truth with fewer and smaller deviations.
§ 4. Returning then to our witness, we know that if we have a very great many statements from him upon all possible subjects, we may feel convinced that in nine out of ten of these he will tell us the truth, and that in the tenth case he will go wrong. This is nothing more than a matter of definition or consistency. But cannot we do better than thus rely upon his general average? Cannot we, in almost any given case, specialize it by attending to various characteristic circumstances in the nature of the statement which he makes; just as we specialize his prospects of mortality by attending to circumstances in his constitution or mode of life?
§ 4. Going back to our witness, we know that if we have a lot of statements from him on all possible subjects, we can be pretty sure that in nine out of ten cases, he'll tell us the truth, and in one case, he'll be wrong. This is simply a matter of definition or consistency. But can’t we do better than just trust his average? Can’t we, in nearly any specific case, refine it by considering different key factors in the nature of his statement, just as we refine his chances of mortality by looking at factors related to his constitution or lifestyle?
Undoubtedly we may do this; and in any of the practical contingencies of life, supposing that we were at all guided 409 by considerations of this nature, we should act very foolishly if we did not adopt some such plan. Two methods of thus correcting the average may be suggested: one of them being that which practical sagacity would be most likely to employ, the other that which is almost universally adopted by writers on Probability. The former attempts to make the correction by the following considerations: instead of relying upon the witness' general average, we assign to it a sort of conjectural correction to meet the case before us, founded on our experience or observation; that is, we appeal to experience to establish that stories of such and such a kind are more or less likely to be true, as the case may be, than stories in general. The other proceeds upon a different and somewhat more methodical plan. It is here endeavoured to show, by an analysis of the nature and number of the sources of error in the cases in question, that such and such kinds of stories must be more or less likely to be correctly reported, and this in certain numerical proportions.
We can definitely do this; and in any real-life situations, if we were guided by such considerations, it would be quite foolish not to adopt some kind of plan. Two ways to adjust the average can be suggested: one of them is the approach that practical wisdom would likely use, while the other is almost universally accepted by writers on Probability. The former tries to make the adjustment by these considerations: instead of just depending on the witness's general average, we give it a sort of educated guess to fit the situation, based on our experience or observation; in other words, we use experience to show that certain types of stories are more or less likely to be true compared to stories in general. The latter follows a different and somewhat more systematic approach. Here, we aim to demonstrate, through an analysis of the nature and number of error sources in the cases at hand, that certain types of stories are more or less likely to be reported accurately, and this is reflected in specific numerical proportions.
§ 5. Before proceeding to a discussion of these methods a distinction must be pointed out to which writers upon the subject have not always attended, or at any rate to which they have not generally sufficiently directed their readers' attention.[2] There are, broadly speaking, two different ways in which we may suppose testimony to be given. It may, in the first place, take the form of a reply to an alternative question, a question, that is, framed to be answered by yes or no. Here, of course, the possible answers are mutually contradictory, so that if one of them is not correct the other must be so:—Has A happened, yes or no? The common mode of illustrating this kind of testimony numerically is by 410 supposing a lottery with a prize and blanks, or a bag of balls of two colours only, the witness knowing that there are only two, or at any rate being confined to naming one or other of them. If they are black and white, and he errs when black is drawn, he must say ‘white,’ The reason for the prominence assigned to examples of this class is, probably, that they correspond to the very important case of verdicts of juries; juries being supposed to have nothing else to do than to say ‘guilty’ or ‘not guilty.’
§ 5. Before we dive into discussing these methods, it’s important to point out a distinction that writers on this topic haven’t always recognized, or at least haven’t consistently drawn attention to. There are, generally speaking, two different ways we can assume testimony is provided. First, it can take the form of an answer to a yes-or-no question. In this case, the possible answers contradict each other, so if one is wrong, the other must be right: Has A happened, yes or no? A common way to illustrate this type of testimony is by imagining a lottery with prizes and blanks, or a bag containing only two colors of balls, with the witness knowing that there are just two, or at least being limited to naming one or the other. If the balls are black and white, and they make a mistake when a black ball is drawn, they have to say ‘white.’ The reason these examples are highlighted is likely because they relate to the very important situation of jury verdicts; juries are assumed to have nothing else to do but say ‘guilty’ or ‘not guilty.’
On the other hand, the testimony may take the form of a more original statement or piece of information. Instead of saying, Did A happen? we may ask, What happened? Here if the witness speaks truth he must be supposed, as before, to have but one way of doing so; for the occurrence of some specific event was of course contemplated. But if he errs he has many ways of going wrong, possibly an infinite number. Ordinarily however his possible false statements are assumed to be limited in number, as must generally be more or less the result in practice. This case is represented numerically by supposing the balls in the bag not to be of two colours only, but to be all distinct from each other; say by their being all numbered successively. It may of course be objected that a large number of the statements that are made in the world are not in any way answers to questions, either of the alternative or of the open kind. For instance, a man simply asserts that he has drawn the seven of spades from a pack of cards; and we do not know perhaps whether he had been asked ‘Has that card been drawn?’ or ‘What card has been drawn?’ or indeed whether he had been asked anything at all. Still more might this be so in the case of any ordinary historical statement.
On the other hand, the testimony might come in the form of a more original statement or piece of information. Instead of asking, Did A happen? we might ask, What happened? Here, if the witness is telling the truth, he must be assumed, as before, to have only one way of conveying that truth; because the occurrence of a specific event is clearly considered. But if he is mistaken, he may have many ways to be wrong, possibly an infinite number. Usually, though, the potential false statements are assumed to be limited in number, which is generally the case in practice. This scenario can be illustrated numerically by imagining that the balls in the bag are not just two colors, but all different from each other; for example, they could all be numbered consecutively. It could be argued, of course, that many statements made in the world aren’t responses to questions of any kind, whether yes-or-no or open-ended. For instance, a person might simply say that he has drawn the seven of spades from a deck of cards; and we don’t know if he was asked, ‘Has that card been drawn?’ or ‘What card has been drawn?’ or even if he was asked anything at all. This could be even more applicable in the case of any typical historical statement.
This objection is quite to the point, and must be recognized as constituting an additional difficulty. All that we 411 can do is to endeavour, as best we may, to ascertain, from the circumstances of the case, what number of alternatives the witness may be supposed to have had before him. When he simply testifies to some matter well known to be in dispute, and does not go much into detail, we may fairly consider that there were practically only the two alternatives before him of saying ‘yes’ or ‘no.’ When, on the other hand, he tells a story of a more original kind, or (what comes to much the same thing) goes into details, we must regard him as having a wide comparative range of alternatives before him.
This objection is quite relevant and must be acknowledged as an additional challenge. All we can do is try our best to figure out, based on the case's circumstances, how many options the witness might have had. When he simply testifies about something well known to be disputed and doesn't provide much detail, we can reasonably assume that he really only had the two options of saying ‘yes’ or ‘no.’ However, when he shares a more unique story or (which is pretty much the same) goes into details, we need to see him as having a much broader range of options available to him. 411
These two classes of examples, viz. that of the black and white balls, in which only one form of error is possible, and the numbered balls, in which there may be many forms of error, are the only two which we need notice. In practice it would seem that they may gradually merge into each other, according to the varying ways in which we choose to frame our question. Besides asking, Did you see A strike B? and, What did you see? we may introduce any number of intermediate leading questions, as, What did A do? What did he do to B? and so on. In this way we may gradually narrow the possible openings to wrong statement, and so approach to the direct alternative question. But it is clear that all these cases may be represented numerically by a supposed diminution in the number of the balls which are thus distinguished from each other.
These two types of examples, namely the black and white balls, where only one kind of error can occur, and the numbered balls, where multiple forms of error are possible, are the only two we need to consider. In practice, it seems that they can gradually blend into each other, depending on how we choose to frame our questions. Besides asking, "Did you see A hit B?" and "What did you see?" we can add any number of intermediate questions, such as "What did A do?" or "What did he do to B?" and so on. In this way, we can gradually limit the potential for incorrect statements, getting closer to the direct alternative question. However, it’s clear that all these cases can be represented numerically by a hypothetical reduction in the number of balls that are differentiated from one another.
§ 6. Of the two plans mentioned in § 4 we will begin with the latter, as it is the only methodical and scientific one which has been proposed. Suppose that there is a bag with 1000 balls, only one of which is white, the rest being all black. A ball is drawn at random, and our witness whose veracity is 9/10 reports that the white ball was drawn. Take a great many of his statements upon this particular subject, 412 say 10,000; that is, suppose that 10,000 balls having been successively drawn out of this bag, or bags of exactly the same kind, he makes his report in each case. His 10,000 statements being taken as a fair sample of his general average, we shall find, by supposition, that 9 out of every 10 of them are true and the remaining one false. What will be the nature of these false statements? Under the circumstances in question, he having only one way of going wrong, the answer is easy. In the 10,000 drawings the white ball would come out 10 times, and therefore be rightly asserted 9 times, whilst on the one of these occasions on which he goes wrong he has nothing to say but ‘black.’ So with the 9990 occasions on which black is drawn; he is right and says black on 8991 of them, and is wrong and therefore says white on 999 of them. On the whole, therefore, we conclude that out of every 1008 times on which he says that white is drawn he is wrong 999 times and right only 9 times. That is, his special veracity, as we may term it, for cases of this description, has been reduced from 9/10 to 9/1008. As it would commonly be expressed, the latter fraction represents the chance that this particular statement of his is true.[3]
§ 6. From the two plans mentioned in § 4, we will start with the latter, as it is the only methodical and scientific approach proposed. Imagine there's a bag containing 1000 balls, with just one being white and the rest black. A ball is picked at random, and our witness, whose reliability is 9/10, reports that the white ball was drawn. Let's consider many of his statements on this specific subject, say 10,000; that is, we assume that 10,000 balls have been drawn one after the other from this bag, or bags of the exact same type, and he reports on each instance. Taking his 10,000 statements as a representative sample of his overall average, we assume that 9 out of every 10 of them are true and 1 is false. What do these false statements look like? Given the situation, where he has only one way to be incorrect, the answer is straightforward. In the 10,000 draws, the white ball would come up 10 times, so he correctly claims white 9 times, while on the one occasion he errs, he can only say 'black.' The same goes for the 9990 times black is drawn; he is right and claims black on 8991 occasions, but wrong and claims white on 999 occasions. Overall, we therefore conclude that out of every 1008 times he asserts that white was drawn, he is wrong 999 times and right only 9 times. This means his specific accuracy, as we might call it, for cases like this has fallen from 9/10 to 9/1008. Commonly said, the latter fraction indicates the likelihood that this particular statement of his is true.[3]
§ 7. We will now take the case in which the witness has many ways of going wrong, instead of merely one. Suppose that the balls were all numbered, from 1 to 1,000, and the witness knows this fact. A ball is drawn, and he tells me that it was numbered 25, what are the odds that he is right? Proceeding as before, in 10,000 drawings this ball would be obtained 10 times, and correctly named 9 times. But on the 9990 occasions on which it was not drawn there would be a difference, for the witness has now many openings for error before him. It is, however, generally considered reasonable to assume that his errors will all take the form of announcing wrong numbers; and that, there being no apparent reason why he should choose one number rather than another, he will be likely to announce all the wrong ones equally often. Hence his 999 errors, instead of all leading him now back again to one spot, will be uniformly spread over as many distinct ways of going wrong. On one only of these occasions, therefore, will he mention 25 as having been drawn. It follows therefore that out of every 10 times that he names 25 he is right 9 times; so that in this case his average or general truthfulness applies equally well to the special case in point.
§ 7. Let's consider a situation where the witness has multiple ways to be wrong, not just one. Imagine the balls are numbered from 1 to 1,000, and the witness is aware of this. If a ball is drawn, and he tells me it was numbered 25, what are the chances he's correct? Following the same reasoning, in 10,000 draws, this ball would be drawn 10 times, and he would correctly identify it 9 times. However, on the 9,990 times when it wasn’t drawn, there would be a difference because the witness now has many opportunities to make mistakes. It's generally accepted that his errors will mostly consist of naming incorrect numbers; and since there's no clear reason for him to prefer one number over another, he will likely name the wrong ones with equal frequency. As a result, his 999 mistakes, instead of all pointing back to one number, will be spread across multiple ways of being wrong. Thus, in only one of those instances, will he say 25 was drawn. Therefore, out of every 10 times he says 25, he's correct 9 times; so in this case, his overall honesty applies well to this specific situation.
§ 8. With regard to the truth of these conclusions, it must of course be admitted that if we grant the validity of the assumptions about the limits within which the blundering or mendacity of the witness are confined, and the complete impartiality with which his answers are disposed within those limits, the reasoning is perfectly sound. But are not these assumptions extremely arbitrary, that is, are not our lotteries and bags of balls rendered perfectly precise in many respects in which, in ordinary life, the conditions supposed to correspond to them are so vague and uncertain that no such method of reasoning becomes practically available? Suppose 414 that a person whom I have long known, and of whose measure of veracity and judgment I may be supposed therefore to have acquired some knowledge, informs me that there is something to my advantage if I choose to go to certain trouble or expense in order to secure it. As regards the general veracity of the witness, then, there is no difficulty; we suppose that this is determined for us. But as regards his story, difficulty and vagueness emerge at every point. What is the number of balls in the bag here? What in fact are the nature and contents of the bag out of which we suppose the drawing to have been made? It does not seem that the materials for any rational judgment exist here. But if we are to get at any such amended figure of veracity as those attained in the above example, these questions must necessarily be answered with some degree of accuracy; for the main point of the method consists in determining how often the event must be considered not to happen, and thence inferring how often the witness will be led wrongly to assert that it has happened.
§ 8. When it comes to the truth of these conclusions, we have to admit that if we accept the validity of the assumptions about the boundaries within which a witness might misjudge or lie, and the complete impartiality with which their answers are given within those boundaries, then the reasoning is sound. But aren't these assumptions pretty arbitrary? In many ways, our lotteries and bags of balls are drawn with much more precision than the vague and uncertain conditions we encounter in real life, making such reasoning practically unavailable. Imagine that someone I’ve known for a long time, whose honesty and judgment I’ve come to understand, tells me there’s something beneficial for me if I’m willing to go through some trouble or expense to get it. In terms of the witness's overall honesty, there's no issue; we assume that’s established. However, when it comes to their story, confusion and ambiguity arise at every turn. How many balls are in the bag? What exactly are the nature and contents of the bag from which we think the selection was made? It seems like there’s no basis for any rational judgment here. But if we want to arrive at a more accurate measure of honesty similar to the examples given earlier, those questions need to be answered with some accuracy. The crux of the method is figuring out how often the event must be considered not to happen, and from that, inferring how often the witness might incorrectly assert that it has occurred.
It is not of course denied that considerations of the kind in question have some influence upon our decision, but only that this influence could under any ordinary circumstances be submitted to numerical determination. We are doubtless liable to have information given to us that we have come in for some kind of fortune, for instance, when no such good luck has really befallen us; and this not once only but repeatedly. But who can give the faintest intimation of the nature and number of the occasions on which, a blank being thus really drawn, a prize will nevertheless be falsely announced? It appears to me therefore that numerical results of any practical value can seldom, if ever, be looked for from this method of procedure.
It’s certainly true that the considerations mentioned do have some effect on our decisions, but this effect can’t realistically be measured in numerical terms under normal circumstances. We’re likely to receive information suggesting that we’ve experienced some sort of good fortune, even when we haven’t actually been lucky at all; and this can happen not just once, but multiple times. However, who can even hint at how often this happens, where a blank is drawn, but a prize is falsely claimed? Therefore, it seems to me that we can rarely, if ever, expect to achieve any practical numerical results from this approach.
§ 9. Our conclusion in the case of the lottery, or, what 415 comes to the same thing, in the case of the bag with black and white balls, has been questioned or objected to[4] on the ground that it is contrary to all experience to suppose that the testimony of a moderately good witness could be so enormously depreciated under such circumstances. I should prefer to base the objection on the ground that experience scarcely ever presents such circumstances as those supposed; but if we postulate their existence the given conclusion seems correct enough. Assume that a man is merely required to say yes or no; assume also a group or succession of cases in which no should rightly be said very much oftener than yes. Then, assuming almost any general truthfulness of the witness, we may easily suppose the rightful occasions for denial to be so much the more frequent that a majority of his affirmative answers will actually occur as false ‘noes’ rather than as correct ‘ayes.’ This of course lowers the average value of his ‘ayes,’ and renders them comparatively untrustworthy.
§ 9. Our conclusion regarding the lottery, or, in other words, the situation with the bag containing black and white balls, has been challenged or disputed on the grounds that it contradicts all experience to think that the testimony of a reasonably good witness could be so greatly diminished in such circumstances. I would rather point out that the objection stems from the fact that experience rarely presents situations like the ones imagined; however, if we assume they do exist, the conclusion given seems quite accurate. Let’s say a person is only asked to respond with yes or no; also, suppose there are a series of cases where no should be the answer far more often than yes. Then, assuming the witness is generally truthful, it’s plausible that the legitimate instances for saying no would be frequent enough that most of his affirmative answers turn out to be false ‘noes’ rather than correct ‘ayes.’ This, of course, lowers the overall reliability of his ‘ayes,’ making them relatively untrustworthy.
Consider the following example. I have a gardener whom I trust as to all ordinary matters of fact. If he were to tell me some morning that my dog had run away I should fully believe him. He tells me however that the dog has gone mad. Surely I should accept the statement with much hesitation, and on the grounds indicated above. It is not that he is more likely to be wrong when the dog is mad; but that experience shows that there are other complaints (e.g. fits) which are far more common than madness, and that most of the assertions of madness are erroneous assertions referring to these. This seems a somewhat parallel case to that in which we find that most of the assertions that a white ball had been drawn are really false assertions referring to the drawing of a black ball. Practically I do 416 not think that any one would feel a difficulty in thus exorbitantly discounting some particular assertion of a witness whom in most other respects he fully trusted.
Consider this example. I have a gardener whom I trust with all ordinary facts. If he told me one morning that my dog had run away, I would completely believe him. However, if he said that the dog has gone mad, I would definitely accept that statement with a lot of hesitation, based on the reasons stated above. It's not that he's more likely to be wrong when the dog *is* mad; rather, experience shows that there are other issues (like fits) that are much more common than madness, and that most claims of madness are actually false claims related to these. This seems somewhat similar to the situation where we find that most claims that a white ball was drawn are actually false claims regarding the drawing of a black ball. Practically, I don’t think anyone would struggle to overly discount a specific statement from a witness they generally trust.
§ 10. There is one particular case which has been regarded as a difficulty in the way of this treatment of the problem, but which seems to me to be a decided confirmation of it; always, be it understood, within the very narrow and artificial limits to which we must suppose ourselves to be confined. This is the case of a witness whose veracity is just one-half; that is, one who, when a mere yes or no is demanded of him, is as often wrong as right. In the case of any other assigned degree of veracity it is extremely difficult to get anything approaching to a confirmation from practical judgment and experience. We are not accustomed to estimate the merits of witnesses in this way, and hardly appreciate what is meant by his numerical degree of truthfulness. But as regards the man whose veracity is one-half, we are (as Mr C. J. Monro has very ingeniously suggested) only too well acquainted with such witnesses, though under a somewhat different name; for this is really nothing else than the case of a person confidently answering a question about a subject-matter of which he knows nothing, and can therefore only give a mere guess.
§ 10. There’s one specific situation that has been seen as a challenge to this approach to the problem, but which I think actually supports it; of course, this is only within the very narrow and artificial limits we need to assume for ourselves. This is the case of a witness whose truthfulness is exactly fifty percent; that is, someone who is just as likely to be wrong as to be right when asked a simple yes or no. For any other level of truthfulness, it’s extremely hard to find anything resembling confirmation from practical judgment and experience. We aren’t used to evaluating witnesses this way, and we hardly understand what his numerical degree of truth means. However, regarding the person whose truthfulness is fifty percent, we are (as Mr. C. J. Monro has cleverly pointed out) very familiar with such witnesses, although we know them by a different name; this is really just someone who confidently answers a question about a topic they know nothing about, and can only make a wild guess.
Now in the case of the lottery with one prize, when the witness whose veracity is one-half tells us that we have gained the prize, we find on calculation that his testimony goes for absolutely nothing; the chances that we have got the prize are just the same as they would be if he had never opened his lips, viz. 1/1000. But clearly this is what ought to be the result, for the witness who knows nothing about the matter leaves it exactly as he found it. He is indeed, in strictness, scarcely a witness at all; for the natural function of a witness is to examine the matter, and so to add 417 confirmation, more or less, according to his judgment and probity, but at any rate to offer an improvement upon the mere guesser. If, however, we will give heed to his mere guess we are doing just the same thing as if we were to guess ourselves, in which case of course the odds that we are right are simply measured by the frequency of occurrence of the events.
In the case of a lottery with one prize, when a witness whose reliability is only half tells us that we've won, we calculate that his testimony is essentially worthless; our chances of winning are exactly the same as if he hadn’t said anything, which is 1/1000. This is clearly the expected outcome, as a witness who doesn’t know anything about the situation doesn’t change anything. In fact, he’s barely a witness at all; a witness's role is to investigate the situation and provide some level of confirmation based on his judgment and honesty, improving upon a simple guess. If we accept his mere guess, we're doing nothing more than guessing ourselves, in which case our odds of being correct are determined by how often the events happen.
We cannot quite so readily apply the same rule to the other case, namely to that of the numbered balls, for there the witness who is right every other time may really be a very fair, or even excellent, witness. If he has many ways of going wrong, and yet is right in half his statements, it is clear that he must have taken some degree of care, and cannot have merely guessed. In a case of yes or no, any one can be right every other time, but it is different where truth is single and error is manifold. To represent the case of a simply worthless witness when there were 1000 balls and the drawing of one assigned ball was in question, we should have to put his figure of veracity at 1/1000. If this were done we should of course get a similar result.
We can't easily apply the same rule to the other scenario, specifically the one involving the numbered balls. In that case, a witness who is correct every other time could still be a pretty fair or even great witness. If he has several ways of being wrong but is still right half the time, it shows he must have put in some effort and isn't just making random guesses. In a simple yes or no situation, anyone could be right every other time, but it’s different when there’s only one truth and many chances for error. To illustrate a completely unreliable witness when there are 1000 balls and the drawing of one specific ball is in question, we’d have to assign his credibility a value of 1/1000. If we did that, we’d obviously reach a similar conclusion.
§ 11. It deserves notice therefore that the figure of veracity, or fraction representing the general truthfulness of a witness, is in a way relative, not absolute; that is, it depends upon, and varies with, the general character of the answer which he is supposed to give. Two witnesses of equal intrinsic veracity and worth, one of whom confined himself to saying yes and no, whilst the other ventured to make more original assertions, would be represented by different fractions; the former having set himself a much easier task than the latter. The real caution and truthfulness of the witness are only one factor, therefore, in his actual figure of veracity; the other factor consists of the nature of his assertions, as just pointed out. The ordinary 418 plan therefore, in such problems, of assigning an average truthfulness to the witness, and accepting this alike in the case of each of the two kinds of answers, though convenient, seems scarcely sound. This consideration would however be of much more importance were not the discussions upon the subject mainly concerned with only one description of answer, namely that of the ‘yes or no’ kind.
§ 11. It's important to notice that the figure of truthfulness, or the fraction representing how truthful a witness is, is relative rather than absolute; it depends on and varies with the overall character of the answer they are expected to provide. Two witnesses who are equally credible, with one only saying yes and no while the other makes more varied statements, would be represented by different fractions; the former has taken on a much simpler task than the latter. Therefore, the actual caution and truthfulness of the witness is just one factor in their overall figure of truthfulness; the other factor is the nature of their statements, as mentioned. The usual approach in these situations, where an average truthfulness is assigned to the witness and accepted for both types of answers, while convenient, seems to lack validity. However, this is a more significant consideration if the discussions mainly focus on one type of answer, specifically the ‘yes or no’ kind.
§ 12. So much for the methodical way of treating such a problem. The way in which it would be taken in hand by those who had made no study of Probability is very different. It would, I apprehend, strike them as follows. They would say to themselves, Here is a story related by a witness who tells the truth, say, nine times out of ten. But it is a story of a kind which experience shows to be very generally made untruly, say 99 times out of 100. Having then these opposite inducements to belief, they would attempt in some way to strike a balance between them. Nothing in the nature of a strict rule could be given to enable them to decide how they might escape out of the difficulty. Probably, in so far as they did not judge at haphazard, they would be guided by still further resort to experience, or unconscious recollections of its previous teachings, in order to settle which of the two opposing inductions was better entitled to carry the day in the particular case before them. The reader will readily see that any general solution of the problem, when thus presented, is impossible. It is simply the now familiar case (Chap. IX. §§ 14–32) of an individual which belongs equally to two distinct, or even, in respect of their characteristics, opposing classes. We cannot decide off-hand to which of the two its characteristics most naturally and rightly refer it. A fresh induction is needed in order to settle this point.
§ 12. That’s the systematic way to approach this problem. However, those who don’t have any background in Probability would handle it quite differently. They might think like this: Here’s a story from a witness who tells the truth about nine times out of ten. But this is a type of story that experience shows is often false, about 99 times out of 100. With these conflicting reasons to believe, they would try to find some sort of balance between them. There wouldn’t be a strict rule to help them figure out how to resolve the issue. Most likely, unless they were making random judgments, they would rely on further experiences or subconscious memories of past lessons to decide which of the two opposing beliefs should prevail in the specific case they are considering. The reader will quickly see that a general solution to the problem, when presented this way, is impossible. It’s simply the now-familiar scenario (Chap. IX. §§ 14–32) of an individual that fits into two distinct, or even conflicting, categories. We can’t just decide right away which category its characteristics belong to most naturally and accurately. We need a new induction to resolve this issue.
§ 13. Rules have indeed been suggested by various 419 writers in order to extricate us from the difficulty. The controversy about miracles has probably been the most fertile occasion for suggestions of this kind on one side or the other. It is to this controversy, presumably, that the phrase is due, so often employed in discussions upon similar subjects, ‘a contest of opposite improbabilities.’ What is meant by such an expression is clearly this: that in forming a judgment upon the truth of certain assertions we may find that they are comprised in two very distinct classes, so that, according as we regarded them as belonging to one or the other of these distinct classes, our opinion as to their truth would be very different. Such an assertion belongs to one class, of course, by its being a statement of a particular witness, or kind of witness; it belongs to the other by its being a particular kind of story, one of what is called an improbable nature. Its belonging to the former class is so far favourable to its truth, its belonging to the latter is so far hostile to its truth. It seems to be assumed, in speaking of a contest of opposite improbabilities, that when these different sources of conviction co-exist together, they would each in some way retain their probative force so as to produce a contest, ending generally in a victory to one or other of them. Hume, for instance, speaks of our deducting one probability from the other, and apportioning our belief to the remainder.[5] Thomson, in his Laws of Thought, speaks of one probability as entirely superseding the other.
§ 13. Various writers have suggested rules to help us out of this dilemma. The debate over miracles has likely sparked the most ideas from both sides. This debate is probably where the phrase often used in similar discussions, “a contest of opposite improbabilities,” comes from. This expression means that when we are judging the truth of certain claims, we can find that they fall into two very distinct categories. Depending on whether we see them as belonging to one category or the other, our opinions on their truth can differ significantly. One category is based on the statement of a specific witness or type of witness, while the other is based on the nature of the story itself, particularly if it's considered improbable. Belonging to the first category supports its truth, while belonging to the second category undermines it. In talking about a contest of opposite improbabilities, it's assumed that when these different sources of belief coexist, they both keep their persuasive power, resulting in a contest that usually ends with one winning out over the other. Hume, for example, discusses how we deduct one probability from the other and adjust our belief to what remains. Thomson, in his Laws of Thought, claims that one probability can completely override the other.
§ 14. It does not appear to me that the slightest philosophical value can be attached to any such rules as these. They doubtless may, and indeed will, hold in individual 420 cases, but they cannot lay claim to any generality. Even the notion of a contest, as any necessary ingredient in the case, must be laid aside. For let us refer again to the way in which the perplexity arises, and we shall readily see, as has just been remarked, that it is nothing more than a particular exemplification of a difficulty which has already been recognized as incapable of solution by any general à priori method of treatment. All that we are supposed to have before us is a statement. On this occasion it is made by a witness who lies, say, once in ten times in the long run; that is, who mostly tells the truth. But on the other hand, it is a statement which experience, derived from a variety of witnesses on various occasions, assures us is mostly false; stated numerically it is found, let us suppose, to be false 99 times in a hundred.
§ 14. I don't think any real philosophical value can be attached to rules like these. They might apply in some specific cases, and they probably will, but they can't claim to be generally true. Even the idea of a competition, as a necessary part of the situation, needs to be put aside. If we look again at how the confusion arises, we'll clearly see, as was just mentioned, that it's simply a specific example of a problem that's already been recognized as unsolvable by any general à priori approach. What we have in front of us is just a statement. In this case, it’s given by a witness who lies about once in every ten times, meaning they mostly tell the truth. However, it's also a statement that experience, from multiple witnesses on different occasions, tells us is mostly false; let’s assume that numerically it's found to be false 99 times out of 100.
Now, as was shown in the chapter on Induction, we are thus brought to a complete dead lock. Our science offers no principles by which we can form an opinion, or attempt to decide the matter one way or the other; for, as we found, there are an indefinite number of conclusions which are all equally possible. For instance, all the witness' extraordinary assertions may be true, or they may all be false, or they may be divided into the true and the false in any proportion whatever. Having gone so far in our appeal to statistics as to recognize that the witness is generally right, but that his story is generally false, we cannot stop there. We ought to make still further appeal to experience, and ascertain how it stands with regard to his stories when they are of that particular nature: or rather, for this would be to make a needlessly narrow reference, how it stands with regard to stories of that kind when advanced by witnesses of his general character, position, sympathies, and so on.[6]
Now, as shown in the chapter on Induction, we find ourselves at a complete deadlock. Our science provides no principles for forming an opinion or making a decision either way; as we discovered, there are countless conclusions that are all equally possible. For example, all the witness's extraordinary claims could be true, or they might all be false, or they could be a mix of true and false in any proportion. Having acknowledged that the witness is usually correct but that their story is often inaccurate, we can't leave it at that. We should further examine the evidence and see how it holds up regarding their stories of that specific nature; or rather, how it applies to stories of that type when presented by witnesses with his general character, background, sympathies, and so on.[6]
§ 15. That extraordinary stories are in many cases, probably in a great majority of cases, less trustworthy than others must be fully admitted. That is, if we were to make two distinct classes of such stories respectively, we should find that the same witness, or similar witnesses, were proportionally more often wrong when asserting the former than when asserting the latter. But it does not by any means appear to me that this must always be the case. We may well conceive, for instance, that with some people the mere fact of the story being of a very unusual character may make them more careful in what they state, so as actually to add to their veracity. If this were so we might be ready to accept their extraordinary stories with even more readiness than their ordinary ones.
§ 15. It’s clear that extraordinary stories are often, and likely in most cases, less reliable than others. If we were to categorize these stories, we would find that the same witness, or similar witnesses, tend to be wrong more often with the former type than with the latter. However, it doesn’t seem to me that this is always the case. We can imagine that for some individuals, the unusual nature of a story might actually make them more careful in what they say, which could enhance their truthfulness. If that were true, we might be more willing to accept their extraordinary stories than their ordinary ones.
Such a supposition as that just made does not seem to me by any means forced. Put such a case as this: let us suppose that two persons, one of them a man of merely ordinary probity and intelligence, the other a scientific naturalist, make a statement about some common event. We believe 422 them both. Let them now each report some extraordinary lusus naturæ or monstrosity which they profess to have seen. Most persons, we may presume, would receive the statement of the naturalist in this latter case almost as readily as in the former: whereas when the same story came from the unscientific observer it would be received with considerable hesitation. Whence arises the difference? From the conviction that the naturalist will be far more careful, and therefore to the full as accurate, in matters of this kind as in those of the most ordinary description, whereas with the other man we feel by no means the same confidence. Even if any one is not prepared to go this length, he will probably admit that the difference of credit which he would attach to the two kinds of story, respectively, when they came from the naturalist, would be much less than what it would be when they came from the other man.
Such a belief as the one just mentioned doesn’t seem forced to me at all. Consider this situation: let’s imagine two people, one a man of average honesty and intelligence, and the other a scientific naturalist, reporting on some common event. We trust both of them. Now, let’s say each one describes an extraordinary lusus naturæ or monster they claim to have seen. Most people would likely accept the naturalist’s statement in this case just as readily as in the previous one; however, when the same story comes from the unscientific observer, it would be met with a lot more doubt. Why is there this difference? It stems from the belief that the naturalist will be much more careful—and therefore just as accurate—in these matters as he is in the most ordinary situations, while we don’t feel the same level of confidence with the other man. Even if someone isn’t completely on board with this idea, they would probably agree that the amount of trust they would place in the naturalist’s stories would be much greater than in the other man’s.
§ 16. Whilst we are on this part of the subject, it must be pointed out that there is considerable ambiguity and consequent confusion about the use of the term ‘an extraordinary story.’ Within the province of pure Probability it ought to mean simply a story which asserts an unusual event. At least this is the view which has been adopted and maintained, it is hoped consistently, throughout this work. So long as we adhere to this sense we know precisely what we mean by the term. It has a purely objective reference; it simply connotes a very low degree of relative statistical frequency, actual or prospective. Out of a great number of events we suppose a selection of some particular kind to be contemplated, which occurs relatively very seldom, and this is termed an unusual or extraordinary event. It follows, as was abundantly shown in a former chapter, that owing to the rarity of the event we are very little disposed to expect its occurrence in any given case. Our guess about it, in case 423 we thus anticipated it, would very seldom be justified, and we are therefore apt to be much surprised when it does occur. This, I take it, is the only legitimate sense of ‘extraordinary’ so far as Probability is concerned.
§ 16. While we're on this topic, it's important to point out that there's a lot of ambiguity and confusion surrounding the term 'an extraordinary story.' In the realm of pure Probability, it should simply refer to a story that claims an unusual event. At least, that's the perspective we've adopted and maintained, hopefully consistently, throughout this work. As long as we stick to this definition, we know exactly what we mean by the term. It has a completely objective reference; it simply indicates a very low degree of relative statistical frequency, whether it's actual or anticipated. Out of a large number of events, we consider a specific type that occurs quite rarely, and this is referred to as an unusual or extraordinary event. It follows, as was clearly demonstrated in a previous chapter, that because the event is rare, we are generally not inclined to expect it to happen in any particular instance. Our prediction about it, if we were to anticipate it, would very rarely be accurate, and so we tend to be quite surprised when it does happen. This, I believe, is the only valid interpretation of 'extraordinary' in terms of Probability.
But there is another and very different use of the word, which belongs to Induction, or rather to the science of evidence in general, more than to that limited portion of it termed Probability. In this sense the ‘extraordinary,’ and still more the ‘improbable,’ event is not merely one of extreme statistical rarity, which we could not expect to guess aright, but which on moderate evidence we may pretty readily accept; it is rather one which possesses, so to say, an actual evidence-resisting power. It may be something which affects the credibility of the witness at the fountain-head, which makes, that is, his statements upon such a subject essentially inferior to those on other subjects. This is the case, for instance, with anything which excites his prejudices or passions or superstitions. In these cases it would seem unreasonable to attempt to estimate the credibility of the witness by calculating (as in § 6) how often his errors would mislead us through his having been wrongly brought to an affirmation instead of adhering correctly to a negation. We should rather be disposed to put our correction on the witness' average veracity at once.
But there's another, very different meaning of the word that relates more to Induction, or the science of evidence in general, than to the narrower concept of Probability. In this sense, an 'extraordinary' or even 'improbable' event isn't just something that's extremely rare statistically, which we wouldn't expect to guess accurately but could accept on moderate evidence; it actually has a resistance to evidence. It might be something that undermines the credibility of the witness at the source, making their statements on that topic significantly less reliable than on others. This happens, for example, when their biases, passions, or superstitions are involved. In these situations, it seems unreasonable to try to assess the witness's credibility by calculating (as in § 6) how often their mistakes could mislead us due to them wrongly affirming something instead of correctly denying it. Instead, we would likely adjust our evaluation of the witness's overall honesty right away.
§ 17. In true Probability, as has just been remarked, every event has its own definitely recognizable degree of frequency of occurrence. It may be excessively rare, rare to any extreme we like to postulate, but still every one who understands and admits the data upon which its occurrence depends will be able to appreciate within what range of experience it may be expected to present itself. We do not expect it in any individual case, nor within any brief range, but we do confidently expect it within an adequately extensive 424 range. How therefore can miraculous stories be similarly taken account of, when the disputants, on one side at least, are not prepared to admit their actual occurrence anywhere or at any time? How can any arrangement of bags and balls, or other mechanical or numerical illustrations of unlikely events, be admitted as fairly illustrative of miraculous occurrences, or indeed of many of those which come under the designation of ‘very extraordinary’ or ‘highly improbable’? Those who contest the occurrence of a particular miracle, as reported by this or that narrator, do not admit that miracles are to be confidently expected sooner or later. It is not a question as to whether what must happen sometimes has happened some particular time, and therefore no illustration of the kind can be regarded as apposite.
§ 17. In true probability, as noted earlier, every event has a specific, recognizable frequency of occurrence. It might be extremely rare, to any extent we choose to imagine, but anyone who understands and accepts the data influencing its occurrence will be able to recognize the range within which it can be expected to happen. We don't anticipate it in any single instance or over a short period, but we do expect it to occur within a sufficiently large timeframe. So, how can miraculous stories be considered in the same way when, at least on one side, the disputants refuse to acknowledge that they have actually happened at any time or place? How can any set of bags and balls or other mechanical or numerical examples of unlikely events be seen as representative of miracles, or many of those labeled as ‘very extraordinary’ or ‘highly improbable’? Those who challenge the validity of a specific miracle, as recounted by a particular narrator, do not agree that miracles can be reliably expected at some point. It's not a matter of whether something that must happen occasionally has occurred at some specific time, so no such illustration can be viewed as relevant.
How unsuitable these merely rare events, however excessive their rarity may be, are as examples of miraculous events, will be evident from a single consideration. No one, I presume, who admitted the occasional occurrence of an exceedingly unusual combination, would be in much doubt if he considered that he had actually seen it himself.[7] On the other hand, few men of any really scientific turn would readily accept a miracle even if it appeared to happen under their very eyes. They might be staggered at the time, but 425 they would probably soon come to discredit it afterwards, or so explain it as to evacuate it of all that is meant by miraculous.
How inappropriate these rare events, no matter how extraordinary their rarity is, are as examples of miraculous events will be clear from one simple point. No one, I assume, who acknowledged that very unusual combinations occasionally happen would doubt if they believed they had witnessed it themselves.[7] On the other hand, few people with a genuinely scientific mindset would easily accept a miracle, even if it seemed to happen right in front of them. They might be shocked at the moment, but 425 they would likely come to doubt it later or find a way to explain it that removes all meaning from the miraculous.
§ 18. It appears to me therefore, on the whole, that very little can be made of these problems of testimony in the way in which it is generally intended that they should be treated; that is, in obtaining specific rules for the estimation of the testimony under any given circumstances. Assuming that the veracity of the witness can be measured, we encounter the real difficulty in the utter impossibility of determining the limits within which the failures of the event in question are to be considered to lie, and the degree of explicitness with which the witness is supposed to answer the enquiry addressed to him; both of these being characteristics of which it is necessary to have a numerical estimate before we can consider ourselves in possession of the requisite data.
§ 18. Overall, it seems to me that not much can be gained from these issues of testimony in the way they're usually meant to be handled; that is, by deriving specific rules for assessing testimony in any given situation. Assuming we can measure the truthfulness of a witness, we face the real challenge of the complete impossibility of defining the boundaries within which the failures of the event in question should be considered, and the level of clarity with which the witness is expected to respond to the inquiry directed at them; both of these aspects require a numerical estimate before we can consider ourselves to have the necessary information.
Since therefore the practical resource of most persons, viz. that of putting a direct and immediate correction, of course of a somewhat conjectural nature, upon the general trustworthiness of the witness, by a consideration of the nature of the circumstances under which his statement is made, is essentially unscientific and irreducible to rule; it really seems to me that there is something to be said in favour of the simple plan of trusting in all cases alike to the witness' general veracity.[8] That is, whether his story is ordinary or extraordinary, we may resolve to put it on the same footing of credibility, provided of course that the event is fully recognized as one which does or may occasionally 426 happen. It is true that we shall thus go constantly astray, and may do so to a great extent, so that if there were any rational and precise method of specializing his trustworthiness, according to the nature of his story, we should be on much firmer ground. But at least we may thus know what to expect on the average. Provided we have a sufficient number and variety of statements from him, and always take them at the same constant rate or degree of trustworthiness, we may succeed in balancing and correcting our conduct in the long run so as to avoid any ruinous error.
Since the practical approach most people take— that is, applying a direct and immediate correction, which is somewhat speculative, to the general reliability of the witness by considering the context in which their statement is made— is fundamentally unscientific and cannot be reduced to a rule; it seems to me that there's merit in simply trusting the witness's overall honesty in every case. In other words, whether their account is ordinary or extraordinary, we might decide to consider it equally credible, as long as the event is recognized as one that does or could occasionally happen. It's true that this approach will often lead us astray, sometimes significantly, so if there were a rational and exact method for evaluating their trustworthiness based on the nature of their account, we'd be on much safer ground. However, at least we can have a general expectation. If we gather a sufficient number and variety of statements from them and consistently apply the same level of trustworthiness, we might be able to balance and adjust our actions over time to avoid serious mistakes.
§ 19. A few words may now be added about the combination of testimony. No new principles are introduced here, though the consequent complication is naturally greater. Let us suppose two witnesses, the veracity of each being 9/10. Now suppose 100 statements made by the pair; according to the plan of proceeding adopted before, we should have them both right 81 times and both wrong once, in the remaining 18 cases one being right and the other wrong. But since they are both supposed to give the same account, what we have to compare together are the number of occasions on which they agree and are right, and the total number on which they agree whether right or wrong. The ratio of the former to the latter is the fraction which expresses the trustworthiness of their combination of testimony in the case in question.
§ 19. A few words can now be added about how testimony works together. No new principles are introduced here, but the resulting complexity is naturally greater. Let’s consider two witnesses, each with a reliability of 9/10. Now let’s say they make 100 statements together; following the approach used previously, they would both be correct 81 times and both incorrect once, while in the remaining 18 cases, one would be correct and the other incorrect. However, since they are both expected to provide the same account, what we need to compare is how many times they agree and are correct versus the total number of times they agree, whether correct or incorrect. The ratio of the former to the latter shows the trustworthiness of their combined testimony in this situation.
In attempting to decide this point the only difficulty is in determining how often they will be found to agree when they are both wrong, for clearly they must agree when they are both right. This enquiry turns of course upon the number of ways in which they can succeed in going wrong. Suppose first the case of a simple yes or no (as in § 6), and take the same example, of a bag with 1000 balls, in which one only is white. Proceeding as before, we should find that 427 out of 100,000 drawings (the number required in order to obtain a complete cycle of all possible occurrences, as well as of all possible reports about them) the two witnesses agree in a correct report of the appearance of white in 81, and agree in a wrong report of it in 999. The Probability therefore of the story when so attested is 81/1080; the fact therefore of two such witnesses of equal veracity having concurred makes the report nearly 9 times as likely as when it rested upon the authority of only one of them.[9]
In trying to figure this out, the main challenge is figuring out how often they agree when they're both wrong since they'll definitely agree when they're both right. This inquiry depends on how many ways they can end up being wrong. Let’s first consider a simple yes or no situation (as in § 6), using the same example of a bag with 1000 balls, where only one is white. Following the same approach, we would find that out of 100,000 draws (the amount needed to cover all possible outcomes and reports), the two witnesses correctly report seeing white in 81 cases and falsely report it in 999 cases. Therefore, the probability of the report being accurate when confirmed by them is 81/1080; thus, having two witnesses of equal honesty makes the report almost 9 times as likely as if it relied on just one of them.[9]
§ 20. When however the witnesses have many ways of going wrong, the fact of their agreeing makes the report far more likely to be true. For instance, in the case of the 1000 numbered balls, it is very unlikely that when they both mistake the number they should (without collusion) happen to make the same misstatement. Whereas, in the last case, every combined misstatement necessarily led them both to the assertion that the event in question had happened, we should now find that only once in 999 × 999 times would they both be led to assert that some given number (say, as before, 25) had been drawn. The odds in favour of the 428 event in fact now become 80919/80920, which are enormously greater than when there was only one witness.
§ 20. However, when witnesses have multiple ways to be wrong, their agreement makes the report much more likely to be true. For example, in the case of the 1,000 numbered balls, it’s very unlikely that if they both make a mistake, they would (without coordinating) happen to make the same mistake. In the previous case, every combined mistake led them to both claim that the event had occurred. Now, we would find that only once in 999 × 999 times would they both claim that some specific number (let's say, as before, 25) had been drawn. The odds in favor of the event actually become 80919/80920, which is significantly greater than when there was only one witness.
It appears therefore that when two, and of course still more when many, witnesses agree in a statement in a matter about which they might make many and various errors, the combination of their favourable testimony adds enormously to the likelihood of the event; provided always that there is no chance of collusion. And in the extreme case of the opportunities for error being, as they well may be, practically infinite in number, such combination would produce almost perfect certainty. But then this condition, viz. absence of collusion, very seldom can be secured. Practically our main source of error and suspicion is in the possible existence of some kind of collusion. Since we can seldom entirely get rid of this danger, and when it exists it can never be submitted to numerical calculation, it appears to me that combination of testimony, in regard to detailed accounts, is yet more unfitted for consideration in Probability than even that of single testimony.
It seems that when two witnesses, or even more, agree on a statement about something they could easily be mistaken about, their combined positive testimony greatly increases the likelihood of the event being true, as long as there’s no chance of collusion. In cases where the potential for error is virtually endless, their agreement could lead to almost complete certainty. However, this condition—absence of collusion—rarely can be guaranteed. Our main source of error and doubt comes from the possible existence of collusion. Since we can hardly ever eliminate this risk entirely, and when it does exist it can't be calculated numerically, I believe that combining testimonies, especially detailed accounts, is even less reliable in terms of probability than relying on a single testimony.
§ 21. The impossibility of any adequate or even appropriate consideration of the credibility of miraculous stories by the rules of Probability has been already noticed in § 17. But, since the grounds of this impossibility are often very insufficiently appreciated, a few pages may conveniently be added here with a view to enforcing this point. If it be regarded as a digression, the importance of the subject and the persistency with which various writers have at one time or another attempted to treat it by the rules of our science must be the excuse for entering upon it.
§ 21. The difficulty of properly or even appropriately assessing the credibility of miraculous stories using the principles of Probability has already been mentioned in § 17. However, since the reasons for this difficulty are often not fully understood, it is useful to add a few pages here to emphasize this point. If this seems like a digression, the significance of the topic and the consistent efforts by various writers to address it through our scientific rules justify discussing it.
A necessary preliminary will be to decide upon some definition of a miracle. It will, we may suppose, be admitted by most persons that in calling a miracle ‘a suspension of a law of causation,’ we are giving what, though it may not amount 429 to an adequate definition, is at least true as a description. It is true, though it may not be the whole truth. Whatever else the miracle may be, this is its physical aspect: this is the point at which it comes into contact with the subject-matter of science. If it were not considered that any suspension of causation were involved, the event would be regarded merely as an ordinary one to which some special significance was attached, that is, as a type or symbol rather than a miracle. It is this aspect moreover of the miracle which is now exposed to the main brunt of the attack, and in support of which therefore the defence has generally been carried on.
A necessary first step will be to come up with a definition of a miracle. Most people would probably agree that when we describe a miracle as “a suspension of a law of causation,” we are providing a description that, while it might not fully capture the concept, is at least accurate. It’s true, even if it doesn’t cover everything. Whatever else a miracle might be, this is its physical aspect: the point where it intersects with the subject matter of science. If it weren’t seen that any suspension of causation was involved, the event would just be viewed as an ordinary occurrence with some special meaning attached to it, essentially as a type or symbol rather than a miracle. This aspect of the miracle is what is currently facing the main criticism, and this is what the defense has generally aimed to support.
Now it is obvious that this, like most other definitions or descriptions, makes some assumption as to matters of fact, and involves something of a theory. The assumption clearly is, that laws of causation prevail universally, or almost universally, throughout nature, so that infractions of them are marked and exceptional. This assumption is made, but it does not appear that anything more than this is necessarily required; that is, there is nothing which need necessarily make us side with either of the two principal schools which are divided as to the nature of these laws of causation. The definition will serve equally well whether we understand by law nothing more than uniformity of antecedent and consequent, or whether we assert that there is some deeper and more mysterious tie between the events than mere sequence. The use of the term ‘causation’ in this minimum of signification is common to both schools, though the one might consider it inadequate; we may speak, therefore, of ‘suspensions of causation’ without committing ourselves to either.
Now it’s clear that this, like most definitions or descriptions, makes some assumptions about the facts and includes a bit of theory. The assumption is that laws of causation exist universally, or nearly so, throughout nature, meaning breaches of these laws are rare and exceptional. This assumption is made, but it doesn’t seem like anything more than this is necessarily needed; in other words, there’s nothing that forces us to take sides between the two main schools that disagree about the nature of these laws of causation. The definition works just as well whether we see law as just the consistency of cause and effect, or if we believe there’s a deeper, more mysterious connection between events than just sequence. The term ‘causation,’ used in this basic sense, is accepted by both schools, even if one might find it inadequate; so we can talk about ‘suspensions of causation’ without aligning ourselves with either side.
§ 22. It should be observed that the aspect of the question suggested by this definition is one from which we can hardly escape. Attempts indeed have been sometimes made to avoid the necessity of any assumption as to the universal 430 prevalence of law and order in nature, by defining a miracle from a different point of view. A miracle may be called, for instance, ‘an immediate exertion of creative power,’ ‘a sign of a revelation,’ or, still more vaguely, an ‘extraordinary event.’ But nothing would be gained by adopting any such definitions as these. However they might satisfy the theologian, the student of physical science would not rest content with them for a moment. He would at once assert his own belief, and that of other scientific men, in the existence of universal law, and enquire what was the connection of the definition with this doctrine. An answer would imperatively be demanded to the question, Does the miracle, as you have described it, imply an infraction of one of these laws, or does it not? And an answer must be given, unless indeed we reject his assumption by denying our belief in the existence of this universal law, in which case of course we put ourselves out of the pale of argument with him. The necessity of having to recognize this fact is growing upon men day by day, with the increased study of physical science. And since this aspect of the question has to be met some time or other, it is as well to place it in the front. The difficulty, in its scientific form, is of course a modern one, for the doctrine out of which it arises is modern. But it is only one instance, out of many that might be mentioned, in which the growth of some philosophical conception has gradually affected the nature of the dispute, and at last shifted the position of the battle-ground, in some discussion with which it might not at first have appeared to have any connection whatever.
§ 22. It should be noted that the way this question is framed is something we can hardly ignore. People have sometimes tried to dodge the need to assume that there is universal law and order in nature by defining a miracle from a different perspective. For example, a miracle could be described as ‘an immediate exertion of creative power,’ ‘a sign of a revelation,’ or, even more vaguely, an ‘extraordinary event.’ But adopting these kinds of definitions wouldn't really help. While they might satisfy theologians, a student of physical science wouldn't accept them for a second. They would insist on their belief, along with that of other scientists, in the existence of universal law and would ask how this definition connects to that idea. A clear answer would be required to the question: Does the miracle you've described imply a violation of one of these laws, or not? An answer must be provided unless we reject this assumption by denying our belief in universal law, which would then mean we cannot argue with them. The need to acknowledge this reality is becoming more pressing as physical science is studied more deeply. Since we will have to address this aspect of the question eventually, it makes sense to put it front and center now. This scientific difficulty is, of course, a modern issue, arising from a contemporary doctrine. However, it is only one example among many where the development of a philosophical idea has gradually changed the nature of the debate and ultimately shifted the focus of discussion to what initially seemed unrelated.
§ 23. So far our path is plain. Up to this point disciples of very different schools may advance together; for in laying down the above doctrine we have carefully abstained from implying or admitting that it contains the whole truth. But from this point two paths branch out before us, paths as 431 different from each other in their character, origin, and direction, as can well be conceived. As this enquiry is only a digression, we may confine ourselves to stating briefly what seem to be the characteristics of each, without attempting to give the arguments which might be used in their support.
§ 23. So far, our journey is clear. Up to this point, followers of very different schools can move forward together; we’ve made sure that our previous statement doesn’t imply or claim to hold the complete truth. But from here, two paths diverge before us, paths as 431 distinct in their nature, origin, and direction as one can imagine. Since this inquiry is just a side note, we’ll stick to briefly outlining what seem to be the key features of each path, without trying to present the arguments that could support them.
(I.) On the one hand, we may assume that this principle of causation is the ultimate one. By so terming it, we do not mean that it is one from which we consciously start in our investigations, as we do from the axioms of geometry, but rather that it is the final result towards which we find ourselves drawn by a study of nature. Finding that, throughout the scope of our enquiries, event follows event in never-failing uniformity, and finding moreover (some might add) that this experience is supported or even demanded by a tendency or law of our nature (it does not matter here how we describe it), we may come to regard this as the one fundamental principle on which all our enquiries should rest.
(I.) On one hand, we can assume that this principle of causation is the ultimate one. By calling it that, we don't mean we start our investigations from it, like we do with the axioms of geometry. Instead, it's the final conclusion we reach through studying nature. We see that, throughout our inquiries, events follow each other with unchanging consistency, and some might argue that this experience is supported or even required by a tendency or law of our nature (it doesn't matter how we describe it here). We may come to view this as the fundamental principle on which all our inquiries should be based.
(II.) Or, on the other hand, we may admit a class of principles of a very different kind. Allowing that there is this uniformity so far as our experience extends, we may yet admit what can hardly be otherwise described than by calling it a Superintending Providence, that is, a Scheme or Order, in reference to which Design may be predicated without using merely metaphorical language. To adopt an aptly chosen distinction, it is not to be understood as over-ruling events, but rather as underlying them.
(II.) Alternatively, we might accept a different set of principles. While acknowledging that there is consistency in our experiences, we can still recognize what can best be described as a Superintending Providence—essentially, a Scheme or Order that allows for genuine references to Design without resorting to metaphor. Using a well-chosen distinction, it shouldn't be seen as over-ruling events, but more as underlying them.
§ 24. Now it is quite clear that according as we come to the discussion of any particular miracle or extraordinary story under one or other of these prepossessions, the question of its credibility will assume a very different aspect. It is sometimes overlooked that although a difference about facts is one of the conditions of a bonâ fide argument, a difference which reaches to ultimate principles is fatal to all 432 argument. The possibility of present conflict is banished in such a case as absolutely as that of future concord. A large amount of popular literature on the subject of miracles seems to labour under this defect. Arguments are stated and examined for and against the credibility of miraculous stories without the disputants appearing to have any adequate conception of the chasm which separates one side from the other.
§ 24. It's now quite clear that depending on how we approach any specific miracle or extraordinary story with certain biases, the question of its credibility will take on a very different form. It’s sometimes overlooked that while differing views about facts are a fundamental part of a bonâ fide argument, a disagreement that goes down to basic principles can completely undermine any argument. In such cases, the possibility of current conflict is eliminated just like that of future agreement. A lot of popular literature on miracles seems to struggle with this issue. Arguments for and against the credibility of miraculous stories are presented and debated, but the people involved often don't seem to grasp the significant gap that separates the two sides.
§ 25. The following illustration may serve in some degree to show the sort of inconsistency of which we are speaking. A sailor reports that in some remote coral island of the Pacific, on which he had landed by himself, he had found a number of stones on the beach disposed in the exact form of a cross. Now if we conceive a debate to arise about the truth of his story, in which it is attempted to decide the matter simply by considerations about the validity of testimony, without introducing the question of the existence of inhabitants, and the nature of their customs, we shall have some notion of the unsatisfactory nature of many of the current arguments about miracles. All illustrations of this subject are imperfect, but a case like this, in which a supposed trace of human agency is detected interfering with the orderly sequence of other and non-intelligent natural causes, is as much to the point as any illustration can be. The thing omitted here from the discussion is clearly the one important thing. If we suppose that there is no inhabitant, we shall probably disbelieve the story, or consider it to be grossly exaggerated. If we suppose that there are inhabitants, the question is at once resolved into a different and somewhat more intricate one. The credibility of the witness is not the only element, but we should necessarily have to take into consideration the character of the supposed inhabitants, and the object of such an action on their part.
§ 25. The following example may help illustrate the inconsistency we're discussing. A sailor claims that on a remote coral island in the Pacific, where he landed alone, he found several stones on the beach arranged exactly like a cross. Now, if a debate arises over the truth of his story, and it's attempted to resolve the issue solely by examining the validity of his testimony, without considering whether there are inhabitants on the island and what their customs might be, we can grasp the unsatisfying nature of many current arguments about miracles. All examples on this topic are imperfect, but a situation like this, where a supposed sign of human activity disrupts the orderly flow of other non-intelligent natural processes, is as relevant as any illustration can be. The critical aspect missing from this discussion is obviously the most important one. If we assume there are no inhabitants, we would likely doubt the story or see it as seriously exaggerated. If we assume there are inhabitants, the question then shifts to a different and more complex one. The credibility of the witness isn't the only factor; we would also need to consider the nature of the alleged inhabitants and the reasons for their actions.
§ 26. Considerations of this character are doubtless often introduced into the discussion, but it appears to me that they are introduced to a very inadequate extent. It is often urged, after Paley, ‘Once believe in a God, and miracles are not incredible.’ Such an admission surely demands some modification and extension. It should rather be stated thus, Believe in a God whose working may be traced throughout the whole moral and physical world. It amounts, in fact, to this;—Admit that there may be a design which we can trace somehow or other in the course of things; admit that we are not wholly confined to tracing the connection of events, or following out their effects, but that we can form some idea, feeble and imperfect though it be, of a scheme.[10] Paley's advice sounds too much like saying, Admit that there are fairies, and we can account for our cups being cracked. The admission is not to be made in so off-hand a manner. To any one labouring under the difficulty we are speaking of, this belief in a God almost out of any constant relation to nature, whom we then imagine to occasionally manifest himself in a perhaps irregular manner, is altogether impossible. The only form under which belief in the Deity can gain entrance into his mind is as the controlling Spirit of an infinite and orderly system. In fact, it appears to me, paradoxical as the suggestion may appear, that it might even be more easy for a person thoroughly imbued with the spirit of Inductive science, though an atheist, to believe in a miracle which formed a part of a vast system, than for such a person, as a theist, to accept an isolated miracle.
§ 26. Discussions like this often come up, but it seems to me they're not explored thoroughly enough. People frequently say, following Paley, "Once you believe in a God, miracles aren’t hard to accept." This statement definitely needs some tweaking and elaboration. It should be more like this: Believe in a God whose influence can be seen throughout the entire moral and physical world. Essentially, it boils down to this: Accept that there may be a design that we can somewhat trace in the unfolding of events; accept that we aren’t just limited to connecting dots between events or exploring their outcomes, but that we can form some idea, however weak and incomplete it may be, of a scheme.[10] Paley's suggestion feels too much like saying, "Admit that there are fairies, and we can explain why our cups are cracked." That kind of admission shouldn’t be made so casually. For anyone grappling with the challenges we're discussing, the idea of a God who seems unrelated to nature, manifesting occasionally in an unpredictable way, is completely untenable. The only way belief in a Deity can take hold in someone’s mind is as the governing Spirit of a vast and orderly system. In fact, it seems to me, no matter how contradictory it may sound, that it could actually be easier for someone deeply rooted in Inductive science, even if they’re an atheist, to believe in a miracle as part of a larger system than for that same person, as a theist, to accept a standalone miracle.
§ 27. It is therefore with great prudence that Hume, and others after him, have practically insisted on commencing with a discussion of the credibility of the single miracle, 434 treating the question as though the Christian Revelation could be adequately regarded as a succession of such events. As well might one consider the living body to be represented by the aggregate of the limbs which compose it. What is to be complained of in so many popular discussions on the subject is the entire absence of any recognition of the different ground on which the attackers and defenders of miracles are so often really standing. Proofs and illustrations are produced in endless number, which involving, as they almost all do in the mind of the disputants on one side at least, that very principle of causation, the absence of which in the case in question they are intended to establish, they fail in the single essential point. To attempt to induce any one to disbelieve in the existence of physical causation, in a given instance, by means of illustrations which to him seem only additional examples of the principle in question, is like trying to make a dam, in order to stop the flow of a river, by shovelling in snow. Such illustrations are plentiful in times of controversy, but being in reality only modified forms of that which they are applied to counteract, they change their shape at their first contact with the disbeliever's mind, and only help to swell the flood which they were intended to check.
§ 27. It is with great caution that Hume, and others after him, have emphasized starting with a discussion about the credibility of a single miracle, treating the issue as if the Christian Revelation could be seen as a series of such events. It would be just as reasonable to think of a living body as merely a collection of its limbs. What is troubling in many popular discussions on this topic is the complete lack of acknowledgment of the different perspectives on which the critics and defenders of miracles often stand. Proofs and examples are presented in endless quantities, which, since they generally involve the very principle of causation that they aim to challenge, fail in the crucial aspect. Trying to convince someone to doubt the existence of physical causation in a particular case using examples that simply seem like more instances of that principle to them is like trying to build a dam to stop a river's flow by shoveling in snow. These examples are abundant during controversies, but since they are essentially just altered versions of what they seek to oppose, they lose their form upon first encountering the skeptical mind and only contribute to the very issue they were meant to address.
1 Reasons were given in the last chapter against the propriety of applying the rules of Probability with any strictness to such examples as these. But although all approach to numerical accuracy is unattainable, we do undoubtedly recognize in ordinary life a distinction between the credibility of one witness and another; such a rough practical distinction will be quite sufficient for the purposes of this chapter. For convenience, and to illustrate the theory, the examples are best stated in a numerical form, but it is not intended thereby to imply that any such accuracy is really attainable in practice.
1 In the last chapter, we discussed why it's not appropriate to strictly apply the rules of Probability to examples like these. While we can’t achieve precise numerical accuracy, we definitely see a difference in the reliability of one witness compared to another in everyday life. This rough practical distinction will be enough for this chapter. For clarity and to explain the theory better, the examples are presented in numerical form, but this doesn’t mean that such accuracy can actually be achieved in real life.
2 I must plead guilty to this charge myself, in the first edition of this work. The result was to make the treatment of this part of the subject obscure and imperfect, and in some respects erroneous.
2 I have to admit that I’m guilty of this in the first edition of this work. This led to a treatment of this part of the subject being unclear and incomplete, and in some ways, incorrect.
3 The generalized algebraical form of this result is as follows. Let p be the à priori probability of an event, and x be the credibility of the witness. Then, if he asserts that the event happened, the probability that it really did happen is
3 The general algebraic form of this result is as follows. Let p be the a priori probability of an event, and x be the credibility of the witness. Then, if they claim that the event occurred, the probability that it actually did occur is
whilst if he asserts that it did not happen the probability that it did happen is
whilst if he claims that it did not happen, the likelihood that it did happen is
In illustration of some remarks to be presently made, the reader will notice that on making either of these expressions = p, we obtain in each case x = 1/2. That is, a witness whose veracity = 1/2 leaves the à priori probability of an event (of this kind) unaffected.
To illustrate some upcoming comments, the reader will notice that when either of these expressions equals p, we get x = 1/2 in both cases. This means that a witness whose truthfulness equals 1/2 does not change the à priori probability of an event of this type.
If, on the other hand, we make these expressions equal to x and 1 − x respectively, we obtain in each case p = 1/2. That is, when an event (of this kind) is as likely to happen as not, the ordinary veracity of the witness in respect of it remains unaffected.
If we set these expressions equal to x and 1 - x respectively, we find that in both cases p = 1/2. This means that when an event of this kind is equally likely to happen or not, a witness's general truthfulness about it stays unchanged.
4 Todhunter's History, p. 400. Philosophical Magazine, July, 1864.
4 Todhunter's History, p. 400. Philosophical Magazine, July, 1864.
5 “When therefore these two kinds of experience are contrary, we have nothing to do but subtract the one from the other, and embrace an opinion, either on one side or the other, with that assurance which arises from the remainder.” (Essay on Miracles.)
5 “When these two types of experience conflict, we can only take one away from the other and choose a side, holding on to that belief with the confidence that comes from what’s left.” (Essay on Miracles.)
6 Considerations of this kind have indeed been introduced into the mathematical treatment of the subject. The common algebraical solution of the problem in § 5 (to begin with the simplest case) is of course as follows. Let p be the antecedent probability of the event, and t the measure of the truthfulness of the witness; then the chance of his statement being true is pt/pt + (1 − p)(1 − t). This supposes him to lie as much when the event does not happen as when it does. But we may meet the cases supposed in the text by assuming that t′ is the measure of his veracity when the event does not happen, so that the above formula becomes pt/pt + (1 − p)(1 − t′). Here t′ and t measure respectively his trustworthiness in usual and unusual events. As a formal solution this certainly meets the objections stated above in §§ 14 and 15. The determination however of t′ would demand, as I have remarked, continually renewed appeal to experience. In any case the practical methods which would be adopted, if any plans of the kind indicated above were resorted to, seem to me to differ very much from that adopted by the mathematicians, in their spirit and plan.
6 Considerations like this have actually been included in the mathematical analysis of the topic. The standard algebraic solution to the problem in § 5 (starting with the simplest case) is, of course, as follows. Let p be the initial probability of the event, and t be the measure of the witness's truthfulness; then the likelihood of his statement being true is pt/pt + (1 - p)(1 - t). This assumes he lies just as much when the event doesn't occur as when it does. However, we can address the cases mentioned in the text by assuming that t′ is the measure of his honesty when the event doesn't happen, changing the formula to pt/pt + (1 − p)(1 − t′). Here, t′ and t measure his reliability in typical and atypical situations, respectively. As a formal solution, this definitely addresses the concerns raised in §§ 14 and 15. However, determining t′ would require, as I've noted, a constant reference to real-world experience. In any case, the practical methods that would be used if any plans like those suggested were implemented seem to me to differ greatly in spirit and approach from those used by mathematicians.
7 Laplace, for instance (Essai, ed. 1825, p. 149), says that if we saw 100 dies (known of course to be fair ones) all give the same face, we should be bewildered at the time, and need confirmation from others, but that, after due examination, no one would feel obliged to postulate hallucination in the matter. But the chance of this occurrence is represented by a fraction whose numerator is 1, and denominator contains 77 figures, and is therefore utterly inappreciable by the imagination. It must be admitted, though, that there is something hypothetical about such an example, for we could not really know that the dies were fair with a confidence even distantly approaching such prodigious odds. In other words, it is difficult here to keep apart those different aspects of the question discussed in Chap. XIV. §§ 28–33.
7 Laplace, for instance (Essai, ed.] 1825, p. 149), says that if we were to roll 100 dice (which we know are fair) and all of them showed the same face, we would be confused at first and would need confirmation from others. However, after careful examination, no one would feel the need to suggest that hallucination was at play. The probability of this happening is represented by a fraction with a numerator of 1 and a denominator that has 77 digits, making it completely unimaginable. It must be acknowledged, though, that there is something hypothetical about this example, as we couldn't really be sure that the dice were fair with any level of confidence that would come close to such extraordinary odds. In other words, it’s challenging to separate the different aspects of this issue discussed in Chap. XIV.] §§ 28–33.
8 In the first edition this was stated, as it now seems to me, in decidedly too unqualified a manner. It must be remembered, however, that (as was shown in § 7) this plan is really the best theoretical one which can be adopted in certain cases.
8 In the first edition, this was stated, as it now seems to me, in a way that was definitely too absolute. It’s important to remember, though, that (as shown in § 7) this plan is actually the best theoretical option that can be chosen in certain situations.
9 It is on this principle that the remarkable conclusion mentioned on p. 405 is based. Suppose an event whose probability is p; and that, of a number of witnesses of the same veracity (y), m assert that it happened, and n deny this. Generalizing the arithmetical reasoning given above we see that the chance of the event being asserted varies as
9 This principle forms the basis of the remarkable conclusion mentioned in p. 405. Let’s say there’s an event with a probability of p; among witnesses of the same truthfulness (y), m claim it occurred, while n deny it. By generalizing the arithmetic reasoning presented above, we see that the likelihood of the event being affirmed changes as
(viz. as the chance that the event happens, and that m are right and n are wrong; plus the chance that it does not happen, and that n are right and m are wrong). And the chance of its being rightly asserted as pym (1 − y)n. Therefore the chance that when we have an assertion before us it is a true one is
(viz. as the probability that the event occurs, and that m are accurate and n are incorrect; plus the probability that it doesn't occur, and that n are accurate and m are incorrect). And the probability of it being accurately stated as pym (1 − y)n. Therefore, the probability that when we have an assertion presented to us, it is true is
which is equal to
equals
But this last expression represents the probability of an assertion which is unanimously supported by m − n such witnesses.
But this last expression represents the likelihood of a claim that is fully backed by m − n witnesses.
CHAPTER 18.
THE NATURE AND USE OF AN AVERAGE, AND ON THE DIFFERENT KINDS OF AVERAGE.[*]
* There is much need of some good account, accessible to the ordinary English reader, of the nature and properties of the principal kinds of Mean. The common text-books of Algebra suggest that there are only three such, viz. the arithmetical, the geometrical and the harmonical:—thus including two with which the statistician has little or nothing to do, and excluding two or more with which he should have a great deal to do. The best three references I can give the reader are the following. (1) The article Moyenne in the Dictionnaire des Sciences Médicales, by Dr Bertillon. This is written somewhat from the Quetelet point of view. (2) A paper by Fechner in the Abhandlungen d. Math. phys. Classe d. Kön. Sächs. Gesellschaft d. Wiss. 1878; pp. 1–76. This contains a very interesting discussion, especially for the statistician, of a number of different kinds of mean. His account of the median is remarkably full and valuable. But little mathematical knowledge is demanded. (3) A paper by Mr F. Y. Edgeworth in the Camb. Phil. Trans. for 1885, entitled Observations and Statistics. This demands some mathematical knowledge. Instead of dealing, as such investigations generally do, with only one Law of Error and with only one kind of mean, it covers a wide field of investigation.
Sure! Please provide the text you'd like me to modernize. There is a significant need for a good overview, accessible to the average English reader, about the nature and properties of the main types of mean. Most algebra textbooks suggest that there are only three types: the arithmetic, geometric, and harmonic means. This includes two types that statisticians rarely deal with and excludes two or more that they should engage with regularly. Here are the best three references I can recommend to the reader. (1) The article Moyenne in the Dictionnaire des Sciences Médicales, by Dr. Bertillon. This is written from the Quetelet perspective. (2) A paper by Fechner in the Abhandlungen d. Math. phys. Classe d. Kön. Sächs. Gesellschaft d. Wiss. 1878; pp. 1–76. This contains a very interesting discussion, especially for statisticians, regarding several different kinds of mean. His explanation of the median is particularly thorough and valuable. It requires minimal mathematical knowledge. (3) A paper by Mr. F. Y. Edgeworth in the Camb. Phil. Trans. for 1885, titled Observations and Statistics. This requires some mathematical understanding. Instead of focusing, as such studies usually do, on just one Law of Error and one type of mean, it explores a broad range of topics.
§ 1. We have had such frequent occasion to refer to averages, and to the kind of uniformity which they are apt to display in contrast with individual objects or events, that it will now be convenient to discuss somewhat more minutely what are the different kinds of available average, and what exactly are the functions they perform.
§ 1. We have referenced averages so often and noted the uniformity they tend to show compared to individual cases or events that it makes sense to take a closer look at the different types of averages available and the specific roles they play.
The first vague notion of an average, as we now understand it, seems to me to involve little more than that of a something intermediate to a number of objects. The objects must of course resemble each other in certain respects, otherwise we should not think of classing them together; and they must also differ in certain respects, otherwise we should not distinguish between them. What the average does for us, under this primitive form, is to enable us conveniently to retain the group together as a whole. That is, it furnishes a sort of representative value of the quantitative aspect of the things in question, which will serve for certain purposes to take the place of any single member of the group.
The first vague idea of an average, as we understand it today, seems to be simply about something intermediate among a group of objects. The objects need to have some similarities; otherwise, we wouldn't think of grouping them. They also need to have some differences, or we wouldn't distinguish between them. What the average does for us, in this basic form, is help us keep the group together as a whole. In other words, it provides a sort of representative value for the quantitative aspect of the items in question, which can be used in certain situations to stand in for any single member of the group.
It would seem then that the first dawn of the conception which science reduces to accuracy under the designation of an average or mean, and then proceeds to subdivide into various distinct species of means, presents itself as performing some of the functions of a general name. For what is the main use of a general name? It is to reduce a plurality of objects to unity; to group a number of things together by reference to some qualities which they possess in common. The ordinary general name rests upon a considerable variety of attributes, mostly of a qualitative character, whereas the average, in so far as it serves the same sort of purpose, rests rather upon a single quantitative attribute. It directs attention to a certain kind and degree of magnitude. When the grazier says of his sheep that ‘one with another they will fetch about 50 shillings,’ or the farmer buys a lot of poles which ‘run to about 10 feet,’ it is true that they are not strictly using the equivalent of either a general or a collective name. But they are coming very near to such use, in picking out a sort of type or specimen of the magnitude to which attention is to be directed, and in classing the whole group by its resemblance to this type. 437 The grazier is thinking of his sheep: not in a merely general sense, as sheep, and therefore under that name or conception, but as sheep of a certain approximate money value. Some will be more, some less, but they are all near enough to the assigned value to be conveniently classed together as if by a name. Many of our rough quantitative designations seem to be of this kind, as when we speak of ‘eight-day clocks’ or ‘twelve-stone men,’ &c.; unless of course we intend (as we sometimes do in these cases) to assign a maximum or minimum value. It is not indeed easy to see how else we could readily convey a merely general notion of the quantitative aspect of things, except by selecting a type as above, or by assigning certain limits within which the things are supposed to lie.
It seems that the first hint of the concept that science defines with precision as an average or mean, which is then broken down into different types of means, acts similarly to a general name. What’s the primary function of a general name? It’s to bring together various objects into one category; to group several items based on some shared qualities. A typical general name is based on a diverse range of attributes, mostly qualitative, while the average, as it serves a similar purpose, relies more on a single quantitative attribute. It focuses on a specific kind and degree of measurement. When a grazier says about his sheep, “on average they’ll sell for about 50 shillings,” or a farmer buys poles that “are about 10 feet long,” it’s true that they aren’t strictly using a general or collective name. But they are quite close to that usage by identifying a type or example of the size to highlight and categorizing the whole group by its similarity to this example. The grazier isn't thinking of his sheep in just a general way, as sheep, and therefore under that label, but as sheep with a specific approximate monetary value. Some will be worth more, some less, but they’re all close enough to the stated value to be conveniently grouped together as if they have a name. Many of our vague quantitative labels seem to fall into this category, like when we refer to “eight-day clocks” or “twelve-stone men,” unless, of course, we mean to specify a maximum or minimum value. It’s not easy to see how else we could quickly communicate a general idea of the quantitative aspect of things other than by selecting a type as mentioned above, or by setting certain limits within which the items are expected to fall. 437
§ 2. So far there is not necessarily any idea introduced of comparison,—of comparison, that is, of one group with another,—by aid of such an average. As soon as we begin to think of this we have to be more precise in saying what we mean by an average. We can easily see that the number of possible kinds of average, in the sense of intermediate values, is very great; is, in fact, indefinitely great. Out of the general conception of an intermediate value, obtained by some treatment of the original magnitudes, we can elicit as many subdivisions as we please, by various modes of treatment. There are however only three or four which for our purposes need be taken into account.
§ 2. So far, there isn't necessarily any idea of comparison introduced—specifically, comparing one group to another—by using such an average. Once we start thinking about this, we need to be more precise about what we mean by an average. It's easy to see that there are many possible types of averages, in the sense of intermediate values; in fact, there are infinitely many. From the general idea of an intermediate value, derived from some analysis of the original quantities, we can create as many categories as we want through different methods of analysis. However, there are only three or four that we need to focus on for our purposes.
(1) In the first place there is the arithmetical average or mean. The rule for obtaining this is very simple: add all the magnitudes together, and divide the sum by their number. This is the only kind of average with which the unscientific mind is thoroughly familiar. But we must not let this simplicity and familiarity blind us to the fact that there are definite reasons for the employment of this average, 438 and that it is therefore appropriate only in definite circumstances. The reason why it affords a safe and accurate intermediate value for the actual divergent values, is that for many of the ordinary purposes of life, such as purchase and sale, we come to exactly the same result, whether we take account of those existent divergences, or suppose all the objects equated to their average. What the grazier must be understood to mean, if he wishes to be accurate, by saying that the average price of his sheep is 50 shillings, is, that so far as that flock is concerned (and so far as he is concerned), it comes to exactly the same thing, whether they are each sold at different prices, or are all sold at the ‘average’ price. Accordingly, when he compares his sales of one year with those of another; when he says that last year the sheep averaged 48 shillings against the 50 of this year; the employment of this representative or average value is a great simplification, and is perfectly accurate for the purpose in question.
(1) First, let's talk about the arithmetic average or mean. The method for calculating it is very straightforward: add all the values together and divide the total by the number of values. This is the only type of average that most people know well. However, we shouldn't let its simplicity and familiarity make us overlook that there are valid reasons for using this average, 438 and it's only suitable in certain situations. The reason it provides a reliable and precise middle value for the actual varying values is that, for many everyday purposes like buying and selling, we end up with the same result whether we consider those existing differences or assume everything is equal to the average. When a farmer states that the average price of his sheep is 50 shillings, he means that regarding that flock (and for his purposes), it makes no difference if they are sold at different prices or all at the ‘average’ price. So, when he compares his sales from one year to another, saying that last year the sheep averaged 48 shillings compared to this year's 50, using this average value simplifies things a lot and is perfectly accurate for what he needs.
§ 3. (2) Now consider this case. A certain population is found to have doubled itself in 100 years: can we talk of an ‘average’ increase here of 1 per cent. annually? The circumstances are not quite the same as in the former case, but the analogy is sufficiently close for our purpose. The answer is decidedly, No. If 100 articles of any kind are sold for £100, we say that the average price is £1. By this we mean that the total amount is the same whether the entire lot are sold for £100, or whether we split the lot up into individuals and sell each of these for £1. The average price here is a convenient fictitious substitute, which can be applied for each individual without altering the aggregate total. If therefore the question be, Will a supposed increase of 1 p. c. in each of the 100 years be equivalent to a total increase to double the original amount? we are proposing a closely 439 analogous question. And the answer, as just remarked, must be in the negative. An annual increase of 1 p. c. continued for 100 years will more than double the total; it will multiply it by about 2.7. The true annual increment required is measured by 100√2; that is, the population may be said to have increased ‘on the average’ 0.7 p. c. annually.
§ 3. (2) Now think about this situation. A certain population has doubled in 100 years: can we say there’s an ‘average’ increase of 1 percent annually? The conditions aren’t exactly the same as before, but the comparison is close enough for our needs. The answer is definitely No. If we sell 100 items for £100, we say the average price is £1. This means the total is the same whether we sell everything for £100 or break it down and sell each item for £1. The average price here is a useful fictional substitute that can be applied to each individual without changing the overall total. So if the question is, will a supposed increase of 1 percent each year for 100 years double the original amount? we’re asking a very similar question. And the answer, as noted earlier, must be negative. An annual increase of 1 percent over 100 years will actually more than double the total; it will grow it by about 2.7 times. The actual annual growth needed is given by 100√2; meaning the population has increased ‘on average’ by 0.7 percent each year.
We are thus directed to the second kind of average discussed in the ordinary text-books of algebra, viz. the geometrical. When only two quantities are concerned, with a single intermediate value between them, the geometrical mean constituting this last is best described as the mean proportional between the two former. Thus, since 3 : √15 :: √15 : 5, √15 is the geometrical mean between 3 and 5. When a number of geometrical means have to be interposed between two quantities, they are to be so chosen that every term in the entire succession shall bear the same constant ratio to its predecessor. Thus, in the example in the last paragraph, 99 intermediate steps were to be interposed between 1 and 2, with the condition that the 100 ratios thus produced were to be all equal.
We are now led to the second type of average discussed in standard algebra textbooks, which is the geometric mean. When there are only two quantities with a single intermediate value between them, the geometric mean is best described as the mean proportional between the two original quantities. For example, since 3 : √15 :: √15 : 5, √15 is the geometric mean between 3 and 5. When several geometric means need to be inserted between two quantities, they should be chosen so that each term in the entire sequence has the same constant ratio to the one before it. For instance, in the example from the last paragraph, 99 intermediate steps were to be added between 1 and 2, ensuring that all 100 ratios produced were equal.
It would seem therefore that wherever accurate quantitative results are concerned, the selection of the appropriate kind of average must depend upon the answer to the question, What particular intermediate value may be safely substituted for the actual variety of values, so far as the precise object in view is concerned? This is an aspect of the subject which will have to be more fully considered in the next chapter. But it may safely be laid down that for purposes of general comparison, where accurate numerical relations are not required, almost any kind of intermediate value will answer our purpose, provided we adhere to the same throughout. Thus, if we want to compare the statures of the inhabitants of different counties or districts in England, 440 or of Englishmen generally with those of Frenchmen, or to ascertain whether the stature of some particular class or district is increasing or diminishing, it really does not seem to matter what sort of average we select provided, of course, that we adhere to the same throughout our investigations. A very large amount of the work performed by averages is of this merely comparative or non-quantitative description; or, at any rate, nothing more than this is really required. This being so, we should naturally resort to the arithmetical average; partly because, having been long in the field, it is universally understood and appealed to, and partly because it happens to be remarkably simple and easy to calculate.
It seems that when it comes to accurate quantitative results, choosing the right type of average depends on answering the question, "What specific intermediate value can be safely used instead of the actual range of values, based on the particular goal?" This is something we will look at more closely in the next chapter. However, it can be said that for general comparisons, where precise numerical relationships aren't essential, almost any type of intermediate value will work, as long as we stay consistent throughout. For example, if we want to compare the heights of people in different counties or regions in England, or compare Englishmen to Frenchmen, or determine whether the height of a specific group or area is increasing or decreasing, it doesn't really matter what type of average we choose, as long as we remain consistent in our analysis. A significant portion of the work done with averages is simply for comparative or non-quantitative purposes; or, in most cases, that's all that's really needed. Given this, we would typically use the arithmetic average, partly because it has been around for a long time and is widely understood and accepted, and partly because it is very straightforward and easy to calculate.
§ 4. The arithmetical mean is for most ordinary purposes the simplest and best. Indeed, when we are dealing with a small number of somewhat artificially selected magnitudes, it is the only mean which any one would think of employing. We should not, for instance, apply any other method to the results of a few dozen measurements of lengths or estimates of prices.
§ 4. The average is usually the easiest and most effective method for everyday use. In fact, when we're working with a small number of somewhat carefully chosen values, it's the only average anyone would consider using. For example, we wouldn’t use any other method for a few dozen measurements of lengths or price estimates.
When, however, we come to consider the results of a very large number of measurements of the kind which can be grouped together into some sort of ‘probability curve’ we begin to find that there is more than one alternative before us. Begin by recurring to the familiar curve represented on p. 29; or, better still, to the initial form of it represented in the next chapter (p. 476). We see that there are three different ways in which we may describe the vertex of the curve. We may call it the position of the maximum ordinate; or that of the centre of the curve; or (as will be seen hereafter) the point to which the arithmetical average of all the different values of the variable magnitude directs us. These three are all distinct ways of describing a position; 441 but when we are dealing with a symmetrical curve at all resembling the binomial or exponential form they all three coincide in giving the same result: as they obviously do in the case in question.
When we look at the results from a large number of measurements that can be grouped into a type of "probability curve," we start to see that there are multiple options available to us. Let's refer back to the familiar curve shown on p. 29; or even better, to its initial form represented in the next chapter (p. 476). We observe that there are three different ways we can describe the vertex of the curve. We can call it the position of the maximum ordinate; or that of the center of the curve; or (as will be shown later) the point that the arithmetic average of all the different values of the variable leads us to. These three are all distinct ways of describing a position; 441 but when we are dealing with a symmetrical curve similar to the binomial or exponential form, they all coincide in providing the same result, as they clearly do in this situation.
As soon, however, as we come to consider the case of asymmetrical, or lop-sided curves, the indications given by these three methods will be as a rule quite distinct; and therefore the two former of these deserve brief notice as representing different kinds of means from the arithmetical or ordinary one. We shall see that there is something about each of them which recommends it to common sense as being in some way natural and appropriate.
As soon as we look at asymmetrical or lopsided curves, the results from these three methods will usually be pretty different. Therefore, the first two methods deserve a quick mention because they represent different types of means compared to the standard arithmetic mean. We'll see that each of them has qualities that make it seem more natural and fitting in a common-sense way.
§ 5. (3) The first of these selects from amongst the various different magnitudes that particular one which is most frequently represented. It has not acquired any technical designation,[1] except in so far as it is referred to, by its graphical representation, as the “maximum ordinate” method. But I suspect that some appeal to such a mean or standard is really far from uncommon, and that if we could draw out into clearness the conceptions latent in the judgments of the comparatively uncultivated, we should find that there were various classes of cases in which this mean was naturally employed. Suppose, for instance, that there was a fishery in which the fish varied very much in size 442 but in which the commonest size was somewhat near the largest or the smallest. If the men were in the habit of selling their fish by weight, it is probable that they would before long begin to acquire some kind of notion of what is meant by the arithmetical mean or average, and would perceive that this was the most appropriate test. But if the fish were sorted into sizes, and sold by numbers in each of these sizes, I suspect that this appeal to a maximum ordinate would begin to take the place of the other. That is, the most numerous class would come to be selected as a sort of type by which to compare the same fishery at one time and another, or one fishery with others. There is also, as we shall see in the next chapter, some scientific ground for the preference of this kind of mean in peculiar cases; viz. where the quantities with which we deal are true ‘errors,’ in the estimate of some magnitude, and where also it is of much more importance to be exactly right, or very nearly right, than to have merely a low average of error.
§ 5. (3) The first of these selects from the various sizes that specific one which is most often seen. It hasn’t gained any particular technical name, [1] except that it’s referred to, through its graphical representation, as the “maximum ordinate” method. However, I think that referring to such an average or standard isn’t unusual at all, and if we could clarify the ideas that are implied in the judgments of those who are less educated, we would find that there are several situations where this average is naturally used. For example, imagine there’s a fishery where the fish come in many different sizes, but the most common size is close to the largest or the smallest. If the fishermen usually sell their fish by weight, it’s likely they would soon develop some understanding of what the arithmetic mean or average means, and would recognize that this is the best measure. But if the fish were sorted by size and sold by the number in each size category, I suspect that this reference to a maximum ordinate would start to replace the other. In other words, the largest group would be chosen as a kind of standard to compare the same fishery over time or to compare one fishery to another. As we’ll see in the next chapter, there’s also some scientific reason for preferring this type of average in specific cases; namely, when the quantities we’re dealing with are true ‘errors’ in estimating some magnitude, and where it’s much more important to be exactly correct, or very close to it, than just to have a low average error.
§ 6. (4) The remaining kind of mean is that which is now coming to be called the “median.” It is one with which the writings of Mr Galton have done so much to familiarize statisticians, and is best described as follows. Conceive all the objects in question to be marshalled in the order of their magnitude; or, what comes to the same thing, conceive them sorted into a number of equally numerous classes; then the middle one of the row, or the middle one in the middle class, will be the median. I do not think that this kind of mean is at all generally recognized at present, but if Mr Galton's scheme of natural measurement by what he calls “per-centiles” should come to be generally adopted, such a test would become an important one. There are some conspicuous advantages about this kind of mean. For one thing, in most statistical enquiries, it is 443 far the simplest to calculate; and, what is more, the process of determining it serves also to assign another important element to be presently noticed, viz. the ‘probable error.’ Then again, as Fechner notes, whereas in the arithmetical mean a few exceptional and extreme values will often cause perplexity by their comparative preponderance, in the case of the median (where their number only and not their extreme magnitude is taken into account) the importance of such disturbance is diminished.
§ 6. (4) The other type of average is what we now refer to as the “median.” Mr. Galton's work has really helped statisticians get familiar with it, and it can be described like this: Imagine all the items arranged in order of their size; alternatively, think of them sorted into several classes that have the same number of items. The item in the middle of this sequence, or the one in the middle class, is the median. I don’t believe this kind of average is widely recognized right now, but if Mr. Galton's method of natural measurement using what he calls “percentiles” becomes commonly accepted, then this average will become significant. There are some clear advantages to this type of average. For one, it’s the easiest to calculate in most statistical investigations; additionally, the process of finding it also provides another important statistic that we’ll discuss shortly, namely, the ‘probable error.’ Furthermore, as Fechner points out, while extreme values can often skew the arithmetic mean and create confusion due to their weight, in the case of the median (where only the count matters and not the extreme size), the impact of such extremes is reduced.
§ 7. A simple illustration will serve to indicate how these three kinds of mean coalesce into one when we are dealing with symmetrical Laws of Error, but become quite distinct as soon as we come to consider those which are unsymmetrical.
§ 7. A simple example will show how these three types of means merge into one when we're dealing with symmetrical Laws of Error, but they become quite different once we consider those that are asymmetrical.

Suppose that, in measuring a magnitude along OBDC, where the extreme limits are OB and OC, the law of error is represented by the triangle BAC: the length OD will be at once the arithmetical mean, the median, and the most frequent length: its frequency being represented by the maximum ordinate AD. But now suppose, on the other hand, that the extreme lengths are OD and OC, and that the triangle ADC represents the law of error. The most frequent length will be the same as before, OD, marked by the maximum ordinate AD. But the mean value will now be OX, where DX = 1/3DC; and the median will be OY, where DY = (1 − 1/√2)DC.
Suppose that, when measuring a magnitude along OBDC, with the extreme limits being OB and OC, the law of error is represented by the triangle BAC: the length OD will be the arithmetic mean, the median, and the most frequently occurring length, with its frequency shown by the maximum ordinate AD. Now imagine, however, that the extreme lengths are OD and OC, and that the triangle ADC represents the law of error. The most frequently occurring length will still be OD, marked by the maximum ordinate AD. But now the mean value will be OX, where DX = 1/3DC; and the median will be OY, where DY = (1 − 1/√2)DC.
Another example, taken from natural phenomena, may be found in the heights of the barometer as taken at the same hour on successive days. So far as 4857 of these may be regarded as furnishing a sufficiently stable basis of experience, it certainly seems that the resulting curve of frequency is asymmetrical. The mean height here was found to be 29.98: the median was 30.01: the most frequent height was 30.05. The close approximation amongst these is an indication that the asymmetry is slight.[2]
Another example from natural events can be found in the barometer readings taken at the same time over several days. Considering that 4857 of these readings provide a stable enough basis of experience, it appears that the resulting frequency curve is asymmetrical. The average height was 29.98, the median was 30.01, and the most common height was 30.05. The close similarity among these figures suggests that the asymmetry is minimal.[2]
§ 8. It must be clearly understood that the average, of whatever kind it may be, from the mere fact of its being a single substitute for an actual plurality of observed values, must let slip a considerable amount of information. In fact it is only introduced for economy. It may entail no loss when used for some one assigned purpose, as in our example about the sheep; but for purposes in general it cannot possibly take the place of the original diversity, by yielding all the information which they contained. If all this is to be retained we must resort to some other method. Practically we generally do one of two things: either (1) we put all the figures down in statistical tables, or (2) we appeal to a diagram. This last plan is convenient when the data are very numerous, or when we wish to display or to discover the nature of the law of facility under which they range.
§ 8. It should be clear that an average, regardless of its type, serves as a single replacement for many observed values, and in doing so, it inevitably loses a significant amount of information. Essentially, it is only used for the sake of simplicity. While it may not result in any loss when used for a specific purpose, like in our example with the sheep, it cannot replace the original variety or provide all the information those values contained. To keep all this information, we need to use a different method. Usually, we do one of two things: either (1) we list all the figures in statistical tables, or (2) we use a diagram. The latter method is helpful when there is a lot of data or when we want to illustrate or understand the pattern in which they are organized.
The mere assignment of an average lets drop nearly all of this, confining itself to the indication of an intermediate 445 value. It gives a “middle point” of some kind, but says nothing whatever as to how the original magnitudes were grouped about this point. For instance, whether two magnitudes had been respectively 25 and 27, or 15 and 37, they would yield the same arithmetical average of 26.
The simple act of assigning an average excludes almost all of this, limiting itself to showing just one middle value. It provides a “middle point” of some sort but doesn’t reveal anything about how the original numbers were distributed around this point. For example, whether two numbers were 25 and 27, or 15 and 37, they would both give the same arithmetic average of 26.
§ 9. To break off at this stage would clearly be to leave the problem in a very imperfect condition. We therefore naturally seek for some simple test which shall indicate how closely the separate results were grouped about their average, so as to recover some part of the information which had been let slip.
§ 9. Stopping here would clearly leave the problem in an incomplete state. So, we naturally look for a straightforward way to test how closely the individual results cluster around their average, in order to retrieve some of the information that has been lost.
If any one were approaching this problem entirely anew,—that is, if he had no knowledge of the mathematical exigencies which attend the theory of “Least Squares,”—I apprehend that there is but one way in which he would set about the business. He would say, The average which we have already obtained gave us a rough indication, by assigning an intermediate point amongst the original magnitudes. If we want to supplement this by a rough indication as to how near together these magnitudes lie, the best way will be to treat their departures from the mean (what are technically called the “errors”) in precisely the same way, viz. by assigning their average. Suppose there are 13 men whose heights vary by equal differences from 5 feet to 6 feet, we should say that their average height was 66 inches, and their average departure from this average was 33/13 inches.
If someone were tackling this problem from scratch—meaning they had no knowledge of the mathematical requirements related to the “Least Squares” theory—they would probably take one approach. They would note that the average we calculated provides a rough estimate by identifying a midpoint among the original values. To add some insight into how close these values are to each other, the best method would be to analyze how far they deviate from the mean (referred to as “errors”) in the same manner, meaning we would find their average. For example, if there are 13 men with heights ranging from 5 feet to 6 feet, we would say their average height is 66 inches, and their average deviation from that average is 33/13 inches.
Looked at from this point of view we should then proceed to try how each of the above-named averages would answer the purpose. Two of them,—viz. the arithmetical mean and the median,—will answer perfectly; and, as we shall immediately see, are frequently used for the purpose. So too we could, if we pleased, employ the geometrical 446 mean, though such employment would be tedious, owing to the difficulty of calculation. The ‘maximum ordinate’ clearly would not answer, since it would generally (v. the diagram on p. 443) refer us back again to the average already obtained, and therefore give no information.
From this perspective, we should then explore how each of the mentioned averages would serve our needs. Two of them—the arithmetic mean and the median—will work perfectly and, as we will soon see, are often used for this purpose. We could also use the geometric mean if we wanted, but that would be cumbersome due to the complexity of the calculations. The 'maximum ordinate' clearly wouldn't work, as it would usually (see the diagram on p. 443) lead us back to the average we already calculated and therefore provide no new information.
The only point here about which any doubt could arise concerns what is called in algebra the sign of the errors. Two equal and opposite errors, added algebraically, would cancel each other. But when, as here, we are regarding the errors as substantive quantities, to be considered on their own account, we attend only to their real magnitude, and then these equal and opposite errors are to be put upon exactly the same footing.
The only point here that might raise any doubt is what is referred to in algebra as the sign of the errors. Two equal and opposite errors, when added together algebraically, would cancel each other out. However, in this case, as we are viewing the errors as substantial quantities to be considered on their own, we focus only on their actual magnitude, and these equal and opposite errors are treated in the exact same way.
§ 10. Of the various means already discussed, two, as just remarked, are in common use. One of these is familiarly known, in astronomical and other calculations, as the ‘Mean Error,’ and is so absolutely an application of the same principle of the arithmetical mean to the errors, that has been already applied to the original magnitudes, that it needs no further explanation. Thus in the example in the last section the mean of the heights was 66 inches, the mean of the errors was 33/13 inches.
§ 10. Among the different methods we've talked about, two are commonly used. One of these is widely recognized in astronomical and other calculations as the ‘Mean Error.’ This concept directly applies the same principle of the arithmetic mean to the errors that was previously applied to the original measurements, so it doesn’t need any more explanation. For instance, in the example from the last section, the average height was 66 inches, while the average of the errors was 33/13 inches.
The other is the Median, though here it is always known under another name, i.e. as the ‘Probable Error’;—a technical and decidedly misleading term. It is briefly defined as that error which we are as likely to exceed as to fall short of: otherwise phrased, if we were to arrange all the errors in the order of their magnitude, it corresponds to that one of them which just bisects the row. It is therefore the ‘median’ error: or, if we arrange all the magnitudes in successive order, and divide them into four equally numerous classes,—what Mr Galton calls ‘quartiles,’—the first and third of the consequent divisions will mark the limits of 447 the ‘probable error’ on each side, whilst the middle one will mark the ‘median.’ This median, as was remarked, coincides, in symmetrical curves, with the arithmetical mean.
The second term is the Median, though here it’s always referred to as the ‘Probable Error’—a technical and rather misleading term. It’s simply defined as the error that we are equally likely to exceed or fall short of: put another way, if we were to arrange all the errors by size, it corresponds to the one that splits the list in half. So, it’s the ‘median’ error: or, if we sort all the sizes in order and divide them into four equal groups—what Mr. Galton calls ‘quartiles’—the first and third groups will indicate the limits of the ‘probable error’ on each side, while the middle one will indicate the ‘median.’ This median, as noted, aligns with the arithmetic mean in symmetrical distributions.
It is best to stand by accepted nomenclature, but the reader must understand that such an error is not in any strict sense ‘probable.’ It is indeed highly improbable that in any particular instance we should happen to get just this error: in fact, if we chose to be precise and to regard it as one exact magnitude out of an infinite number, it would be infinitely unlikely that we should hit upon it. Nor can it be said to be probable that we shall be within this limit of the truth, for, by definition, we are just as likely to exceed as to fall short. As already remarked (see note on p. 441), the ‘maximum ordinate’ would have the best right to be regarded as indicating the really most probable value.
It’s best to stick with accepted terms, but the reader should know that this kind of mistake isn’t really ‘likely’ in a strict sense. It’s actually highly unlikely that we would happen to make this specific mistake: in fact, if we wanted to be precise and see it as one exact value among an infinite number, it would be infinitely unlikely for us to land on it. We also can’t say it’s likely we’ll be within this limit of the truth, since, by definition, we have just as much chance of going over as we do of falling short. As already mentioned (see note on p. 441), the 'maximum ordinate' would best be viewed as indicating the most probable value.
§ 11. (5) The error of mean square. As previously suggested, the plan which would naturally be adopted by any one who had no concern with the higher mathematics of the subject, would be to take the ‘mean error’ for the purpose of the indication in view. But a very different kind of average is generally adopted in practice to serve as a test of the amount of divergence or dispersion. Suppose that we have the magnitudes x1, x2, … xn; their ordinary average is 1/n(x1 + x2 + … + xn), and their ‘errors’ are the differences between this and x1, x2, … xn. Call these errors e1, e2, … en, then the arithmetical mean of these errors (irrespective of sign) is 1/n(e1 + e2 + … + en). The Error of Mean Square,[3] on the other hand, is the square root of 1/n(e12 + e22 + … + en2).
§ 11. (5) The error of mean square. As mentioned earlier, the approach that would usually be taken by someone who isn't familiar with the advanced mathematics of the topic would be to use the 'mean error' for the indicated purpose. However, in practice, a different kind of average is commonly used as a measure of divergence or dispersion. Let's say we have the values x1, x2, … xn; their regular average is 1/n(x1 + x2 + … + xn), and their 'errors' are the differences between this average and x1, x2, … xn. We can call these errors e1, e2, … en, and the arithmetic mean of these errors (ignoring the sign) is 1/n(e1 + e2 + … + en). The Error of Mean Square, [3], on the other hand, is the square root of 1/n(e12 + e22 + … + en2).
The reasons for employing this latter kind of average in preference to any of the others will be indicated in the following chapter. At present we are concerned only with the general logical nature of an average, and it is therefore sufficient to point out that any such intermediate value will answer the purpose of giving a rough and summary indication of the degree of closeness of approximation which our various measures display to each other and to their common average. If we were to speak respectively of the ‘first’ and the ‘second average,’ we might say that the former of these assigns a rough single substitute for the plurality of original values, whilst the latter gives a similar rough estimate of the degree of their departure from the former.
The reasons for using this type of average instead of the others will be explained in the next chapter. Right now, we’re only focusing on the general logical nature of an average, so it’s enough to note that any intermediate value will serve to provide a rough summary indication of how close our various measures are to each other and to their common average. If we refer to the ‘first’ and the ‘second average,’ we could say that the first provides a rough single substitute for the multiple original values, while the second offers a similar rough estimate of how much they differ from the first.
§ 12. So far we have only been considering the general nature of an average, and the principal kinds of average practically in use. We must now enquire more particularly what are the principal purposes for which averages are employed.
§ 12. Up to now, we've only looked at the basic idea of an average and the main types of averages that are commonly used. Now we need to dive deeper into the key reasons why averages are used.
In this respect the first thing we have to do is to raise doubts in the reader's mind on a subject on which he perhaps has not hitherto felt the slightest doubt. Every one is more or less familiar with the practice of appealing to an average in order to secure accuracy. But distinctly what we begin by doing is to sacrifice accuracy; for in place of the plurality of actual results we get a single result which 449 very possibly does not agree with any one of them. If I find the temperature in different parts of a room to be different, but say that the average temperature is 61°, there may perhaps be but few parts of the room where this exact temperature is realized. And if I say that the average stature of a certain small group of men is 68 inches, it is probable that no one of them will present precisely this height.
In this regard, the first thing we need to do is create doubts in the reader's mind about a topic they may not have questioned before. Everyone is somewhat familiar with the idea of using an average to ensure accuracy. However, what we actually do is compromise accuracy; instead of capturing the variety of actual outcomes, we end up with a single result that likely doesn’t align with any of them. If I find that the temperature in different areas of a room varies but state that the average temperature is 61°, there might be very few spots in the room where this exact temperature is reached. Similarly, if I claim that the average height of a certain small group of men is 68 inches, it’s likely that none of them will be exactly this height. 449
The principal way in which accuracy can be thus secured is when what we are really aiming at is not the magnitudes before us but something else of which they are an indication. If they are themselves ‘inaccurate,’—we shall see presently that this needs some explanation,—then the single average, which in itself agrees perhaps with none of them, may be much more nearly what we are actually in want of. We shall find it convenient to subdivide this view of the subject into two parts; by considering first those cases in which quantitative considerations enter but slightly, and in which no determination of the particular Law of Error involved is demanded, and secondly those in which such determination cannot be avoided. The latter are only noticed in passing here, as a separate chapter is reserved for their fuller consideration.
The main way to ensure accuracy is when what we're actually focusing on isn't the numbers in front of us, but something else that they represent. If those numbers are 'inaccurate'—which we'll explain shortly—then the overall average, which might not match any of them exactly, could actually be closer to what we really need. We'll find it useful to break this topic into two parts: first, discussing cases where quantitative factors play a minor role and where we don't need to specify the particular Law of Error involved, and second, those cases where such a specification is unavoidable. We'll only mention the latter briefly here, as a separate chapter is set aside for a more in-depth discussion.
§ 13. The process, as a practical one, is familiar enough to almost everybody who has to work with measures of any kind. Suppose, for instance, that I am measuring any object with a brass rod which, as we know, expands and contracts according to the temperature. The results will vary slightly, being sometimes a little too great and sometimes a little too small. All these variations are physical facts, and if what we were concerned with was the properties of brass they would be the one important fact for us. But when we are concerned with the length of the object measured, these facts become superfluous and misleading. What we want to do is to escape their influence, and this we are enabled to effect by 450 taking their (arithmetical) average, provided only they are as often in excess as in defect.[4] For this purpose all that is necessary is that equal excesses and defects should be equally prevalent. It is not necessary to know what is the law of variation, or even to be assured that it is of one particular kind. Provided only that it is in the language of the diagram on p. 29, symmetrical, then the arithmetical average of a suitable and suitably varied number of measurements will be free from this source of disturbance. And what holds good of this cause of variation will hold good of all others which obey the same general conditions. In fact the equal prevalence of equal and opposite errors seems to be the sole and sufficient justification of the familiar process of taking the average in order to secure accuracy.
§ 13. The process, being practical, is familiar to almost everyone who works with measurements. For example, let's say I'm measuring an object with a brass rod, which expands and contracts with temperature changes. The results will fluctuate slightly—sometimes they’ll be a bit too high and sometimes a bit too low. These variations are physical realities, and if we were focused solely on the properties of brass, they would be the main consideration. However, when we are interested in the length of the object we're measuring, these variations become unnecessary and misleading. What we want to do is minimize their impact, which we can achieve by taking their (arithmetic) average, as long as there are as many instances of excess as there are of deficiency. For this purpose, all that's needed is for equal excesses and deficiencies to be equally common. It's not necessary to know the specific law of variation or even to be sure that it follows one particular pattern. As long as it is symmetrical, as shown in the diagram on p. 29, the arithmetic average of a suitable, varied number of measurements will be free from this source of error. This principle applies to all other variations that meet the same general conditions. In fact, the equal occurrence of equal and opposite errors seems to be the only valid reason for the common practice of averaging to achieve accuracy.
§ 14. We must now make the distinction to which attention requires so often to be drawn in these subjects between the cases in which there respectively is, and is not, some objective magnitude aimed at: a distinction which the common use of the same word “errors” is so apt to obscure. When we talked, in the case of the brass rod, of excesses and defects being equal, we meant exactly what we said, viz. that for every case in which the ‘true’ length (i.e. that determined by the authorized standard) is exceeded by a given fraction of an inch, there will be a corresponding case in which there is an equal defect.
§ 14. We now need to clarify a point that often requires attention in these discussions: the difference between situations where there is, and isn't, an objective measurement being targeted. This distinction can easily get confused because the same term “errors” is commonly used. When we talked about the brass rod in terms of having equal excesses and defects, we meant exactly what we said, which is that for every instance where the ‘true’ length (i.e., the length determined by the official standard) exceeds a certain fraction of an inch, there will be a corresponding instance where there is an equal shortfall.
On the other hand, when there is no such fixed objective standard of reference, it would appear that all that we mean by equal excesses and defects is permanent symmetry of arrangement. In the case of the measuring rod we were 451 able to start with something which existed, so to say, before its variations; but in many cases any starting point which we can find is solely determined by the average.
On the other hand, when there isn’t a fixed objective standard to refer to, it seems that what we really mean by equal amounts of excess and defects is just a consistent arrangement. With the measuring rod, we could begin with something that existed, so to speak, before its variations; however, in many cases, any starting point we find is only based on the average. 451
Suppose, for instance, we take a great number of observations of the height of the barometer at a certain place, at all times and seasons and in all weathers, we should generally consider that the average of all these showed the ‘true’ height for that place. What we really mean is that the height at any moment is determined partly (and principally) by the height of the column of air above it, but partly also by a number of other agencies such as local temperature, moisture, wind, &c. These are sometimes more and sometimes less effective, but their range being tolerably constant, and their distribution through this range being tolerably symmetrical, the average of one large batch of observations will be almost exactly the same as that of any other. This constancy of the average is its truth. I am quite aware that we find it difficult not to suppose that there must be something more than this constancy, but we are probably apt to be misled by the analogy of the other class of cases, viz. those in which we are really aiming at some sort of mark.
Suppose, for example, we take a large number of measurements of the barometric pressure at a specific location, at all times of the day, throughout the seasons, and in various weather conditions. We would generally think that the average of all these measurements represents the 'true' pressure for that location. What we actually mean is that the pressure at any given moment is determined partly (and mainly) by the weight of the air column above it, but also influenced by several other factors like local temperature, humidity, wind, etc. These factors can be more or less effective at different times, but their range remains fairly constant, and their distribution within this range tends to be quite symmetrical. Therefore, the average of one large set of observations will almost exactly match that of any other set. This stability of the average is its truth. I’m fully aware that we often find it hard not to believe there must be something more than this stability, but we might be misled by comparing this to other situations where we are indeed targeting a specific goal.
§ 15. As regards the practical methods available for determining the various kinds of average there is very little to be said; as the arithmetical rules are simple and definite, and involve nothing more than the inevitable drudgery attendant upon dealing with long rows of figures. Perhaps the most important contribution to this part of the subject is furnished by Mr Galton's suggestion to substitute the median for the mean, and thus to elicit the average with sufficient accuracy by the mere act of grouping a number of objects together. Thus he has given an ingenious suggestion for obtaining the average height of a number of men without 452 the trouble and risk of measuring them all. “A barbarian chief might often be induced to marshall his men in the order of their heights, or in that of the popular estimate of their skill in any capacity; but it would require some apparatus and a great deal of time to measure each man separately, even supposing it possible to overcome the usually strong repugnance of uncivilized people to any such proceeding” (Phil. Mag. Jan. 1875). That is, it being known from wide experience that the heights of any tolerably homogeneous set of men are apt to group themselves symmetrically,—the condition for the coincidence of the three principal kinds of mean,—the middle man of a row thus arranged in order will represent the mean or average man, and him we may subject to measurement. Moreover, since the intermediate heights are much more thickly represented than the extreme ones, a moderate error in the selection of the central man of a long row will only entail a very small error in the selection of the corresponding height.
§ 15. When it comes to the practical methods for determining different types of averages, there isn't much to discuss; the arithmetic rules are straightforward and clear, requiring nothing more than the tedious work of handling long columns of numbers. One of the most significant contributions to this topic comes from Mr. Galton's idea to use the median instead of the mean, allowing us to find the average with enough accuracy just by grouping several objects together. For instance, he cleverly proposed how to determine the average height of a group of men without the hassle and risk of measuring each one. “A tribal chief might often arrange his men by height or by popular opinion on their skills; however, measuring each man individually would require special tools and a lot of time, even if it were possible to overcome the natural resistance that uncivilized people usually have to such activities” (Phil. Mag. Jan. 1875). In other words, it is widely understood that the heights of any reasonably uniform group of men tend to arrange themselves symmetrically—this is the condition for the alignment of the three main types of average—so the middle person in a line organized this way will represent the average or mean individual, and we can take measurements of him. Furthermore, since the heights in the middle range are much more common than the extremes, a small mistake in choosing the central person from a long line will only result in a very tiny error in determining the corresponding height.
§ 16. We can now conveniently recur to a subject which has been already noticed in a former chapter, viz. the attempt which is sometimes made to establish a distinction between an average and a mean. It has been proposed to confine the former term to the cases in which we are dealing with a fictitious result of our own construction, that is, with a mere arithmetical deduction from the observed magnitudes, and to apply the latter to cases in which there is supposed to be some objective magnitude peculiarly representative of the average.
§ 16. We can now conveniently return to a topic that was already mentioned in a previous chapter, namely, the attempt to create a distinction between an average and a mean. It's been suggested to use the term "average" for situations where we're dealing with a fictional result of our own making, meaning it's just an arithmetic calculation based on the observed values, and to reserve the term "mean" for cases where there is thought to be some objective value that distinctly represents the average.
Recur to the three principal classes, of things appropriate to Probability, which were sketched out in Ch. II. § 4. The first of these comprised the results of games of chance. Toss a die ten times: the total number of pips on the upper side may vary from ten up to sixty. Suppose it to be 453 thirty. We then say that the average of this batch of ten is three. Take another set of ten throws, and we may get another average, say four. There is clearly nothing objective peculiarly corresponding in any way to these averages. No doubt if we go on long enough we shall find that the averages tend to centre about 3.5: we then call this the average, or the ‘probable’ number of points; and this ultimate average might have been pretty constantly asserted beforehand from our knowledge of the constitution of a die. It has however no other truth or reality about it of the nature of a type: it is simply the limit towards which the averages tend.
Refer to the three main categories of items related to probability, which were outlined in Ch. II. § 4. The first category includes the outcomes of games of chance. Toss a die ten times: the total number of pips on the top face can range from ten to sixty. Suppose it adds up to thirty. We then say that the average for this set of ten is three. If we take another set of ten rolls, we might get a different average, say four. Clearly, there’s nothing objective that specifically corresponds to these averages. If we continue long enough, we will find that the averages tend to center around 3.5: we then call this the average, or the ‘probable’ number of points; this final average could have been reliably predicted beforehand based on our understanding of how a die is structured. However, it lacks any other truth or reality as a type: it’s simply the limit toward which the averages converge.
The next class is that occupied by the members of most natural groups of objects, especially as regards the characteristics of natural species. Somewhat similar remarks may be repeated here. There is very frequently a ‘limit’ towards which the averages of increasing numbers of individuals tend to approach; and there is certainly some temptation to regard this limit as being a sort of type which all had been intended to resemble as closely as possible. But when we looked closer, we found that this view could scarcely be justified; all which could be safely asserted was that this type represented, for the time being, the most numerous specimens, or those which under existing conditions could most easily be produced.
The next category includes members of most natural groups of objects, especially concerning the traits of natural species. Similar observations can be made here. There is often a "limit" that the averages of growing numbers of individuals tend to approach; and there’s definitely a temptation to see this limit as a type that everything was meant to resemble as closely as possible. However, upon closer examination, we found that this perspective was hardly justifiable; what could be confidently stated was that this type represented, for the time being, the most common specimens or those that could be produced most easily under current conditions.
The remaining class stands on a somewhat different ground. When we make a succession of more or less successful attempts of any kind, we get a corresponding series of deviations from the mark at which we aimed. These we may treat arithmetically, and obtain their averages, just as in the former cases. These averages are fictions, that is to say, they are artificial deductions of our own which need not necessarily have anything objective corresponding to 454 them. In fact, if they be averages of a few only they most probably will not have anything thus corresponding to them. Anything answering to a type can only be sought in the ‘limit’ towards which they ultimately tend, for this limit coincides with the fixed point or object aimed at.
The remaining class is based on a slightly different premise. When we make a series of attempts, whether they succeed or fail, we end up with a range of deviations from our target. We can analyze these numerically and calculate their averages, just like in the previous cases. However, these averages are just constructs; in other words, they are artificial conclusions we've drawn that don’t necessarily correspond to anything real. In fact, if we’re only dealing with a few attempts, they probably won’t relate to anything meaningful. Anything that corresponds to a type can only be found in the ‘limit’ that they ultimately approach, since this limit aligns with the fixed point or goal we aimed for. 454
§ 17. Fully admitting the great value and interest of Quetelet's work in this direction,—he was certainly the first to direct public attention to the fact that so many classes of natural objects display the same characteristic property,—it nevertheless does not seem desirable to attempt to mark such a distinction by any special use of these technical terms. The objections are principally the two following.
§ 17. While recognizing the significant value and interest of Quetelet's work in this area—he was indeed the first to draw public attention to the fact that so many types of natural objects show the same characteristic property—it still doesn’t seem appropriate to try to highlight such a distinction by using these technical terms in a specific way. The main objections are primarily the following two.
In the first place, a single antithesis, like this between an average and a mean, appears to suggest a very much simpler state of things than is actually found to exist in nature. A reference to the three classes of things just mentioned, and a consideration of the wide range and diversity included in each of them, will serve to remind us not only of the very gradual and insensible advance from what is thus regarded as ‘fictitious’ to what is claimed as ‘real;’ but also of the important fact that whereas the ‘real type’ may be of a fluctuating and evanescent character, the ‘fiction’ may (as in games of chance) be apparently fixed for ever. Provided only that the conditions of production remain stable, averages of large numbers will always practically present much the same general characteristics. The far more important distinction lies between the average of a few, with its fluctuating values and very imperfect and occasional attainment of its ultimate goal, and the average of many and its gradually close approximation to its ultimate value: i.e. to its objective point of aim if there happen to be such.
In the first place, a simple contrast, like the one between an average and a mean, seems to suggest a much simpler situation than what actually exists in nature. Referring to the three classes of things mentioned earlier and considering the wide range and diversity within each of them will remind us not only of the gradual and subtle shift from what is seen as ‘fictitious’ to what is considered ‘real,’ but also of the important fact that while the ‘real type’ may be unstable and fleeting, the ‘fiction’ may (like in games of chance) appear to remain fixed forever. As long as the conditions for production stay consistent, averages from large groups will consistently show similar general characteristics. The much more significant distinction is between the average of a few, with its fluctuating values and very imperfect and sporadic reach toward its ultimate goal, and the average of many, which gradually gets closer to its ultimate value: i.e. to its objective point of aim, if such a point exists.
Then, again, the considerations adduced in this chapter 455 will show that within the field of the average itself there is far more variety than Quetelet seems to have recognized. He did not indeed quite ignore this variety, but he practically confined himself almost entirely to those symmetrical arrangements in which three of the principal means coalesce into one. We should find it difficult to carry out his distinction in less simple cases. For instance, when there is some degree of asymmetry, it is the ‘maximum ordinate’ which would have to be considered as a ‘mean’ to the exclusion of the others; for no appeal to an arithmetical average would guide us to this point, which however is to be regarded, if any can be so regarded, as marking out the position of the ultimate type.
Then again, the points raised in this chapter 455 will show that within the realm of averages, there is much more variety than Quetelet seemed to acknowledge. He didn’t completely overlook this variety, but he mainly focused on those symmetrical setups where three of the main means come together as one. We would find it challenging to apply his distinction in more complex cases. For example, when there's some level of asymmetry, it's the ‘maximum ordinate’ that should be considered a ‘mean’ excluding the others; because no reference to an arithmetic average would lead us to this point, which, if any can be seen this way, marks the position of the ultimate type.
§ 18. We have several times pointed out that it is a characteristic of the things with which Probability is concerned to present, in the long run, a continually intensifying uniformity. And this has been frequently described as what happens ‘on the average.’ Now an objection may very possibly be raised against regarding an arrangement of things by virtue of which order thus emerges out of disorder as deserving any special notice, on the ground that from the nature of the arithmetical average it could not possibly be otherwise. The process by which an average is obtained, it may be urged, insures this tendency to equalization amongst the magnitudes with which it deals. For instance, let there be a party of ten men, of whom four are tall and four are short, and take the average of any five of them. Since this number cannot be made up of tall men only, or of short men only, it stands to reason that the averages cannot differ so much amongst themselves as the single measures can. Is not then the equalizing process, it may be asked, which is observable on increasing the range of our observations, one which can be shown to follow from necessary laws of 456 arithmetic, and one therefore which might be asserted à priori?
§ 18. We have pointed out several times that things related to Probability tend to show a progressively stronger uniformity over time. This is often described as what happens "on average." However, one might raise an objection to considering the way order emerges from disorder as something noteworthy, arguing that due to the nature of the arithmetic average, it couldn't really be any other way. It could be argued that the method used to calculate an average ensures this tendency towards equalization among the values it addresses. For example, if there’s a group of ten men, with four being tall and four being short, and we take the average of any five of them, it’s clear that this group cannot consist solely of tall men or solely of short men. Therefore, it makes sense that the averages wouldn’t vary among themselves as much as the individual measurements might. One might then question whether the equalizing effect observed when we increase our observations is something that can be explained by fundamental principles of arithmetic and could therefore be considered à priori.
Whatever force there may be in the above objection arises principally from the limitations of the example selected, in which the number chosen was so large a proportion of the total as to exclude the bare possibility of only extreme cases being contained within it. As much confusion is often felt here between what is necessary and what is matter of experience, it will be well to look at an example somewhat more closely, in order to determine exactly what are the really necessary consequences of the averaging process.
Whatever strength there is in the above objection mainly comes from the limitations of the chosen example, where the number selected was such a large part of the total that it ruled out even the slightest chance of only extreme cases being included. As people often confuse what is essential with what is based on experience, it would be helpful to examine an example more closely to identify the actual necessary outcomes of the averaging process.
§ 19. Suppose then that we take ten digits at random from a table (say) of logarithms. Unless in the highly unlikely case of our having happened upon the same digit ten times running, the average of the ten must be intermediate between the possible extremes. Every conception of an average of any sort not merely involves, but actually means, the taking of something intermediate between the extremes. The average therefore of the ten must lie closer to 4.5 (the average of the extremes) than did some of the single digits.
§ 19. Imagine that we randomly pick ten digits from a table of logarithms. Unless we’re extremely lucky and pick the same digit ten times in a row, the average of those ten has to be somewhere in between the possible extremes. Every idea of an average not only involves, but actually signifies, finding something in between the extremes. Therefore, the average of the ten must be closer to 4.5 (the average of the extremes) than some of the individual digits.
Now suppose we take 1000 such digits instead of 10. We can say nothing more about the larger number, with demonstrative certainty, than we could before about the smaller. If they were unequal to begin with (i.e. if they were not all the same) then the average must be intermediate, but more than this cannot be proved arithmetically. By comparison with such purely arithmetical considerations there is what may be called a physical fact underlying our confidence in the growing stability of the average of the larger number. It is that the constituent elements from which the average is deduced will themselves betray a growing uniformity:—that the proportions in which the different digits come out will become more and more nearly equal as we take larger numbers of 457 them. If the proportions in which the 1000 digits were distributed were the same as those of the 10 the averages would be the same. It is obvious therefore that the arithmetical process of obtaining an average goes a very little way towards securing the striking kind of uniformity which we find to be actually presented.
Now, let's say we have 1000 digits instead of 10. We can't make any more definitive statements about the larger number than we could about the smaller one. If they were different from the start (meaning if they weren’t all the same), then the average has to be somewhere in between, but we can't prove anything more mathematically. Compared to these purely mathematical considerations, there’s a physical fact that boosts our confidence in the increasing stability of the average in the larger set. This fact is that the individual elements that make up the average will show a growing uniformity: the different digits will appear in more balanced proportions as we work with larger numbers. If the way the 1000 digits were distributed mirrored that of the 10, the averages would be identical. So, it’s clear that the arithmetic process for calculating an average doesn’t fully capture the remarkable type of uniformity we observe in reality.
§ 20. There is another way in which the same thing may be put. It is sometimes said that whatever may have been the arrangement of the original elements the process of continual averaging will necessarily produce the peculiar binomial or exponential law of arrangement. This statement is perfectly true (with certain safeguards) but it is not in any way opposed to what has been said above. Let us take for consideration the example above referred to. The arrangement of the individual digits in the long run is the simplest possible. It would be represented, in a diagram, not by a curve but by a finite straight line, for each digit occurs about as often as any other, and this exhausts all the ‘arrangement’ that can be detected. Now, when we consider the results of taking averages of ten such digits, we see at once that there is an opening for a more extensive arrangement. The totals may range from 0 up to 100, and therefore the average will have 100 values from 0 to 9; and what we find is that the frequency of these numbers is determined according to the Binomial[5] or Exponential Law. The most frequent result is the true mean, viz. 4.5, and from this they diminish in each direction towards 0 and 10, which will each occur but once (on the average) in 1010 occasions.
§ 20. There’s another way to explain this. It’s often said that regardless of how the original elements were arranged, the ongoing process of averaging will inevitably lead to the specific binomial or exponential law of arrangement. This statement is completely accurate (with certain precautions), but it doesn’t contradict what we discussed earlier. Let’s consider the example mentioned above. In the long run, the arrangement of the individual digits is as simple as it gets. In a diagram, it wouldn’t be represented by a curve but by a straight line, since each digit appears about as often as any other, and this covers all the 'arrangement' we can observe. Now, when we look at the results of averaging ten such digits, we immediately see that there’s room for a more complex arrangement. The totals can range from 0 to 100, so the average will have 100 values from 0 to 9; what we find is that the frequency of these numbers follows the Binomial[5] or Exponential Law. The most common result is the true mean, which is 4.5, and from this point, the frequencies decrease in both directions towards 0 and 10, with those numbers each appearing only once (on average) in 1010 trials.
The explanation here is of the same kind as in the former case. The resultant arrangement, so far as the averages are 458 concerned, is only ‘necessary’ in the sense that it is a necessary result of certain physical assumptions or experiences. If all the digits tend to occur with equal frequency, and if they are ‘independent’ (i.e. if each is associated indifferently with every other), then it is an arithmetical consequence that the averages when arranged in respect of their magnitude and prevalence will display the Law of Facility above indicated. Experience, so far as it can be appealed to, shows that the true randomness of the selection of the digits,—i.e. their equally frequent recurrence, and the impartiality of their combination,—is very fairly secured in practice. Accordingly the theoretic deduction that whatever may have been the original Law of Facility of the individual results we shall always find the familiar Exponential Law asserting itself as the law of the averages, is fairly justified by experience in such a case.
The explanation here is similar to the previous case. The resulting arrangement, in terms of averages, is only ‘necessary’ in the sense that it is a necessary outcome of certain physical assumptions or experiences. If all the digits tend to occur with equal frequency and are ‘independent’ (i.e., each is related without bias to every other), then it follows mathematically that the averages, when organized by their size and frequency, will reflect the Law of Facility mentioned earlier. Experience, as far as it can be relied upon, shows that true randomness in the selection of the digits—i.e., their equal recurrence and the unbiased nature of their combinations—is reasonably secured in practice. Therefore, the theoretical conclusion that no matter the original Law of Facility of the individual results, we will always find the familiar Exponential Law asserting itself as the law of averages, is generally supported by experience in such cases.
The further discussion of certain corrections and refinements is reserved to the following chapter.
The discussion of some corrections and improvements will be saved for the next chapter.
§ 21. In regard to the three kinds of average employed to test the amount of dispersion,—i.e. the mean error, the probable error, and the error of mean square,—two important considerations must be borne in mind. They will both recur for fuller discussion and justification in the course of the next chapter, when we come to touch upon the Method of Least Squares, but their significance for logical purposes is so great that they ought not to be entirely passed by at present.
§ 21. When it comes to the three types of averages used to measure dispersion—namely, the mean error, probable error, and mean square error—there are two key points to keep in mind. Both will be discussed in more detail in the next chapter when we cover the Method of Least Squares, but it's important to acknowledge their significance for logical purposes right now.
(1) In the first place, then, it must be remarked that in order to know what in any case is the real value of an error we ought in strictness to know what is the position of the limit or ultimate average, for the amount of an error is always theoretically measured from this point. But this is information which we do not always possess. Recurring 459 once more to the three principal classes of events with which we are concerned, we can readily see that in the case of games of chance we mostly do possess this knowledge. Instead of appealing to experience to ascertain the limit, we practically deduce it by simple mechanical or arithmetical considerations, and then the ‘error’ in any individual case or group of cases is obviously found by comparing the results thus obtained with that which theory informs us would ultimately be obtained in the long run. In the case of deliberate efforts at an aim (the third class) we may or may not know accurately the value or position of this aim. In astronomical observations we do not know it, and the method of Least Squares is a method for helping us to ascertain it as well as we can; in such experimental results as firing at a mark we do know it, and may thus test the nature and amount of our failure by direct experience. In the remaining case, namely that of what we have termed natural kinds or groups of things, not only do we not know the ultimate limit, but its existence is always at least doubtful, and in many cases may be confidently denied. Where it does exist, that is, where the type seems for all practical purposes permanently fixed, we can only ascertain it by a laborious resort to statistics. Having done this, we may then test by it the results of observations on a small scale. For instance, if we find that the ultimate proportion of male to female births is about 106 to 100, we may then compare the statistics of some particular district or town and speak of the consequent ‘error,’ viz. the departure, in that particular and special district, from the general average.
(1) First of all, it's important to note that to understand the actual value of an error in any situation, we should ideally know where the limit or ultimate average lies, since the degree of an error is theoretically measured from that point. However, this information isn’t always available to us. Referring again to the three main categories of events we’re analyzing, we can see that in regards to games of chance, we generally do have this knowledge. Instead of relying on experience to determine the limit, we can usually deduce it through simple mechanical or mathematical calculations, allowing us to identify the ‘error’ in any single instance or group of instances by comparing the results we calculate to what theory tells us would ultimately be achieved over time. For deliberate efforts directed at a goal (the third category), we may or may not reliably know the value or position of that goal. In astronomical observations, we don’t have this knowledge, which is why the method of Least Squares exists to help us estimate it as accurately as possible; however, in experimental situations like aiming at a target, we do know it and can therefore measure the nature and extent of our failures based on direct experience. In the last category—what we’ve called natural kinds or groups of things—not only is the ultimate limit often unknown, but its existence can be at least questionable, and in many cases may be outright rejected. Where it does exist, meaning where the type appears to be consistently fixed for all practical intents and purposes, we can only identify it through careful statistical analysis. After doing this, we can then use it to evaluate the outcomes of observations on a smaller scale. For example, if we discover that the ultimate ratio of male to female births is about 106 to 100, we can then compare the statistics from a specific district or town and discuss the resulting ‘error,’ meaning the difference from the general average in that particular area.
What we have therefore to do in the vast majority of practical cases is to take the average of a finite number of measurements or observations,—of all those, in fact, which we have in hand,—and take this as our starting point in 460 order to measure the errors. The errors in fact are not known for certain but only probably calculated. This however is not so much of a theoretic defect as it may seem at first sight; for inasmuch as we seldom have to employ these methods,—for purposes of calculation, that is, as distinguished from mere illustration,—except for the purpose of discovering what the ultimate average is, it would be a sort of petitio principii to assume that we had already secured it. But it is worth while considering whether it is desirable to employ one and the same term for ‘errors’ known to be such, and whose amount can be assigned with certainty, and for ‘errors’ which are only probably such and whose amount can be only probably assigned. In fact it has been proposed[6] to employ the two terms ‘error’ and ‘residual’ respectively to distinguish between the magnitudes thus determined, that is, between the (generally unknown) actual error and the observed error.
What we really need to do in most practical situations is take the average of a limited number of measurements or observations—essentially all the ones we have available—and use this as our baseline to measure the errors. The errors aren’t known for sure, but only estimated. However, this isn’t as much of a theoretical flaw as it might seem at first glance; since we rarely use these methods—for calculation purposes, as opposed to just illustration—except to find out what the ultimate average is, it would be somewhat of a petitio principii to assume we have already determined it. But it’s worth considering whether it makes sense to use the same term for 'errors' that we know are errors, and whose amount can be given with certainty, and for 'errors' that are only likely to be errors, and whose amount can only be estimated. In fact, it has been proposed[6] to use the terms ‘error’ and ‘residual’ to differentiate between these two types of measurements, that is, between the (generally unknown) actual error and the observed error.
§ 22. (2) The other point involves the question to what extent either of the first two tests (pp. 446, 7) of the closeness with which the various results have grouped themselves about their average is trustworthy or complete. The answer is that they are necessarily incomplete. No single estimate or magnitude can possibly give us an adequate account of a number of various magnitudes. The point is a very important one; and is not, I think, sufficiently attended to, the consequence being, as we shall see hereafter, that it is far too summarily assumed that a method which yields the result with the least ‘error of mean square’ must necessarily be the best result for all purposes. It is not however by any means clear that a test which answers best for one purpose must do so for all.
§ 22. (2) The other point raises the question of how reliable or complete the first two tests (pp. 446, 7) are in terms of how closely the results cluster around their average. The answer is that they are necessarily incomplete. No single estimate or measurement can fully represent a range of different measurements. This is a crucial point that I don’t think gets enough attention. As we will see later, it’s too often assumed that a method that produces the smallest ‘mean square error’ is automatically the best for every situation. However, it’s not clear that a test that works best for one purpose will work best for all.
It must be clearly understood that each of these tests is 461 an ‘average,’ and that every average necessarily rejects a mass of varied detail by substituting for it a single result. We had, say, a lot of statures: so many of 60 inches, so many of 61, &c. We replace these by an ‘average’ of 68, and thereby drop a mass of information. A portion of this we then seek to recover by reconsidering the ‘errors’ or departures of these statures from their average. As before, however, instead of giving the full details we substitute an average of the errors. The only difference is that instead of taking the same kind of average (i.e. the arithmetical) we often prefer to adopt the one called the ‘error of mean square.’
It needs to be clear that each of these tests is an 'average,' and every average inevitably ignores a lot of varied details by replacing them with a single result. We had, for example, a range of heights: so many at 60 inches, so many at 61, etc. We replace these with an 'average' of 68, which means we lose a lot of information. We then try to recover some of this by looking again at the 'errors' or deviations of these heights from their average. As before, though, instead of providing all the details, we replace them with an average of the errors. The only difference is that instead of using the same type of average (i.e. the arithmetic), we often choose to use what's called the 'error of mean square.'
§ 23. A question may be raised here which is of sufficient importance to deserve a short consideration. When we have got a set of measurements before us, why is it generally held to be sufficient simply to assign: (1) the mean value; and (2) the mean departure from this mean? The answer is, of course, partly given by the fact that we are only supposed to be in want of a rough approximation: but there is more to be said than this. A further justification is to be found in the fact that we assume that we need only contemplate the possibility of a single Law of Error, or at any rate that the departures from the familiar Law will be but trifling. In other words, if we recur to the figure on p. 29, we assume that there are only two unknown quantities or disposable constants to be assigned; viz. first, the position of the centre, and, secondly, the degree of eccentricity, if one may so term it, of the curve. The determination of the mean value directly and at once assigns the former, and the determination of the mean error (in either of the ways referred to already) indirectly assigns the latter by confining us to one alone of the possible curves indicated in the figure.
§ 23. A question arises here that is important enough to warrant a brief discussion. When we have a set of measurements in front of us, why is it typically considered enough to just provide: (1) the average value; and (2) the average deviation from this average? The answer partly lies in the fact that we're only looking for a rough estimate: but there’s more to consider. Another reason is that we assume we only need to think about the possibility of a single Law of Error, or at least that any deviations from the familiar Law will be minimal. In other words, if we refer back to the figure on p. 29, we assume there are only two unknown quantities or constants to determine; specifically, first, the position of the center, and second, the degree of eccentricity, if that's a fitting term for it, of the curve. Finding the average value immediately gives us the first, and determining the average error (using either of the methods mentioned earlier) indirectly provides the second by limiting us to just one of the possible curves shown in the figure.
Except for the assumption of one such Law of Error the 462 determination of the mean error would give but a slight intimation of the sort of outline of our Curve of Facility. We might then have found it convenient to adopt some plan of successive approximation, by adding a third or fourth ‘mean.’ Just as we assign the mean value of the magnitude, and its mean departure from this mean; so we might take this mean error (however determined) as a fresh starting point, and assign the mean departure from it. If the point were worth further discussion we might easily illustrate by means of a diagram the sort of successive approximations which such indications would yield as to the ultimate form of the Curve of Facility or Law of Error.
Except for the assumption of one such Law of Error, the determination of the mean error would only provide a small indication of the general shape of our Curve of Facility. We might then find it useful to adopt a plan of successive approximation, by adding a third or fourth 'mean.' Just as we assign the average value of the magnitude and its average deviation from this average, we could take this mean error (however determined) as a new starting point and assign the average deviation from it. If the point were worth discussing further, we could easily illustrate with a diagram the kind of successive approximations that such indications would provide regarding the ultimate shape of the Curve of Facility or Law of Error.
As this volume is written mainly for those who take an interest in the logical questions involved, rather than as an introduction to the actual processes of calculation, mathematical details have been throughout avoided as much as possible. For this reason comparatively few references have been made to the exponential equation of the Law of Error, or to the corresponding ‘Probability integral,’ tables of which are given in several handbooks on the subject. There are two points however in connection with these particular topics as to which difficulties are, or should be, felt by so many students that some notice may be taken of them here
This book is primarily for those interested in the logical questions involved, rather than as a guide to the actual calculation processes, so we've mostly avoided detailed math throughout. Because of this, there are relatively few references to the exponential equation of the Law of Error or the related 'Probability integral,' which can be found in various handbooks on the topic. However, there are two specific points related to these subjects that many students struggle with and should be addressed here.
(1) In regard to the ordinary algebraical expression for the law of error, viz. y = h/√π e−h2x2, it will have been observed that I have always spoken of y as being proportional to the number of errors of the particular magnitude x. It would hardly be correct to say, absolutely, that y represents that number, because of course the actual number of errors of any precise magnitude, where continuity of possibility is assumed, must be indefinitely small. If therefore we want to pass from the continuous to the discrete, by ascertaining the actual number of errors between two consecutive divisions of our scale, when, as usual in measurements, all within certain limits are referred to some one precise point, we must modify our formula. In accordance with the usual differential notation, we must say that the number of errors falling into one subdivision (dx) of our scale is dx h/√π e−h2x2, where dx is a (small) unit of length, in which both h−1 and x must be measured.
(1) Regarding the ordinary algebraic expression for the law of error, namely, y = h/√π e−h2x2, I've always referred to y as being proportional to the number of errors of the specific magnitude x. It wouldn’t be accurate to say outright that y represents that number, because the actual number of errors of any given magnitude, assuming a continuous possibility, must be very small. Therefore, if we want to move from the continuous to the discrete by determining the actual number of errors between two consecutive divisions of our scale—where, as is typical in measurements, everything within certain limits is related to a specific point—we need to adjust our formula. In line with standard differential notation, we should say that the number of errors falling into one subdivision (dx) of our scale is dx h/√π e−h2x2, where dx is a (small) unit of length, in which both h−1 and x must be measured.
The difficulty felt by most students is in applying the formula to actual statistics, in other words in putting in the correct units. To take an actual numerical example, suppose that 1460 men have been measured in regard to their height “true to the nearest inch,” and let it be known that the modulus here is 3.6 inches. Then dx = 1 (inch); h−1 = 3.6 inches. Now ∑h/√πe−h2x2 dx = 1; that is, the sum of all the consecutive possible values is equal to unity. When therefore we want the sum, as here, to be 1460, we must express the formula thus;— y = 1460/√π × 3.6 e−(x/3.6)2, or y = 228e−(x/3.6)2.
The challenge most students face is applying the formula to actual statistics, meaning they struggle with using the correct units. For example, let's say 1,460 men have been measured for their height, accurate to the nearest inch, and the modulus is 3.6 inches. Then dx = 1 (inch); h−1 = 3.6 inches. Now ∑h/√πe−h2x2 dx = 1; that is, the sum of all the possible values equals unity. Therefore, when we want the total to be 1,460, we need to express the formula this way: y = 1460/√π × 3.6 e−(x/3.6)2, or y = 228e−(x/3.6)2.
Here x stands for the number of inches measured from the central or mean height, and y stands for the number of men referred to that height in our statistical table. (The values of e−t2 for successive values of t are given in the handbooks.)
Here x represents the number of inches measured from the average height, and y indicates the number of men corresponding to that height in our statistical table. (The values of e−t2 for different values of t are provided in the handbooks.)
For illustration I give the calculated numbers by this formula for values of x from 0 to 8 inches, with the actual numbers observed in the Cambridge measurements recently set on foot by Mr Galton.
For illustration, I provide the calculated numbers using this formula for values of x from 0 to 8 inches, along with the actual numbers observed in the recent Cambridge measurements conducted by Mr. Galton.
inches | calculated | observed |
x = 0 | y = 228 | = 231 |
x = 1 | y = 212 | = 218 |
x = 2 | y = 166 | = 170 |
x = 3 | y = 111 | = 110 |
x = 4 | y = 82 | = 66 |
x = 5 | y = 32 | = 31 |
x = 6 | y = 11 | = 10 |
x = 7 | y = 4 | = 6 |
x = 8 | y = 1 | = 3 |
Here the average height was 69 inches: dx, as stated, = 1 inch. By saying, ‘put x = 0,’ we mean, calculate the number of men who are assigned to 69 inches; i.e. who fall between 68.5 and 69.5. By saying, ‘put x = 4,’ we mean, calculate the number who are assigned to 65 or to 73; i.e. who lie between 64.5 and 65.5, or between 72.5 and 73.5. The observed results, it will be seen, keep pretty close to the calculated: in the case of the former the means of equal and opposite divergences from the mean have been taken, the actual results not being always the same in opposite directions.
Here, the average height was 69 inches: dx, as mentioned, = 1 inch. When we say, ‘set x = 0,’ we mean, calculate the number of men who are at 69 inches; that is, who fall between 68.5 and 69.5. When we say, ‘set x = 4,’ we mean, calculate the number who are at 65 or at 73; that is, who lie between 64.5 and 65.5, or between 72.5 and 73.5. The observed results, as you will see, are quite close to the calculated ones: in the former case, the means of equal and opposite differences from the average have been taken, with the actual results not always being the same in opposite directions.
(2) The other point concerns the interpretation of the familiar probability integral, 2/√π ∫0te−t2 dt. Every one who has calculated the chance of an event, by the help of the tables of this integral given in so many handbooks, knows that if we assign any numerical value to t, the corresponding value of the above expression assigns the chance that an 464 error taken at random shall lie within that same limit, viz. t. Thus put t = 1.5, and we have the result 0.96; that is, only 4 per cent. of the errors will exceed ‘one and a half.’ But when we ask, ‘one and a half’ what? the answer would not always be very ready. As usual, the main difficulty of the beginner is not to manipulate the formulæ, but to be quite clear about his units.
(2) The other point is about understanding the well-known probability integral, 2/√π ∫0te−t2 dt. Anyone who's calculated the probability of an event using the tables of this integral found in so many manuals knows that when we assign a numerical value to t, the corresponding value of the expression gives the probability that a randomly taken error will fall within that same limit, which is t. So if we set t = 1.5, we get the result of 0.96; meaning only 4 percent of the errors will be greater than ‘one and a half.’ But when we ask, ‘one and a half’ what? the answer isn’t always straightforward. As usual, the main struggle for beginners isn’t manipulating the equations, but being clear about their units.
It will be seen at once that this case differs from the preceding in that we cannot now choose our unit as we please. Where, as here, there is only one variable (t), if we were allowed to select our own unit, the inch, foot, or whatever it might be, we might get quite different results. Accordingly some comparatively natural unit must have been chosen for us in which we are bound to reckon, just as in the circular measurement of an angle as distinguished from that by degrees.
It’s clear right away that this case is different from the previous one because we can’t pick our unit freely anymore. In this situation, where there’s only one variable (t), if we could choose our own unit—like inches, feet, or something else—we could end up with very different results. So, a more natural unit must have been chosen for us, which we have to use, similar to how we measure angles in a circle compared to measuring them in degrees.
The answer is that the unit here is the modulus, and that to put ‘t = 1.5’ is to say, ‘suppose the error half as great again as the modulus’; the modulus itself being an error of a certain assignable magnitude depending upon the nature of the measurements or observations in question. We shall see this better if we put the integral in the form 2/√π ∫0hxe−h2x2 d(hx); which is precisely equivalent, since the value of a definite integral is independent of the particular variable employed. Here hx is the same as x : 1/h; i.e. it is the ratio of x to 1/h, or x measured in terms of 1/h. But 1/h is the modulus in the equation (y = h/√πe−h2x2) for the law of error. In other words the numerical value of an error in this formula, is the number of times, whole or fractional, which it contains the modulus.
The answer is that the unit here is the modulus, and saying ‘t = 1.5’ means, ‘assume the error is 50% greater than the modulus’; the modulus itself represents an error of a specific measurable size based on the type of measurements or observations involved. We’ll understand this better if we express the integral as 2/√π ∫0hxe−h2x2 d(hx); which is exactly equivalent, since the value of a definite integral doesn’t depend on the specific variable used. Here hx is the same as x: 1/h; that is, it’s the ratio of x to 1/h, or x measured in terms of 1/h. But 1/h is the modulus in the equation (y = h/√πe−h2x2) for the law of error. In other words, the numerical value of an error in this formula is how many times, either whole or fractional, it contains the modulus.
1 This kind of mean is called by Fechner and others the “dichteste Werth.” The most appropriate appeal to it that I have seen is by Prof. Lexis (Massenerscheinungen, p. 42) where he shows that it indicates clearly a sort of normal length of human life, of about 70 years; a result which is almost entirely masked when we appeal to the arithmetical average.
1 This type of mean is referred to by Fechner and others as the “dichteste Werth.” The best explanation I’ve come across is by Prof. Lexis (Massenerscheinungen, p. 42), where he argues that it clearly represents a sort of normal lifespan for humans, which is around 70 years; a finding that gets largely obscured when we look at the arithmetic average.
This mean ought to be called the ‘probable’ value (a name however in possession of another) on the ground that it indicates the point of likeliest occurrence; i.e. if we compare all the indefinitely small and equal units of variation, the one corresponding to this will tend to be most frequently represented.
This should be referred to as the ‘probable’ value (though that term is already taken) because it shows the point of highest likelihood; in other words, if we look at all the infinitely small and equal units of variation, the one that matches this will be the one that's most commonly seen.
2 A diagram illustrative of this number of results was given in Nature (Sept. 1, 1887). In calculating, as above, the different means, I may remark that the original results were given to three decimal places; but, in classing them, only one place was noted. That is, 29.9 includes all values between 29.900 and 29.999. Thus the value most frequently entered in my tables was 30.0, but on the usual principles of interpolation this is reckoned as 30.05.
2 A diagram illustrating this number of results was published in Nature (Sept. 1, 1887). In calculating the different averages as mentioned earlier, I should point out that the original results were recorded to three decimal places; however, when categorizing them, only one decimal place was noted. In other words, 29.9 represents all values from 29.900 to 29.999. Therefore, the value that appeared most often in my tables was 30.0, but according to the usual interpolation methods, this is considered as 30.05.
3 There is some ambiguity in the phraseology in use here. Thus Airy commonly uses the expression ‘Error of Mean Square’ to represent, as here, √ ∑e2/n. Galloway commonly speaks of the ‘Mean Square of the Errors’ to represent ∑e2/n. I shall adhere to the former usage and represent it briefly by E.M.S. Still more unfortunate (to my thinking) is the employment, by Mr Merriman and others, of the expression ‘Mean Error,’ (widely in use in its more natural signification,) as the equivalent of this E.M.S.
3 There is some ambiguity in the language used here. Airy often uses the term ‘Error of Mean Square’ to indicate, as here, √ ∑e2/n. Galloway typically refers to the ‘Mean Square of the Errors’ to mean ∑e2/n. I will stick with the former term and denote it simply as E.M.S. Even more unfortunate, in my opinion, is the way Mr. Merriman and others use the phrase ‘Mean Error’ (commonly used in its more straightforward definition) as a synonym for this E.M.S.
The technical term ‘Fluctuation’ is applied by Mr F. Y. Edgeworth to the expression 2∑e2/n.
The technical term ‘Fluctuation’ is used by Mr. F. Y. Edgeworth to refer to the expression 2∑e2/n.
4 Practically, of course, we should allow for the expansion or contraction. But for purposes of logical explanation we may conveniently take this variation as a specimen of one of those disturbances which may be neutralised by resort to an average.
4 Basically, we should consider the possibility of getting bigger or smaller. But for the sake of explaining logically, we can use this change as an example of one of those disruptions that can be balanced out by using an average.
5 More strictly multinomial: the relative frequency of the different numbers being indicated by the coefficients of the powers of x in the development of
5 More strictly multinomial: the relative frequency of the different numbers being indicated by the coefficients of the powers of x in the development of
CHAPTER 19.
THE THEORY OF THE AVERAGE AS A MEANS OF APPROXIMATION TO THE TRUTH.
§ 1. In the last chapter we were occupied with the Average mainly under its qualitative rather than its quantitative aspect. That is, we discussed its general nature, its principal varieties, and the main uses to which it could be put in ordinary life or in reasoning processes which did not claim to be very exact. It is now time to enter more minutely into the specific question of the employment of the average in the way peculiarly appropriate to Probability. That is, we must be supposed to have a certain number of measurements,—in the widest sense of that term,—placed before us, and to be prepared to answer such questions as; Why do we take their average? With what degree of confidence? Must we in all cases take the average, and, if so, one always of the same kind?
§ 1. In the last chapter, we focused on the Average primarily from a qualitative perspective rather than a quantitative one. We discussed its general characteristics, its main types, and the key ways it can be used in everyday life or in reasoning that doesn’t need to be very precise. Now, it’s time to dive deeper into the specific issue of how the average is used in a way that’s especially relevant to Probability. This means we need to consider a set of measurements—broadly defined—and be ready to answer questions like: Why do we calculate their average? How confident can we be in that average? Should we always take the average, and if so, should it always be of the same type?
The subject upon which we are thus entering is one which, under its most general theoretic treatment, has perhaps given rise to more profound investigation, to a greater variety of opinion, and in consequence to a more extensive history and literature, than any other single problem within the range of mathematics.[1] But, in spite of this, the main 466 logical principles underlying the methods and processes in question are not, I apprehend, particularly difficult to grasp: though, owing to the extremely technical style of treatment adopted even in comparatively elementary discussions of the subject, it is far from easy for those who have but a moderate command of mathematical resources to disentangle these principles from the symbols in which they are clothed. The present chapter contains an attempt to remove these difficulties, so far as a general comprehension of the subject is concerned. As the treatment thus adopted involves a considerable number of subdivisions, the reader will probably find it convenient to refer back occasionally to the table of contents at the commencement of this volume.
The topic we’re about to discuss has likely led to deeper investigations, a wider range of opinions, and more extensive history and literature than any other single issue in mathematics.[1] However, despite this, the main logical principles behind the methods and processes in question aren't particularly hard to understand. Still, due to the very technical nature of even the more basic discussions of the subject, it's not easy for those with only a basic understanding of math to separate these principles from the symbols they’re presented with. This chapter aims to address these challenges, at least for a general understanding of the topic. Since this approach involves several subdivisions, readers may find it helpful to occasionally refer back to the table of contents at the beginning of this volume.
§ 2. The subject, in the form in which we shall discuss it, will be narrowed to the consideration of the average, on account of the comparative simplicity and very wide prevalence of this aspect of the problem. The problem is however very commonly referred to, even in non-mathematical treatises, as the Rule or Method of Least Squares; the fact being that, in such cases as we shall be concerned with, the Rule of Least Squares resolves itself into the simpler and more familiar process of taking the arithmetical average. A very simple example,—one given by Herschel,—will explain the general nature of the task under a slightly wider treatment, and will serve to justify the familiar designation.
§ 2. The topic we'll be discussing will focus on the concept of the average, due to its relative simplicity and widespread relevance. However, this issue is often referred to, even in non-mathematical contexts, as the Rule or Method of Least Squares. This is because, in the situations we’ll examine, the Rule of Least Squares essentially simplifies to the more straightforward and common process of calculating the arithmetic average. A very straightforward example—one provided by Herschel—will illustrate the general nature of the task under a slightly broader approach and will help justify this commonly used term.
Suppose that a man had been firing for some time with a pistol at a small mark, say a wafer on a wall. We may take it for granted that the shot-marks would tend to group themselves about the wafer as a centre, with a density varying in some way inversely with the distance from the centre. But now suppose that the wafer which marked the centre was removed, so that we could see nothing but the surface of the wall spotted with the shot-marks; and that we were 467 asked to guess the position of the wafer. Had there been only one shot, common sense would suggest our assuming (of course very precariously) that this marked the real centre. Had there been two, common sense would suggest our taking the mid-point between them. But if three or more were involved, common sense would be at a loss. It would feel that some intermediate point ought to be selected, but would not see its way to a more precise determination, because its familiar reliance,—the arithmetical average,—does not seem at hand here. The rule in question tells us how to proceed. It directs us to select that point which will render the sum of the squares of all the distances of the various shot-marks from it the least possible.[2]
Imagine a guy has been shooting at a small target, like a sticker on a wall, for a while. We can assume that the bullet holes will cluster around the sticker as the center, with the density decreasing the farther you get from it. Now, let's say the sticker is taken away, and all we see are the bullet holes on the wall. If we were asked to guess where the sticker was, common sense would suggest that if there was only one bullet hole, we might think it indicates the center. If there were two holes, we could assume the center is halfway between them. But if there are three or more, common sense would struggle. It would feel like there should be some average point chosen, but wouldn’t be sure how to pinpoint it exactly, since the usual method—taking the arithmetic average—doesn't really apply here. The rule we need tells us what to do. It says to find the point that makes the sum of the squares of all the distances from that point to the bullet holes as small as possible. 467
This is merely by way of illustration, and to justify the familiar designation of the rule. The sort of cases with which we shall be exclusively occupied are those comparatively simple ones in which only linear magnitude, or some quality which can be adequately represented by linear magnitude, is the object under consideration. In respect of these the Rule of Least Squares reduces itself to the process of taking the average, in the most familiar sense of that term, viz. the arithmetical mean; and a single Law of Error, or its graphical equivalent, a Curve of Facility, will suffice accurately to indicate the comparative frequency of the different amounts of the one variable magnitude involved.
This is just an example to clarify the common name of the rule. The cases we'll mainly focus on are the simpler ones where only linear size, or some quality that can be effectively shown through linear size, is the subject we're looking at. For these cases, the Rule of Least Squares boils down to calculating the average, in the most basic sense, which is the arithmetic mean; and one Law of Error, or its graphical counterpart, a Curve of Facility, will be enough to accurately show how often different values of the single variable are present.
§ 3. We may conveniently here again call attention to a misconception or confusion which has been already noticed in a former chapter. It is that of confounding the Law of Error with the Method of Least Squares. These are things of an entirely distinct kind. The former is of the nature of a physical fact, and its production is one which in many cases is entirely beyond our control. The latter,—or any simplified application of it, such as the arithmetical average,—is no law whatever in the physical sense. It is rather a precept or rule for our guidance. The Law states, in any given case, how the errors tend to occur in respect of their magnitude and frequency. The Method directs us how to treat these errors when any number of them are presented to us. No doubt there is a relation between the two, as will be pointed out in the course of the following pages; but there is nothing really to prevent us from using the same method for different laws of error, or different methods for the same law. In so doing, the question of distinct right and wrong would seldom be involved, but rather one of more or less propriety.
§ 3. We can conveniently point out again a misunderstanding that was mentioned in a previous chapter. This is the mix-up between the Law of Error and the Method of Least Squares. These are completely different concepts. The former is a physical fact, and its occurrence is often beyond our control. The latter—along with any simplified application of it, like the arithmetic average—is not a law in the physical sense. Instead, it serves as a guideline. The Law describes how errors typically manifest in terms of their size and frequency in a given situation. The Method tells us how to handle these errors when we encounter several of them. There is certainly a relationship between the two, which will be explained in the following pages; however, there’s nothing that stops us from applying the same method to different laws of error, or using different methods for the same law. In doing so, the question of right and wrong is rarely at stake; it's more about what is appropriate or not.
§ 4. The reader must understand,—as was implied in the illustration about the pistol shots,—that the ultimate problem before us is an inverse one. That is, we are supposed to have a moderate number of ‘errors’ before us and we are to undertake to say whereabouts is the centre from which they diverge. This resembles the determination of a cause from the observation of an effect. But, as mostly happens in inverse problems, we must commence with the consideration of the direct problem. In other words, so far as concerns the case before us, we shall have to begin by supposing that the ultimate object of our aim,—that is, the true centre of our curve of frequency,—is already known to us: in which case all that remains to be done is to study the 469 consequences of taking averages of the magnitudes which constitute the errors.
§ 4. The reader must understand—similar to the example with the pistol shots—that the main issue we face is an inverse one. Essentially, we have a certain number of ‘errors’ in front of us, and we need to determine the center from which they spread out. This is like figuring out a cause based on the effects we observe. However, as is often the case with inverse problems, we need to start by looking at the direct problem first. In simpler terms, regarding our current situation, we will need to assume that the ultimate goal we’re aiming for—that is, the true center of our frequency curve—is already known to us. In that case, all that’s left to do is analyze the consequences of calculating averages of the values that make up the errors.
§ 5. We shall, for the present, confine our remarks to what must be regarded as the typical case where considerations of Probability are concerned; viz. that in which the law of arrangement or development is of the Binomial kind. The nature of this law was explained in Chap. II., where it was shown that the frequency of the respective numbers of occurrences was regulated in accordance with the magnitude of the successive terms of the expansion of the binomial (1 + 1)n. It was also pointed out that when n becomes very great, that is, when the number of influencing circumstances is very large, and their relative individual influence correspondingly small, the form assumed by a curve drawn through the summits of ordinates representing these successive terms of the binomial tends towards that assigned by the equation
§ 5. For now, we will focus on what we should consider the typical case when it comes to Probability; specifically, the situation where the law of arrangement or development is of the Binomial type. This law was detailed in Chapter II, where it was explained that the frequency of the different numbers of occurrences is determined by the size of the successive terms in the expansion of the binomial (1 + 1)n. It was also noted that when n becomes very large, meaning the number of influencing factors is very high and their individual effects are relatively minor, the shape of a curve drawn through the peaks of vertical lines representing these successive binomial terms approaches the form given by the equation.
For all practical purposes therefore we may talk indifferently of the Binomial or Exponential law; if only on the ground that the arrangement of the actual phenomena on one or other of these two schemes would soon become indistinguishable when the numbers involved are large. But there is another ground than this. Even when the phenomena themselves represent a continuous magnitude, our measurements of them,—which are all with which we can deal,—are discontinuous. Suppose we had before us the accurate heights of a million adult men. For all practical purposes these would represent the variations of a continuous magnitude, for the differences between two successive magnitudes, especially near the mean, would be inappreciably small. But our tables will probably represent 470 them only to the nearest inch. We have so many assigned as 69 inches; so many as 70; and so on. The tabular statement in fact is of much the same character as if we were assigning the number of ‘heads’ in a toss of a handful of pence; that is, as if we were dealing with discontinuous numbers on the binomial, rather than with a continuous magnitude on the exponential arrangement.
For all practical purposes, we can refer to the Binomial or Exponential law interchangeably, mainly because the distribution of real phenomena under either model would quickly look the same when dealing with large numbers. However, there’s another reason for this. Even when the phenomena represent a continuous quantity, our measurements of them—which are the only ones we can work with—are discontinuous. Imagine we had the exact heights of a million adult men. For most practical purposes, these would show variations of a continuous quantity since the differences between two successive measurements, especially around the average height, would be so small they’re hardly noticeable. But our data likely only report them to the nearest inch. We might have a certain number listed as 69 inches and another group as 70, and so on. This tabular representation is quite similar to counting the number of 'heads' when flipping a handful of coins; in other words, we’re working with discrete numbers under the binomial model rather than a continuous quantity under the exponential model.
§ 6. Confining ourselves then, for the present, to this general head, of the binomial or exponential law, we must distinguish two separate cases in respect of the knowledge we may possess as to the generating circumstances of the variable magnitudes.
§ 6. For now, let's focus on this broader topic of the binomial or exponential law. We need to differentiate between two distinct cases regarding what we know about the factors that generate the varying amounts.
(1) There is, first, the case in which the conditions of the problem are determinable à priori: that is, where we are able to say, prior to specific experience, how frequently each combination will occur in the long run. In this case the main or ultimate object for which we are supposing that the average is employed,—i.e. that of discovering the true mean value,—is superseded. We are able to say what the mean or central value in the long run will be; and therefore there is no occasion to set about determining it, with some trouble and uncertainty, from a small number of observations. Still it is necessary to discuss this case carefully, because its assumption is a necessary link in the reasoning in other cases.
(1) First, there’s the situation where the conditions of the problem can be determined à prior: that is, we can predict, before any specific experience, how often each combination will happen over time. In this case, the main goal for which we assume the average is used—namely, finding the true mean value—is not needed. We can state what the average or central value will be in the long run, so there’s no need to try to determine it through a small number of observations, which could be difficult and uncertain. However, it’s important to discuss this situation carefully because its assumption is a necessary connection in the reasoning for other cases.
This comparatively à priori knowledge may present itself in two different degrees as respects its completeness. In the first place it may, so far as the circumstances in question are concerned, be absolutely complete. Consider the results when a handful of ten pence is repeatedly tossed up. We know precisely what the mean value is here, viz. equal division of heads and tails: we know also the chance of six heads and four tails, and so on. That is, if we had to plot 471 out a diagram showing the relative frequency of each combination, we could do so without appealing to experience. We could draw the appropriate binomial curve from the generating conditions given in the statement of the problem.
This relatively à priori knowledge can come in two different levels of completeness. First, it can be completely comprehensive regarding the relevant circumstances. For example, think about the results of repeatedly tossing a handful of ten pence coins. We know exactly what the average outcome is: an equal split of heads and tails. We also understand the probability of getting six heads and four tails, and so on. In other words, if we needed to create a diagram showing the relative frequency of each combination, we could do that without needing real-world experience. We could draw the correct binomial curve based on the conditions presented in the problem statement.
But now consider the results of firing at a target consisting of a long and narrow strip, of which one point is marked as the centre of aim.[3] Here (assuming that there are no causes at work to produce permanent bias) we know that this centre will correspond to the mean value. And we know also, in a general way, that the dispersion on each side of this will follow a binomial law. But if we attempted to plot out the proportions, as in the preceding case, by erecting ordinates which should represent each degree of frequency as we receded further from the mean, we should find that we could not do so. Fresh data must be given or inferred. A good marksman and a bad marksman will both distribute their shot according to the same general law; but the rapidity with which the shots thin off as we recede from the centre will be different in the two cases. Another ‘constant’ is demanded before the curve of frequency could be correctly traced out.
But now think about the results of shooting at a target that's a long and narrow strip, with one point marked as the center of aim.[3] Here (assuming there aren't any factors causing a permanent bias) we know that this center will correspond to the average value. We also generally understand that the spread of shots on either side of this will follow a binomial law. However, if we tried to plot the proportions, like in the previous example, by creating graphs representing each frequency as we move away from the average, we would find that we couldn't do it. New data would need to be provided or inferred. A skilled marksman and an unskilled one will both distribute their shots according to the same general law, but the rate at which the shots taper off as we move away from the center will differ between the two. Another ‘constant’ is needed before the frequency curve can be accurately drawn.
§ 7. (2) The second division, to be next considered, corresponds for all logical purposes to the first. It comprises the cases in which though we have no à priori knowledge as to the situation about which the values will tend to cluster in the long run, yet we have sufficient experience at hand to assign it with practical certainty. Consider for instance the tables of human stature. These are often very extensive, including tens or hundreds of thousands. In such cases the mean or central value is determinable with just as 472 great certainty as by any à priori rule. That is, if we took another hundred thousand measurements from the same class of population, we should feel secure that the average would not be altered by any magnitude which our measuring instruments could practically appreciate.
§ 7. (2) The second category we'll look at next is similar to the first for all logical purposes. It includes situations where, even though we don’t have prior knowledge about where the values will tend to group over time, we have enough experience to assign it with practical certainty. For example, take the data on human height. These datasets are often very large, including tens or hundreds of thousands of entries. In such instances, the mean or central value can be determined with as much certainty as any prior rule. In other words, if we gathered another hundred thousand measurements from the same population, we would be confident that the average wouldn’t change by any amount that our measuring tools could realistically detect. 472
§ 8. But the mere assignment of the mean or central value does not here, any more than in the preceding case, give us all that we want to know. It might so happen that the mean height of two populations was the same, but that the law of dispersion about that mean was very different: so that a man who in one series was an exceptional giant or dwarf should, in the other, be in no wise remarkable.
§ 8. However, just stating the average or central value doesn’t provide everything we need to understand. It could happen that the average height of two groups is the same, but the way the heights spread around that average could be very different: meaning a person who is an outstanding giant or dwarf in one group might be completely ordinary in the other.
To explain the process of thus determining the actual magnitude of the dispersion would demand too much mathematical detail; but some indication may be given. What we have to do is to determine the constant h in the equation[4] y = h/√πe−h2x2. In technical language, what we have to do is to determine the modulus of this equation. The quantity 1/h in the above expression is called the modulus. It measures the degree of contraction or dispersion about the mean indicated by this equation. When it is large the dispersion is considerable; that is the magnitudes are not closely 473 crowded up towards the centre, when it is small they are thus crowded up. The smaller the modulus in the curve representing the thickness with which the shot-marks clustered about the centre of the target, the better the marksman.
To explain the process of determining the actual level of dispersion would require too much mathematical detail, but I can provide some insight. What we need to do is find the constant h in the equation[4]y = h/√πe−h2x2. In technical terms, we need to determine the modulus of this equation. The quantity 1/h in the expression above is called the modulus. It indicates the degree of contraction or dispersion around the average represented by this equation. When it is large, the dispersion is significant; that is, the values are not tightly grouped near the center. When it is small, they are closer together. The smaller the modulus in the curve showing how the shot marks cluster around the center of the target, the better the marksman.
§ 9. There are several ways of determining the modulus. In the first of the cases discussed above, where our theoretical knowledge is complete, we are able to calculate it à priori from our knowledge of the chances. We should naturally adopt this plan if we were tossing up a large handful of pence.
§ 9. There are several ways to determine the modulus. In the first scenario mentioned earlier, where we have complete theoretical knowledge, we can calculate it à priori based on our understanding of the probabilities. We would naturally choose this method if we were flipping a large handful of coins.
The usual à posteriori plan, when we have the measurements of the magnitudes or observations before us, is this:—Take the mean square of the errors, and double this; the result gives the square of the modulus. Suppose, for instance, that we had the five magnitudes, 4, 5, 6, 7, 8. The mean of these is 6: the ‘errors’ are respectively 2, 1, 0, 1, 2. Therefore the ‘modulus squared’ is equal to 10/5; i.e. the modulus is √2. Had the magnitudes been 2, 4, 6, 8, 10; representing the same mean (6) as before, but displaying a greater dispersion about it, the modulus would have been larger, viz. √8 instead of √2.
The typical à posteriori plan, when we have the measurements of the values or observations in front of us, is this:—Calculate the mean square of the errors, then double that; the result will give you the square of the modulus. For example, if we have the five values, 4, 5, 6, 7, 8. The mean of these is 6: the ‘errors’ are 2, 1, 0, 1, 2, respectively. Therefore, the ‘modulus squared’ is equal to 10/5; meaning the modulus is √2. If the values were 2, 4, 6, 8, 10, which represent the same mean (6) as before but show greater spread around it, the modulus would have been larger, specifically, √8 instead of √2.
Mr Galton's method is more of a graphical nature. It is described in a paper on Statistics by Intercomparison (Phil. Mag. 1875), and elsewhere. It may be indicated as follows. Suppose that we were dealing with a large number of measurements of human stature, and conceive that all the persons in question were marshalled in the order of their height. Select the average height, as marked by the central man of the row. Suppose him to be 69 inches. Then raise (or depress) the scale from this point until it stands at such 474 a height as just to include one half of the men above (or below) the mean. (In practice this would be found to require about 1.71 inches: that is, one quarter of any large group of such men will fall between 69 and 70.71 inches.) Divide this number by 0.4769 and we have the modulus. In the case in question it would be equal to about 3.6 inches.
Mr. Galton's method is more graphical. It's explained in a paper on Statistics by Intercomparison (Phil. Mag. 1875) and in other places. Here’s how it works. Imagine we have a large set of measurements of human height and think of all the individuals lined up in order of their height. Pick the average height, represented by the person in the middle of the line. Let's say he is 69 inches tall. Now, adjust the scale from this point upward or downward until it reaches a height that includes half of the men either above or below the average. (In practice, this would be about 1.71 inches: meaning one quarter of a large group of these men would measure between 69 and 70.71 inches.) Divide this number by 0.4769, and we get the modulus. In this case, it would be around 3.6 inches.
Under the assumption with which we start, viz. that the law of error displays itself in the familiar binomial form, or in some form approximating to this, the three methods indicated above will coincide in their result. Where there is any doubt on this head, or where we do not feel able to calculate beforehand what will be the rate of dispersion, we must adopt the second plan of determining the modulus. This is the only universally applicable mode of calculation: in fact that it should yield the modulus is a truth of definition; for in determining the error of mean square we are really doing nothing else than determining the modulus, as was pointed out in the last chapter.
Assuming that the law of error appears in the familiar binomial form, or something similar, the three methods mentioned above will yield the same result. If there’s any uncertainty about this, or if we can’t predict the rate of dispersion in advance, we should go with the second method of calculating the modulus. This is the only calculation method that works universally: it’s a definition that it should give us the modulus; because when we calculate the mean square error, we’re essentially calculating the modulus, as explained in the last chapter.
§ 10. The position then which we have now reached is this. Taking it for granted that the Law of Error will fall into the symbolic form expressed by the equation y = h/√π e−h2x2, we have rules at hand by which h may be determined. We therefore, for the purposes in question, know all about the curve of frequency: we can trace it out on paper: given one value,—say the central one,—we can determine any other value at any distance from this. That is, knowing how many men in a million, say, are 69 inches high, we can determine without direct observation how many will be 67, 68, 70, 71, and so on.
§ 10. The point we have now reached is this. Assuming that the Law of Error can be represented by the equation y = h/√π e−h2x2, we have methods available to determine h. Therefore, for our purposes, we understand everything about the frequency curve: we can sketch it out on paper. Given one value—let's say the central one—we can find out any other value at any distance from it. For example, knowing how many men in a million are 69 inches tall, we can figure out without direct measurement how many will be 67, 68, 70, 71, and so forth.
We can now adequately discuss the principal question of logical interest before us; viz. why do we take averages or means? What is the exact nature and amount of the advantage 475 gained by so doing? The advanced student would of course prefer to work out the answers to these questions by appealing at once to the Law of Error in its ultimate or exponential form. But I feel convinced that the best method for those who wish to gain a clear conception of the logical nature of the process involved, is to begin by treating it as a question of combinations such as we are familiar with in elementary algebra; in other words to take a finite number of errors and to see what comes of averaging these. We can then proceed to work out arithmetically the results of combining two or more of the errors together so as to get a new series, not contenting ourselves with the general character merely of the new law of error, but actually calculating what it is in the given case. For the sake of simplicity we will not take a series with a very large number of terms in it, but it will be well to have enough of them to secure that our law of error shall roughly approximate in its form to the standard or exponential law.
We can now properly discuss the main question of logical interest in front of us; namely, why do we use averages or means? What is the exact nature and extent of the benefit we gain from this? The advanced student would likely prefer to tackle these questions by referring directly to the Law of Error in its ultimate or exponential form. However, I believe that the best way for those who want to clearly understand the logical nature of the process involved is to start by treating it as a question of combinations, similar to what we already know from basic algebra. In other words, we should consider a finite number of errors and see what happens when we average them. We can then calculate arithmetically the results of combining two or more of these errors to create a new series, not just settling for the general characteristics of the new law of error but actually determining what it is in this specific case. To keep things simple, we won't use a series with a very large number of terms, but it will be beneficial to have enough of them so that our law of error roughly matches the standard or exponential law.
For this purpose the law of error or divergence given by supposing our effort to be affected by ten causes, each of which produces an equal error, but which error is equally likely to be positive and negative (or, as it might perhaps be expressed, ‘ten equal and indifferently additive and subtractive causes’) will suffice. This is the lowest number formed according to the Binomial law, which will furnish to the eye a fair indication of the limiting or Exponential law.[5] The whole number of possible cases here is 210 or 1024; that is, this is the number required to exhibit not only all the cases which can occur (for there are but eleven really distinct cases), but also the relative frequency with which each of these cases occurs in the long run. Of this total, 252 will 476 be situated at the mean, representing the ‘true’ result, or that given when five of the causes of disturbance just neutralize the other five. Again, 210 will be at what we will call one unit's distance from the mean, or that given by six causes combining against four; and so on; until at the extreme distance of five places from the mean we get but one result, since in only one case out of the 1024 will all the causes combine together in the same direction. The set of 1024 efforts is therefore a fair representation of the distribution of an infinite number of such efforts. A graphical representation of the arrangement is given here.
For this purpose, the law of error or divergence can be understood by considering our efforts as influenced by ten factors, each contributing an equal amount of error, which can be either positive or negative (or, as we might say, ‘ten equal factors that can add or subtract equally’). This is the smallest number that fits the Binomial law, which provides a clear indication of the limiting or Exponential law. The total number of possible scenarios here is 210 or 1024; that is, this is the number needed to represent not only all the scenarios that can occur (there are actually only eleven distinct scenarios), but also the relative frequency of each scenario in the long run. Out of this total, 252 will be at the mean, representing the ‘true’ outcome, where five of the disturbance factors exactly cancel out the other five. Additionally, 210 will be at what we’ll call one unit away from the mean, which occurs when six factors prevail over four; and so on, until we reach the extreme of five units away from the mean, where we’ll find only one outcome, as there is just one situation out of the 1024 where all factors align in the same direction. Therefore, the set of 1024 efforts is a fair representation of the distribution of an infinite number of such efforts. A graphical representation of this arrangement is provided here.

§ 11. This representing a complete set of single observations or efforts, what will be the number and arrangement in the corresponding set of combined or reduced observations, say of two together? With regard to the number we must bear in mind that this is not a case of the combinations of things which cannot be repeated; for any given error, say the extreme one at F, can obviously be repeated twice running. Such a repetition would be a piece of very bad luck no doubt, but being possible it must have its place in the set. Now the possible number of ways of combining 1024 things two together, where the same thing may be repeated twice running, is 1024 × 1024 or 1048576. 477 This then is the number in a complete cycle of the results taken two and two together.
§ 11. This represents a complete set of individual observations or efforts. What will the number and arrangement be in the corresponding set of combined or reduced observations, say of two together? Regarding the number, we must remember that this isn't a case of combinations of things that cannot be repeated; for any given error, like the extreme one at F, can obviously be repeated consecutively. Such a repetition would indeed be quite unlucky, but since it's possible, it must be included in the set. Now, the possible number of ways to combine 1024 items in pairs, where the same item can be repeated consecutively, is 1024 × 1024 or 1048576. 477 This, then, is the number in a complete cycle of the results taken two at a time.
§ 12. So much for their number; now for their arrangement or distribution. What we have to ascertain is, firstly, how many times each possible pair of observations will present itself; and, secondly, where the new results, obtained from the combination of each pair, are to be placed. With regard to the first of these enquiries;—it will be readily seen that on one occasion we shall have F repeated twice; on 20 occasions we shall have F combined with E (for F coming first we may have it followed by any one of the 10 at E, or any one of these may be followed by F); E can be repeated in 10 × 10, or 100 ways, and so on.
§ 12. That covers their number; now let’s talk about their arrangement or distribution. What we need to find out is, first, how many times each possible pair of observations will show up, and second, where to place the new results obtained from combining each pair. Regarding the first question, it’s easy to see that we will have F appear twice on one occasion; on 20 occasions, we will see F combined with E (with F first, it can be followed by any of the 10 at E, or any of these can be followed by F); E can repeat in 10 × 10, or 100 different ways, and so on.
Now for the position of each of these reduced observations, the relative frequency of whose component elements has thus been pointed out. This is easy to determine, for when we take two errors there is (as was seen) scarcely any other mode of treatment than that of selecting the mid-point between them; this mid-point of course becoming identical with each of them when the two happen to coincide. It will be seen therefore that F will recur once on the new arrangement, viz. by its being repeated twice on the old one. G midway between E and F, will be given 20 times. E, on our new arrangement, can be got at in two ways, viz. by its being repeated twice (which will happen 100 times), and by its being obtained as the mid-point between D and F (which will happen 90 times). Hence E will occur 190 times altogether.
Now regarding the position of each of these reduced observations, whose component elements’ relative frequency has been noted. This is easy to determine, because when we look at two errors, there is really no other way to handle them than by choosing the mid-point between them; this mid-point, of course, becomes the same as each error when they coincide. Thus, F will appear once in the new arrangement, since it is repeated twice in the old one. G, positioned between E and F, will be counted 20 times. In our new arrangement, E can be accessed in two ways: by being repeated twice (which will happen 100 times), and by being calculated as the mid-point between D and F (which will happen 90 times). Therefore, E will occur a total of 190 times.
The reader who chooses to take the trouble may work out the frequency of all possible occurrences in this way, and if the object were simply to illustrate the principle in accordance with which they occur, this might be the best way of proceeding. But as he may soon be able to observe, and as 478 the mathematician would at once be able to prove, the new ‘law of facility of error’ can be got at more quickly deductively, viz. by taking the successive terms of the expansion of (1 + 1)20. They are given, below the line, in the figure on p. 476.
The reader who is willing to put in the effort can figure out the frequency of all possible occurrences this way, and if the goal was just to demonstrate the principle behind them, this might be the best approach. However, as they will soon notice, and as any mathematician could quickly prove, the new ‘law of facility of error’ can be reached more efficiently through deduction, specifically by examining the successive terms of the expansion of (1 + 1)20. They are shown below the line in the figure on p. 476.
§ 13. There are two apparent obstacles to any direct comparison between the distribution of the old set of simple observations, and the new set of combined or reduced ones. In the first place, the number of the latter is much greater. This, however, is readily met by reducing them both to the same scale, that is by making the same total number of each. In the second place, half of the new positions have no representatives amongst the old, viz. those which occur midway between F and E, E and D, and so on. This can be met by the usual plan of interpolation, viz. by filling in such gaps by estimating what would have been the number at the missing points, on the same scale, had they been occupied. Draw a curve through the vertices of the ordinates at A, B, C, &c., and the lengths of the ordinates at the intermediate points will very fairly represent the corresponding frequency of the errors of those magnitudes respectively. When the gaps are thus filled up, and the numbers thus reduced to the same scale, we have a perfectly fair basis of comparison. (See figure on next page.)
§ 13. There are two obvious obstacles to directly comparing the distribution of the old set of simple observations and the new set of combined or reduced ones. First, the number of the latter is much larger. However, this can easily be addressed by reducing both sets to the same scale, meaning they have the same total number. Second, half of the new positions do not have counterparts among the old, specifically those that fall between F and E, E and D, and so on. This can be handled using the usual method of interpolation, which involves estimating the values that would have existed at the missing points on the same scale if they had been present. Draw a curve through the vertices of the ordinates at A, B, C, and so on, and the lengths of the ordinates at the intermediate points will represent the corresponding frequency of the errors of those magnitudes quite accurately. Once the gaps are filled in this way and the numbers reduced to the same scale, we have a completely fair basis for comparison. (See figure on next page.)
Similarly we might proceed to group or ‘reduce’ three observations, or any greater number. The number of possible groupings naturally becomes very much larger, being (1024)3 when they are taken three together. As soon as we get to three or more observations, we have (as already pointed out) a variety of possible modes of treatment or reduction, of which that of taking the arithmetical mean is but one.
Similarly, we could group or "reduce" three observations, or even more. The number of possible groupings gets much larger, reaching (1024)3 when three are grouped together. Once we have three or more observations, there are (as mentioned before) various ways to handle or reduce them, with calculating the average being just one option.
§ 14. The following figure is intended to illustrate the nature of the advantage secured by thus taking the arithmetical mean of several observations.
§ 14. The following figure is meant to show the benefit gained by taking the average of multiple observations.
The curve ABCD represents the arrangement of a given number of ‘errors’ supposed to be disposed according to the binomial law already mentioned, when the angles have been smoothed off by drawing a curve through them. A′CD′ represents the similar arrangement of the same number when given not as simple errors, but as averages of pairs of errors. A″BD″, again, represents the similar arrangement obtained as averages of errors taken three together. They are drawn as carefully to scale as the small size of the figure permits.
The curve ABCD shows how a specific number of 'errors' are arranged according to the binomial law mentioned earlier, with the angles smoothed out by connecting them with a curve. A′CD′ illustrates the same arrangement, but this time it's presented as averages of pairs of errors instead of just simple errors. A″BD″, on the other hand, indicates the arrangement derived from averages of three errors taken together. They are drawn as accurately to scale as the small size of the figure allows.

§ 15. A glance at the above figure will explain to the reader, better than any verbal description, the full significance of the statement that the result of combining two or more measurements or observations together and taking the average of them, instead of stopping short at the single elements, is to make large errors comparatively more scarce. The advantage is of the same general description as that of fishing in a lake where, of the same number of fish, there are more big and fewer little ones than in another water: of 480 dipping in a bag where of the same number of coins there are more sovereigns and fewer shillings; and so on. The extreme importance, however, of obtaining a perfectly clear conception of the subject may render it desirable to work this out a little more fully in detail.
§ 15. Taking a look at the figure above illustrates, better than any words could, the significance of the idea that combining two or more measurements or observations and calculating their average, rather than relying on a single measurement, reduces the likelihood of significant errors. This benefit is similar to fishing in a lake where, with the same number of fish, there are more large fish and fewer small ones compared to another body of water; or reaching into a bag where, among the same number of coins, there are more gold coins and fewer pennies. However, it's crucial to have a clear understanding of this topic, so it may be useful to explore it in a bit more detail.
For one thing, then, it must be clearly understood that the result of a set of ‘averages’ of errors is nothing else than another set of ‘errors,’ No device can make the attainment of the true result certain,—to suppose the contrary would be to misconceive the very foundations of Probability,—no device even can obviate the possibility of being actually worse off as the result of our labour. The average of two, three, or any larger number of single results, may give a worse result, i.e. one further from the ultimate average, than was given by the first observation we made. We must simply fall back upon the justification that big deviations are rendered scarcer in the long run.
For one thing, it's important to understand that the result of a set of "averages" of errors is just another set of "errors." No tool can guarantee that we will achieve the true result—believing otherwise would be a misunderstanding of the basics of Probability. No tool can prevent the chance of ending up worse off as a result of our efforts. The average of two, three, or any larger number of individual results can yield a worse outcome, meaning it could be further from the ultimate average than the first observation we made. We can only rely on the idea that large deviations become less common over time.
Again; it may be pointed out that though, in the above investigation, we have spoken only of the arithmetical average as commonly understood and employed, the same general results would be obtained by resorting to almost any symmetrical and regular mode of combining our observations or errors. The two main features of the regularity displayed by the Binomial Law of facility were (1) ultimate symmetry about the central or true result, and (2) increasing relative frequency as this centre was approached. A very little consideration will show that it is no peculiar prerogative of the arithmetical mean to retain the former of these and to increase the latter. In saying this, however, a distinction must be attended to for which it will be convenient to refer to a figure.
Again, it's worth noting that although we've talked only about the arithmetic average in the investigation above, you would get similar results using almost any symmetrical and regular way of combining our observations or errors. The two main characteristics of the regularity shown by the Binomial Law of facility were (1) perfect symmetry around the central or true result, and (2) increasing relative frequency as we got closer to this center. A little thought will reveal that it's not just the arithmetic mean that maintains the first characteristic and enhances the second. However, in making this point, we need to pay attention to a distinction that it will be helpful to illustrate with a figure.
§ 16. Suppose that O, in the line D′OD, was the point aimed at by any series of measurements; or, what comes to 481 the same thing for our present purpose, was the ultimate average of all the measurements made. What we mean by a symmetrical arrangement of the values in regard to O, is that for every error OB, there shall be in the long run a precisely corresponding opposite one OB′; so that when we erect the ordinate BQ, indicating the frequency with which B is yielded, we must erect an equal one B′Q′. Accordingly the two halves of the curve on each side of P, viz. PQ and PQ′ are precisely alike.
§ 16. Let's say that O, in the line D′OD, was the point targeted by any series of measurements; or, for our current purposes, was the final average of all the measurements taken. What we mean by a symmetrical arrangement of the values regarding O is that for every error OB, there will, over time, be a perfectly corresponding opposite one OB′; so that when we create the ordinate BQ, which shows how often B occurs, we must create an equal one B′Q′. As a result, the two halves of the curve on each side of P, namely PQ and PQ′, are exactly the same.

It then readily follows that the secondary curve, viz. that marking the law of frequency of the averages of two or more simple errors, will also be symmetrical. Consider any three points B, C, D: to these correspond another three B′, C′, D′. It is obvious therefore that any regular and symmetrical mode of dealing with all the groups, of which BCD is a sample, will result in symmetrical arrangement about the centre O. The ordinary familiar arithmetical average is but one out of many such modes. One way of describing it is by saying that the average of B, C, D, is assigned by choosing a point such that the sum of the squares of its distances from 482 B, C, D, is a minimum. But we might have selected a point such that the cubes, or the fourth powers, or any higher powers should be a minimum. These would all yield curves resembling in a general way the dotted line in our figure. Of course there would be insuperable practical objections to any such courses as these; for the labour of calculation would be enormous, and the results so far from being better would be worse than those afforded by the employment of the ordinary average. But so far as concerns the general principle of dealing with discordant and erroneous results, it must be remembered that the familiar average is but one out of innumerable possible resources, all of which would yield the same sort of help.
It naturally follows that the secondary curve, which shows the frequency of the averages of two or more simple errors, will also be symmetrical. Consider any three points B, C, D: they correspond to another three B′, C′, D′. It's clear that any consistent and symmetrical method of working with all the groups, of which BCD is an example, will lead to a symmetrical arrangement around the center O. The standard arithmetic average is just one of many such methods. One way to express it is by saying that the average of B, C, D is determined by selecting a point where the sum of the squares of its distances from B, C, D is minimized. However, we could also choose a point where the cubes, or the fourth powers, or any higher powers are minimized. These would all create curves that generally resemble the dotted line in our figure. Of course, there would be significant practical drawbacks to these approaches; the calculation would be incredibly labor-intensive, and the results would likely be worse than those obtained using the standard average. But regarding the general principle of handling inconsistent and erroneous results, it’s important to note that the familiar average is just one option among countless possible methods, all of which would provide similar assistance.
§ 17. Once more. We saw that a resort to the average had the effect of ‘humping up’ our curve more towards the centre, expressive of the fact that the errors of averages are of a better, i.e. smaller kind. But it must be noticed that exactly the same characteristics will follow, as a general rule, from any other such mode of dealing with the individual errors. No strict proof of this fact can be given here, but a reference to one of the familiar results of taking combinations of things will show whence this tendency arises. Extreme results, as yielded by an average of any kind, can only be got in one way, viz. by repetitions of extremes in the individuals from which the averages were obtained. But intermediate results can be got at in two ways, viz. either by intermediate individuals, or by combinations of individuals in opposite directions. In the case of the Binomial Law of Error this tendency to thicken towards the centre was already strongly predominant in the individual values before we took them in hand for our average; but owing to this characteristic of combinations we may lay it down (broadly speaking) that any sort of average applied to any sort of law 483 of distribution will give a result which bears the same general relation to the individual values that the dotted lines above bear to the black line.[6]
§ 17. Once again. We observed that using the average made our curve rise more towards the center, showing that the errors in averages tend to be better, or smaller. However, it's important to note that the same characteristics typically apply to any other method of handling individual errors. While a strict proof of this cannot be provided here, referring to one of the well-known results from combinations will illustrate where this tendency comes from. Extreme results, regardless of the type of average used, can only occur through repeated extremes in the individuals that provided the averages. On the other hand, intermediate results can come from two sources: either from intermediate individuals or from combining individuals in opposite directions. For the Binomial Law of Error, this tendency to concentrate around the center was already evident in the individual values before we averaged them. Due to this characteristic of combinations, we can generally say that any average applied to any distribution law will yield results that relate to the individual values in the same way that the dotted lines above relate to the black line. 483
§ 18. This being so, the speculative advantages of one method of combining, or averaging, or reducing, our observations, over another method,—irrespective, that is, of the practical conveniences in carrying them out,—will consist solely in the degree of rapidity with which it tends thus to cluster the result about the centre. We shall have to subject this merit to a somewhat further analysis, but for the present purpose it will suffice to say that if one kind of average gave the higher dotted line in the figure on p. 479 and another gave the lower dotted line, we should say that the former was the better one. The advantage is of the same general kind as that which is furnished in algebraical calculation, by a series which converges rapidly towards the true value as compared with one which converges slowly. We can do the work sooner or later by the aid of either; but we get nearer the truth by the same amount of labour, or get as near by a less amount of labour, on one plan than on the other.
§ 18. Given this, the speculative benefits of one way of combining, averaging, or simplifying our observations over another—regardless of the practical ease of implementing them—will depend solely on how quickly it brings the results closer to the center. We will need to analyze this benefit further, but for now, it’s enough to say that if one type of average produced the higher dotted line in the figure on p. 479 and another produced the lower dotted line, we would consider the former to be superior. This advantage is similar to what is provided in algebraic calculations by a series that approaches the true value quickly compared to one that does so slowly. We can complete the work sooner or later using either method, but one approach gets us closer to the truth with the same amount of effort, or reaches the same level of accuracy with less effort, than the other.
As we are here considering the case in which the individual observations are supposed to be grouped in accordance 484 with the Binomial Law, it will suffice to say that in this case there is no doubt that the arithmetical average is not only the simplest and easiest to deal with, but is also the best in the above sense of the term. And since this Binomial Law, or something approximating to it, is of very wide prevalence, a strong primâ facie case is made out for the general employment of the familiar average.
As we examine the scenario where individual observations are grouped according to the Binomial Law, it’s clear that the arithmetic average is not just the simplest and easiest to handle, but also the most effective in this context. Given that this Binomial Law, or something close to it, is widely seen, there is a strong initial argument for using the familiar average in general. 484
§ 19. The analysis of a few pages back carried the results of the averaging process as far as could be conveniently done by the help of mere arithmetic. To go further we must appeal to higher mathematics, but the following indication of the sort of results obtained will suffice for our present purpose. After all, the successive steps, though demanding intricate reasoning for their proof, are nothing more than generalizations of processes which could be established by simple arithmetic.[7] Briefly, what we do is this:—
§ 19. The analysis from a few pages ago showed the results of the averaging process as far as we could comfortably go using just basic arithmetic. To explore further, we need to turn to more advanced mathematics, but the following examples of the kinds of results we've obtained will be enough for what we need right now. Ultimately, the steps we take, while requiring complex reasoning to prove, are really just generalizations of processes that could be established with simple arithmetic.[7] In short, what we do is this:—
(1) We first extend the proof from the binomial form, with its finite number of elements, to the limiting or exponential form. Instead of confining ourselves to a small number of discrete errors, we then recognize the possibility of any number of errors of any magnitude whatever.
(1) First, we expand the proof from the binomial form, which has a limited number of elements, to the limiting or exponential form. Rather than limiting ourselves to a few specific errors, we acknowledge the potential for an unlimited number of errors of any size.
(2) In the next place, instead of confining ourselves to the consideration of an average of two or three only,—already, as we have seen, a tedious piece of arithmetic,—we calculate the result of an average of any number, n. The actual result is extremely simple. If the modulus of the single errors is c, that of the average of n of these will be c ÷ √n.
(2) Next, instead of limiting ourselves to looking at an average of just two or three—already a boring math exercise, as we've seen—we calculate the result of an average of any number, n. The actual result is quite straightforward. If the modulus of the individual errors is c, then the modulus of the average of n of these will be c ÷ √n.
(3) Finally we draw similar conclusions in reference to the sum or difference of two averages of any numbers. Suppose, 485 for instance, that m errors were first taken and averaged, and then n similarly taken and averaged. These averages will be nearly, but not quite, equal. Their sum or difference,—these, of course, are indistinguishable in the end, since positive and negative errors are supposed to be equal and opposite,—will itself be an ‘error’, every magnitude of which will have a certain assignable probability or facility of occurrence. What we do is to assign the modulus of these errors. The actual result again is simple. If c had been the modulus of the single errors, that of the sum or difference of the averages of m and n of them will be
(3) Finally, we reach similar conclusions regarding the sum or difference of two averages of any numbers. For example, let's say that m errors were first collected and averaged, and then n were similarly collected and averaged. These averages will be close to equal, but not exactly the same. Their sum or difference—these are essentially the same in the end, since positive and negative errors are assumed to balance out—will itself be an ‘error,’ each with a specific assignable probability of occurrence. Our job is to determine the modulus of these errors. The actual result is straightforward. If c was the modulus of the individual errors, then the modulus of the sum or difference of the averages of m and n will be
§ 20. So far, the problem under investigation has been of a direct kind. We have supposed that the ultimate mean value or central position has been given to us; either à priori (as in many games of chance), or from more immediate physical considerations (as in aiming at a mark), or from extensive statistics (as in tables of human stature). In all such cases therefore the main desideratum is already taken for granted, and it may reasonably be asked what remains to be done. The answers are various. For one thing we may want to estimate the value of an average of many when compared with an average of a few. Suppose that one man has collected statistics including 1000 instances, and another has collected 4000 similar instances. Common sense can recognize that the latter are better than the former; but it has no idea how much better they are. Here, as elsewhere, quantitative precision is the privilege of science. The answer we receive from this quarter is that, in the long run, the modulus,—and with this the probable error, the mean error, and the error of mean square, which all vary in proportion,—diminishes 486 inversely as the square root of the number of measurements or observations. (This follows from the second of the above formulæ.) Accordingly the probable error of the more extensive statistics here is one half that of the less extensive. Take another instance. Observation shows that “the mean height of 2,315 criminals differs from the mean height of 8,585 members of the general adult population by about two inches” (v. Edgeworth, Methods of Statistics: Stat. Soc. Journ. 1885). As before, common sense would feel little doubt that such a difference was significant, but it could give no numerical estimate of the significance. Appealing to science, we see that this is an illustration of the third of the above formulæ. What we really want to know is the odds against the averages of two large batches differing by an assigned amount: in this case by an amount equalling twenty-five times the modulus of the variable quantity. The odds against this are many billions to one.
§ 20. So far, the issue we’re looking at has been straightforward. We’ve assumed that the ultimate average value or central point has been provided to us, either à priori (like in many games of chance), from immediate physical factors (like aiming at a target), or from extensive statistics (like tables of human height). In all these cases, the main requirement is already assumed, and one might reasonably ask what else needs to be done. The answers can vary. For one, we may want to evaluate the average of many compared to the average of a few. Imagine one person collects statistics with 1,000 cases, and another gathers 4,000 similar cases. Common sense recognizes that the latter is better than the former; however, it doesn’t know how much better. Here, as in other areas, precise quantitative analysis is the domain of science. The answer from this perspective is that, over time, the modulus—and with it the probable error, the mean error, and the mean square error, which all change in relation—decreases inversely with the square root of the number of measurements or observations. (This comes from the second of the above formulas.) Accordingly, the probable error of the more extensive statistics is half that of the less extensive. Consider another example. Observation indicates that “the mean height of 2,315 criminals differs from the mean height of 8,585 members of the general adult population by about two inches” (v. Edgeworth, Methods of Statistics: Stat. Soc. Journ. 1885). As before, common sense would likely conclude that such a difference is significant, but it couldn’t provide a numerical estimate of that significance. Turning to science, we see this as an example of the third of the above formulas. What we're really trying to find out is the odds against the averages of two large groups differing by a specified amount: in this case, by an amount equal to twenty-five times the modulus of the variable quantity. The odds against this are many billions to one.
§ 21. The number of direct problems which will thus admit of solution is very great, but we must confine ourselves here to the main inverse problem to which the foregoing discussion is a preliminary. It is this. Given a few only of one of these groups of measurements or observations; what can we do with these, in the way of determining that mean about which they would ultimately be found to cluster? Given a large number of them, they would betray the position of their ultimate centre with constantly increasing certainty: but we are now supposing that there are only a few of them at hand, say half a dozen, and that we have no power at present to add to the number.
§ 21. There are many direct problems that can be solved, but we will focus here on the main inverse problem that the previous discussion introduces. This problem is: if we have only a few measurements or observations from one of these groups, what can we do to determine the average value around which they would ultimately cluster? With a large number of measurements, their ultimate center becomes increasingly clear, but we are assuming we only have a few, say half a dozen, and that we cannot add to this number for now.
In other words,—expressing ourselves by the aid of graphical illustration, which is perhaps the best method for the novice and for the logical student,—in the direct problem we merely have to draw the curve of frequency from 487 a knowledge of its determining elements; viz. the position of the centre, and the numerical value of the modulus. In the inverse problem, on the other hand, we have three elements at least, to determine. For not only must we, (1), as before, determine whereabouts the centre may be assumed to lie; and (2), as before, determine the value of the modulus or degree of dispersion about this centre. This does not complete our knowledge. Since neither of these two elements is assigned with certainty, we want what is always required in the Theory of Chances, viz. some estimate of their probable truth. That is, after making the best assignment we can as to the value of these elements, we want also to assign numerically the ‘probable error’ committed in such assignment. Nothing more than this can be attained in Probability, but nothing less than this should be set before us.
In other words—using graphical illustrations, which is probably the best way for beginners and logical learners—in the direct problem, we simply need to plot the frequency curve based on its determining elements: the position of the center and the numerical value of the modulus. In the inverse problem, however, we have at least three elements to figure out. Not only do we need to (1) determine where the center is likely to be positioned, and (2) determine the value of the modulus or the degree of dispersion around this center, but that’s not all. Since neither of these two elements is known with certainty, we also need what is always necessary in Probability Theory: an estimate of their likely accuracy. That is, after making the best estimate we can about these values, we also need to numerically assess the ‘probable error’ associated with that estimate. Nothing more than this can be achieved in Probability, but nothing less than this should be our goal.
§ 22. (1) As regards the first of these questions, the answer is very simple. Whether the number of measurements or observations be few or many, we must make the assumption that their average is the point we want; that is, that the average of the few will coincide with the ultimate average. This is the best, in fact the only assumption we can make. We should adopt this plan, of course, in the extreme case of there being only one value before us, by just taking that one; and our confidence increases slowly with the number of values before us. The only difference therefore here between knowledge resting upon such data, and knowledge resting upon complete data, lies not in the result obtained but in the confidence with which we entertain it.
§ 22. (1) Regarding the first question, the answer is quite straightforward. Whether we have few measurements or many, we must assume that their average is the point we need; that is, the average of the few will match the ultimate average. This is the best, and really the only assumption we can make. We should follow this approach, especially in the extreme case where we only have one value available, by simply using that one; and our confidence grows gradually as we have more values to consider. Therefore, the only difference between knowledge based on such data and knowledge based on complete data lies not in the result but in the confidence we have in it.
§ 23. (2) As regards the second question, viz. the determination of the modulus or degree of dispersion about the mean, much the same may be said. That is, we adopt the same rule for the determination of the E.M.S. (error of mean square) by which the modulus is assigned, as we should 488 adopt if we possessed full Information. Or rather we are confined to one of the rules given on p. 473, viz. the second, for by supposition we have neither the à priori knowledge which would be able to supply the first, nor a sufficient number of observations to justify the third. That is, we reckon the errors, measured from the average, and calculate their mean square: twice this is equal to the square of the modulus of the probable curve of facility.[8]
§ 23. (2) Regarding the second question, which is about determining the modulus or degree of dispersion around the mean, we can say much the same. In other words, we follow the same process for calculating the E.M.S. (error of mean square), which defines the modulus, as we would if we had complete information. Actually, we are limited to one of the rules mentioned on p. 473, specifically the second one, because we do not have the à priori knowledge needed for the first rule, nor do we have enough observations to support the third. In essence, we assess the errors from the average and compute their mean square: twice this value equals the square of the modulus of the probable curve of facility.[8]
§ 24. (3) The third question demands for its solution somewhat advanced mathematics; but the results can be indicated without much difficulty. A popular way of stating our requirement would be to say that we want to know how likely it is that the mean of the few, which we have thus accepted, shall coincide with the true mean. But this would be to speak loosely, for the chances are of course indefinitely great against such precise coincidence. What we really do is to assign the ‘probable error’; that is, to assign a limit which it is as likely as not that the discrepancy between the inferred mean and the true mean should exceed.[9] To take a numerical example: suppose we had made several 489 measurements of a wall with a tape, and that the average of these was 150 feet. The scrupulous surveyor would give us this result, with some such correction as this added,—‘probable error 3 inches’. All that this means is that we may assume that the true value is 150 feet, with a confidence that in half the cases (of this description) in which we did so, we should really be within three inches of the truth.
§ 24. (3) The third question requires some advanced mathematics for its solution, but we can express the results without much hassle. A popular way to state our issue would be to say that we want to know how likely it is that the average of the few measurements we've accepted matches the true average. However, that phrasing is a bit vague because the chances of such a precise match are, of course, extremely low. What we're actually doing is determining the 'probable error;' that is, establishing a limit that indicates how likely it is that the difference between the calculated average and the true average will exceed that limit.[9] For example: suppose we took several measurements of a wall with a tape and found the average to be 150 feet. A cautious surveyor might present this result along with a note saying, ‘probable error 3 inches.’ This means we can assume the true value is 150 feet, with the confidence that in half of similar cases we would actually be within three inches of the true value.
The expression for this probable error is a simple multiple of the modulus: it is the modulus multiplied by 0.4769…. That it should be some function of the modulus, or E.M.S., seems plausible enough; for the greater the errors,—in other words the wider the observed discrepancy amongst our measurements,—the less must be the confidence we can feel in the accuracy of our determination of the mean. But, of course, without mathematics we should be quite unable to attempt any numerical assignment.
The formula for this probable error is just a straightforward multiple of the modulus: it’s the modulus times 0.4769…. It makes sense that it would be some function of the modulus, or E.M.S., because the larger the errors—in other words, the greater the differences in our measurements—the less confidence we can have in the accuracy of our average. But, of course, without math, we wouldn’t be able to make any numerical calculations.
§ 25. The general conclusion therefore is that the determination of the curve of facility,—and therefore ultimately of every conclusion which rests upon a knowledge of this curve,—where only a few observations are available, is of just the same kind as where an infinity are available. The rules for obtaining it are the same, but the confidence with which it can be accepted is less.
§ 25. The overall conclusion is that figuring out the curve of facility—and therefore any conclusion that depends on understanding this curve—when there are only a few observations available is exactly the same as when there are countless observations. The methods for determining it are the same, but the confidence in its acceptance is lower.
The knowledge, therefore, obtainable by an average of a small number of measurements of any kind, hardly differs except in degree from that which would be attainable by an indefinitely extensive series of them. We know the same sort of facts, only we are less certain about them. But, on the other hand, the knowledge yielded by an average even of a small number differs in kind from that which is yielded by a single measurement. Revert to our marksman, whose bullseye is supposed to have been afterwards removed. If he had fired only a single shot, not only should we be less 490 certain of the point he had aimed at, but we should have no means whatever of guessing at the quality of his shooting, or of inferring in consequence anything about the probable remoteness of the next shot from that which had gone before. But directly we have a plurality of shots before us, we not merely feel more confident as to whereabouts the centre of aim was, but we also gain some knowledge as to how the future shots will cluster about the spot thus indicated. The quality of his shooting begins at once to be betrayed by the results.
The knowledge we get from an average of a small number of measurements isn't really different in type, just in degree, from what we could gather from a large number of them. We understand the same types of facts, but we’re less certain about them. However, the knowledge gained from averaging even a small number is different in nature from that which comes from just a single measurement. Think about our marksman, whose bullseye is presumed to have been removed afterward. If he only took one shot, not only would we be less confident about where he aimed, but we wouldn’t have any way to guess how well he shot or deduce anything about how far the next shot might be from the first one. But as soon as we have multiple shots to look at, we not only feel more certain about where the center of aim was, but we also gain insight into how the future shots are likely to cluster around that indicated spot. The quality of his shooting immediately starts to show in the results.
§ 26. Thus far we have been supposing the Law of Facility to be of the Binomial type. There are several reasons for discussing this at such comparative length. For one thing it is the only type which,—or something approximately resembling which,—is actually prevalent over a wide range of phenomena. Then again, in spite of its apparent intricacy, it is really one of the simplest to deal with; owing to the fact that every curve of facility derived from it by taking averages simply repeats the same type again. The curve of the average only differs from that of the single elements in having a smaller modulus; and its modulus is smaller in a ratio which is exceedingly easy to give. If that of the one is c, that of the other (derived by averaging n single elements) is c/√n.
§ 26. So far, we've been assuming that the Law of Facility is of the Binomial type. There are several reasons to discuss this in such detail. For one, it’s the only type that is, or something very similar to it, actually common across a wide range of phenomena. Additionally, despite its apparent complexity, it’s one of the simplest to work with because every curve of facility generated from it by averaging just repeats the same type again. The average curve only differs from that of the individual elements by having a smaller modulus, and its modulus is smaller by a ratio that is very easy to express. If the modulus of one is c, then that of the other (derived by averaging n individual elements) is c/√n.
But for understanding the theory of averages we must consider other cases as well. Take then one which is intrinsically as simple as it possibly can be, viz. that in which all values within certain assigned limits are equally probable. This is a case familiar enough in abstract Probability, though, as just remarked, not so common in natural phenomena. It is the state of things when we act at random directly upon 491 the objects of choice;[10] as when, for instance, we choose digits at random out of a table of logarithms.
But to understand the theory of averages, we must also consider other scenarios. Let's take one that is as simple as it can be, namely that in which all values within certain assigned limits are equally likely. This situation is quite familiar in abstract Probability, although, as mentioned earlier, it's not so common in natural events. It's what happens when we randomly act directly on the objects of choice; for example, when we randomly select digits from a table of logarithms.
The reader who likes to do so can without much labour work out the result of taking an average of two or three results by proceeding in exactly the same way which we adopted on p. 476. The ‘curve of facility’ with which we have to start in this case has become of course simply a finite straight line. Treating the question as one of simple combinations, we may divide the line into a number of equal parts, by equidistant points; and then proceed to take these two and two together in every possible way, as we did in the case discussed some pages back.
The reader who wants to can easily figure out the average of two or three results by following the same method we used on p. 476. The ‘curve of facility’ we begin with in this case has now simply turned into a finite straight line. By looking at the question as a matter of simple combinations, we can split the line into several equal parts using equidistant points; then we can pair these parts together in every possible way, just like we did in the previous discussion a few pages back.
If we did so, what we should find would be this. When an average of two is taken, the ‘curve of facility’ of the average becomes a triangle with the initial straight line for base; so that the ultimate mean or central point becomes the likeliest result even with this commencement of the averaging process. If we were to take averages of three, four, and so on, what we should find would be that the Binomial law begins to display itself here. The familiar bell shape of the exponential curve would be more and more closely approximated to, until we obtained something quite indistinguishable from it.
If we did this, here’s what we would find. When we take an average of two, the ‘curve of ease’ for the average forms a triangle with the initial straight line as the base; thus, the ultimate mean or central point becomes the most likely outcome even at the start of the averaging process. If we were to take averages of three, four, and so on, we would see the Binomial law starting to show itself. The familiar bell shape of the exponential curve would be increasingly closer to it, until we obtained something completely indistinguishable from it.
§ 27. The conclusion therefore is that when we are dealing with averages involving a considerable number it is not necessary, in general, to presuppose the binomial law of distribution in our original data. The law of arrangement of what we may call the derived curve, viz. that corresponding to the averages, will not be appreciably affected thereby. Accordingly we seem to be justified in bringing to bear all 492 the same apparatus of calculation as in the former case. We take the initial average as the probable position of the true centre or ultimate average: we estimate the probability that we are within an assignable distance of the truth in so doing by calculating the ‘error of mean square’; and we appeal to this same element to determine the modulus, i.e. the amount of contraction or dispersion, of our derived curve of facility.
§ 27. The conclusion is that when we’re working with averages that involve a large number of items, it’s generally not necessary to assume that the original data follows a binomial distribution. The way we arrange what we can call the derived curve, which corresponds to the averages, won’t be significantly impacted by this. Therefore, we can use all the same calculation methods as we did before. We take the initial average as the likely position of the true center or ultimate average; we assess the probability that we are within a specified distance of the truth by calculating the ‘error of mean square’; and we use this same factor to determine the modulus, or the level of contraction or dispersion, of our derived curve of facility.
The same general considerations will apply to most other kinds of Law of Facility. Broadly speaking,—we shall come to the examination of certain exceptions immediately,—whatever may have been the primitive arrangement (i.e. that of the single results) the arrangement of the derived results (i.e. that of the averages) will be more crowded up towards the centre. This follows from the characteristic of combinations already noticed, viz. that extreme values can only be got at by a repetition of several extremes, whereas intermediate values can be got at either by repetition of intermediates or through the counteraction of opposite extremes. Provided the original distribution be symmetrical about the centre, and provided the limits of possible error be finite, or if infinite, that the falling off of frequency as we recede from the mean be very rapid, then the results of taking averages will be better than those of trusting to single results.
The same general considerations will apply to most other types of Law of Facility. Broadly speaking—we'll look at some exceptions soon—whatever the original arrangement (i.e. that of the single results), the arrangement of the derived results (i.e. that of the averages) will be more clustered towards the center. This is due to the characteristic of combinations we’ve already noted, which is that extreme values can only be obtained by repeating several extremes, while intermediate values can be obtained either by repeating intermediates or by balancing out opposing extremes. As long as the original distribution is symmetrical around the center, and as long as the limits of possible error are finite, or if they’re infinite, that the drop in frequency as we move away from the mean is very rapid, then the results of taking averages will be better than relying on single results.
§ 28. We will now take notice of an exceptional case. We shall do so, not because it is one which can often actually occur, but because the consideration of it will force us to ask ourselves with some minuteness what we mean in the above instances by calling the results of the averages ‘better’ than those of the individual values. A diagram will bring home to us the point of the difficulty better than any verbal or symbolic description.
§ 28. Let's now look at an exceptional case. We’re doing this not because it happens frequently, but because examining it will make us think carefully about what we mean when we say that the averages’ results are ‘better’ than those of the individual values. A diagram will illustrate the complexity of this issue more effectively than any written or symbolic explanation.

The black line represents a Law of Error easily stated in words, and one which, as we shall subsequently see, can be conceived as occurring in practice. It represents a state of things under which up to a certain distance from O, on each side, viz. to A and B, the probability of an error diminishes uniformly with the distance from O; whilst beyond these points, up to E and F, the probability of error remains constant. The dotted line represents the resultant Law of Error obtained by taking the average of the former two and two together. Now is the latter ‘better’ than the former? Under it, certainly, great errors are less frequent and intermediate ones more frequent; but then on the other hand the small errors are less frequent: is this state of things on the whole an improvement or not? This requires us to reconsider the whole question.
The black line shows a Law of Error that can be easily explained in words and that, as we will see later, can also be understood in practice. It illustrates a situation where, up to a certain distance from O, on each side, specifically to A and B, the likelihood of an error decreases consistently as you move away from O. However, beyond these points, up to E and F, the probability of error stays the same. The dotted line represents the overall Law of Error created by averaging the first two together. So, is the latter ‘better’ than the former? Under this law, larger errors are less common, and intermediate errors are more common; but on the flip side, small errors happen less often. Is this situation overall an improvement or not? This requires us to rethink the entire question.
§ 29. In all the cases discussed in the previous sections the superiority of the curve of averages over that of the single results showed itself at every point. The big errors were scarcer and the small errors were commoner; it was only just at one intermediate point that the two were on terms of equality, and this point was not supposed to possess any particular significance or importance. Accordingly we had no occasion to analyse the various cases included under the general relation. It was enough to say that one was better than the other, and it was sufficient for all purposes to 494 take the ‘modulus’ as the measure of this superiority. In fact we are quite safe in simply saying that the average of those average results is better than that of the individual ones.
§ 29. In all the cases mentioned in the previous sections, the advantage of the average curve over individual results was clear at every level. Major errors were less frequent, while minor errors occurred more often; there was only one specific point where both were equal, and that point wasn't assumed to have any special significance or importance. Therefore, we didn't need to analyze the various cases under the general relationship. It was enough to state that one was superior to the other, and it sufficed for our purposes to take the ‘modulus’ as a measure of this superiority. In fact, we can confidently say that the average of those average results is better than that of the individual results. 494
When however we proceed in what Hume calls “the sifting humour,” and enquire why it is sufficient thus to trust to the average; we find, in addition to the considerations hitherto advanced, that some postulate was required as to the consequences of the errors we incur. It involved an estimate of what is sometimes called the ‘detriment’ of an error. It seemed to take for granted that large and small errors all stand upon the same general footing of being mischievous in their consequences, but that their evil effects increase in a greater ratio than that of their own magnitude.
When we then adopt what Hume refers to as “the sifting humor” and ask why it's enough to rely on the average, we discover, in addition to the points made earlier, that there’s a need for some basic assumption regarding the consequences of the mistakes we make. It required us to assess what’s sometimes called the ‘detriment’ of an error. It seemed to assume that both large and small errors are generally equally problematic in their effects, but that the negative impacts grow at a faster rate than the size of the errors themselves.
§ 30. Suppose, for comparison, a case in which the importance of an error is directly proportional to its magnitude (of course we suppose positive and negative errors to balance each other in the long run): it does not appear that any advantage would be gained by taking averages. Something of this sort may be considered to prevail in cases of mere purchase and sale. Suppose that any one had to buy a very large number of yards of cloth at a constant price per yard: that he had to do this, say, five times a day for many days in succession. And conceive that the measurement of the cloth was roughly estimated on each separate occasion, with resultant errors which are as likely to be in excess as in defect. Would it make the slightest difference to him whether he paid separately for each piece; or whether the five estimated lengths were added together, their average taken, and he were charged with this average price for each piece? In the latter case the errors which will be made in the estimation of each piece will of course be less in the long run than they would be in the former: will this be of any 495 consequence? The answer surely is that it will not make the slightest difference to either party in the bargain. In the long run, since the same parties are concerned, it will not matter whether the intermediate errors have been small or large.
§ 30. Imagine, for comparison, a situation where the significance of a mistake is directly related to its size (we assume positive and negative mistakes cancel each other out over time): it seems that no benefit would come from averaging. Something like this can be seen in simple buying and selling scenarios. Picture someone who needs to buy a very large number of yards of cloth at a fixed price per yard: let’s say they need to do this five times a day for many consecutive days. Now, imagine that the measurements of the cloth are roughly estimated each time, leading to errors that could be either above or below the actual amount. Would it make any difference to them whether they paid for each piece separately, or whether they added the five estimated lengths together, calculated an average, and were charged that average price for each piece? In the second case, the errors made in estimating each piece would, over time, be smaller than in the first case: but will this even matter? The answer is clearly no—it won’t make any difference to either party in the agreement. Over time, since the same parties are involved, it won’t be important whether the intermediate errors were large or small.
Of course nothing of this sort can be regarded as the general rule. In almost every case in which we have to make measurements we shall find that large errors are much more mischievous than small ones, that is, mischievous in a greater ratio than that of their mere magnitude. Even in purchase and sale, where different purchasers are concerned, this must be so, for the pleasure of him who is overserved will hardly equal the pain of him who is underserved. And in many cases of scientific measurement large errors may be simply fatal, in the sense that if there were no reasonable prospect of avoiding them we should not care to undertake the measurement at all.
Of course, nothing like this can be seen as the general rule. In nearly every situation where we need to take measurements, we’ll find that large errors are much more harmful than small ones, meaning they cause more trouble relative to their size. Even in buying and selling, where different purchasers are involved, this holds true, since the satisfaction of someone who gets too much will barely compare to the frustration of someone who doesn’t get enough. In many cases of scientific measurement, large errors can be downright disastrous, to the point that if there’s no reasonable way to avoid them, we wouldn’t even want to take the measurements at all.
§ 31. If we were only concerned with practical considerations we might stop at this point; but if we want to realize the full logical import of average-taking as a means to this particular end, viz. of estimating some assigned magnitude, we must look more closely into such an exceptional case as that which was indicated in the figure on p. 493. What we there assumed was a state of things in reference to which extremely small errors were very frequent, but that when once we got beyond a certain small range all other errors, within considerable limits, were equally likely.
§ 31. If we were only focused on practical matters, we could stop here; however, if we want to understand the full logical significance of taking averages to achieve a specific goal, namely, estimating a certain value, we need to examine an exceptional situation like the one shown in the figure on p. 493. What we assumed there was a scenario where very small errors occurred frequently, but once we moved beyond a certain small range, all other errors within a significant interval were equally probable.
It is not difficult to imagine an example which will aptly illustrate the case in point: at worst it may seem a little far-fetched. Conceive then that some firm in England received a hurried order to supply a portion of a machine, say a steam-engine, to customers at a distant place; and that it 496 was absolutely essential that the work should be true to the tenth of an inch for it to be of any use. But conceive also that two specifications had been sent, resting on different measurements, in one of which the length of the requisite piece was described as sixty and in the other sixty-one inches. On the assumption of any ordinary law of error, whether of the binomial type or not, there can be no doubt that the firm would make the best of a very bad job by constructing a piece of 60 inches and a half: i.e. they would have a better chance of being within the requisite tenth of an inch by so doing, than by taking either of the two specifications at random and constructing it accurately to this. But if the law were of the kind indicated in our diagram,[11] then it seems equally certain that they would be less likely to be within the requisite narrow margin by so doing. As a mere question of probability,—that is, if such estimates were acted upon again and again,—there would be fewer failures encountered by simply choosing one of the conflicting measurements at random and working exactly to this, than by trusting to the average of the two.
It's not hard to come up with an example that illustrates the point: it might seem a bit unrealistic at first. Imagine a company in England gets a rushed order to provide part of a machine, like a steam engine, for customers in a faraway location; and it’s crucial that the work is accurate to within a tenth of an inch for it to be functional. However, suppose two specifications were sent over with different measurements: one states the piece needs to be sixty inches long, while the other says it needs to be sixty-one inches. Based on any typical law of error, whether binomial or otherwise, it’s clear that the company would try to make the best of a tough situation by producing a piece that measures 60 and a half inches. In doing so, they would likely have a better chance of hitting the required tenth of an inch than if they randomly chose either of the two specifications and made it precisely to that measurement. But, if the law follows the type shown in our diagram, it seems just as clear that they would be less likely to fall within that narrow margin by creating the 60 and a half inches. Simply put, if they acted based on such estimates repeatedly, they would encounter fewer failures by picking one of the conflicting measurements at random and working precisely to it instead of averaging the two.
This suggests some further reflections as to the taking of averages. We will turn now to another exceptional case, but one involving somewhat different considerations than those which have been just discussed. As before, it may be most conveniently introduced by commencing with an example.
This leads to some additional thoughts on averaging. Now let's look at another unique case, but this one involves somewhat different factors than what we've just talked about. As before, it can be easiest to start with an example.
§ 32. Suppose then that two scouts were sent to take the calibre of a gun in a hostile fort,—we may conceive that the fort was to be occupied next day, and used against the enemy, and that it was important to have a supply of shot or shell,—and that the result is that one of them reports the calibre to be 8 inches and the other 9. Would it be wise to assume that the mean of these two, viz. 81/2 inches, was a likelier value than either separately?
§ 32. Imagine that two scouts were sent to measure the caliber of a gun in an enemy fort—we can assume that the fort was going to be taken over the next day and used against the enemy, and it was crucial to have enough ammunition or shells—and let's say one of the scouts reports the caliber as 8 inches while the other says 9 inches. Would it make sense to assume that the average of these two, which is 81/2 inches, is a more accurate value than either of their individual reports?
The answer seems to be this. If we have reason to suppose that the possible calibres partake of the nature of a continuous magnitude,—i.e. that all values, with certain limits, are to be considered as admissible, (an assumption which we always make in our ordinary inverse step from an observation or magnitude to the thing observed or measured)—then we should be justified in selecting the average as the likelier value. But if, on the other hand, we had reason to suppose that whole inches are always or generally preferred, as is in fact the case now with heavy guns, we should do better to take, even at hazard, one of the two estimates set before us, and trust this alone instead of taking an average of the two.
The answer seems to be this. If we have reason to think that the possible calibers act like a continuous magnitude,—meaning that all values, within certain limits, are considered acceptable, (which is an assumption we usually make when moving from an observation or magnitude to what is being observed or measured)—then we would be justified in choosing the average as the more likely value. But if, on the other hand, we have reason to believe that whole inches are usually or always preferred, as is currently the case with heavy guns, we would be better off simply picking one of the two estimates presented to us, and relying on that alone instead of averaging the two.
§ 33. The principle upon which we act here may be stated thus. Just as in the direct process of calculating or displaying the ‘errors’, whether in an algebraic formula or in a diagram, we generally assume that their possibility is continuous, i.e. that all intermediate values are possible; so, in the inverse process of determining the probable position of the original from the known value of two or more errors, we assume that that position is capable of falling at any point whatever between certain limits. In such an example as the above, where we know or suspect a discontinuity of that possibility of position, the value of the average may be entirely destroyed.
§ 33. The principle we're using here can be summarized like this: Just as when we calculate or show the 'errors'—whether in an algebraic formula or a diagram—we generally assume that these errors can take on any value continuously, meaning that all in-between values are possible; similarly, when we work backwards to figure out the likely position of the original based on the known values of two or more errors, we assume that this position can fall anywhere within certain limits. In cases like the one mentioned above, where we know or suspect there’s a break in the possibility of that position, the average value can be completely thrown off.
In the above example we were supposed to know that the calibre of the guns was likely to run in English inches or in some other recognized units. But if the battery were in China or Japan, and we knew nothing of the standards of length in use there, we could no longer appeal to this principle. It is doubtless highly probable that those calibres are not of the nature of continuously varying magnitudes; but in an entire ignorance of the standards actually adopted, we are to all intents and purposes in the same position as if they were of that continuous nature. When this is so the objections to trusting to the average would no longer hold good, and if we had only one opportunity, or a very few opportunities, we should do best to adhere to the customary practice.
In the example above, we were expected to understand that the size of the guns was likely measured in English inches or some other standard units. However, if the battery were in China or Japan, and we knew nothing about the length standards used there, we couldn’t rely on this principle anymore. While it’s very likely that those sizes are not continuously varying, our complete ignorance of the actual standards being used puts us in the same situation as if they were continuously varying. In such cases, the objections to relying on the average wouldn’t apply, and if we had only one chance or very few chances, it would be best to stick to the usual practice.
§ 34. When however we are able to collect and compare a large number of measurements of various objects, this consideration of the probable discontinuity of the objects we thus measure,—that is, their tendency to assume some one or other of a finite number of distinct magnitudes, instead of showing an equal readiness to adapt themselves to all intermediate values,—again assumes importance. In fact, given a sufficient number of measurable objects, we can actually deduce with much probability the standard according to which the things in question were made.
§ 34. However, when we can gather and compare a large number of measurements of different objects, the idea of the likely discontinuity of the objects we measure becomes significant. This means that they tend to take on one of a limited number of specific sizes, rather than easily fitting into all the values in between. In fact, with enough measurable objects, we can even deduce with a high degree of likelihood the standard by which the items in question were created.
This is the problem which Mr Flinders Petrie has attacked with so much acuteness and industry in his work on Inductive Metrology, a work which, merely on the ground of its speculative interest, may well be commended to the student of Probability. The main principles on which the reasoning is based are these two:—(1) that all artificers are prone to construct their works according to round numbers, or simple fractions, of their units of measurement; and (2) that, aiming to secure this, they will stray from it in tolerable 499 accordance with the law of error. The result of these two assumptions is that if we collect a very large number of measurements of the different parts and proportions of some ancient building,—say an Egyptian temple,—whilst no assignable length is likely to be permanently unrepresented, yet we find a marked tendency for the measurements to cluster about certain determinate points in our own, or any other standard scale of measurement. These points mark the length of the standard, or of some multiple or submultiple of the standard, employed by the old builders. It need hardly be said that there are a multitude of practical considerations to be taken into account before this method can be expected to give trustworthy results, but the leading principles upon which it rests are comparatively simple.
This is the issue that Mr. Flinders Petrie has addressed with such insight and dedication in his work on Inductive Metrology, a book that, simply for its theoretical interest, could be recommended to anyone studying Probability. The main principles behind his reasoning are twofold: (1) that all craftsmen tend to build their works using round numbers or simple fractions of their measurement units; and (2) that, in trying to achieve this, they will deviate from it in a way that aligns with the law of error. The outcome of these two assumptions is that if we gather a very large number of measurements from various parts and proportions of an ancient building—let's say an Egyptian temple—while no specific length is likely to be completely absent, we do observe a clear tendency for the measurements to cluster around certain specific points in our own or any other standard measurement scale. These points indicate the length of the standard or some multiple or fraction of the standard used by the ancient builders. It's worth noting that there are many practical factors to consider before this method can be expected to yield reliable results, but the fundamental principles on which it is based are quite straightforward.
§ 35. The case just considered is really nothing else than the recurrence, under a different application, of one which occupied our attention at a very early stage. We noticed (Chap. II.) the possibility of a curve of facility which instead of having a single vertex like that corresponding to the common law of error, should display two humps or vertices. It can readily be shown that this problem of the measurements of ancient buildings, is nothing more than the reopening of the same question, in a slightly more complex form, in reference to the question of the functions of an average.
§ 35. The case we just discussed is really just a different version of one we looked at earlier. We noted (Chap. II.) the possibility of a curve of facility that features two peaks or vertices instead of just one, like the one related to the standard error law. It's easy to demonstrate that the issue of measuring ancient buildings is just a reexamination of the same question, but in a slightly more complicated form, concerning the functions of an average.
Take a simple example. Suppose an instance in which great errors, of a certain approximate magnitude, are distinctly more likely to be committed than small ones, so that the curve of facility, instead of rising into one peak towards the centre, as in that of the familiar law of error, shows a depression or valley there. Imagine, in fact, two binomial curves, with a short interval between their centres. Now if we were to calculate the result of taking averages here we 500 should find that this at once tends to fill up the valley; and if we went on long enough, that is, if we kept on taking averages of sufficiently large numbers, a peak would begin to arise in the centre. In fact the familiar single binomial curve would begin to make its appearance.
Take a simple example. Suppose there’s a situation where major mistakes, of a certain approximate size, are clearly more likely to happen than minor ones, so the curve of ease, instead of rising into one peak toward the center like in the usual law of error, shows a dip or valley there. Imagine, in fact, two binomial curves, with a small gap between their centers. Now, if we were to calculate the result of taking averages here we 500 would find that this quickly helps fill in the valley; and if we continued long enough, that is, if we kept taking averages of sufficiently large numbers, a peak would start to form in the center. In fact, the familiar single binomial curve would begin to show up.
§ 36. The question then at once suggests itself, ought we to do this? Shall we give the average free play to perform its allotted function of thus crowding things up towards the centre? To answer this question we must introduce a distinction. If that peculiar double-peaked curve had been, as it conceivably might, a true error-curve,—that is, if it had represented the divergences actually made in aiming at the real centre,—the result would be just what we should want. It would furnish an instance of the advantages to be gained by taking averages even in circumstances which were originally unfavourable. It is not difficult to suggest an appropriate illustration. Suppose a man firing at a mark from some sheltered spot, but such that the range crossed a broad exposed valley up or down which a strong wind was generally blowing. If the shot-marks were observed we should find them clustering about two centres to the right and left of the bullseye. And if the results were plotted out in a curve they would yield such a double-peaked curve as we have described. But if the winds were equally strong and prevalent in opposite directions, we should find that the averaging process redressed the consequent disturbance.
§ 36. The question immediately arises: should we do this? Should we allow the average to freely fulfill its role of pushing everything toward the center? To answer this question, we need to make a distinction. If that unique double-peaked curve had truly been an error curve—meaning it accurately represented the deviations from aiming at the actual center—the outcome would be exactly what we would want. It would provide an example of the benefits of using averages even in initially unfavorable conditions. A fitting illustration comes to mind. Imagine a person shooting at a target from a sheltered spot, but the shot travels across a wide open valley where a strong wind usually blows. If we looked at where the shots landed, we would see them clustering around two centers to the right and left of the bullseye. If we plotted those results on a graph, they would form the double-peaked curve we've described. However, if the winds were equally strong and blowing in opposite directions, we would find that the averaging process corrected the resulting disruption.
If however the curve represented, as it is decidedly more likely to do, some outcome of natural phenomena in which there was, so to say, a real double aim on the part of nature, it would be otherwise. Take, for instance, the results of measuring a large number of people who belonged to two very heterogeneous races. The curve of facility would here be of the kind indicated on p. 45, and if the numbers of the 501 two commingled races were equal it would display a pair of twin peaks. Again the question arises, ‘ought’ we to involve the whole range within the scope of a single average? The answer is that the obligation depends upon the purpose we have in view. If we want to compare that heterogeneous race, as a whole, with some other, or with itself at some other time, we shall do well to average without analysis. All statistics of population, as we have already seen (v. p. 47), are forced to neglect a multitude of discriminating characteristics of the kind in question. But if our object were to interpret the causes of this abnormal error-curve we should do well to break up the statistics into corresponding parts, and subject these to analysis separately.
If the curve represents, as is much more likely, some outcome of natural events where nature had a real double purpose, the situation would be different. For example, consider the results of measuring a large number of people from two very different races. The curve of ability would look like the one shown on p. 45, and if the numbers from both mixed races were equal, it would show a pair of twin peaks. This raises the question, ‘should’ we include the entire range when calculating a single average? The answer depends on our purpose. If we want to compare that mixed race as a whole with another or with itself at a different time, it’s best to average without breaking it down. As we’ve seen in population statistics (v. p. 47), many distinguishing characteristics have to be overlooked. However, if our goal is to analyze the causes of this unusual error curve, we should divide the statistics into relevant parts and analyze them separately.
Similarly with the measurements of the ancient buildings. In this case if all our various ‘errors’ were thrown together into one group of statistics we should find that the resultant curve of facility displayed, not two peaks only, but a succession of them; and these of various magnitudes, corresponding to the frequency of occurrence of each particular measurement. We might take an average of the whole, but hardly any rational purpose could be subserved in so doing; whereas each separate point of maximum frequency of occurrence has something significant to teach us.
Similarly with the measurements of the ancient buildings. In this case, if we combined all our different 'errors' into one set of statistics, we would see that the resulting curve of ease showed not just two peaks, but a series of them; and these peaks would vary in size, reflecting how often each specific measurement occurred. We might calculate an average for everything, but that wouldn't serve any real purpose; instead, each individual point of maximum frequency has something important to teach us.
§ 37. One other peculiar case may be noticed in conclusion. Suppose a distinctly asymmetrical, or lop-sided curve of facility, such as this:—
§ 37. One more unique case can be mentioned in conclusion. Imagine a clearly asymmetrical or uneven curve of ability, like this:—

Laws of error, of which this is a graphical representation, are, I apprehend, far from uncommon. The curve in question, is, in fact, but a slight exaggeration of that of barometrical heights as referred to in the last chapter; when it was explained that in such cases the mean, the median, and the maximum ordinate would show a mutual divergence. The doubt here is not, as in the preceding instances, whether or not a single average should be taken, but rather what kind of average should be selected. As before, the answer must depend upon the special purpose we have in view. For all ordinary purposes of comparison between one time or place and another, any average will answer, and we should therefore naturally take the arithmetical, as the most familiar, or the median, as the simplest.
Laws of error, which this graph represents, are quite common. The curve discussed is actually just a slight exaggeration of the barometric heights mentioned in the last chapter; it was explained that in such situations, the mean, median, and maximum values would show some differences. The uncertainty here is not whether to choose a single average, as in previous cases, but rather which type of average to choose. Once again, the answer depends on the specific purpose we have in mind. For typical comparisons between different times or places, any average will work, so we would naturally use the arithmetic average, as it’s the most familiar, or the median, as it’s the simplest.
§ 38. Cases might however arise under which other kinds of average could justify themselves, with a momentary notice of which we may now conclude. Suppose, for instance, that the question involved here were one of desirability of climate. The ordinary mean, depending as it does so largely upon the number and magnitude of extreme values, might very reasonably be considered a less appropriate test than that of judging simply by the relatively most frequent value: in other words, by the maximum ordinate. And various other points of view can be suggested in respect of which this particular value would be the most suitable and significant.
§ 38. However, there could be situations where other types of averages might be justifiable, and we can briefly acknowledge them now. For instance, if the issue at hand were about the desirability of a climate, the usual average—which relies heavily on the number and size of extreme values—might be considered a less suitable measure than simply looking at the most common value: in other words, the maximum ordinate. Additionally, there are various other perspectives where this specific value would be the most appropriate and meaningful.
In the foregoing case, viz. that of the weather curve, there was no objective or ‘true’ value aimed at. But a curve closely resembling this would be representative of that particular class of estimates indicated by Mr Galton, and for which, as he has pointed out, the geometrical mean becomes the only appropriate one. In this case the curve of facility ends abruptly at O: it resembles a much foreshortened 503 modification of the common exponential form. Its characteristics have been discussed in the paper by Dr Macalister already referred to, but any attempt to examine its properties here would lead us into far too intricate details.
In the previous example, specifically that of the weather curve, there wasn't a neutral or "true" value to aim for. However, a curve that closely matches this one would represent the specific category of estimates noted by Mr. Galton, for which, as he mentioned, the geometric mean is the only suitable option. In this scenario, the curve of ease ends suddenly at O: it looks like a much-shortened version of the usual exponential form. Its features have been discussed in the paper by Dr. Macalister that was mentioned earlier, but trying to analyze its properties here would take us into far too complicated details.
§ 39. The general conclusion from all this seems quite in accordance with the nature and functions of an average as pointed out in the last chapter. Every average, it was urged, is but a single representative intermediate value substituted for a plurality of actual values. It must accordingly let slip the bulk of the information involved in these latter. Occasionally, as in most ordinary measurements, the one thing which it represents is obviously the thing we are in want of; and then the only question can be, which mean will most accord with the ‘true’ value we are seeking. But when, as may happen in most of the common applications of statistics, there is really no ‘true value’ of an objective kind behind the phenomena, the problem may branch out in various directions. We may have a variety of purposes to work out, and these may demand some discrimination as regards the average most appropriate for them. Whenever therefore we have any doubt whether the familiar arithmetical average is suitable for the purpose in hand we must first decide precisely what that purpose is.
§ 39. The overall conclusion from all this aligns well with the nature and functions of an average, as discussed in the last chapter. Every average is essentially a single representative value standing in for a range of actual values. This means it inevitably overlooks a lot of the detailed information contained in those values. Sometimes, as with most everyday measurements, the one thing it represents is exactly what we are looking for; in that case, the only question is which mean will best reflect the 'true' value we want. However, when it comes to many common uses of statistics, there often isn’t a definitive 'true value' behind the observed phenomena, which can lead to different avenues to explore. We might have various objectives to fulfill, and these could require careful selection of the most appropriate average. Thus, whenever we're unsure if the typical arithmetic average is right for the task, we must first clarify what that task is.
1 Mr Mansfield Merriman published in 1877 (Trans. of the Connecticut Acad.) a list of 408 writings on the subject of Least Squares.
1 Mr. Mansfield Merriman published in 1877 (Trans. of the Connecticut Acad.) a list of 408 works on the topic of Least Squares.
2 In other words, we are to take the “centre of gravity” of the shot-marks, regarding them as all of equal weight. This is, in reality, the ‘average’ of all the marks, as the elementary geometrical construction for obtaining the centre of gravity of a system of points will show; but it is not familiarly so regarded. Of course, when we are dealing with such cases as occur in Mensuration, where we have to combine or reconcile three or more inconsistent equations, some such rule as that of Least Squares becomes imperative. No taking of an average will get us out of the difficulty.
2 In other words, we need to find the “center of gravity” of the shot marks, treating them all as equal. Essentially, this is the ‘average’ of all the marks, as basic geometry for finding the center of gravity of a set of points demonstrates; however, it’s not commonly thought of that way. Naturally, when we encounter situations like those in Measurement, where we have to combine or resolve three or more conflicting equations, applying a method like Least Squares becomes necessary. Simply taking an average won’t solve the problem.
3 The only reason for supposing this exceptional shape is to secure simplicity. The ordinary target, allowing errors in two dimensions, would yield slightly more complicated results.
3 The only reason for considering this unique shape is to maintain simplicity. The usual target, which permits errors in two dimensions, would lead to somewhat more complicated outcomes.
4 When first referred to, the general form of this equation was given (v. p. 29). The special form here assigned, in which h/√π is substituted for A, is commonly employed in Probability, because the integral of y dx, between +∞ and −∞, becomes equal to unity. That is, the sum of all the mutually exclusive possibilities is represented, as usual, by unity. In this form of expression h is a quantity of the order x−1; for hx is to be a numerical quantity, standing as it does as an index. The modulus, being the reciprocal of this, is of the same order of quantities as the errors themselves. In fact, if we multiply it by 0.4769… we have the so-called ‘probable error.’
4 When first mentioned, the general form of this equation was provided (see p. 29). The specific form given here, where h/√π is used instead of A, is commonly used in Probability, because the integral of y dx, from +∞ to −∞, equals one. In other words, the total of all the mutually exclusive possibilities is represented, as usual, by one. In this expression, h is a quantity of the order x−1; since hx is meant to be a numerical quantity, functioning as it does as an index. The modulus, being the reciprocal of this, is of the same order as the errors themselves. In fact, if we multiply it by 0.4769… we get what is known as the ‘probable error.’
6 Broadly speaking, we may say that the above remarks hold good of any law of frequency of error in which there are actual limits, however wide, to the possible magnitude of an error. If there are no limits to the possible errors, this characteristic of an average to heap its results up towards the centre will depend upon circumstances. When, as in the exponential curve, the approximation to the base, as asymptote, is exceedingly rapid,—that is, when the extreme errors are relatively very few,—it still holds good. But if we were to take as our law of facility such an equation as y = π/1 + x2, (as hinted by De Morgan and noted by Mr Edgeworth: Camb. Phil. Trans. vol. X. p. 184, and vol. XIV. p. 160) it does not hold good. The result of averaging is to diminish the tendency to cluster towards the centre.
6 In general terms, we can say that the comments above apply to any frequency of error law where there are actual limits, no matter how broad, on the possible size of an error. If there are no limits on possible errors, this tendency for an average to group its results around the center will depend on the circumstances. When, as in the exponential curve, the approach to the base as an asymptote is very rapid—that is, when extreme errors are relatively rare—it still applies. However, if we were to use an equation like y = π/1 + x2, (as suggested by De Morgan and noted by Mr. Edgeworth: Camb. Phil. Trans. vol. X, p. 184, and vol. XIV, p. 160), it does not apply. The result of averaging is to diminish the tendency to concentrate around the center.
7 The reader will find the proofs of these and other similar formulæ in Galloway on Probability, and in Airy on Errors.
7 You can find the proofs of these and other similar formulas in Galloway on Probability, and in Airy on Errors.
8 The formula commonly used for the E.M.S. in this case is ∑e2/n − 1 and not ∑e2/n. The difference is trifling, unless n be small; the justification has been offered for it that since the sum of the squares measured from the true centre is a minimum (that centre being the ultimate arithmetical mean) the sum of the squares measured from the somewhat incorrectly assigned centre will be somewhat larger.
8 The formula typically used for the E.M.S. in this situation is ∑e2/n - 1 and not ∑e2/n. The difference is minimal, unless n is small; the reasoning behind this is that since the sum of the squares calculated from the true center is a minimum (with that center being the ultimate arithmetic mean), the sum of the squares calculated from the somewhat inaccurately assigned center will be a bit larger.
9 It appears to me that in strict logical propriety we should like to know the probable error committed in both the assignments of the preceding two sections. But the profound mathematicians who have discussed this question, and who alone are competent to treat it, have mostly written with the practical wants of Astronomy in view; and for this purpose it is sufficient to take account of the one great desideratum, viz. the true values sought. Accordingly the only rules commonly given refer to the probable error of the mean.
9 It seems to me that, in strict logical terms, we would like to know the likely error made in both the assignments from the previous two sections. However, the expert mathematicians who have discussed this issue—those who are truly qualified to address it—have mostly written with the practical needs of Astronomy in mind; and for that, it's enough to consider the one key requirement, namely, the accurate values sought. Therefore, the only rules generally provided relate to the probable error of the mean.
10 i.e. as distinguished from acting upon them indirectly. This latter proceeding, as explained in the chapter on Randomness, may result in giving a non-uniform distribution.
10 i.e. as distinguished from acting on them indirectly. This latter process, as explained in the chapter on Randomness, may result in a non-uniform distribution.
11 There is no difficulty in conceiving circumstances under which a law very closely resembling this would prevail. Suppose, e.g., that one of the two measurements had been made by a careful and skilled mechanic, and the other by a man who to save himself trouble had put in the estimate at random (within certain limits),—the firm having a knowledge of this fact but being of course unable to assign the two to their authors,—we should get very much such a Law of Error as is supposed above.
11 It's easy to imagine situations where a law similar to this one would be in effect. For example, if one of the two measurements was taken by a skilled mechanic, while the other was estimated randomly by someone just trying to take a shortcut—knowing that the firm is aware of this fact but can't identify who made which measurement—we would likely see a Law of Error similar to what was described above.
505
INDEX.
- Accidents 342
- Airy, G. B. 447, 484
- Anticipations, tacit 287
- Arbuthnott 258
- Aristotle 205, 307
- Average
- Babbage 343
- Bags and balls 180, 411
- Belief
- Bentham 319, 323
- Bernoulli 91, 117, 389
- Bertillon 435
- Births, male and female 90, 258, 263
- Boat race, Oxford and Cambridge 339
- Boole 183
- Buckle 237
- Buffon 153, 205, 352, 389
- Burgersdyck 311
- Butler 209, 281, 333, 366
- Carlisle Tables 169
- Casual, meaning of 245
- Causation
- Centre of gravity 467
- Certainty, in Law 324
- Chance
- Chauvenet 352
- Classification, numerical scheme of 48
- Coincidences 245
- Combinations and Permutations 87
- Communism 375, 392
- Conceptualism 275
- Conflict of chances 418
- Consumptives, insurance of 227
- Cournot 245, 255, 338
- Crackanthorpe 312, 320
- Craig, J. 192
- Crofton, M. W. 61, 101, 104
506
- Dante 285
- Deflection
- De Morgan 83, 106, 119, 122, 135, 177, 179, 197, 236, 247, 296, 308, 350, 379, 382, 483
- De Ros trial 255
- Digits, random 111, 114
- Discontinuity 116
- Distribution, random 106
- Diagrams 29, 45, 118, 443, 476, 481, 493, 501
- Dialectic 302, 320
- Donkin 123, 188, 283
- Duration of life 15, 441
- Düsing 259
- Ebbinghaus 199
- Edgeworth, F. Y. 34, 119, 256, 339, 393, 435, 483
- Ellis, L. 9
- Epidemics 62
- Error, law of 29
- Error
- Escapes, narrow 341
- Expectation, moral 388
- Experience and probability 74
- Exponential curve 29
- Extraordinary
- Fallacies in Logic and Probability 367
- Fatalism 243
- Fechner 34, 389, 435, 441
-
Fluctuation 448
- unlimited 73
- Forbes, J. D. 188, 262
- Formal Logic 123
- Formal and Material treatment 86
- Free will 240
- Galloway 248, 448, 484
- Galton, F. 33, 50, 70, 318, 442, 451, 473, 502
- Gambling
- Godfray, H. 99
- Grote, G. 307
- Guy 6
- Hamilton, W. 266, 297
- Happiness, human 382
- Heads and Tails 77
- Heredity 50, 357
- Herschel 30, 466
- Houdin 361
- Hume 236, 419, 433
- Hypotheses 268
507
- Lambert 309
- Language of Chance 159
- Laplace 89, 120, 197, 237, 424
- Law
- Least squares 41, 467
- Leibnitz 309, 320
- Letters
- Lexis, W. 263, 441
- Likely, equally 77, 183
- Limit
- Lines, random 113
- Lister's method 187
- Lotteries 128
- Lunn, J. R. 248
- McAlister, D. 34, 187, 502
- Mansel, H. L. 299, 301, 320
- Martingale 343
- Material and Formal Logic 265
- Maximum ordinate 441, 455
- Measurement of
- Mental qualities, measurement of 49
- Merriman, M. 352, 448, 460, 465
- Mill, J. S. 131, 207, 266, 282, 402
- Milton, chance production of 353
- Miracles 428
- Michell, J. 260
- Modality 295
- Modulus 464, 472, 484
- Monro, C. J. 325, 416
- Names, reference of 270
- Nations, comparison of 51
- Natural Kinds 55, 63, 71
- Necessary and impossible matter 310
- Paley 433
- Penny, tosses of 144
- Petrie, F. 498
- Petersburg Problem 19, 154
- Poisson 405
- Prantl 311
- Presumption, legal 329
- Prevost 348
- Probability
- Probable
- Problem, Three point 104
- Proctor, R. A. 262, 378
- Prophecies, suicidal 226
- Providence 89, 431
- Propositions, proportional 2
- Psychical research 256
- Pyramid, the great 251
- π, digits in 111, 247
508
- Series
- Shanks 248
- Skeat, W. W. 96
- Smiglecius 306, 316
- Smyth, P. 251
- Socialism 392
- Spiritualism 365
- Stars, random arrangement of 108, 260
- Statistics
- Statistical Journal 6
- Stature
- Stephen, J. F. 282, 323, 326
- Stewart, D. 209, 237
- Subjective and objective terms 160
- Succession
- Suffield, G. 248
- Suicides 67, 237
- Surnames, extinction of 387
- Surprise, emotion of 157
- Syllogisms, pure and modal 316
Transcriber's Note
Minor typographical corrections and presentational changes have been made without comment.
Minor typographical corrections and formatting changes have been made without comment.
Download ePUB
If you like this ebook, consider a donation!