16. SELLING STATISTICS TO SIXTH FORMERS Alan Jones University of Wales at Aberystwyth

 

16.1 Introduction

It is by no means uncommon for university lecturers to visit Schools and Colleges. It is increasingly common for such visits to be requested by schools to supplement their teaching in the field of General Studies for sixth form students. This is an altogether sensible development as it simultaneously broadens the coverage of courses and, in science in particular, provides a vehicle by which university personnel can inform the outside world of what they are doing and why they are doing it.

Statistics as a subject is at a watershed in its development. Research in the by-now varied branches of the subject is progressing at the devastating rate of a river in full flood, but there remain many still pools, particularly at the interface with the outside world. The general understanding of science is far from good, and the layman often does not understand or, even worse, misunderstands what statisticians are trying to do.

The University of Wales at Aberystwyth has a very well established and smoothly running Schools Liaison Service. At peak demand Aberystwyth's lecturers make some 25-30 visits per week to schools up to 120 miles away, and even further in some cases. The statistics staff have co-operated fully in this venture and, in so doing, are attempting to 'sell' the subject of Statistics to both specialist and non-specialist school audiences far and wide.

This paper concentrates on the aspect of including statistics as part of the General Studies curriculum, where the audience is the general sixth form intelligent pupils across most academic disciplines.

16.2 Statistics in General Studies

There are many ways in which Statistics can be relevant to a general audience. The sociological survey is an obvious subject, and there is a great deal to be learned by a very basic study (in mathematical or statistical terms) of opinion polls, for example. Such a study can be geared towards the 'Social Statistics' side of the methods of sampling and data collection, etc., or can be more mathematically meaty by involving probability up to quite a sophisticated level (it is an ideal way of introducing the idea of covariance, for example).

Experimental design is another field that lends itself to a general treatment in this way. Modern developments in industry and the business world in general have re-kindled interest in quality management. Indeed, in many cases, modern research in these areas have almost been a case of rediscovering the wheel. This is because the methodology of pioneering statistical methods in research and production of almost 40 years ago have been rejuvenated. It is an area where simple, common sense ideas advance things a great way towards the final goal.

Probability remains a difficult subject for many pupils. There are endless problems and paradoxes that could fill many a paper but we shall not explore this avenue for the simple reason that it has already been the subject of other papers in this book.

Operational Research is, by definition, a subject where its problems have 'real-world' origins. It follows that there abound case studies that could form very useful and interesting talks, detailing problems in the fields of queueing, forecasting, linear programming, etc. The technique of dynamic programming can become mathematically very complex and impressive. The results of its applications are equally impressive. However, in its simplest form, it is logical and understandable without requiring a great deal of specialist mathematical knowledge. One of its celebrated applications is the so-called marriage problem, and this can provoke reactions of all degrees, from the favourably impressed, through the comically entertained, to being insulted by sexism! But it does involve some real mathematics, and a sixth form audience can realise that mathematics and the statistical method has achieved something. Note that none of the above refer to the 'recipe' approach to Statistics the 'this is what you do' idea is one we try to eradicate at Aberystwyth.

All of the above can be approached and even linked together in one way by history. The history of the development of statistics is a subject of great interest in itself. Statistics undoubtedly involves and includes a slightly different way of thinking, and it can be fascinating to see how the web has been woven over time, combining the great skills and techniques of mathematics with the equally important notions of formulating a problem in mathematical terms, and interpreting the results. The subject is so vast that the story of development of even a very small part of statistics forms a lengthy tale. The obvious first suggestion is the history of probability (see the next section). Where did 't' come from, how did simple averaging begin, and what led to the methods of combining observations from different experiments, so important in general science? Least squares is a method in common use today probably the most common but it took a long time to get going, and some of the early developers certainly had great insight. The interested reader might enjoy reading the excellent book by Stephen Stigler (1988), now in paperback!

For the remainder of this paper, however, we shall concentrate on one particular story, by way of an example of the content of just one talk that has been enjoyed by many sixth-form audiences. It involves the story of the development of describing randomness in mathematical terms, and the allied notion of a 'random event'. First, we add a few words about part of the pre-history to this, namely probability.

16.3 Probability

The subject of probability and the assessment of chance dates from very early times. Much of it is involved with gambling and games of chance, but, if anything, it is a mystery as to why its development took so long. Modern day education emphasizes the notion of 'equally likely outcomes', and few today would take the experiment of tossing two fair coins to have just three outcomes (two heads, two tails or one of each), but rather correctly consider two outcomes for each coin, giving four outcomes for the joint result. Yet this mistake, and taking the three to be equally likely, is reputed to have been spotted only when a French nobleman lost money at gambling on it!

The proper enumeration of outcomes in trials that could be decomposed into 'Yes-No' experiments took the development a long way. This led to the now familiar binomial distribution of probabilities - the attachment of the correct probability nCrpr(1-p)n-r to the event of obtaining r heads in n tosses of a coin where p is the probability of a head on each toss. There followed the obvious mathematical modelling of applying this result to any experiment of independent trials, each with the same probability of success, even if the notion of independence was sometimes a little nebulous. The calculation of these probabilities sometimes caused problems, and many great mathematicians sought to find ways of making the arithmetic easier. Knowledge of the relative sizes of each term led to knowledge of the shape of a bar diagram displaying the probabilities. Modern day warnings of good statistical diagrams were not around, however, and the bar diagrams often appeared as curves, which in turn led to the obvious problem of describing the curve mathematically. By the 18th century, the representation of the binomial distribution diagram by a relatively simple curve was known, and the result was a link to what is now called the normal distribution. Today, mathematics students will know that the curve to which we are referring is a probability density curve, but suffice it to say that the curves themselves long predated the interpretations we try to teach today!

It is worth mentioning at this point a piece of apparatus that is a tremendous visual aid. The quincunx is a flat, enclosed, glass or perspex faced box, about the size of an A4 sheet of paper. Inside the glass, at the 'top', we have a chute, and some means of plugging a hole in its downward facing vertex. When plugged, it holds back a large number of pieces of lead shot. Below the chute, we have a triangular array of pins, one on the top row, two on the next, etc; and below these the width of the box is divided into a number of channels. When the hole is unplugged, the balls of shot bounce on the various rows of pins, and eventually end up in one or other of the channels. A piece of shot will only end up in the left-most channel if it has been deflected to the left at each row of pins. Thus we have the mathematical model - deflection of a pin to the left represents throwing a head, or a success. With the number of rows (of pins) representing the number of trials, we have the classic binomial set-up. The build-up of the binomial pattern in the channels is a joy to watch.

The pattern that forms is symmetric and resembles the bell-shape that characterizes the result that the normal curve is a good approximation to the shape of the binomial distribution. Mathematics shows that this is so if the number of trials becomes large, in which case the shape centres around the value of np, the "common-sense" value of how many successes would be "expected" in n trials. But what if this value is relatively small, perhaps because p is exceedingly small? Surely we cannot have a symmetric pattern always - the number of successes is a whole number, or 0, so if the 'peak' is low, symmetry can't result!

16.4 Randomness and rare events.

Enter Siméon Denis Poisson. Born in 1781 in Pithviers, Loiret. This young man quickly established himself as a mathematician. He topped the list of graduates from the Paris Polytechnic University at the age of 17, and was a professor there ten years later. A move to the Bureau des Longitudes directed his interests towards Astronomy and Physics, and he was responsible for great advances in both subjects. His work in the areas of mechanics and heat were fundamental, and are basic to this day. In semi-retirement, however, Poisson turned his attentions in leisure time to 'lighter' areas of mathematics. He noticed the apparent problem about the normal curve approximation to the binomial probability calculations, and set about investigating it.

He tried out some calculations on data where the chance of success was small. In his data, a trial was a case before a French court, and a success was an acquittal - a very small chance indeed when Madame Guillotine ruled! His mathematical argument was clever, and it remains in current text-books as the derivation of what became known as the Poisson distribution. He showed that if n is large and p is small, but in such a way that m=np is not very big (say by having p=m/n), the binomial probability becomes approximately equal to cmr/r!. The value of the constant, c, has since become known to mathematicians as e-m, but note that its only purpose is to ensure that the probabilities of all possible outcomes add to 1. Arithmetically, c could therefore be calculated to any degree of accuracy required.

Poisson's breakthrough, like many great breakthroughs, went unnoticed for a time. He published his work in 1837, four years before his death, but it took a full 60 years before his distribution was used. In the years from 1890 to 1915, or so, many examples of experiments with low probability of success were published, exhibiting close agreement with Poisson's formula. The most celebrated are:

  • Von Bortkiewicz's (1898) analysis of the number of deaths by horsekick (success) in each unit (trial) of the Prussian cavalry over a 20 year period; the data agreed admirably with a Poisson distribution with m=0.61;

    Lucy Whitaker's (1914) analysis of the number of deaths of women over 85 reported in The Times over the three year period 1810-1812; here a trial is a woman over 85 alive on a particular day, and a success is the exceedingly rare event of both dying and having that death reported in The Times!

  • The extraordinarily good agreement between the collected data and the Poisson 'prediction' is in itself an indicator of the history of the acceptance of statistical thought. Nowadays, any data set showing such agreement with a theory would surely be rejected by any reputable journal editor as being 'suspect' . Natural variation has by now been accepted!

    There would from that time be no real progress by publishing a succession of examples of similar types of data. Development would require some further piece of modelling skill, and this came, again in physics in the second decade of this century. Rutherford and Geiger are eminent as physicists to everyone; they were regularly collecting vast quantities of data on radioactive substances and emissions therefrom. In particular, they acquired data on the emission patterns of alpha particles; the pattern seemed 'random'.

    Pause awhile and think what 'random' means. In layman's terms it means somewhere between totally regular and totally 'clustered'. On a dartboard, somewhere between 20 darts ending up all as double 20's, and the situation of one dart in each of the 1,2,...,20 beds. Rutherford and Geiger had counts of the numbers of alpha particles emitted in periods of 7.5 seconds; there seemed to be no pattern. The brilliant advance was in the way randomness was modelled in mathematical terms. There were two parts to the modelling 'definition':

  • emissions in non-overlapping time intervals did not influence each other they were independent, just like different tosses of a coin;

    if we consider a time speck so small that it can contain either one emission or none, then the chance of an emission in that interval is always the same (analogous to the same coin always having the same probability of 'head').

  • A time interval of 7.5 seconds therefore consists of a very large number of independent time specks (trials), each with the same probability of emission (success). The model was therefore complete, and the mathematical consequence was that the number of emissions in 7.5 seconds would follow the pattern of Poisson probabilities; n was unknown (but large), as was p (small). But they could estimate m=np as the 'average' number of emissions in 7.5 seconds. The data agreed well with the Poisson prediction.

    Rutherford and Geiger could never have foretold the importance of this small step. They had modelled (some might say conjured up) a situation in terms of hypothetical coin tosses. Once this was done, the modelling of random patterns in time was complete. The move to other dimensions was quite natural. The American biologist-cum-mathematician-cum-statistician Jerzy Neyman looked upon the growth of bacteria on a petri plate in a similar way, but now as representing randomness in area. A petri plate could be divided into little squares, and the number of bacteria growing in each square could be counted - just like counting the number of emissions in 7.5 seconds. Then each square could be thought of as consisting of a very large number of specks, each speck having a small probability of bearing a bacterium. The Poisson model worked again, in work published around 1930-35.

    The example published by Clarke (1946) obviously relates to work carried out before then. South London had been divided into 576 squares, just like a giant petri plate. The number of flying bombs landing in each one was seen to follow a Poisson pattern. Thus the flying bomb hits over South London in World War II were randomly scattered, and such a discovery was a major breakthrough for military intelligence.

    Extensions to randomness were soon to follow in great volumes. Some theories of the solar system are now based on a random distribution of stars; perhaps we should advocate that any fruit cake baked in a school domestic science class should be subjected to a statistical investigation of the randomness of the currants as displayed by the counted number of currants in each cube sampled?

    Two breakthroughs, 70-80 years apart contributed to the mathematical development of the Poisson distribution and its use as a model for randomness. By now, the basic idea has been at the heart of many breakthroughs in modern life - the design of telephone exchanges, traffic light systems, even the relatively recent changes in queue disciplines in banks and post offices all came about because of refinements in the theory.

    16.5 Conclusions

    The development of the modelling of randomness is a simple story, but one with great repercussions that affect all of us. The Poisson distribution tells us the pattern of 'successes' we should get in collections of individual trials and it has been applied in a great many situations. It has even cropped up in a court of law but that's another story!

    The data sets referred to in this article can be found in many books, but, in case of difficulty, I would be pleased to supply anyone interested with copies of the data and analyses I performed.

    16.6 References

    Bortkiewicz, L. von (1898) Das Gesetz der kleinen Zahlen. Teubner: Leipzig.

    Clarke, R. D. (1946) An application of the Poisson distribution. J. Inst. Actuar., 72, 48.

    Poisson, S.D. (1837) Recherches sur la probabilite de es Jugements. Bachelier, Paris

    Rutherford, E. Chadwick, J. and Ellis, C.D. (1930) Radiation from Radioactive substances. Cambridge University Press, London.

    Stigler, S. (1988) History of Statistics before 1900. Harvard University Press. Cambridge, Mass and London.

    Whitaker, L. (1914) On deaths of women over 85 Biometrika, 10, 36.

    Top

    Contents