Welcome to NRICH.

 
How accurately should you quote the mean?


By Larry Currie on March 4, 1998:

How many significant digits are there in the mean of a set of integer values? Or, to phrase it another way, if you take the mean of a data set in which all of the values are integers, what decimal place should you round to in order to be confident of your result (what decimal place is considered statistically significant)?

Larry Currie


By Chris on March 5, 1998:

Hi Larry,

That's quite a tricky question as it stands, since there are two things you might mean by ``a set of integers'' and I don't know which you've got. Do you know that the numbers you are looking at are integers (for example, do they represent the number of ice creams bought in a day by children at a shop), or are they just rounded to the nearest integer (for example, how many minutes did children spend choosing their ice creams, to the nearest minute)?

The important thing is that your answer for the mean can never really be more accurate than the accuracy of the data you have in the first place - taking a mean from a set of values can give you more confidence that an answer is about right sometimes, but it can never make your answer more accurate.

In the first case (number of ice creams bought), suppose your data are 15, 18, 17 and 15 on four days, and you want to work out the mean. You know that these numbers really are 15, 18, 17 and 14, and not 14.6, 18.4, 17.3 and 15.2, so work out the mean, (15+18+17+15) / 4=16.25, and now it's just a case of working out how many significant figures will be useful to you, since there's no inaccuracy in your data.

The second case (how long children spent choosing) is more complicated, because you could have errors in your original data. You get the number 15 whether the child spends 14.5 minutes looking or almost 15.5. Now the real mean could lie anywhere from (14.5+17.5+16.5+14.5) / 4=15.75 to as close as you like to (15.5+18.5+17.5+15.5) / 4=16.75.
Obviously it's silly to go quoting the figure 16.25 as an exact value like you did before, so you might say that the mean is 16.25 ± 0.5 . Thus you still quote the mean you worked out, but you started with errors of ±0.5 in all of your data, so your mean cannot be more accurate than ±0.5 either.

One last thing... This is what I'd probably do. If you need this for an exam, make sure to check with your teacher what answer you're expected to give, don't assume it's necessarily what I've said above.

Best wishes,

Chris