Thanks Gimp and Magnetite.
Shooters might be interested in the simple explanation of why small string sizes (n) give inaccurate estimates of sd and ES. Same goes for group sizes.
It's just that you have to use the observed sample to estimate the mean value.
The definition of sd is the root mean square of differences from the mean.
We estimate this by adding up the squared differences and estimating the mean value from the sample data, so it is always an under estimate because the sample mean follows the observed values.
A correction for this effect (not sure if it is exact) is to increase the estimate by dividing the sum of squares of the errors by n-1, so the estimated sd becomes larger when the sample size is small. Excel has formulae for both the population sd (STDEV.P) and this adjusterd sample derived estimate of the sd (STDEV.S) so use the STDEV.S on your observed values.
The n-1 number is called the number of "degrees of freedom".
The distribution of these root-mean-square values using n-1 degrees of freedom is called a Students T distribution.
The underlying distribution of the values, which gimp showed in his first graph is the "Normal Distribution" (also called "Gaussian" or "Bell Curve"). It is the limit case when the number of observations trends to infinity. There is an exact formula for the shape, which you can look up in Wikipedia.
In practice, estimates of the mean and standard deviation are very close to those of the underlying Normal distribution when n is greater than 20 or 30.
The Normal distribution is quite amenable to mathematical work, such as calculating the exact standard deviation of a variable which is the sum of two Normal random variables. Interestingly, this applies even if the errors are at right angles like wind and velocity errors. However, the calculus involved is beyond me now (like climbing 1000m in an hour) and using an example like Gimp has done is the only way I could demonstrate the threshold of how many shots it takes to validly compare velocity sd or group size.
Brian Litz is another master of science writing here, simply adding together errors in a mathematically valid way (often based on 95% confidence intervals) in his 'error budget" calculations. My own very limited experience is consistent with his scepticism of meaningfully comparing group sizes from small observed samples (n<20). See his recent "Believe the Target" 2 hr interview with Eric Cortina.
Bookmarks