Standard deviation (s.d.) is the square root of the sum of the squared differences between each score and the mean average of all scores. It is also the square root of variance.
An issue that sometimes confuses novices studying standard deviation and variance is the importance of recognizing the difference between between data drawn from a population or a sample. At this point I'm not going to explore this issue, BUT I do want you to accept that using the appropriate formula to calculate a s.d. statistic is important. Most scientific calculators, Excel, and other statistical softweare have both formulas available, but you need to select the appropriate one.
For a population the formula is:
For a sample the formula is:
Where SD is standard deviation,
xi is each of the observations,
ma is the mean average, and
n is the number of scores.
This may sound intimidating, but luckily statistical calculators, Excel and other software automate the calculation of standard deviations.
There are other formulas that can be used to make the calculation in fewer steps. The following formula is a far easier calculation of the standard deviation of a sample.
This formula requires the total of all the observations squared (a value multiplied by itself) ∑ X2 and the total of all values (∑ X) which is then squared (∑ X)2.
From our point of view two issues need to be reviewed. We need some level of understanding of what standard deviation represents, and the issue of the standard deviation of a population versus calculating an estimate of the population parameter when using samples drawn for a population.
STANDARD DEVIATION for a population versus a sample:
When a set of data contains all the possible data points for a population the calculation of standard deviation provides a value that characterizes that population. Such a specific value for a statistic is called a parameter.
However, it is much more likely that the data one is working with is not the complete population, but only a sample. When a sample is used we need to estimate the population standard deviation from the sample. In this situation 1 is subtracted from the N, number of cases. (See the formulas above) For small samples this N-1 tends to increase the value of the standard deviation estimate slightly when compred to the results using N. As sample size increases the effect of N-1 declines and the results of the two formulas converge towards the population parameter for large numbers of data points. These patterns are illustrated in a linked table.
WHAT DOES STANDARD DEVIATION MEAN?
The standard deviation statistic is a number that identifies a distance on the measurement scale. In very general terms it is the average difference between each score and the mean average. A standard deviation is central to many of the statistics used to make inferences and testing hypotheses.
HOW IS STANDARD DEVIATION INTERPRETED?
A calculated standard deviation is an estimate of how scores are distributed away from the mean average. If this distribution is approximately normal (bell shaped curve), then .34 (34%) of the cases will occur between the mean and one standard deviation.
Also, if one adds one standard deviation to the mean and subtracts one standard deviation from the mean the proprotion of cases between these two numbers is around 68% or 2/3rds of the cases. This predictabilty of the distribution is the foundation for both calculating a confidence interval and margin of error.
In more advanced interpretations a researcher may use a fraction of the standard deviation and a normal distribution table to estimate a different proportion of the cases.