Statistical Control
In order to understand statistical control let's return to the earlier common sense example. The patient tried to follow the Doctor's directions. They increased exercise, and after three months they returned to the doctor's scale to find no change in weight. "What's up, Doctor?" Ask the patient."Well, did you keep eating at the same level?" asked the Doctor."No, I was too hungry." answered the patient.And here is the dilemma. A simple correlation assumes that "all other things are equal." In the real world outside of a controlled experiment it is not realistic to assumption. In the prior research example we cannot assume that all students taking math courses are equal. We know that some come into the college with better skills than others. Thus, we must control for the skills as student brings into the setting. If we do this statistically, then the introduction of some variables serves to impose statistical control for those variables identified and measured.In the figure below the "z spretest Algebra" variable is a pre-test of math skills given prior to students entering a college.

The algebra pretest score serves as a statistical control on the relationship between math course hours and math skills. In this model notice the following indicators.
- The coefficient (no longer a correlation, but interpreted in generally the same manner) is .24. This coefficient is substantial less than the .48 in the simple correlation.
- However, the explained variance has increased from .23 to .54. This means that this two independent variable model explains 54% of the variance. Thus, adding our knowledge of a pre-test score adds 21% explained variance.
- The path (arrow) between the pre-test variable and the skills variable is .61. This means that knowledge of the pre-test score is a stronger predictor of math skills than the number of courses taken.
- However, "math course hours" are still a significant predictor of math skill.
- The curved line between the pre-test and math courses variables indicate that these two variables are significantly related, but in this model both are exogenous (outside, or external) to the model. The arrow indicates this prior relationship is important to defining the coefficients in the model.
This covariance is the source of the statistical calculations that provide statistical control. This path shows that those students with higher pre-knowledge of math tend to take more math courses. (Math courses is a measure of college level math courses, with remedial and developmental math courses not included in this count, but in a variable used later.)
Given this model it is possible to say that math courses increase math skill controlling for the entry level of student. However, we know that introducing this one variable (prior knowledge) is not enough to account for all prior conditions.
The next step is to extend this statistical control to an expanded list.

