Redusing the variables in a model to achieve eligance
In the case of the modeling being used as an example, a standardized test was administered to community college students across their enrollments that resulted in five skills scores for math, reading, scientific thinking, critical thinking, and composition. The focus for modeling was to examine and compare models for acquisition of these five skills. To support this comparison a common theoretical model was designed. This shared theoretical model was then reduced to skill specific models that best fit the data.Sometimes modeling may utilize a different approach in which variable are added to a model, one at a time, and only when the increase in explained variance is statistically significant. Other times modeling may hypothesize a specific model and then test it for fit with data. Each of these approaches is appropriate to specific research settings and doctrine. It is not our purpose to examine the philosophical arguments around these different uses. For the purposes of these discussion and because it worked well in the assessment process I proceed with an explanation of model reduction procedures. This is not to say or imply that the other approaches are less valid or that the reduction process is not without reservations and problems.The first step is to develop a full model. The full model in the following diagram contains 22 variables. The diagram also includes many, but not all of the covariance paths between exogenous variables.

The Model in General
Since the prior page included only some of these variables, take a moment to acquaint yourself with the full model. As a general guide I offer the following:
- In the upper left are three pre-test scores.
- In the lower left are nine variables that are sums of the number of course credit (hours) in each subject area. Note that these have both a direct path to the skill score in the center, and a path to the total number of college credits in the lower center of the diagram.
- In the upper center of the diagram is a count of the remedial and developmental credits "< 100."
- Above that is a variable for the students GPA at the time they were tested.
- In the upper right are three self-reports of external time commitments including study time, employment, and housework.
- In the lower right are the exogenous variables including sex, age, motivation, Non-native speaker of English, and prior college credits.
The order of the model flows from left to right, with paths (arrows) indicating the order assigned to the variables.
Double headed arrows on curved lines indicate covariance relationships that are specified. Since our purpose here is to provide for general interpretation, rather than instruction in constructing the structural equations specifications, I will not try to explain these relationships. The most interesting covariance relationships occur between the variable being a non-native English speaker and the pre-test scores. These will be discussed when the remain in a final model.
The Procedure of Reduction:
First, the full theoretical model was calculated for each of the five skill scores. As one might expect, not each of the paths among variables were substantial or statistically significant. Eliminating these non-significant paths is the procedure used to reduce a model. In some uses a non-significant path is retained because that lack of significance provides evidence that is conceptually interesting, but such hypothesis testing paths are not included in any of the model that follow.
This then brings up the issue of what is a statically significant coefficient. Here to debates take up volumes of journal articles and books. The standard applied to these models requires that a path must be at least 2 times the standard error of the path. I am not going to explain standard error here, suffice it to say that among the pages of tables output be structural equation software, are reports of both the standard error and the path coefficients.
Reduction of the model proceeded in several steps. Using results of calculations for the full model, the variables with very small coefficients (less than 1.5 times the standard error) are eliminated as a block. Then the model is recalculated and additional variables eliminated one at a time, in an iterative process starting with the smallest coefficient less than 2 times the standard error.
The reduced final models for each of the five skills follow along with comments about specifics of that model.
- Math Model
- College Reading Model
- Critical Thinking Model
- Scientific Reasoning Model
- Writing Model

