Computation and Analysis of Effect Sizes
The effect size calculated is g, the difference between the means of the intervention group and the control group, or the difference between the pretest and posttest
group means, divided by the pooled standard deviation. The sign of the difference
was positive when a treatment had a positive effect (thus, those that reduced
learning pathologies such as anxiety, surface approaches, and negative attitudes
were coded as positive effects). The gs were converted to ds by correcting them
for bias (as the gs overestimate the population effect size, particularly in small
samples; see Hedges & Olkin, 1985). To determine whether each set of ds shared
a common effect size (i.e., was consistent across the studies),
we calculated a
homogeneity statistic Qw, which has an approximate chi-square distribution with
k - 1 degrees of freedom, where k is the number of effect sizes (Hedges & Olkin,
1985). Given the large number of effect sizes that are combined into the various
categories, and the sensitivity of the chi-square statistic to this number, it is not
surprising that nearly all homogeneity statistics are significant. As the most
critical comparisons are presented in interaction tables between at least two
variables, we are more confident that these means are sufficiently homogeneous
to use the means as reasonable estimates of the typical value.
We then used categorical models to determine the relation between the study
111
This content downloaded on Sun, 3 Feb 2013 08:00:24 AM
All use subject to JSTOR Terms and Conditions
Hattie, Biggs, and Purdie
characteristics and the magnitude of the effect sizes, using the procedures outlined
by Hedges and Olkin (1985). These models provide a between-classes effect
(analogous to a main effect in an ANOVA design) and a test of homogeneity of
the effect sizes within each class. The between-classes effect is estimated by QB'
which has an approximate chi-square distribution with p -
1 degrees of freedom,
where p is the number of classes. The statistical significance of this betweenclasses effect can be used to determine whether the average effect size differs over
classes. The tables reporting tests of categorical models also include the mean
weighted effect size for each class, calculated with each effect size weighted by
the reciprocal of its variance, and the 95% confidence interval of this mean. If this
confidence interval does not include zero, then the mean weighted effect size can
be considered significantly different from zero.