Acta Orthopaedica - Sat 04/09

Think twice about using stepwise regression analysis

Background

Some statistical software packages include programmes for stepwise multiple regression analysis. In short, this is a technique for building statistical models automatically, by selecting variables from a pre-defined set of candidate variables using a test related criterion, e.g. F- or p-value. Two main selection procedures exist: forward and backward. The former alternative selects explanatory variables by consecutive inclusion, the latter by consecutive exclusion. The two procedures often produce different outcomes.

Statistical tests and parameter estimates

It is often argued that statistical testing is an important part of a scientific manuscript because p-values represent an objective method for assessing scientifically important differences in data. This is a nice idea, but it is false.

Statistical tests cannot be used to “assess” important differences. Statistical significance is used for checking if an observed difference or effect can be explained by chance alone. When this is the case, the observation should of course be interpreted with caution. However, a statistically insignificant hypothesis never indicates that a difference or effect “does not exist”, because absence of evidence is not evidence of absence.

Furthermore, scientific importance is related to two different issues, which should not be confused: clinical and statistical significance. For example, whether a body temperature difference of 0.5 degree Celsius is clinically significant or not depends on biology. The difference may be significant when predicting ovulation but insignificant when predicting recovery after hip fracture. In contrast, statistical significance depends entirely on statistical issues; the 0.5 degree difference in body temperature may be statistically significant in a study of recovery after hip fracture with 400 subjects but not in one with 12.

In addition, statistical test results are not objective. The outcome of statistical tests depends on the characteristics of the sample. A true difference in revision risk between two sorts of prostheses may show up in one sample, but not in another one, because a risk difference can easily be confounded by association with other factors affecting revision risk.

It is therefore, at least in observational studies, always necessary to take possible confounding factors into consideration. Sex and age are two common confounders. If not adjusted for, differences in the distributions of sex and age can bias a risk estimate and produce an arti-factual risk where none exists or mask an existing one.

Building statistical models

Adjustment for confounding can be performed using statistical (regression) models. Programmes for fitting statistical models are generally available in commercial software packages.

The testing and parameter estimation performed using a statistical model clearly depends on the variables included in the model. It is therefore crucial for confounding adjustment that known clinically significant variables are included in the regression model. The statistical significance of an adjustment variable is, however, irrelevant. A clinically significant variable may well be an important confounder also when it is statistically insignificant.

Stepwise regression analysis

The common practice (1) to screen a dataset using simple hypothesis tests and include statistically significant variables in a multiple statistical model to find out if they are “really significant” is therefore inappropriate. This technique should be used neither for confounding adjustment, nor for prediction purposes.

Stepwise regression analysis also uses p-value related criteria for building a statistical model. This is thus also an inappropriate method (2-3). The technique can perhaps be used for generating hypotheses about completely unknown phenomena, but a sound strategy for selecting variables in clinical and epidemiological studies where some knowledge do exist, is to use clinical judgment.

In addition, stepwise regression have for many years been criticised by statisticians (4-6) for overestimating precision and producing biased regression coefficients. It seems, however, that little of this criticism has reached the medical society. Inappropriate use of stepwise regression analysis appears to be increasingly common in medical publications (7-8).

Other scientific journals like Annals of Internal Medicine have also recently included statistical guidelines to “avoid stepwise methods of model building” in their Information for Authors
(http://www.annals.org/shared/author_info.html#multivariable-analysis, July 9, 2007).

In conclusion, there are good reasons for thinking twice about using this method in medical research. Our recommendation is always to avoid stepwise regression.

References

1. Sun SW et al. Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol. 1996;49:907-16.

2. Harrell FE, Jr, et al. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statist Med. 1996;15:361-87.

3. Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol. 1989;129:125-37.

4. Mantel N. Why stepdown procedures in variable selection. Technometrics 1970;12:621-625.

5. Copas JB. Regression, prediction and shrinkage (with discussion) JRSS 1983;B45:311-354.

6. Derksen S, Keselman HJ. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 1992;45:265-282.

7. Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP. Why do we still use stepwise modelling in ecology and behaviour? J Anim Ecol 2006;75:1182-1189.

8. Malek MH, Berger DE, Coburn JW. On the inappropriateness of stepwise regression analysis for model building and testing. Eur J Appl Physiol 2007 May 23; Epub ahead of print.

Acta Orthopaedica 2008 - Last modified: 2010-08-08 - Webmaster: webmaster@actaorthop.org