Jonas Ranstam PhD
Tips
This is a brief description of common statistical misunderstandings that often appear in manuscripts.
1. The greatest problem in medical research is insufficient statistical testing.
No, evaluations of inferential uncertainty may be necessary, but hypothesis testing is not. The greatest problems in medical research are related to inadequate research questions, flawed study designs, and confused interpretation of findings.
2. Why are pvalues controversial?
Pvalues are often misunderstood, incorrectly interpreted as descriptive measures. Findings in a sample are considered practically important when p<0.05, and a p>0.05 is considered an indication of equivalence. Pvalues are, however, uncertainty measures, and a statistically significant finding is not necessarily scientifically relevant. Scientific relevance has to be shown by other means than pvalues. Furthermore, statistical nonsignificance cannot be used to claim equivalence as a p>0.05 just reflects uncertainty. This incorrect use of pvalues has evolved into an unfortunate standard and become a substitute for scientific reasoning.
3. What measure can be used to show the uncertainty of an estimated treatment effect?
Estimation uncertainty needs to be considered when the clinical relevance of an estimated effect is evaluated. The pvalue cannot be used as this measures the uncertainty of the relation between the null hypothesis and the data, not of the estimated effect size. The correct uncertainty measure of an estimated effect is its confidence interval.
4. Why are odds ratios controversial?
The odds ratio is in some cases (e.g. in casecontrol studies) a relevant measure in itself, but in other cases (e.g. cohort studies) it is used as an approximation of the relative risk of an exposure. The approximation is good when the baseline risk is low, but otherwise two similar odds ratios can have different clinical interpretations (and two different odds ratios the same) because: RR = OR/(1R+OR*R) where R = baseline risk, RR = relative risk, and OR = odds ratio. The clinical significance of a treatment effect cannot always be evaluated if the studied effect is presented as an odds ratio. The problem can be avoided by using a statistical method that provides direct estimates of the relative risk.
5. When analysing data, it is important to check that all continuous variables have Gaussian distributions.
No, some statistical methods, such as Student's ttest, are based on an underlying assumption of a Gaussian distribution, but why should all continuous variables in a research project have a Gaussian distribution? Furthermore, the pvalue from a distributional test is as all other pvalues a measure of uncertainty. It cannot directly show whether or not a variable has a Gaussian distribution. Moreover, in some cases it is not the observed variables but a derived one that is assumed to have a Gaussian distribution, like the residual of a linear model, and this can have a Gaussian distribution even when the original variables do not.
6. What about nonparametric data?
First, a null hypothesis may or may not include assumptions about a parameter, and a nonparametric null hypothesis can often be tested using a distributionfree test, but the term nonparametric has no specific implications for data. Second, distributionfree tests provide pvalues but not necessarily effect size estimates, and pvalues are controversial see 2, which means that such tests are not useful for evaluation of clinical significance.
7. Why shouldn't I use Bonferroni corrections?
Multiplicity issues (related to the testing of multiple null hypotheses) are important to address in confirmatory studies, and one way is to use a Bonferroni correction, i.e. by lowering the significance level by a factor of 1/m, where m is the number of tested null hypotheses. However, to avoid subjectivity the adjustment should be prespecified, and as it has negative effects on the statistical power of the comparisons, it should also be accounted for in the sample size calculation, and this incresases patient numbers and costs. Multiplicity problems can often be avoided in the study design by careful endpoint definitions or solved by using closed test procedures or more effective adjustment methods such as Holm's or Hochberg's methods. In addition, while the existence of multiplicity issues is a problem in confirmatory studies, this is not relevant in exploratory or hypothesis generating studies. Furthermore, the statistical analysis of observational studies needs to include validity considerations as selection and confounding bias cannot be prevent in the study design, which implies that detailed prespecification is not practically possible. Moreover, the strategy, common in laboratory studies, of Bonferroni correcting for the number of exposure groups but ignoring that multiple endpoints are tested, does not solve the multiplicity problem.
8. I have always performed my lab experiments in triplicates, and now the statistical reviewer complains about n=3.
Is the sample size of 3 really based on a sample size calculation with acceptable risks of false positive and false negative outcomes? Or are these risks unknown? If the uncertainty of the test result is too great, the test result will not be reliable. It is thus important to know the statistical precision. It seems to me that a statistical test based on a sample size of 3 is unlikely to provide reliable empirical evidence, and publishing scientific findings based on clairvoyance instead of empirical evidence is not easy, at least not in journals claiming that they present scientific work.
9. Predictors, covariates, regressors, independent variables, and risk factors.
To be continued...
© Copyright 2019 Jonas Ranstam. All rights reserved.
Privacy policy. Admin.

