T-, Welch- and U-test in psychotherapy science

recommendations for application and interpretation

Authors

  • SFU Institut für Statistik

DOI:

https://doi.org/10.15135/2020.8.1.87-105

Abstract

In this first contribution to the Statistics series in psychotherapy science, the application use of the t-, Welch- and U-test for unrelated samples is presented in the sense of a best practice approach.

In addition to recommendations for (1) the optimal choice of procedure, (2) the use of effect sizes, (3) the designation of relevant results and (4) report conventions for the presentation of results, (5) the problem of reliable statistical decision making in the research context of psychotherapy science is adressed and therefore, (6) suggestions for dealing with this potential problem are identified.

References

American Psychological Association. (2020). Publication Manual of the American Psychological Association (7th. Ed.). APA: Washington, DC.

Berth, H. & Brähler, E. (2003). Bonner Fragebogen für Therapie und Beratung - Testinformation. Diagnostica, 94 (4). 191-194.

Bortz, J. (2006). Statistik: Für Human-und Sozialwissenschaftler. Springer Medizin Verlag: Heidelberg.

Bortz, J., & Lienert, G. A. (2008). Kurzgefasste Statistik für die klinische Forschung: Leitfaden für die verteilungsfreie Analyse kleiner Stichproben. Springer-Verlag.

Chow, S.C., Shao, J., Wang, H., Lokhnygina, Y. (2018). Sample Size Calculations in Clinical Research. New York: Chapman and Hall/CRC.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences, 2nd ed. Hillsdale, NJ: Erlbaum.

Fritz, O. F., Morris, P. E. & Richer, J. J. (2012). Effect Size Estimates: Current Use, Calculations, and Interpretation. Journal of Experimental Psychology, 141 (1), 2–18.

Hagemann, W. & Geuenich, K. (2009). Burnout-Screening-Skalen (BOSS). Göttingen: Hogrefe.

Jones, S. R., Carley, S. & Harrison, M. (2003). An introduction to power and sample size estimation. Emergency Medicine Journal, 20, 453-458.

Kleist, P. (2010). Wann ist ein Studienergebnis klinisch relevant?. Swiss Medical Forum, 10 (32), 525-527.

Krzywinski, M. & Altman, N. (2013). Power and sample size. Nature Methods, 10, 1139-1140.

Kühner, C., Bürger, C., Keller, F. & Hautzinger, M. (2007). Reliabilität und Validität des revidierten Beck-Depressionsinventars (BDI-II). Befunde aus deutschsprachigen Stichproben. Der Nervenarzt, 78, 651-656.

Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 50-60.

Rasch, D., Kubinger, K. D., & Moder, K. (2011). The two-sample t test: pre-testing its assumptions does not pay off. Statistical papers, 52 (1), 219-231.

Ramsey, P. H. (1980). Exact type 1 error rates for robustness of student's t test with unequal variances. Journal of Educational Statistics, 5 (4), 337-349.

Sawilowsky, S. S., & Blair, R. C. (1992). A more realistic look at the robustness and type II error properties of the t test to departures from population normality. Psychological bulletin, 111 (2), 352.

Welch, B. L. (1947). The generalization of student's' problem when several different population variances are involved. Biometrika, 34 (1/2), 28-35.

Downloads

Published

2025-10-30 — Updated on 2020-06-30

Issue

Section

statistics