Browse a curated collection of publications reflecting our lab’s research areas

Factor

Variation in observed effect sizes

Brief explanation of the argument

The “replication crisis” presupposes effect sizes that are fixed across time and contexts and can be divided between true and false. But when a given effect is measured in practice, “features unique to that context may mediate the average effect by adding additional mediator variance” […] “This can occur for numerous reasons, ranging from a poorly chosen statistical model to imperfect randomization, differing sample populations, environmental conditions, or flexibility in experimental design.”Their model shows that low replication rates may occur regardless of QRPs, unless the heterogeneity (between-study variance) is much smaller than the within-study variance of effect sizes, and sample size is sufficiently large.

Discipline

NA

Reference & doi

Bak-Coleman et al. (2022). Replication and reliability of science. SocArXiv. 10.31235/osf.io/rkyf7

Multiple trials

In order to increase power and detect small and medium effects, psychologists might run multiple trials or use multiple items and then aggregate data. But this is shown to inflate the estimated effect size.

Psychology

Brand et al. (2010). Exaggerated effect sizes from multiple trials. J. Gen. Psychol. 10.1080/00221309.2010. 520360

Unforeseen confounds

Replication studies may suffer from unforeseen confounding factors, which are unknown and therefore not documented in the original study, and in the replication study may be noticed but not corrected due to the reluctance of changing results post-hoc. The quality control that is often applied to the original study is not applied to the replication studies, leading to over-estimation of irreproducibility.

Psychology

Bressan, P. Confounds in “failed” replications. Front. Psychol., 2019, 10. 10.3389/fpsyg.2019.01884

Centralized scientific community

Evidence obtained comparing published drug-gene interaction claims with high-throughput experiments from the LINKCS L1000 program suggest that centralised scientific communities of authors using similar methods and contributing to many articles produce less replicable claims.

Toxicogenomics

Danchev et al. (2019). Centralized communities and replicability. eLife. 10.7554/elife.43094

Between-site variation

We should understand replications as instances of resampling (of population, effects etc.). Differences between sites/laboratories generate not small random effects that cancel each other out, but often important effects, non-randomly distributed. The result is an inflation of false positives, especially in between-species comparisons and small samples.

Animal behaviour

Farrar et al. (2021). Representativeness in animal cognition research. Anim. Behav. Cogn. 10.26451/abc.08.02.14. 2021

Small and non-representative samples 

We should understand replications as instances of resampling (of population, effects etc.). Small and non-representative samples (of experimental units, settings, treatments, and measurements) will mean that sampling variation will make other laboratories fail to replicate.

Animal behaviour

Farrar et al. (2021). Representativeness in animal cognition research. Anim. Behav. Cogn. 10.26451/abc.08.02.14. 2021

Vaguely specified hypotheses

We should understand replications as instances of resampling (of population, effects etc.). Small and non-representative samples (of experimental units, settings, treatments, and measurements) will mean that sampling variation will make other laboratories fail to replicate.

Animal behaviour

Farrar et al. (2021). Representativeness in animal cognition research. Anim. Behav. Cogn. 10.26451/abc.08.02.14. 2021