Paul Meehl on replication and significance testing
Still very relevant today.
A scientific study amounts essentially to a “recipe,” telling how to prepare the same kind of cake the recipe writer did. If other competent cooks can’t bake the same kind of cake following the recipe, then there is something wrong with the recipe as described by the first cook. If they can, then, the recipe is all right, and has probative value for the theory. It is hard to avoid the thrust of the claim: If I describe my study so that you can replicate my results, and enough of you do so, it doesn’t matter whether any of us did a significance test; whereas if I describe my study in such a way that the rest of you cannot duplicate my results, others will not believe me, or use my findings to corroborate or refute a theory, even if I did reach statistical significance. So if my work is replicable, the significance test is unnecessary; if my work is not replicable, the significance test is useless. I have never heard a satisfactory reply to that powerful argument.
Meehl, P. E. (1990). Appraising and amending theories: The strategy of Lakatosian defense and two principles that warrant using it. Psychological Inquiry, 1, 108-141, 173-180. [PDF]