Pre-registration (2 hours)

↵ Back to module homepage

The best way to avoid p-hacking is through pre-registration. A pre-registration is an analysis plan that you create before you do a study (or, at least, before you look at the data from a study), and you can later refer back to it to show which parts of your analyses were planned beforehand and which parts were only done after seeing the data. Technically, doing new weird analyses after you see the data is fine, as long as you make it clear that they weren't planned beforehand; p-hacking happens when you explore the data after seeing it but then later on you pretend that those analyses were what you had planned all along. Exploring the data is fine if you admit that it is exploration (and, therefore, that it's less trustworthy than the pre-planned analyses).

This essay is a great description of what pre-registration is and how it works. Skim it (it's not necessary to read every word) to familiarize yourself with the concept of pre-registration and what kind of information is usually included in a pre-registration. Some additional useful readings are listed below. Good places to pre-register are aspredicted.org and osf.io.

If you've already done the module on "Publishing models and types of journals", you have also learned about registered reports, which pre-registration may sound similar to. These concepts are related, but not quite the same. A pre-registration is a record you make for yourself, which is not necessarily peer-reviewed by anyone, and later you can include a link to this record in any paper (even a regular paper) to prove that your analysis plan was pre-registered. A registered report, on the other hand, is a special kind of paper, in which your analysis plan itself gets peer-reviewed (often before you even carry out the study, but definitely before the reviewers see your results) and the journal guarantees that your paper will be accepted no matter what your results are. Therefore, a registered report necessarily includes a pre-registration of your research plan, but it also involves additional steps as well.

After you've skimmed the essay recommended above, proceed to the reflection questions below.

The benefits of pre-registration should be pretty obvious after reading the essay, so I won't ask you to repeat them. But what about drawbacks? Can you think of any potential drawbacks to pre-registration?

Some people are concerned about pre-registration because they worry it would prevent them from exploring and discovering new things. A lot of discoveries in science happen by accident, i.e., when you're intending to study one thing but then you unexpectedly observe some other interesting thing in your results. (This has happened for me as well; e.g., I tried to do an EEG experiment about a particular aspect of Mandarin tones, and the experiment kept not working out because Tone 2 and Tone 3 were behaving differently in a way I had not anticipated, so I ended up temporarily abandoning the original experiment and doing some new experiments to further investigate the Tone 2 - Tone 3 difference, which led to a whole other paper and ended up being more interesting than the original experiment I had been doing. Many famous findings in psychology have been discovered by accident.)

Another concern I have heard people state is that they think pre-registration works well for hypotheses about specific differences (e.g., expecting that one kind of word will be responded to faster than another kind of word in a reaction time task), but does not work well for modern psycholinguistic analyses in which people look at continuous measures and include lots of different features (e.g., word length, neighbourhood density, number of morphemes, semantic properties of the word, how common the word is, etc.) together to predict how fast people respond to any given word.

In fact, both of these criticisms are wrong. Pre-registration is completely compatible with exploring (i.e., pre-registration does not prevent learning stuff from serendipitous and unexpected findings), and is also completely compatible with research that uses lots of variables.

Why do I say that? Why is pre-registration not a problem for those kinds of research? If someone tells you "I won't do pre-registration because I want to be able to explore my data" or "I won't do pre-registration because my hypothesis involves a lot of variables", what can you tell them?

When you finish this activity, you are done with the module (assuming all your work on this and the previous tasks has been satisfactory). However, you may still continue on to the advanced-level task for this module if you wish to complete this module at the advanced level (if you're aiming for a higher grade or if you are just particularly interested in this topic). Otherwise, you can return to the module homepage to review this module, or return to the class homepage to select a different module or assignment to do now.


by Stephen Politzer-Ahles. Last modified on 2021-05-17. CC-BY-4.0.