Advanced level

Advanced level (4 hours)

Find a dataset that would be appropriate for testing with ANOVA. This may be a dataset from a research paper (many are available at this repository, although not all will be appropriate for ANOVA*) or it may be any random thing (there's lots of random "big data" datasets online nowadays, or things like country GDP, COVID-19 cases, or all kinds of other things that you can download); it doesn't need to be related to linguistics. You can use the data as-is, or you can change the variables to be something more meaningful (e.g., you might download some data about the height of people from different countries, but call it "language proficiency" instead of "height" to make it easier to link to your own research). If a dataset has lots of variables, you don't need to analyze all of them; you could just choose a subset of the data would would be appropriate for ANOVA. You can do one-way ANOVA or factorial ANOVA.

*In particular, be aware that to use ANOVA, you have to have just one data point per participant (for regular ANOVA) or one data point per condition per participant (for repeated-measures ANOVA). Many of these sample datasets may have lots of data for each person; in that case, you would first need to average the data within each person (and condition) before you can do ANOVA. Or you can search for data that are already in the proper format.

See ANOVA in R.pdf for a simple tutorial on how to run ANOVA in R with little knowledge of R coding (although it still will be useful to go through the "Programming" module to understand the code here). You can also do your ANOVA in other statistical software, such as SPSS, the Excel Data Analysis Toolpack, or something like Jasp or Jamovi (these are both free programs that provide a point-and-click interface similar to SPSS); tutorials for how to do ANOVA in these are available online.

Describe the hypothesis that you want to test on these data, describe the results of the ANOVA you carried out, and describe what conclusion you can make based on the p-value of that test. (Remember activity #2 of the "Introduction to inferential statistics and t-tests module" for caveats about what conclusions are or are not valid from p-values.)

by Stephen Politzer-Ahles. Last modified on 2021-05-15. CC-BY-4.0.