help:anova

# ANOVA (Analysis of Variance)

The one-way ANOVA (see One-way ANOVA) looks at averages for different groups and works out how likely the differences between groups are given the differences (variability) within groups. So if we were looking at average height of British, Canadian, Australian, and New Zealand adults, we would be more confident we had a real difference the bigger the height difference between nations and the narrower the height spread within each nation. The more variable height values are, the more chance that we could get the differences observed by chance even though sampling from the same population.

The ANOVA only works for numerical data, and the variable being averaged must be adequately normal in distribution. Were the units from each “group” sampled randomly and independently? The ANOVA is reasonably robust to violations of the normality assumption. Note that if the data is of a more complex dataset then a more sophisticated model would be needed. Split-plot designs, for example, can give misleading results if analysed using a one-way ANOVA.

The p value tells you how likely the difference observed would occur if drawing from the same population (i.e. where the groups didn't really have a relationship with the value being averaged). A small p value tells you that it would be rare to observe such a difference, or more extreme, if the group doesn't have a relationship with the value being averaged. From this we might reject the null hypothesis in favour of the alternative hypothesis - namely, that there is a difference according to the grouping variable.

In the example below (based on false data for illustration only), we shouldn't be surprised the p value is very low. It is possible to see a large difference in average age and N is reasonably large for each group. In this case we could reject the hypothesis that nation has no relationship with age.

Be aware that there is a certain sensitivity about terminology around this area. According to a widespread convention, we shouldn't conclude that there is a relationship, only that we reject the null hypothesis (see Hypothesis testing). We might go so far as to reject the null hypothesis in favour of the alternative hypothesis. See Statistical hypothesis testing.

If your data is not truly numeric but is only ordinal, or if your data being averaged is not adequately normal, the Kruskal-Wallis H Test is a good alternative. 