EME 210
Data Analytics for Energy Systems

ANOVA

PrintPrint

One-way ANOVA

ANOVA (short for "Analysis of Variance"; more on that later) allows one to compare means across multiple groups. Here, we will just focus on one-way ANOVA, where we compare means across more than two categories of a single categorical variable. Recall that in Lesson 5 we did a comparison of means (difference of means hypothesis testing). Well, ANOVA essentially expands that comparison to more than two groups or categories so that you can efficiently see if there is a significant difference of means across many groups in just one test. Thus, it is similar to a chi-square test, except ANOVA looks at means whereas chi-square testing looks at proportions.

One-Way Chi-square vs. One-Way ANOVA
Hypothesis for a One-Way Chi-square Test are:  Hypothesis for a One-Way ANOVA Test are:
Ho: Defines proportions, p i , for all k categories Ho: All population means, μ i are equal for all k categories
Ha: At least one p i is not as specified Ha: At least one μ i is not equal to the others

Another way of stating the null and alternative hypotheses for one-way ANOVA is:

HO:μ1=μ2=...=μkHA:at least one mean is not equal to the others\begin{align} H_O:&\ \mu_1 = \mu_2 = ... = \mu_k \\ H_A:&\ \text{at least one mean is not equal to the others}\end{align}

Since chi-square tests look at proportions, they are suited for a categorical variable, which would then be summarized by a frequency table.

Data for a One-Way Chi-square test. See link in caption for text description
A generalized depiction of the typical data format for chi-square tests
Click for a text description of Data for a One-Way Chi-square Test.
Data for a One-Way Chi-square test
Categorical Variable
A
B
B
A
C
Data for a One-Way Chi-Square Test
Categorical Variable Frequency
A 2
B 2
C 1
Credit: © Penn State is licensed under CC BY-NC-SA 4.0

On the other hand, ANOVA examines means and so is meant for a quantitative variable, which can be summarized into grouped sample means.

Data for a One-Way ANOVA test. See link in caption for text description
A generalized depiction of typical data for a one-way ANOVA test
Click for a text description of Data for a One-Way ANOVA Test.
Data for a One-Way ANOVA test
Categorical Variable Quantitative Variable
A 5
B 4
B 3
A 3
C 6
Data for a One-Way ANOVA Test
Categorical Variable Mean
A 4
B 3.5
C 6
Credit: © Penn State is licensed under CC BY-NC-SA 4.0

Analysis of Variance

The key question that ANOVA answers is: “Are the differences in the mean values significantly different?” Most likely, the sample means from each group do not agree exactly. So, how much disagreement in the sample means is needed to say that there is difference in the population means?

This is exemplified in the figure below, where one may be able to tell that the sample means (represented by asterisks) are different, but it’s hard to tell whether these differences matter much because the samples themselves (the boxplots) overlap so much.

Comparative boxplots of three different samples
Comparative boxplots of three different samples, X, Y, and Z, of a quantitative variable ("Variable"). The asterisks denote the sample mean of each group.
Credit: © Penn State is licensed under CC BY-NC-SA 4.0

To answer this, we need to consider how much the sample means could vary by random chance alone (that is, from randomly drawing the sample from the population). Therefore, we need to analyze the variance in the sampling distribution of the mean. Continuing with the example above, the figure below shows the corresponding bootstrap distributions for each sample mean (recall from Lesson 4 that the bootstrap distribution approximates the sampling distribution). We can now see that the sample means are distinct from one another. The variances of the sample means (the spread of each boxplot) aren’t so large as to overlap with each other.

Comparative boxplots of the bootstrap distributions of sample means
Comparative boxplots of the bootstrap distributions of sample means from the samples in the figure above
Credit: © Penn State is licensed under CC BY-NC-SA 4.0

 FAQ