Comparing means with ANOVA

# Comparing means with ANOVA
### Prof. Maria Tackett

---

## [Click here for PDF of slides](09-anova.pdf)

---

## Topics

- Compare group means using analysis of variance

---

## Aldrin in the Wolf River

- The Wolf River in Tennessee flows past an abandoned site once used by the pesticide industry for dumping wastes, including chlordane (pesticide), aldrin, and dieldrin (both insecticides).

- These highly toxic organic compounds can cause various cancers and birth defects.

---

## Aldrin in the Wolf River

```
## # A tibble: 30 × 2
##    aldrin depth 
##     <dbl> <chr> 
##  1    3.8 bottom
##  2    4.8 bottom
##  3    4.9 bottom
##  4    5.3 bottom
##  5    5.4 bottom
##  6    5.7 bottom
##  7    6.3 bottom
##  8    7.3 bottom
##  9    8.1 bottom
## 10    8.8 bottom
## # … with 20 more rows
```
]

.pull-right[
<img src="09-anova_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" />
]

---

## Aldrin in the Wolf River

- The standard methods to test whether these substances are present in a river is to take samples at six-tenths depth.

<br>

- These compounds are denser than water and their molecules tend to stick to particles of sediment, they are more likely to be found in higher concentrations near the bottom than near mid-depth.

---

## Is there a difference between the mean aldrin concentrations among the three depth levels?

---

## Aldrin by depth

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> depth </th>
   <th style="text-align:right;"> n </th>
   <th style="text-align:right;"> mean </th>
   <th style="text-align:right;"> sd </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> bottom </td>
   <td style="text-align:right;"> 10 </td>
   <td style="text-align:right;"> 6.04 </td>
   <td style="text-align:right;"> 1.579 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> middepth </td>
   <td style="text-align:right;"> 10 </td>
   <td style="text-align:right;"> 5.05 </td>
   <td style="text-align:right;"> 1.104 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> surface </td>
   <td style="text-align:right;"> 10 </td>
   <td style="text-align:right;"> 4.20 </td>
   <td style="text-align:right;"> 0.660 </td>
  </tr>
</tbody>
</table>

---

So far, we have used a .vocab[quantitative] predictor variable to understand the variation in a quantitative response variable.
<br>

Now, we will use a .vocab[categorical (qualitative)] predictor variable to understand the variation in a quantitative response variable.

---

## Notation

- `$K$` is number of mutually exclusive groups. We index the groups as `$i = 1,\dots, K$`.
<br>

- `$n_i$` is number of observations in group `$i$`
<br>

- `$n = n_1 + n_2 + \dots + n_K$` is the total number of observations in the data
<br>

--
  
- `$y_{ij}$` is the `$j^{th}$` observation in group `$i$`, for all `$i,j$`
<br>

- `$\mu_i$` is the population mean for group `$i$`, for `$i = 1,\dots, K$`

---

## Using ANOVA to compare means

- .vocab[Question of interest] Is the mean value of the response `$y$` the same for all groups, or is there at least one group with a significantly different mean value?

- To answer this question, we will test the following hypotheses:

.alert[
$$
`\begin{aligned}
&H_0: \mu_1 = \mu_2 = \dots =  \mu_K\\
&H_a: \text{At least one }\mu_i \text{ is not equal to the others}
\end{aligned}`
$$
]

---

## What's happening...

.alert[
$$
`\begin{aligned}
&H_0: \mu_1 = \mu_2 = \dots =  \mu_K\\
&H_a: \text{At least one }\mu_i \text{ is not equal to the others}
\end{aligned}`
$$
]

- If the sample means are "far apart", " there is evidence against `$H_0$`

- We will calculate a test statistic to quantify "far apart" in the context of the data

---

## Analysis of Variance (ANOVA)

**Main Idea: ** Decompose the <font color="green">total variation</font> in the data into the variation <font color="blue">between groups (model)</font> and the variation <font color="red">within each group (residuals)</font>

`$$\color{green}{\sum_{i=1}^{K}\sum_{j=1}^{n_i}(y_{ij}- \bar{y})^2}=\color{blue}{\sum_{i=1}^{K}n_i(\bar{y}_i-\bar{y})^2} + \color{red}{\sum_{i=1}^{K}\sum_{j=1}^{n_i}(y_{ij}-\bar{y}_i)^2}$$`
<br>

- If the variation <font color="blue">between groups</font> is significantly greater than the variation <font color="red">within each group</font>, then there is evidence against the null hypothesis.

---

## ANOVA table

---

## Total variation

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> sumsq </th>
   <th style="text-align:right;"> meansq </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> depth </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 2 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 16.961 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 8.480 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 6.134 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.006 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> Residuals </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 27 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 37.329 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 1.383 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
  </tr>
</tbody>
</table>

`$SS_{Total}= 16.961 + 37.329 = 54.290$`

`$DF_{Total} = 2 + 37 = 29$`

`$s^2_y = \frac{SS_{Total}}{DF_{Total}} = \frac{54.290}{29} = 1.872$`

---

## Between variation

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> sumsq </th>
   <th style="text-align:right;"> meansq </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> depth </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 2 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 16.961 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 8.480 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 6.134 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.006 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 27 </td>
   <td style="text-align:right;"> 37.329 </td>
   <td style="text-align:right;"> 1.383 </td>
   <td style="text-align:right;">  </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

`$SS_{Between}= 16.961$`

`$DF_{Between} = 2$`

`$MS_{Between} = \frac{SS_{Between}}{DF_{Between}} = \frac{15.961}{2} = 8.480$`

---

## Within variation

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> sumsq </th>
   <th style="text-align:right;"> meansq </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> depth </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 16.961 </td>
   <td style="text-align:right;"> 8.480 </td>
   <td style="text-align:right;"> 6.134 </td>
   <td style="text-align:right;"> 0.006 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> Residuals </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 27 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 37.329 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 1.383 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
  </tr>
</tbody>
</table>

`$SS_{Within}= 37.329$`

`$DF_{Within} = 27$`

`$MS_{Within} = \frac{SS_{Within}}{DF_{Within}} = \frac{37.329}{27} = 1.383$`

---

## Using ANOVA table to test difference in means

<br>

.eq[
$$
`\begin{aligned}
&H_0: \mu_1 = \mu_2 = \mu_3\\
&H_a: \text{At least one depth level has }\mu_i \text{ that is not equal to the others}
\end{aligned}`
$$
]

---

## Using ANOVA table to test difference in means

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> sumsq </th>
   <th style="text-align:right;"> meansq </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> depth </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 16.961 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 8.480 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 6.134 </td>
   <td style="text-align:right;"> 0.006 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 27 </td>
   <td style="text-align:right;"> 37.329 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 1.383 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

`$$F = \frac{MS_{Between}}{MS_{Within}} = \frac{8.480}{1.383} = 6.134$$`

---

## Calculate p-value

Calculate the p-value using an F distribution with `$K-1$` and `$n-K$` degrees of freedom

---

## Using ANOVA table to test difference in means

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> sumsq </th>
   <th style="text-align:right;"> meansq </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> depth </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 16.961 </td>
   <td style="text-align:right;"> 8.480 </td>
   <td style="text-align:right;"> 6.134 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.006 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 27 </td>
   <td style="text-align:right;"> 37.329 </td>
   <td style="text-align:right;"> 1.383 </td>
   <td style="text-align:right;">  </td>
   <td style="text-align:right;background-color: #dce5b2 !important;">  </td>
  </tr>
</tbody>
</table>

.vocab[P-value]: Probability of observing a test statistic at least as extreme as *F Stat* given the group means are equal

The p-value is very small `$(\approx 0)$`, so we reject `$H_0$`. The data provide sufficient evidence that at least one depth level has a mean aldrin concentration that differs from the others.

---

## Assumptions for ANOVA

---

## Assumptions for ANOVA

1️⃣ .vocab[Normality:] `$y_{ij} \sim N(\mu_i, \sigma^2)$`

2️⃣ .vocab[Constant variance:] The population distribution for each group has a common variance, `$\sigma^2$`

3️⃣ .vocab[Independence:] The observations are independent from each other
- This applies to observations within and between groups

---

## Checking Normality

✅ No major skewness or outliers.

---

## Checking Normality

✅ Points fall relatively along the diagonal line.

---

## Checking constant variance

```
## # A tibble: 3 × 4
##   depth        n  mean    sd
##   <chr>    <int> <dbl> <dbl>
## 1 bottom      10  6.04 1.58 
## 2 middepth    10  5.05 1.10 
## 3 surface     10  4.2  0.660
```
]

.pull-right[
✅ The maximum standard deviation is about 2.4 times the smallest one. This is OK given the small sample size.  
]

---

## Checking independence

✅ Based on what we know about the study, we have no reason to believe that the aldrin concentrations are not independent of each other.

---

## Robustness to Assumptions

- .vocab[Normality:] `$y_{ij} \sim N(\mu_i, \sigma^2)$`
  + ANOVA relatively robust to departures from Normality. 
  + Concern when there are strongly skewed distributions with different sample sizes (especially if sample sizes are small, < 10 in each group)
  
--

- .vocab[Independence: ]There is independence within and across groups
  + If this doesn't hold, should use methods that account for correlated errors

---

## Robustness to Assumptions

- .vocab[Constant variance: ]The population distribution for each group has a common variance, `$\sigma^2$`
  + Critical assumption, since the pooled (combined) variance is important for ANOVA
  + **General rule:** Satisfied if `$SD_{max}/SD_{min} \leq 2$`. OK if this is somewhat `$> 2$` when sample sizes are small.

---

## Recap

- Used ANOVA to compare means across groups

---

## Acknowledgements

- Analysis example and map image from [OpenIntro Statistics](https://www.openintro.org/)