Simple Linear Regression

# Simple Linear Regression
## Inference
### Prof. Maria Tackett

---

## [Click for PDF of slides](04-slr-coef-inf.pdf)

---

## Topics

- Conduct a hypothesis test for `$\beta_1$`

<br>

- Calculate a confidence interval for `$\beta_1$`

---

## Movie ratings data

The data set contains the "Tomatometer" score (.term[critics]) and audience score (.term[audience]) for 146 movies rated on rottentomatoes.com.

---

## The model

```r
model <- lm(audience ~ critics, data = movie_scores)
```

```r
model %>%
  tidy() %>%
  kable(format = "html", digits = 3)
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 32.316 </td>
   <td style="text-align:right;"> 2.343 </td>
   <td style="text-align:right;"> 13.795 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> critics </td>
   <td style="text-align:right;"> 0.519 </td>
   <td style="text-align:right;"> 0.035 </td>
   <td style="text-align:right;"> 15.028 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
</tbody>
</table>

---

## The model

`$$\color{blue}{\hat{\text{audience}} = 32.316 + 0.519 \times \text{critics}}$$`

---

### Does the data provide sufficient evidence that `$\beta_1$` is significantly different from 0?

---

## Outline of a hypothesis test

1️⃣ State the hypotheses.

2️⃣ Calculate the test statistic.

3️⃣ Calculate the p-value.

4️⃣ State the conclusion. 
---

## 1️⃣ State the hypotheses

<br>

.pull-left[
.small-box[
`$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$`
]
]

---

## 1️⃣ State the hypotheses

<br>

.pull-left[
.small-box[
`$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$`
]
]

]

---

## 1️⃣ State the hypotheses

<br>

.pull-left[
.small-box[
`$$\large{\begin{aligned}& H_0: \beta_1 = 0\\& H_a: \beta_1 \neq 0\end{aligned}}$$`
]
]

]

---

## 2️⃣ Calculate the test statistic

<br>

.eq[
`$$\text{test statistic} = \frac{\text{Estimate} - \text{Hypothesized}}{\text{Standard error}}$$`
]

---

## 2️⃣ Calculate the test statistic

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 32.316 </td>
   <td style="text-align:right;"> 2.343 </td>
   <td style="text-align:right;"> 13.795 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td>
  </tr>
</tbody>
</table>

<br>

.small-box-work[
`$$\begin{aligned}t &= \frac{0.5187 - 0}{0.0345}\\
&= \mathbf{15.03}\end{aligned}$$`
]
]

---

## 3️⃣ Calculate the p-value

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 32.316 </td>
   <td style="text-align:right;"> 2.343 </td>
   <td style="text-align:right;"> 13.795 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td>
  </tr>
</tbody>
</table>

<br>

Calculated from a `$t$` distribution with `$n-2$` degrees of freedom

---

## 3️⃣ Calculate the p-value

---

## Understanding the p-value

*These are general guidelines. The strength of evidence depends on the context of the problem.*

---

## 4️⃣ State the conclusion

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 32.316 </td>
   <td style="text-align:right;"> 2.343 </td>
   <td style="text-align:right;"> 13.795 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td>
  </tr>
</tbody>
</table>

<br>

The data provide sufficient evidence that the population slope `$\beta_1$` is different from 0.

.vocab[There is a linear relationship between the critics score and audience score for movies on rottentomatoes.com.]

---

### What is a plausible range of values for the population slope `$\beta_1$`?

---

## Confidence interval for `$\beta_1$`

<br>

`$t^*$` is calculated from a `$t$` distribution with `$n-2$` degrees of freedom
  
---

## Calculating the 95% CI for `$\beta_1$`

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 32.316 </td>
   <td style="text-align:right;"> 2.343 </td>
   <td style="text-align:right;"> 13.795 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #dce5b2 !important;"> critics </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.519 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0.035 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 15.028 </td>
   <td style="text-align:right;background-color: #dce5b2 !important;"> 0 </td>
  </tr>
</tbody>
</table>

`$$\hat{\beta}_1 = 0.519 \hspace{15mm} t^* = 1.977 \hspace{15mm} SE_{\hat{\beta}_1} = 0.035$$`
--

---

## Interpretation

<br>

.vocab[We are 95% confident that for every one point increase in the critics score, the audience score is predicted to increase on average between 0.450 and 0.588 points.]

---

## Recap

- Conducted a hypothesis test for `$\beta_1$`

<br>

- Calculated a confidence interval for `$\beta_1$`