When conducting linear regression in R, I am trying to understand how certain p-values are calculated and what they represent. So far this is my understanding:

The p-values from summary() correspond to t-tests of the marginal impact of the variables in question, *given all the other variables are already included*. This uses Type III sum of squares.

The anova() function instead uses F-tests, which are sequential testing using the Type I sum of squares. For example, if we have the following output:

```
Analysis of Variance Table
Response: soma
Df Sum Sq Mean Sq F value Pr(>F)
ht2 1 0.071 0.0710 0.1289 0.72073
wt2 1 4.635 4.6349 8.4196 0.00504 **
ht9 1 3.779 3.7792 6.8651 0.01090 *
Residuals 66 36.333 0.5505
---
```

The p-values are testing the significance of ht2 *in the presence of the intercept only*, of wt2 in the presence of *only the intercept and ht2*, and of ht9 in the presence of *the intercept, ht2, and wt2*.

Is this understanding correct? And if it is, then why do the p-values change when we add additional variables? For example:

```
Analysis of Variance Table
Response: soma
Df Sum Sq Mean Sq F value Pr(>F)
ht2 1 0.0710 0.0710 0.2072 0.6504835
wt2 1 4.6349 4.6349 13.5353 0.0004772 ***
ht9 1 3.7792 3.7792 11.0363 0.0014695 **
wt9 1 14.0746 14.0746 41.1018 1.878e-08 ***
Residuals 65 22.2581 0.3424
---
```

Adding the wt9 variable decreased the p-value for ht2. But if this is just testing the significance of ht2 in the presence of nothing but the intercept, shouldn’t the p-value be identical?

Thanks in advance for any clarifications!

