library(ggplot2)
library(dplyr)

theme_set(theme_bw())

Introduction

This lab is intended to introduce to ANOVA with the aov() function in R that, similar to the t.test() function, allows us conduct a hypothesis test with our observed data. Also like the t.test() function, aov() utilizes a syntax that

requires a formula of the form outcome ~ group. For example, to compare city miles per gallon based on class of vehicle in the mpg dataset, we would simply do

aov(cty ~ class, data = mpg) %>% summary()

##              Df Sum Sq Mean Sq F value              Pr(>F)    
## class         6   2295     383    45.1 <0.0000000000000002 ***
## Residuals   227   1925       8                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From this, we can find our degrees of freedom, the value of our F statistic, and a p-value.

Don’t forget to include %>% summary() in order to make our output more useful.

Jittering

This lab asks you to construct several boxplots. These are fine, but as an alternative you might also consider using a jitter plot to help visualize the individual observations. Using a height or width argument in geom_jitter() helps prevent the points from overlapping across categories.

ggplot(mpg, aes(cty, class)) + geom_jitter(height = 0.15, size = 2)

Plane Data

Here is a curated version of the data collected in class

planes <- read.csv("https://collinn.github.io/data/sta209_s25_planes.csv")

Note that some minor modifications have been made; colors have been grouped and designs with too few observations have been dropped.

Introductory Questions

Using the dataset provided, answer the following questions:

Question 1: How many observations are in our dataset? What variables are in this dataset? What values do they have?

Question 2: What is our outcome variable in this data? Which variables do you think will be the most helpful in explaining observed variance? Which do you think will be the least helpful?

Data Exploration

These next questions will help us begin exploring our dataset.

Question 3: Create boxplots comparing distance with each of the categorical variables (Design, Color, Hand, and Section). Offer a brief description of what you see in each of these plots

Question 4: Suppose you were to recreate this study again from the beginning. What other factors in the design of our experiment would you consider changing or controlling for? In other words, can you identify any additional sources of variability that may influence flight distance that are missing?

One-way ANOVA

Here, we will practice analyzing data using the aov() function

Question 5: When performing ANOVA, what form does our null hypothesis take? Give an example by stating the null hypothesis relating flight distance to class section.

Question 6: Perform an ANOVA analyzing the relationship between flight distance and paper color:

What are the two degrees of freedom for this test?
What value does the F-statistic take?
If you were testing our null hypothesis at \(\alpha = 0.05\), what decision would you make?

Question 7: Perform an ANOVA analyzing the relationship between flight distance and handedness:

What are the two degrees of freedom for this test?
What value does the F-statistic take?
If you were testing our null hypothesis at \(\alpha = 0.05\), what decision would you make?

Question 8: Perform an ANOVA analyzing the relationship between flight distance and design:

What are the two degrees of freedom for this test?
What value does the F-statistic take?
If you were testing our null hypothesis at \(\alpha = 0.05\), what decision would you make?

Question 9: Based on Questions 6-8, which variable would you use in a model to try and determine the flight distance of a particular plane? Does this match your expectations from the beginning of the lab?

Post-Hoc Testing

As noted previously, ANOVA is specifically concerned with testing the null hypothesis of equality between means for multiple groups,

\[ H_0: \mu_1 = \mu_2 = \dots = \mu_k \] Should we perform an ANOVA and reject our null hypothesis, we only know that at least two of our group means are different. Post-hoc pairwise testing (Latin for “after this” or “after the fact”) can be done to determine which of our pairwise differences are likely responsible.

Consider again our dog dataset in which we wish to test for equality in average speed between different colored dogs. This is done simply with the aov() function

## Read in dogs
dogs <- read.csv("https://collinn.github.io/data/dogs.csv")

## This will assign the results to a variable called model
model <- aov(speed ~ color, dogs)
summary(model)

##              Df Sum Sq Mean Sq F value Pr(>F)   
## color         3   1652     551     4.3 0.0053 **
## Residuals   396  50746     128                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here, we see information on the squared error from the grouping and our residuals, along with an F-statistic and a p-value. If we were testing at the \(\alpha = 0.05\) level, we would reject this test as \(p-val = 0.0053\).

To determine which pairwise colors had a difference, we can use the TukeyHSD() function (Tukey honest statistical difference) on the model object we created above:

## Pass in output from aov() function
comp <- TukeyHSD(model)
comp

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = speed ~ color, data = dogs)
## 
## $color
##                 diff       lwr      upr   p adj
## brown-black   1.9612  -2.65664  6.57906 0.69237
## white-black   4.2360  -0.38182  8.85388 0.08529
## yellow-black -2.3968  -5.97373  1.18021 0.31012
## white-brown   2.2748  -3.56635  8.11599 0.74672
## yellow-brown -4.3580  -9.41657  0.70063 0.11889
## yellow-white -6.6328 -11.69139 -1.57418 0.00437

There are a few things to note here:

First, we see that it gives us a point estimate of the difference in means as the first column in the output
Next, we get a confidence interval for the difference for lower and upper bounds. By default, this is a 95% confidence interval, but we can change this in the TukeyHSD() function by passing in an argument for conf.level
Finally, we see that the last column gives us an adjusted p-value. That is, rather than adjusting \(\alpha^* = \alpha/3\) and comparing the original p-values, it adjust the p-values that we can compare with our regular \(\alpha\). In either case, the conclusions that we should come to will be the same.

From this output, we see the only statistically significant difference in between yellow and white.

Finally, we can plot the output from the TukeyHSD() function with a call to the base R function plot()

## Pass in output from TukeyHSD() function
plot(comp)

Note here again, the only confidence interval that does not contain 0 (our null hypothesis for pairwise tests) is that between yellow and white, consistent with the output we observed above.

Question 10: Consider the ANOVA models you created in Questions 6-8. Of the ones in which there was evidence to reject the null hypothesis, perform a post-hoc test to determine between which individual groups there was a statistically significant difference