STA-290 HW 8

library(ggplot2)
library(dplyr)

# Prettier graphs
theme_set(theme_bw())

Question 1

In professional basketball games during the 2009-2010 season, when Kobe Bryant of the Los Angeles Lakers shot a pair of free throws, 8 times he missed both, 152 times he made both, 33 times he made only the first shot, and 37 times he made only the second. Is it possible that the successive free throws are independent, or is there evidence to suggest a “hot streak” effect? The data are tabulated in the freethrow data frame below:

# Create freethrow data (copy and paste this into your own R session)
freethrow <- matrix(c(152,33,37,8), nrow = 2, byrow = TRUE)
rownames(freethrow) <- c("Make 1st", "Miss 1st")
colnames(freethrow) <- c("Make 2nd", "Miss 2nd")
print(freethrow)

##          Make 2nd Miss 2nd
## Make 1st      152       33
## Miss 1st       37        8

What is the null hypothesis of this experiment?
Using the table provided, find a table of expected values for each cell
Using your table of observed and expected values, find the \(\chi^2\) statistic associated with this table along with the degrees of freedom
Using your critical value sheet, if we were to test this hypothesis at level \(\alpha = 0.05\), what conclusion would we come to regarding the independence of the first and second free throw?

Question 2

Reconsider the anorexia data that we investigated in Homework 7:

anorexia <- read.csv("https://collinn.github.io/data/anorexia.txt")

Part A: Use the mutate function to again create a variable called Diff that records the difference in pre and post weights
Part B: State the null hypothesis for testing the difference and pre and post weights for each of the groups considered in the dataset
Part C: Perform an ANOVA for the hypothesis stated in Part B. What do you conclude?
Part D: Use post-hoc testing to determine if there are any pairwise differences between these groups. How do your findings here compare with the conclusions you had in Homework 7?

Question 3

This question will again consider the mtcars dataset built into R

data(mtcars)

We will be investigating the relationship between the weight of a car (independent variable) and its miles per gallon (dependent variable). In addition to this, we will also be using the number of carburetors as a second independent variable.

Part A: Create a linear model predicting mpg with the covariates wt and carb. Based on the results, does it appear that the number of carburetors has a relationship with fuel economy (mpg)?
Part B: By default, carb is stored in the dataset as an integer value. Use the mutate function to create a new variable in the mtcars dataset called carb_factor that is equal to carb_factor = fator(carb). This will turn the new variable into a categorical one instead of an integer
Part C: Create a new linear model, this time predicting mpg with wt and carb_factor. What has changed this time? Specifically, what do the covariates in the new model represent, and how is this different from what we saw in Part A? (Hint: how do the estimates for factor_carb change as the number of carburetors increases?)
Part D: Based on your assessment in Part C, which of these two models do you think is more appropriate for predicting miles per gallon? In other words, does the number of carburetors appear to make more sense as a continuous variable or a categorical one?