Question 1

This question is Question 4.2 from the textbook and has been reproduced here. The dataset below contains the results from a poll based on a random sample with two variables: response, indicating their response to the poll question, and political, reporting their self-reported political ideology.

Nine-hundred and ten (910) randomly sampled registered voters from Tampa, FL were asked if they thought workers who have illegally entered the US should be (i) allowed to keep their jobs and apply for US citizenship, (ii) allowed to keep their jobs as temporary guest workers but not allowed to apply for US citizenship, or (iii) lose their jobs and have to leave the country.

## Copy and run this code to create table
immigration <- read.csv("https://collinn.github.io/data/immigrationpoll.csv")

Use the appropriate tables to answer the following questions:

  1. What percent of these Tampa, FL voters identify themselves as conservatives?

  2. What percent of these Tampa, FL voters are in favor of the citizenship option?

  3. What percent of these Tampa, FL voters identify themselves as conservatives and are in favor of the citizenship option?

  4. What percent of these Tampa, FL voters who identify themselves as conservatives are in favor of the citizenship option? What percent of moderates share this view? What percent of liberals share this view?

  5. Do political ideology and views on immigration appear to be associated? Explain your reasoning.

Question 2

This question uses police shooting data aggregated by the Washington Post documenting all fatal shootings by a police officer between 2015-2022.

A dictionary of terms used can be found here

# Copy and paste this block to load and clean data
police <- read.csv("https://collinn.github.io/data/police22.csv")

# Clean data for problem by selecting relevant 
# columns and removing missing values
police <- police[, c(3,4,5,6,9,10,11,12,13,17)]

police <- police[complete.cases(police) & 
                   police$flee != "" &
                   police$gender != "", ]
  1. Which five states had the greatest number of shootings between 2015-2022?

  2. Below is a plot demonstrating the relationship between whether or not a body camera was being used during the shooting (variable: body_camera) and whether the assailant was either shot or both shot and tasered (variable: manner_of_death, though note: in all cases the assailant was shot. The two values of this variable are "shot", indicating that the assailant was simply shot, and "tasered" indicating that the assailant was first tasered before being subsequently shot). You should:

  1. Using the dataset, create the table associated with this plot
  2. Find the odds of being shot when a body camera is worn
  3. Find the odds of being tasered when a body camera is worn
  4. Find the odds ratio of being shot when a body camera is worn compared to when one is not worn. Based on this, does there appear to be any association between wearing a body camera and shooting a suspect without tasering them?

You do not need to enumerate your answer, but you should include all four pieces of information. Take extra care in specifying what is your “Event” and “Non-event” when finding the odds and odds ratios.

  1. Next, we present a table showing the threat level of the assailant (“attack” is considered the highest level of threat, followed by “other” and then “undetermined”). What variable is being conditioned on, and what conclusions could we draw from this table? Create the plot associated with this table
##               manner_of_death
## threat_level       shot  tasered
##   attack       0.653656 0.491909
##   other        0.321650 0.495146
##   undetermined 0.024693 0.012945