First, you’ll want to read in your data and remove unnecessary columns and rows. You can download a copy of the data here
library(dplyr)
library(ggplot2)
## Read in qualtrics data as csv
dat <- read.csv("qualtrics_data.csv")
## Keep only those variables I created
dat <- select(dat, sex, degree, grocery_1, boba)
## Remove first two rows with metadata
dat <- dat[3:nrow(dat), ]
Also don’t forget – numeric characters will be exported as characters because of the character strings that were included in the first two rows. You may run into errors if this is not changed
## Error
t.test(grocery_1 ~ degree, dat)
## Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant"): missing value where TRUE/FALSE needed
The error will go away if you convert with
as.numeric()
## Change numeric vectors into numeric
dat <- mutate(dat, grocery_1 = as.numeric(grocery_1))
## Do your statistics!
t.test(grocery_1 ~ degree, dat)
##
## Welch Two Sample t-test
##
## data: grocery_1 by degree
## t = -2.61, df = 39.8, p-value = 0.013
## alternative hypothesis: true difference in means between group Humanities and group STEM is not equal to 0
## 95 percent confidence interval:
## -18.5721 -2.3612
## sample estimates:
## mean in group Humanities mean in group STEM
## 17.933 28.400