library(ggplot2)
library(dplyr)
# Prettier graphs
theme_set(theme_bw())
Diarrhea is a major public health concern in many underdeveloped countries, in particular for babies, of whom millions die each year from dehydration. The following data comes from a controlled double-blind study of the use of bismuth salicylate (the active ingredient in Pepto Bismol) as therapy for Peruvian infants with diarrhea, with 85 babies receiving bismuth salicylate and 84 receiving placebo. To control for body size, the outcome variable is the the ratio of the volume of stool output per kilogram of body weight (ml/kg)
diarrhea <- read.csv("https://github.com/IowaBiostat/data-sets/raw/main/diarrhea/diarrhea.txt", sep = "\t")
Using ggplot
, create a box plot demonstrating the
distribution of outcomes for each of our two groups.
Conduct a t-test against the null hypothesis that there is no difference in outcome between treatment and placebo groups.
Determine a 95% confidence interval for the true difference in output between babies in the control and treatment groups. Based on this, what conclusions would you draw regarding the use of bismuth salicylate as treatment for infant diarrhea. Explain.
The following data include the results of two interventions and a
control for young female anorexia patients. Include in this data are pre
and post weights for 29 individuals in Cognitive Behavioral Therapy
("CBT"
), Family Treatment ("FT"
), and Control
("Cont"
). Although these data are paired, rather
than considering the efficacy within each group, we will be interested
in assessing the difference in differences between them.
anorexia <- read.csv("https://collinn.github.io/data/anorexia.txt")
mutate
the data set to include a new variable called
Diff
that is the difference between the post weight and pre
weight observations.filter
to create a subset of the original data,
excluding the study type that is not in the pair (i.e., for
“CBT and Control”, you will exclude “FT”).Diff
value
you created in (1) and comparing it between Treatment types at the \(\alpha = 0.05\) levelIn professional basketball games during the 2009-2010 season, when
Kobe Bryant of the Los Angeles Lakers shot a pair of free throws, 8
times he missed both, 152 times he made both, 33 times he made only the
first shot, and 37 times he made only the second. Is it possible that
the successive free throws are independent, or is there evidence to
suggest a “hot streak” effect? The data are tabulated in the
freethrow
data frame below:
# Create freethrow data (copy and paste this into your own R session)
freethrow <- matrix(c(152,33,37,8), nrow = 2, byrow = TRUE)
rownames(freethrow) <- c("Make 1st", "Miss 1st")
colnames(freethrow) <- c("Make 2nd", "Miss 2nd")
print(freethrow)
## Make 2nd Miss 2nd
## Make 1st 152 33
## Miss 1st 37 8