Provided here is a rough outline of potential exam questions. Wording is of course subject to change, but this should give a sense of the types of questions I would ask. My tentative plan is to pull 2-3 for each Exam 3 and the combination Exam 1-2.
For the oral exam, this also covers the range of questions and topics that I would plan to ask.
Identify which types of plots are associated with what types of variables (e.g., what kind of plot could I make with two quantitative variables)
How does conditional probability relate to conditional bar plots? (In context of identifying correct plots)
What are joint, marginal, and conditional probabilities?
What is the relationship between probability and odds?
Why does changing the order of rows or columns in a table give the reciprocal of the odds ratio?
How does giving the reciprocal of an odds ratio change our interpretation of it?
What is the law of large numbers, and how does it relate to the central limit theorem?
How is bootstrapping used to estimate a sampling distribution?
What is the relationship between standard deviation and standard error?
What is a confidence interval? How is one constructed and how is it used?
What is a null distribution, and how it is used in the context of hypothesis testing?
Why does \(|t| < C\) indicate that we should fail to reject the null hypothesis?
Why does \(|t| < C\) imply that \(p > \alpha\)?
What is the relationship between critical values and quantiles?
Why does increasing my confidence make a Type I error less likely?
What are the components of a \(t\) statistic and how does each lend support for or against the null hypothesis?
What is a test statistic, and how is this idea used for hypothesis testing?
If I increase my Type I error rate, what kind of change can I expect to my power? Why
Changing Type 1 error from \(\alpha = 0.1\) to \(\alpha = 0.05\) will always decrease the Type 1 error rate by 5%. The amount that changing my Type 1 error rate impacts my power though depends on a number of factors. Draw a picture to illustrate why this is
There are three elements that impact statistical power. What are they, and what impact do they have on power?
Why does a \(\chi^2\) goodness-of-fit test with \(k\) groups have \(k-1\) degrees of freedom?
If I test for goodness of fit looking at jury ethnicity, why does rejecting my null not tell me which groups are over/under represented? In other words, what is my null hypothesis, and what do I learn by rejecting it?
Explain how the assumption of independence is used to construct expected values for a \(\chi^2\) test of independence.
What are the components of an \(F\) statistic? How do the components listed below change evidence in support or against the null:
All things held fixed, why does increasing the number of groups in ANOVA make it more difficult to reject the null? What does this have to do with statistical power?
Why does adding a correlated variable to a regression model change the effect of blah
Compare and contrast a model in which cylinders is treated as a categorical vs continuous. What assumptions, what do we learn
Can we have a significant slope but low R2? Why?
What problem is the adjusted \(R^2\) trying to solve? What information does it consider that Multiple \(R^2\) does not?
What is the null hypothesis associated with the \(F\) statistic in a linear regression model?