Introduction

This lab will help us explore and hopefully better understand the motivation and calculations involved with Fisher’s Exact Test. We’ll begin with a very quick overview of some functions that will be useful in working through the questions to follow. For this, refrain from using any external R packages (i.e., epitools).

Generating matrices

Matrices, by default, are generated (and stored) in column order, meaning that the indicies for a matrix start in the top left and work their way down, moving on to the next column once it finishes:

matrix(1:9, nrow = 3)
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

The byrow option allows us to specify row order instead:

matrix(1:9, nrow = 3, byrow = TRUE)
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9

Choose Function

For binomial coefficients:

## 5 choose 2
choose(5, 2)
## [1] 10

Fisher Exact Test

See ?fisher.test, in particular the argument for alternative

Lab

Below is a 2x2 contingency table tabulating the results of a small sample study in which participants were given either Vitamin C tablets or a placebo and recorded whether or not they got sick in the month of February.

Vitamin C
Sick
Yes No
Placebo 3 2
Drug 1 3

Use the textbook to answer the following questions:

Question 0: List the names of the people with whom you are working on this lab

Question 1: For a 2x2 table, if only \(n\), the total number of participants, is fixed, what distribution does \(n_{ij}\) follow? What if both the row and column totals are fixed? In the case, how many degrees of freedom do we have?

Question 2: Assuming that the row and column totals are fixed, what are all of the possible values that \(n_{11}\) can take for the table provided? On a separate piece of paper (or in your RMD document), write down what all of these tables will look like.

Question 3: For 2x2 tables, assumptions of independence correspond with an odds ratio of \(H_0: \theta = 1\). Write out the formula for \(\theta\) in terms of \(n_{ij}\).

Question 4: Under \(H_0\), find the probability \(P(n_{11} = 3)\) (It may be useful to create a function for the PMF).

Question 5: The \(p\) value is defined as the probability of observing data as extreme or more so under the null hypothesis. Find directly (that is, without using fisher.test) the p-value associated with the table above for \(H_A: \theta > 1\). Once you have found it, confirm its correctness by comparing it against the results from fisher.test using the correct alternative hypothesis.

Question 6: Find the p-value directly again, this time for \(H_A: \theta \not= 1\). Explain how you found it and how this was different than in Question 5. Confirm its correctness with fisher.test.