From the IMS
Textbook, do the following exercises:
- Ch 4.8: #4, #5, #6
- Ch 5.10 #1, #2
Question 1: For this question, we will use data
obtained from the US
Department of Education’s College Scorecard which includes
information from 2019 on all primarily undergraduate colleges with at
least 400 enrolled students.
The link for the data is available at this URL: https://remiller1450.github.io/data/Colleges2019_Complete.csv
- Part A Write code to load this data into R as a
data.frame named
colleges
- Part B Find the number of observations and
variables in this dataset. In one sentence, briefly describe what
constitutes an observation in this data
- Part C Below we see a plot showing the relationship
between
Cost, a school’s total annual cost of attendance
without considering financial aid, and Salary10yr_median,
the median salary of graduates from the college 10 years after receiving
their degree. Use the plot to answer the questions the following
questions
- What type of plot is displayed here?
- Comment on the form, strength, and
direction of the relationship. What do we see?
(approximate answers are OK)

- Part D Consider this plot again, now with
color added to the plot. What kind of variable is represented
by the color (categorical/quantitative), and how does adding color
change our interpretation of the plot?
