Description
For this project you will be playing the role of data scientist(s)
extraordinaire, tasked with finding a dataset, establishing a research
question, and using your skills to create an interactive web application
in R Shiny to assist your less technically inclined collaborators in
exploring the data visually.
Even though this particular project is unpaid, you will still be held
to the highest of professional standards. Consequently, it is not enough
that your app works, it should also look really cool.
Finding Data
You are expected to use a data source of your choosing. I encourage
you to find something that aligns with your interests, and you are
welcome to use data from other courses/internships/etc. provided it is
sufficiently complex.
If you are having trouble finding data, check out the list on the Shiny
Resources page. Additionally, you might check out kaggle which keeps an
interesting repository of datasets used in machine learning
competitions.
App Expectations
Your final Shiny app is expected to include
- Data visualization At least one
visualization using
ggplot2
- User options Your app should allow a user to
manipulate and explore your data in at a minimum three
different ways. Some possibilities include:
- Choosing variables for an aesthetic (x, y, color, fill, etc., )
- Choose a scale transformation for a visual aesthetic (palette, axis,
etc.,)
- Choosing faceting variables
- Choosing from rows of data frame for presentation in plot
- Different tabs or panels to display different content
- Choose a different dataset or a subset of data to display
To be clear, the expectations given here represent the minimum.
Presentation Expectations
In conjunction with your app, you (and your partner) will be asked to
give a short 5-7 minute presentation of your app. Your presentation
should include:
- An introduction to your data along with the research motivation.
This may include where you sourced your data.
- Explanation of app features including demonstration of different
options, highlighting any particularly useful tools for your
analysis
- Demonstration of one interesting relationship or trend that can be
found using your app.
Your target audience should be our class, so you may assume some
working knowledge of R Shiny and various types of graphics/statistics;
but you should not assume any familiarity with your data source or
research question.
Groups
You may work either individually or with one classmate (of your
choosing) on this project. This does not preclude you from working
through the labs with a partner and being responsible for your own
project.
Assessment
What I have below is a rough idea of what kinds of things I will be
looking for, though in practice, my rubric will be more subjective.
Specifically, what I am looking for is effort and an earnest attempt at
putting together things you have practiced thus far. So long as this is
the case, you needn’t worry about the rest
I am happy to meet to further discuss any questions or concerns you
might have.
App Code
- Code should be neatly formatted and easy to read with comments as
necessary
- Code should run on my computer without issue assuming I
have the necessary packages and data
- Data cleaning is done in a separate R script, not in the app
Aesthetics
- The app should look professional in appearance – this includes the
use of themes, correct labels, capitalization, etc.,
- Figures and output should be formatted for professional quality. The
standard variable names for data sets should be changed if necessary,
i.e.,
"Avg_Faculty_Sal" should be
"Average Faculty Salary"
- The layout of the app should be sensible
Function
- Your app should include at a minimum three features that
the user can manipulate
- Combinations of features should not break your app or result in
error
- Aspects of your app should be logically consistent – e.g., if you
allow users to select colors for a gradient scale, you should not be
able to choose the same color for “high” and “low” values
- Aspects of the app should be correct, e.g., if the title of your
plot is reactive, it should match what the plot shows
Presentation
- Should be between 5-7 minutes
- You should include all three components listed in the Presentation
Expectations section above, with most emphasis on the research question
you are wishing to investigate
Examples
In addition to the details here for the assessment, consider also the
example projects for grades A, B, and C. Associated with each one are a
few comments about each of the apps that I would consider when grading,
though in no particular order.
Unfortunately, these are all based on the R package
maptools which was removed from R in 2023. Of course, there
is no expectation that your app will contain map data.
R code for each version below is included here:
Grade C
Example C
- Merging data from different sources to create plot
- Default theme on ggplot
- Labels for plot are the default xy values from dataset
- Once “Create Map” is used once, it no longer does anything. Changing
the State from the drop down will immediately update the map
- You can select “Green” for both high and low values of gradient
- Metric drop down uses default values from dataset
- Creates error when Metric == “percasian”
- The choropleth looks “stretched” and not proportional
- No other tabs or information
Grade B
Example B
- Merging data from different sources to create plot with additional
data source for table
- Metric values look better (though High School is written as
“Highschool”)
- The “Create Map” button works to update state however, changing the
value of State will change the title of the plot without changing plot
(error in
isolate() most likely)
- The colors for gradient are better – if “Red” is selected for Color
One, it cannot be selected for Color Two. However, the value for Color
One changes without prompting if Color Two is changed
- Still an issue with changing state and the title of the plot
changing
- Adds a table that subsets the college dataset to include only
colleges located in the state
- Has input that updates based on other input – i.e., Gradient or
Viridis and allow to choose either high/low values or scale
Grade A
Example A
- Many separate data sources combined in interesting ways for
comprehensive analysis.
- The colors associated with Gradient work as they should – cannot
select two of the same, they do not update without being explicitly
changed. This likely took a bit of work and trial/error
- Added several tabs including a constant row at the top for updating
state in analysis
- Title is correct for both choropleth and analysis tab
- Plot isn’t “stretched” like in previous
- Has conditional UI inputs depending on gradient/viridis
- Added functionality to restore default values for all of the
inputs
- The scale values for Viridis aren’t A-H but rather the names of the
actual scales