For this project you will be playing the role of data scientist(s) extraordinaire, tasked with finding a dataset, establishing a research question, and using your skills to create an interactive web application in R Shiny to assist your less technically inclined collaborators in exploring the data visually.
Even though this particular project is unpaid, you will still be held to the highest of professional standards. Consequently, it is not enough that your app works, it should also look really cool. In the document that follows, I provide a timeline, an outline of expectations, and a rubric that will be used to assess your project.
More details on the individual components are given elsewhere in this description.
Working schedule
Important intermediate dates
Your final Shiny app is expected to include
ggplot2
, plotly
or
leaflet
To be clear, the expectations given here represent the minimum. Anything less than this will result in a grade lower than a C. A rubric is presented in the section on assessment.
In conjunction with your app, you (and your partner) will be asked to give a short 5-7 minute presentation of your app. Your presentation should include:
Your target audience should be our class, so you may assume some working knowledge of R Shiny and various types of graphics/statistics; but you should not assume any familiarity with your data source or research question.
You may work either individually or with one classmate (of your choosing) on this project. This does not preclude you from working through the labs with a partner and being responsible for your own project.
Note: You may not work with the same person on both this project and the final project. So, if you really want to work with a certain classmate on the final you might choose to work with someone else (or independently) on this project.
Additionally, by choosing to work with someone you are consenting to receiving the same score on the project based. If you are not comfortable receiving the same score as your partner, you might opt to work alone.
By the end of the day Friday, October 27, one member from your group should email me a brief proposal that addresses the following:
In consideration of (3.), a good proposal may be something like “I want users of the app to be able to explore whether there are spatial patterns in the incidences of different types of crimes that were reported in the city of Chicago”. A less good proposal might be something like “I want display all crimes in Chicago on a map”. The first example is good because it involves something that is best achieved using Shiny (ie: a user option to change or filter by crime), while the second is less good because Shiny isn’t necessary to make a map.
By the end of the day Wednesday, 11/1, you should have most of your data processing done and a rough idea of the kinds of graphics you intend on making. To make sure that you are progressing, I will ask that you send me via email an R Script with you data cleaning code along with an R Markdown document with a rough sketch of what kind of plot(s) you intend on making. These do not have to be cleaned up (i.e., don’t worry about labels, themes, etc.,). I will be quick in giving you feedback so that you are able to respond to any changes if needed.
App Code – 10 pts
Aesthetics – 20 pts
"Avg_Faculty_Sal"
should be
"Average Faculty Salary"
Function – 30 pts
Presentation – 15 pts
Difficulty – 20 pts
Misc – 5 pts
In addition to the details here for the assessment, consider also the example projects for grades A, B, and C. Associated with each one are a few comments about each of the apps that I would consider when grading, though in no particular order.
isolate()
most likely)One goal of this project is to give you the opportunity to find and work on a topic that you find interesting. For better or worse, real world projects have considerable variability in terms of the amount of work required to do anything interesting: for some projects, over 90% of your time may be spent on cleaning and organizing your data in a way that is useful to produce relatively simple yet insightful visualizations. Other times, the data may come relatively clean with the majority of time spent on preparing highly detailed visualizations.
The same can be said about Shiny apps themselves – while many aspects of an app may be straightforward, some aspects may involve crafting intricate logic to get the reactive aspects of an app to work as they should. Having the high/low color relationship from the “A” project above would be an example of this.
To address the levels of variability and difficulty in your projects, you will be asked to submit a <1 page written argument detailing the level of difficulty in your project or pointing out aspects that you are particularly proud of having accomplished.
The hallmarks of an A-level project include things such as:
dplyr
, tidyr
,
stringr
, or lubridate
.DT
and shinyjs
packages), plots that update based on user
selection of rows from data tables, or using different geoms for ggplot,
including radar charts, treemaps, stacked area charts, etc.,. A nice
illustration of some of these are hereIn short, an A level project will be one that involves some level of self study to present things that go beyond what has been covered in labs.
You are expected to use a challenging data source of your choosing. I encourage you to find something that aligns with your interests, and you are welcome to use data from other courses/internships/etc. provided it is sufficiently complex.
If you are having trouble finding data, check out the list on the Shiny Resources page. Additionally, you might check out kaggle which keeps an interesting repository of datasets used in machine learning competitions.
Comments
During the next two weeks, I am happy to meet with anyone who would like to discuss in more detail what their options are, in planning a good research question, or for working through parts of your Shiny code. If outside of typical office hours, please email me to schedule a time so that I can be sure that I have enough time allotted to work through whatever it is we need to.
I will also plan on hosting an informal session in our standard classroom on Sunday, Nov 5 from 5-630 to get last minute feedback on your projects.