Python Introduction
Python Pandas
Python Matplotlib
R Introduction
ggplot2
Covid 19 has affected people's lives in many ways all over the world.
In the first part of the workshop, we will use Python libraries such as Pandas and Matplotlib to download, clean, analyze and visualize the coronavirus open dataset from John Hopkins' Github account. The R portion of the course will look at the Center for Disease Control’s COVID-related dataset, “Weekly counts of death by jurisdiction and cause of death”. This dataset is interesting because it is highly structured and yet somewhat messy in ways that meaningfully relate to real-world problems. We will learn how to parse this data by causes and states, some advanced R functionality to deal with the data's messiness, and how to use the R “Surveillance” package's Farrington algorithm to reproduce the CDC’s outbreak-detection dashboard.
This workshop puts together the skills we learned in the previous Python/R workshops and apply them on real data through the whole data wrangling workflow.
Course Materials