Research

Parallel Python

So you’ve gotten really good at python. Your algorithm is perfect, your data is clean … and it takes hours to run. In fact, you just realized that you could be running this in a dozen different ways by tweaking some parameters, but that would take days to run … if you get really lucky and it completes with no errors. It’s time to parallelize your code. In this class, you will learn a couple basic methods for parallelizing your code and completing your jobs in a fraction of the time. Prior experience with python is strongly recommended.

Introduction to OpenRefine

OpenRefine is a handy desktop application that can help clean inconsistent data. This session will introduce the program and how it can be used to get an overview of a dataset, resolve inconsistencies, split data into granular parts, and match local data points to external datasets.

Excel Pivot Tables

Excel pivot tables help you to quickly summarize, report and find patterns in your datasets. This short course will cover how to setup a pivot table from scratch and special techniques to make the most of this feature. The use of Excel Pivot Tables will help you to organize, visualize and gain insights from your data.

Note: Class offered via Canvas. Link to be provided via registration.

Introduction to SQL

So you’ve hand-made a spreadsheet with thousands of entries. You’ve got pivot tables and record lookups and formulas you don’t even remember writing from months back. Excel is freaking out and crashing all the time. It’s time to use a database! In this class, you will learn how to store, retrieve, and link your data in a structured way using SQL. We will use a database hosted on Rice servers and log in together for our exercises.

R visualization with ggplot2

In this class I will give an introductory lesson to the functionality and application of the ggplot2 package in R. The purpose of this class is to add a data visualization component to the data manipulation tools developed in the previous class thereby giving a more comprehensive understanding of data analysis.