There are 4 workshops: 2 each for Python and R. There is a beginner for each; and also an intermediate workshop for each — for Python, an intermediate Network Analysis, and for R an Intermediate DataViz. (Please note that instructor availability meant arranging these workshops in this particular order, or otherwise we would have had beginner-intermediate sequence).
9:00 - 11:30
|Python Data Science Intermediate: Network Analysis|
|In this tutorial, I will show you how you can use data to construct networks for data analysis. The goal is to demystify graph analytics and mining, and make it accessible to the general programmer. We will go through:|
- graph basics (nodes + edges, list and matrix representations)
- modelling problems as graphs,
- preprocessing data using Pandas,importing data using NetworkX,
- how to compute basic statistics of the network
- generating visualizations using matplotlib,
- finding hubs, paths and clusters in the data,
IPython notebooks and data files will be distributed beforehand on Github to facilitate code distribution.
1:00 - 3:30
|The workshop has three main parts: R-Studio, Data Management, and Basic Graphics. |
We will walk through how to use R-Studio for inputting scripts (multiple lines of code), entering code, installing packages and libraries, running a script, looking at data, visualizing output. For "Data Management" we will be inputting data, manipulating ata using Vectors, DataFrames . We will covering some stats along the way, too. We will finish with Graphics including Histograms, Boxplots, and Scatterplots.
9:00 - 11:30
|Python Data Science Beginner|
|Learn how to make inferences and predictions using analytics data. We'll start by exploring a dataset that contains anonymized information about learners on dataquest.io. We'll learn how to read in the data and make some basic inferences and visualizations. Then, we'll move on to making predictions on the data using logistic regression. At the end of this workshop, you'll have the skills to be able to do a small data analytics project from start to finish. Some familiarity with python will be helpful for this workshop -- try the learning python lessons on www.dataquest.io if you want to prepare. Overall, we will use Pandas, numpy, matplotlib, and scikit-learn.|
1:00 - 3:30
|R Intermediate: |
|The workshop has three main parts: basic plots, ggplot2, and Shiny which allows for interactive web visualizations.|
We will walk first view how to create basic plots. Next, the intricacies of ggplot2 will be stepped through towards learning how to create custom graphs. Lastly, an interactive visualization will be created using RStudio's Shiny package.