The schedule is primarily organized by programming languages and themes.  Traditionally, we’ve had separate R, Python, or Hadoop events; however this year are integrating all of them.  Furthermore, some topics, such as Data Visualization, span across all languages, and as such we believe it will be beneficial for folks of different programming language background to see what is possible on different platforms.

Non-Tech: Overview of Data Science-Based Initiatives, Program, Company, etc.
Big Data
Mutliple Languages

Please note there are a few talks which are primarily  design  in nature, that is, no code, but focused on thinking through different design options whether for a data science project or data visualization. Here is the condensed schedule, followed by an expanded one with talks:

PMR Beginner Workshop
AMData Science & EngineeringText Analytics and Big Data
PMData VisualizationData Science & Engineering
AMBeginner PythonBig Data, R, and Parallel Computing
PMPyData + New Python Libraries!Python, Machine Learning, & Big Data
PM2R Beginner WorkshopMore PyData


1st Floor
10th / 11th Floor
FRIDAYR Beginner Workshop
6:00 - 8:45R Beginner Bootcamp - Joe Kambourakis and John Verostek

SATURDAYData Science & EngineeringTwitter, Text Analytics and Big Data
9:00 - 10:20Introduction to the Data Science Method - David WeismanUsing Twitter to Analyze Switching Across Cellphone Carriers - Tanya Cashorali

Topic Modelling Using R - Herb Susmann
10:20 - 10:30BreakBreak
10:30 - 11:00

11:00 -12:00
Combining R Libraries into Automated Workflows - Dag Holmboe

General Linearized Mixed Models (GLMMs) in R - Julia Pilowsky
Optimizing Multilingual Search Using Solr- David Troiano

Mining of Massive Datasets Using Locality Sensitive Hashing (LSH) - J Singh and Teresa Brooks

12:00 - 1:00



SATURDAYData VisualizationData Science & Engineering
1:00 - 1:50DataViz Design Principles - Angela BassaMassive Feature Selection Using Supercomputing in R - Jean-Loup Loyer
1:50 - 2:40Python DataViz Tour - Ian Stokes-ReesRobots, Small Molecules & R - Ingredients for Exploring and Predicting Biological Effects - Rajarshi Guha
2:40 - 3:00BreakBreak
3:00 - 3:50Interactive DataViz with R: ggvis, rCharts, Shiny - Abhinav Sarapure High Dimensionality in Large Datasets - Sri Krishnamurthy
3:50 - 4:40A Case Study Visualizing Boston's Subway System Using D3 and other Open Source tools - Mike Barry and Brian CardPrinciples of Data Engineering- Edmund Jorgenson and Matt Papi

Baseball and Data Engineering using Statistics, R & Python - Dan Milstein

SUNDAYBeginner PythonBig Data Tools and Parallel Computing
9:00 - 9:50iPython Tutorial - Imran Malek
Creating Custom Big Data Tools including Models, Hadoop Clusters, and DataViz - DigitasLBI

Introduction to Massively Parallel Databases - Wes Reing
10:00 - 10:50Regression Analysis with Python, Pandas, and StatsModels - Allen DowneyR for Analyzing Big Expression Data Parallel Computing - Yuefeng Lu

Scaling R with ScaleR Packages - Steve Belcher
11:00 - 11:50More Pandas! - Mali Akmanalp

Orange Canvas: Python Data Mining - Justin Sun

Open-Source Data-Analysis for Bio-tech - Will Sutton

Introduction to Hive with Case Study on Storing and Querying Protobuf Logs in Hive - Muralikumar Venkat

Gamification and Big Data - Nick Lim

12:00 - 1:00

Lunchtime Talk: Visualizations for Exploring Data - Patrik Lundblad


SUNDAYMore PyData! + New Python LibrariesPython, Machine Learning, & Big Data
1:00 - 1:50Glue: a hackable user interface for multidimensional data exploration - Chris Beaumont

Data Science, YouTube, & Media Disruption - Pete Martin of Pixability

Building Predictive Models in Cloud using Microsoft Azure Machine Learning - Roope Astala
1:50 - 2:40Statistical inference in Python the NIFTY way - Mike Bell Python-based Site for Data Science Competitions - Greg Lipstein & Peter Bull
Using Python's Machine Learning and Dynamic Control Libraries for Online Advertisement Analysis - Michael Els

2:40 - 3:00BreakBreak
SUNDAYR BootcampMore PyData!
3:00 - 3:50R Beginner Bootcamp - Joe Kambourakis and John VerostekIP-Reputation Scoring System in Python and Hadoop - Stuart Layton
3:50 - 4:40R BootcampWeb Scraping Using Python's Beautiful Soup and Selenium - Laurie Skelly
4:40 - 5:30R Bootcamp