STAT 385: Statistics Programming Methods

Statisticians must be savvy in programming methods useful to the wide variety of analysis that they will be expected to perform. This course provides the foundation for writing and packaging statistical algorithms through the creation of functions and object oriented programming. Fundamental programming techniques and considerations will be emphasized. Students will also create dynamic reports that encapsulate their implemented algorithms. Students must have access to a computer on which they can install software. Prerequisite: STAT 200 or STAT 212.


Course Overview

Lecture Location and Schedule

Location 132 Bevier Hall Google Maps, Floor Map
Times M, W, F 3:00 PM - 3:50 PM

Instructional Staff

Instructor James Balamuta Wed & Fri 4:30 PM - 5:30 PM IH 104
Course Assistants John Lee Tue & Thurs 5:00 PM - 6:00 PM IH 104
Binxiang "Brian" Ni Mon & Wed 11:00 AM - 12:00 PM IH 104
Xueqing "Wendy" Wang Tue & Thurs 4:00 PM - 5:00 PM IH 104

Description

The world is rapidly evolving to rely on data driven decisions. These decisions come from the work a statistician or data scientist presents in his or her reports. In order to present an analysis, the analyst needs to be able to leverage computing resources through programming to unearth patterns and models that exist within data sets. As a result, statisticians and data scientists must be savvy to programming methods that are useful to the wide variety of analysis that they will be expected to perform.

For the focus of the course, fundamental programming techniques and considerations related to working with data will be introduced. Students will learn one of the leading statistical programing languages in depth and will be well positioned to easily learn others based on general principles. With this in mind, the course provides a framework for performing reproducible analyses, writing statistical algorithms, and analyzing code.

Course Objectives

After this course, students should be able to ...

  • analyze and discuss the meaning behind lines of code;
  • manipulate and visualize different forms of data;
  • create a reproducible analytics data pipeline;
  • implement statistical algorithms within a statistical computing environment;
  • explain fundamental computing theory as it applies within the domain of Statistics.

Textbooks

The textbooks listed below are available online for free. You are free to purchase physical copies of the books listed below. However, note that I will reference sections in the online textbook.

Frequently Referenced

Advanced R Programming [2nd Edition] Hadley Wickham
R for Data Science Garrett Grolemund and Hadley Wickham
The R Inferno Patrick Burns

Situationally Useful

An Introduction to R William N. Venables, David M. Smith and the R Core Team
The Art of R Programming Norman Matloff
R Packages Hadley Wickham
ggplot2: Elegant Graphics for Data Analysis [2nd Edition] Hadley Wickham