Statistical Computing (STAT 302)

Course Description

Driven by the era of big data, computational data analysis has become an indispensable cornerstone of modern statistics. Competitive statisticians must not only be capable of providing theoretical assurances for their proposed methodologies but also proficient in conducting robust simulation studies and meaningful data applications to support their proposals. These simulation studies and data applications demand strong statistical computing and programming skills. This proficiency entails the ability to read, modify, and create computer programs tailored to the specific requirements of their data analysis tasks.

STAT 302 stands as an introductory statistical computing course designed to equip undergraduate students with essential programming skills, preparing them for more advanced data analysis and machine learning courses. Without assuming extensive programming background, students will learn the core of ideas of programming – data structures, functions, iteration, debugging, logical design, and abstraction – through writing code to tackle various statistical and numerical analysis problems. The course will be taught in the R programming language.

Course Syllabus

Syllabus (Autumn 2023), Syllabus (Winter 2024).

Lecture Slides

Lecture 1 – Introduction and R Basics: Slides.html, Source_Code.Rmd.
Lecture 2 – Data Structures in R: Slides.html, Source_Code.Rmd.
Lecture 3 – Programming Fundamentals: Slides.html, Source_Code.Rmd.
Lecture 4 – Data Manipulation and Visualization: Slides.html, Source_Code.Rmd.
Lecture 5 – Writing Functions And Debugging: Slides.html, Source_Code.Rmd.
Lecture 6 – Simulations: Slides.html, Source_Code.Rmd.
Lecture 7 – Midterm Review.
Lecture 8 – Numerical Analysis: Slides.html, Source_Code.Rmd.
Lecture 9 – Statistical Prediction: Slides.html, Source_Code.Rmd.
Lecture A1 – Introduction to Git and GitHub with R: Slides.html, Source_Code.Rmd.

Lab Assignments

Lab 1 – R Basics: .pdf, .Rmd, .html.
Lab 2 – Data Types: .pdf, .Rmd, .html.
Lab 3 – Programming Fundamentals: .pdf, .Rmd, .html.
Lab 4 – Data Manipulation and Visualization: .pdf, .Rmd, .html.
Lab 5 – Writing Functions And Debugging: .pdf, .Rmd, .html.
Lab 6 – Simulations: .pdf, .Rmd, .html.
Lab 7 – Numerical Analysis: .pdf, .Rmd, .html.

Acknowledgement: Some of my lab problems are modified from the lab questions of Statistical Computing at CMU by Professor Ryan Tibshirani.

Final Project

Final Project Description (This final project is modified from the midterm project of STAT 133 taught by Professor Deborah Nolan in Fall 2016 at UC Berkeley.)