Goals: Students learn how to use a variety of supervised statistical learning methods, and gain an understanding of their relative advantages and limitations. In addition to learning concepts and heuristics for selecting appropriate methods, the students will also gain programming skills in order to implement such methods. The students will also learn about the core mathematical constructs and optimization techniques behind the methods. A primary emphasis will be on understanding the methodologies through numerical simulations and analysis of real-world data. A high level programming language like R or Python will be used for the computation, and students will become familiar with using existing packages for implementing specific methods.
Summary of course contents:
- Concepts of statistical learning
- Prediction versus estimation
- Accuracy versus interpretability
- Supervised methods versus unsupervised methods
- Model assessment
- Bias-variance tradeoff
- Model complexity
- Optimization concepts
- Classification
- Linear and quadratic discriminant analysis
- Logistic regression
- K-nearest neighbor classifier
- Decision tree classifiers
- Maximum margin classifiers
- Resampling methods
- Cross-validation
- Leave-one out validation and Jackknife
- Bootstrap procedures
- Linear Regression
- Transformation of variables
- Overfitting and regularization
- Variable selection - AIC and BIC criteria
- Stepwise regression
- L1-penalized regression
- Nonparametric smoothing techniques
- Kernel smoothing and splines
- Local polynomial regression
- Density estimation
- Bandwidth selection
- Generalized additive models
Illustrative reading:
- An Introduction to Statistical Learning, with Applications in R -- James, Witten, Hastie,
Tibshirani - Modern Multivariate Statistical Techniques, 2nd Ed. -- A. J. Izenman
Potential Overlap:
Some of the broad topics, such as classification and regression overlap with STA 135. However, the emphasis in STA 135 is on understanding methods within the context of a statistical model, and their mathematical derivations and broad application domains. In contrast, STA 142A focuses more on issues of statistical principles and algorithms inherent in the formulation of the methods, their advantages and limitations, and their actual performance, as evidenced by numerical simulations and data analysis. The computational component has some overlap with STA 141B, where the emphasis is more on data visualization and data preprocessing.
Overlap with ECS 171 is more substantial. Both courses cover the fundamentals of the various methods and techniques, their implementation and applications. However, focus in ECS 171 is more on the optimization aspects and on neural networks, while the focus in STA 142A is more on statistical aspects such as smoothing and model selection techniques. In addition, ECS 171 covers both unsupervised and supervised learning methods in one course, whereas STA 142A is dedicated to supervised learning methods only.
History:
First offered Winter 2020.