STAT6050 Statistical Learning Theory
βThere is Nothing More Practical Than A Good Theory.β β Kurt Lewin
- π Administrative information
- ποΈ Schedule (tentative)
- π§Ύ Course Content
- π― Grading (tentative)
- π Textbooks
- π§Ύ Reference course
π Administrative information
- β²οΈ Lectures: Fri. 10:30AM - 12:15PM
- π« Room: Lady Shaw Bldg C5, CUHK
- π¨βπ« Instructor: Ben Dai
- β³ Office hours: Fri. 2:00PM - 3:00PM
ποΈ Schedule (tentative)
Week | Content |
---|---|
Week01 | Introduction |
Week02 | Approximation error and estimation error |
Week03 | Uniform concentration inequality |
Week04 | Rademacher complexity I |
Week05 | Rademacher complexity II |
Week06 | Method of regularization |
Week07 | Non-parametric regression on RKHS |
Week08 | Classification: Fisher consistency and calibrated surrogate losses |
Week09 | [Revisiting Excess Risk Bounds: chain argument] |
Week10 | [Revisiting Excess Risk Bounds: local complexity and random entropy] |
Week11 | [Case study: recommender systems] |
Week12 | [Case study: ranking] |
Week13 | [Case study: neural networks] |
π§Ύ Course Content
π₯οΈ Description:
This course will provide tools to the theoretical analysis of statistical machine learning methods. It will cover approaches such as parametric models, neural networks, kernel methods, SVM to tasks such as regression, classification, recommender systems, ranking, and it will focus on developing a theoretical understanding and insights of the statistical properties of learning methods.
π Key words:
- Empirical risk minimization
- Estimation error, approximation error
- Regret or excess risk bounds, convergence rate, consistency
- Fisher consistency, calibrated surrogate losses
- Uniform concentration inequality
- Rademacher complexity, covering number, entropy
- Penalization, method of sieve
- Local/random complexity
ποΈ Prerequisites:
- Probability at the level of STAT5005 or equivalent (plus mathematical maturity).
π― Grading (tentative)
π¨βπ» Coursework:
- Homeworks (50%)
There will be three homework assignments. You are welcome to discuss Problems with other students, but the final solutions should be completely on your own. You will receive one bonus point for a typed written assignment in LaTeX or Markdown. We will accept scanned handwritten version but without the bonus point. Late submission will not be accepted.
- Paper review / project (50%)
You will write a review of 2-3 papers on the same topic, which can be in any area related to the course. (1) You should summarize and critique the assumptions and theoretical results in the papers, discuss its overall contributions. (2) You might extend a theoretical result, develop a new method and investigate its performance, or run experiments to see the applicability of the methods. It is OK to work on projects in groups of three, see Collaboration policy.
π¨π»βπ€βπ¨πΎ Collaboration policy: we admit you to form a group to finish your final project. The number of group members should be smaller or equal than 3. The contribution of each member should be clearly stated in the final report. You will receive one (1) bonus point if you work solo to projects.
π Textbooks
Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Ecole dβEtΓ© de ProbabilitΓ©s de Saint-Flour XXXVIII-2008 (Vol. 2033). Springer Science & Business Media.
Van Der Vaart, A. W. & Wellner, J. (1996). Weak Convergence and Empirical Processes: with Applications to Statistics. Springer Science & Business Media.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
Anthony, M., & Bartlett, P. L. (1999). Neural network learning: Theoretical foundations (Vol. 9). Cambridge: cambridge university press.
π§Ύ Reference course
Peter Bartlett, CS 281B / Stat 241B: Statistical Learning Theory
Tengyu Ma, STATS214 / CS229M: Machine Learning Theory
Larry Wasserman, 36-708: Statistical Methods for Machine Learning
Clayton Scott, EECS 598: Statistical Learning Theory
Yoonkyung Lee, STAT 881: Advanced Statistical Learning