STAT6050 Statistical Learning Theory

β€œThere is Nothing More Practical Than A Good Theory.” β€” Kurt Lewin

πŸ“ Administrative information

  • ⏲️ Lectures: Fri. 10:30AM - 12:15PM
  • 🏫 Room: Lady Shaw Bldg C5, CUHK
  • πŸ‘¨β€πŸ« Instructor: Ben Dai
  • ⏳ Office hours: Fri. 2:00PM - 3:00PM

πŸ—“οΈ Schedule (tentative)

Week02Approximation error and estimation error
Week03Uniform concentration inequality
Week04Rademacher complexity I
Week05Rademacher complexity II
Week06Method of regularization
Week07Non-parametric regression on RKHS
Week08Classification: Fisher consistency and calibrated surrogate losses
Week09[Revisiting Excess Risk Bounds: chain argument]
Week10[Revisiting Excess Risk Bounds: local complexity and random entropy]
Week11[Case study: recommender systems]
Week12[Case study: ranking]
Week13[Case study: neural networks]

🧾 Course Content

πŸ–₯️ Description:

This course will provide tools to the theoretical analysis of statistical machine learning methods. It will cover approaches such as parametric models, neural networks, kernel methods, SVM to tasks such as regression, classification, recommender systems, ranking, and it will focus on developing a theoretical understanding and insights of the statistical properties of learning methods.

πŸ”‘ Key words:

  • Empirical risk minimization
  • Estimation error, approximation error
  • Regret or excess risk bounds, convergence rate, consistency
  • Fisher consistency, calibrated surrogate losses
  • Uniform concentration inequality
  • Rademacher complexity, covering number, entropy
  • Penalization, method of sieve
  • Local/random complexity

πŸ—οΈ Prerequisites:

  • Probability at the level of STAT5005 or equivalent (plus mathematical maturity).

πŸ’― Grading (tentative)

πŸ‘¨β€πŸ’» Coursework:

  • Homeworks (50%)

There will be three homework assignments. You are welcome to discuss Problems with other students, but the final solutions should be completely on your own. You will receive one bonus point for a typed written assignment in LaTeX or Markdown. We will accept scanned handwritten version but without the bonus point. Late submission will not be accepted.

  • Paper review / project (50%)

You will write a review of 2-3 papers on the same topic, which can be in any area related to the course. (1) You should summarize and critique the assumptions and theoretical results in the papers, discuss its overall contributions. (2) You might extend a theoretical result, develop a new method and investigate its performance, or run experiments to see the applicability of the methods. It is OK to work on projects in groups of three, see Collaboration policy.

πŸ‘¨πŸ»β€πŸ€β€πŸ‘¨πŸΎ Collaboration policy: we admit you to form a group to finish your final project. The number of group members should be smaller or equal than 3. The contribution of each member should be clearly stated in the final report. You will receive one (1) bonus point if you work solo to projects.

πŸ“‹ Textbooks

  1. Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Ecole d’EtΓ© de ProbabilitΓ©s de Saint-Flour XXXVIII-2008 (Vol. 2033). Springer Science & Business Media.

  2. Van Der Vaart, A. W. & Wellner, J. (1996). Weak Convergence and Empirical Processes: with Applications to Statistics. Springer Science & Business Media.

  3. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.

  4. Anthony, M., & Bartlett, P. L. (1999). Neural network learning: Theoretical foundations (Vol. 9). Cambridge: cambridge university press.

🧾 Reference course

  1. Peter Bartlett, CS 281B / Stat 241B: Statistical Learning Theory

  2. Tengyu Ma, STATS214 / CS229M: Machine Learning Theory

  3. Larry Wasserman, 36-708: Statistical Methods for Machine Learning

  4. Clayton Scott, EECS 598: Statistical Learning Theory

  5. Yoonkyung Lee, STAT 881: Advanced Statistical Learning