“There is Nothing More Practical Than A Good Theory.” — Kurt Lewin
- 📝 Administrative information
- 🗓️ Schedule (tentative)
- 🧾 Course Content
- 💯 Grading (tentative)
- 📋 Textbooks
- 🧾 Reference course
📝 Administrative information
- ⏲️ Lectures: Fri. 10:30AM - 12:15PM
- 🏫 Room: Lady Shaw Bldg C5, CUHK
- 👨🏫 Instructor: Ben Dai
- 👨💼 TA: Hao Shi
- ⏳ Office hours: Fri. 2:00PM - 3:00PM
🗓️ Schedule (tentative)
|Week02||Approximation error and estimation error|
|Week03||Uniform concentration inequality|
|Week04||Rademacher complexity I|
|Week05||Rademacher complexity II|
|Week06||Method of regularization|
|Week07||Nonparametric regression on RKHS|
|Week08||Classification: Fisher consistency and calibrated surrogate losses|
|Week09||[Revisiting Excess Risk Bounds: chain argument]|
|Week10||[Revisiting Excess Risk Bounds: local complexity and random entropy]|
|Week11||[Case study: recommender systems]|
|Week12||[Case study: ranking]|
|Week13||[Case study: neural networks]|
🧾 Course Content
This course will provide tools to the theoretical analysis of statistical machine learning methods. It will cover approaches such as parametric models, neural networks, kernel methods, SVM to tasks such as regression, classifiaction, recommender systems, ranking, and it will focus on developing a theoretical understanding and insights of the statistical properties of learning methods.
🔑 Key words:
- Empirical risk minimization
- Estimation error, approximation error
- Regret or excess risk bounds, convergence rate, consistency
- Fisher consistency, calibrated surrogate losses
- Uniform concentration inequality
- Rademacher complexity, covering number, entropy
- Penalization, method of sieve
- Local/random complexity
- Probability at the level of STAT5005 or equivalent (plus mathematical maturity). This is an advanced theory course, a strong mathematical/statistical/probabilistic background is necessary.
💯 Grading (tentative)
- Homeworks (50%)
There will be three homework assignments. You are welcome to discuss Problems with other students, but the final solutions should be completely on your own. You will receive one bonus point for a typed written assignment in LaTeX or Markdown. We will accept scanned handwritten version but without the bonus point. Late submission will not be accepted.
- Paper review / project (50%)
You will write a review of 2-3 papers on the same topic, which can be in any area related to the course. (1) You should summarize and critique the assumptions and theoretical results in the papers, discuss its overall contributions. (2) You might extend a theoretical result, develop a new method and investigate its performance, or run experiments to see the applicability of the methods. It is OK to work on projects in groups of three, see Collaboration policy.
👨🏻🤝👨🏾 Collaboration policy: we admit you to form a group to finish your final project. The number of group members should be smaller or equal than 3. The contribution of each member should be clearly stated in the final report. You will receive one (1) bonus point if you work solo to projects.
Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Ecole d’Eté de Probabilités de Saint-Flour XXXVIII-2008 (Vol. 2033). Springer Science & Business Media.
Van Der Vaart, A. W. & Wellner, J. (1996). Weak Convergence and Empirical Processes: with Applications to Statistics. Springer Science & Business Media.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
Anthony, M., & Bartlett, P. L. (1999). Neural network learning: Theoretical foundations (Vol. 9). Cambridge: cambridge university press.