โAll models are wrong, but some are useful.โ โ George E. P. Box
๐ Administrative information
- โฒ๏ธ Lectures: Thur. 12:30 - 15:15, Mong Man Wai Building 710
- ๐จโ๐ซ Instructor: Ben Dai
- โจ๏ธ colab: notebook or click
Open in Colab
- ๐ป GitHub: CUHK-STAT3009
- โ Office hour: Thur. Appointment only
๐งพ Course Content
๐ฅ๏ธ Description:
Commercial sites such as search engines, advertisers and median (e.g., Netflix, Amazon), and financial institutions employ recommender systems for content recommendation, predicting customer behavior, compliance, or risk. This course provides an overview of predictive models for recommender systems, including content-based collaborative algorithms, matrix factorization, and deep learning models. The course also demonstrate Python implementation for existing recommender systems.
๐ What youโll learn:
- Understand principles behind recommender systems approaches such as correlation-based collaborative filtering, latent factor models, neural recommender systems
- Implement and analyze recommender systems to real applications by Python, sklearn, and TensorFlow
- Choose and design suitable models for different applications
๐๏ธ Prerequisites:
- Calculus & Linear algebra: inner product, matrix-vector product, linear regression (OLS).
- Basic Statistics: basics of distributions, probabilities, mean, standard deviation, etc.
- Python: basic grammar; Numpy, pandas, TensorFlow libraries
- (Recommended) Completed Machine Learning Crash Course either in-person, online, or self-study, or you have equivalent knowledge.
๐ Reference Textbooks
The following textbooks are useful, but none are exactly same with our course.
Charu C. Aggarwal. Recommender Systems (2016). Springer Nature Switzerland.
Francesco Ricci, Lior Rokach, Bracha Shapira, Paul B. Kantor (2011). Recommender Systems Handbook. Springer New York, NY.
๐ฏ Grading (tentative)
๐จโ๐ป Coursework:
- Homeworks (15%): There will be three homework. Please submit your homework by a well-documented Jupyter Notebook.
- HW 1 (5%): Implementation of k-fold cross-validation
- HW 2 (5%): Practice of ALS related Algorithms
- HW 3 (5%): Prototyping neural networks in recommender systems via TensorFlow
- Inclass quizzes (coding and exercise) (30%): Open-book exam, and problems will be like Homework and examples in the lectures.
- Quiz 1 (5%): Implement baseline methods and correlation-based collaborative filtering
- Quiz 2 (25%): STAT & Python exercise
- Real application project (55%): A full analysis provided in form of report and Jupyter notebook. (1) An executable notebook containing the performed analysis on the data; (2) A technique report includes the (i) mathematical form and intuitive interpretation of your predictive models (ii) details about the data processing and hyperparameters tuning.
- Proj 1 (27%): Real-time Kaggle competition based on Matrix Factorization
- Proj 2 (28%): Real-time Kaggle competition based on Real Dataset
๐จ๐ปโ๐คโ๐จ๐พ Collaboration policy: we admit you to form a group to finish your real application projects. The number of group members should be smaller or equal than 2. The contribution of each member should be clearly stated in the final report. You will receive 5% points (of the project) if you work solo to projects.
๐ Honesty: Our course places very high importance on honesty in coursework submitted by students, and adopts a policy of zero tolerance on academic dishonesty.
๐ข (Late) submission: Homework/projects are submitted via BlackBoard, the competitions are submitted via kaggle. We will penalize 10% credits per 6 hours for the late submission.
All students welcome: we are happy to have audiences in our lecture.
๐๏ธ Schedule (tentative)
The slides will be released just before the lecture, and the code will be published in colab just after the lecture. The well-structured code is also public available in Github:CUHK-STAT3009.
Date | Description | Course Materials | Events | Deadlines | |
---|---|---|---|---|---|
Prepare | Course information [slides] Python Tutorial [YouTube] Numpy, Pandas, Matplotlib [notes] [YouTube] | Suggested Readings:
| |||
Sep 07 | Background and baseline methods [slides] [colab] [github] | Suggested Readings: | |||
Sep 14 | Correlation-based RS [slides] [colab] [github] | Suggested Readings: | |||
Sep 21 | โฐ Quiz 1: implement baseline methods and correlation-based RS [instruct] [report] | InClass quiz via Kaggle (link on BlackBoard) | |||
Sep 28 | ML overview [slides] [colab] [github] | Suggested Readings:
| HW 1 release [colab] | ||
Oct 05 | Matrix factorization I: ALS/BCD [slides] [colab] [github] | Suggested Readings:
| HW 2 release [colab] | HW 1 due [sol] | |
Oct 12 | Matrix factorization II: SGD [slides] [colab] [github] | Suggested Readings:
| Proj 1 release [instruct] | HW 2 due [sol] | |
Oct 19 | Factorization Meets the Neighborhood [slides] [github] | Suggested Readings: | |||
Oct 26 | Case Study: MovieLens [slides] [github] | Suggested Readings:
| Proj 1 due | ||
Nov 02 | Neural Networks [slides] [github] | Suggested Readings:
| Proj 2 release [instruct] | ||
Nov 09 | Neural collaborative filtering [slides] [github] | Suggested Readings: | HW 3 release [colab] | ||
Nov 16 | Side information [slides] [github] | Suggested Readings: | HW 3 due | ||
Suggested Readings: | |||||
Nov 30 | โฐ Quiz 2: Math & Python | InClass quiz | |||
- | Proj 2 due |