โ

All models are wrong, but some are useful.โ โ George E. P. Box

## ๐ Administrative information

- โฒ๏ธ
**Lectures**:**Thur**. 12:30 - 15:15, Mong Man Wai Building 710 - ๐จโ๐ซ
**Instructor**: Ben Dai - โจ๏ธ
**colab**: notebook or click`Open in Colab`

- ๐ป
**GitHub**: CUHK-STAT3009 - โ
**Office hour**:**Thur**.*Appointment only*

## ๐งพ Course Content

๐ฅ๏ธ **Description:**

Commercial sites such as search engines, advertisers and median (e.g., Netflix, Amazon), and financial institutions employ recommender systems for content recommendation, predicting customer behavior, compliance, or risk. This course provides an overview of predictive models for recommender systems, including content-based collaborative algorithms, matrix factorization, and deep learning models. The course also demonstrate Python implementation for existing recommender systems.

๐ **What youโll learn:**

- Understand principles behind recommender systems approaches such as correlation-based collaborative filtering, latent factor models, neural recommender systems
- Implement and analyze recommender systems to real applications by Python, sklearn, and TensorFlow
- Choose and design suitable models for different applications

๐๏ธ **Prerequisites:**

**Calculus & Linear algebra**: inner product, matrix-vector product, linear regression (OLS).**Basic Statistics**: basics of distributions, probabilities, mean, standard deviation, etc.**Python**: basic grammar; Numpy, pandas, TensorFlow libraries- (
*Recommended*) Completed Machine Learning Crash Course either in-person, online, or self-study, or you have equivalent knowledge.

## ๐ Reference Textbooks

The following textbooks are useful, but none are exactly same with our course.

Charu C. Aggarwal. Recommender Systems (2016). Springer Nature Switzerland.

Francesco Ricci, Lior Rokach, Bracha Shapira, Paul B. Kantor (2011). Recommender Systems Handbook. Springer New York, NY.

## ๐ฏ Grading (tentative)

๐จโ๐ป **Coursework:**

**Homeworks**(15%): There will be three homework. Please submit your homework by a well-documented*Jupyter Notebook*.- HW 1 (5%): Implementation of k-fold cross-validation
- HW 2 (5%): Practice of ALS related Algorithms
- HW 3 (5%): Prototyping neural networks in recommender systems via TensorFlow

**Inclass quizzes**(coding and exercise) (30%): Open-book exam, and problems will be like Homework and examples in the lectures.- Quiz 1 (5%): Implement baseline methods and correlation-based collaborative filtering
- Quiz 2 (25%): STAT & Python exercise

**Real application project**(55%): A full analysis provided in form of report and Jupyter notebook. (1) An executable notebook containing the performed analysis on the data; (2) A technique report includes the (i) mathematical form and intuitive interpretation of your predictive models (ii) details about the data processing and hyperparameters tuning.- Proj 1 (27%): Real-time Kaggle competition based on Matrix Factorization
- Proj 2 (28%): Real-time Kaggle competition based on Real Dataset

๐จ๐ปโ๐คโ๐จ๐พ **Collaboration policy**: we admit you to form a group to finish your real application projects. The number of group members should be smaller or equal than 2. The contribution of each member should be clearly stated in the final report. You will receive 5% points (of the project) if you work solo to projects.

๐ **Honesty**: Our course places very high importance on honesty in coursework submitted by students, and adopts a policy of *zero tolerance* on academic dishonesty.

๐ข **(Late) submission**: Homework/projects are submitted via BlackBoard, the competitions are submitted via kaggle. We will *penalize* **10%** credits per 6 hours for the late submission.

**All students welcome**: we are happy to have audiences in our lecture.

## ๐๏ธ Schedule (tentative)

The slides will be released just before the lecture, and the code will be published in colab just after the lecture. The well-structured code is also public available in Github:CUHK-STAT3009.

Date | Description | Course Materials | Events | Deadlines | |
---|---|---|---|---|---|

Prepare | Course information [slides] Python Tutorial [YouTube] Numpy, Pandas, Matplotlib [notes] [YouTube] | Suggested Readings:- learnpython.org
- The Python Tutorial (official Python documentation)
| |||

Sep 07 | Background and baseline methods [slides] [colab] [github] | Suggested Readings: | |||

Sep 14 | Correlation-based RS [slides] [colab] [github] | Suggested Readings: | |||

Sep 21 | โฐ Quiz 1: implement baseline methods and correlation-based RS [instruct] [report] | InClass quiz via Kaggle (link on BlackBoard) | |||

Sep 28 | ML overview [slides] [colab] [github] | Suggested Readings:- Chapters 2-3 in The Elements of Statistical Learning
- Linear regression in sklearn
| HW 1 release [colab] | ||

Oct 05 | Matrix factorization I: ALS/BCD [slides] [colab] [github] | Suggested Readings:- Netflix Update: Try This at Home (first one applied MF in RS)
- Matrix factorization techniques for recommender systems
- Finding Similar Music using Matrix Factorization
- Matrix factorization techniques for recommender systems
- Matrix completion and low-Rank SVD via fast alternating least squares
- Coordinate Descent (Slides by Ryan Tibshirani)
| HW 2 release [colab] | HW 1 due [sol] | |

Oct 12 | Matrix factorization II: SGD [slides] [colab] [github] | Suggested Readings:- Stochastic Gradient Descent (sklearn documentation)
- Stochastic Gradient Descent Algorithm With Python and NumPy
| Proj 1 release [instruct] | HW 2 due [sol] | |

Oct 19 | Factorization Meets the Neighborhood [slides] [github] | Suggested Readings: | |||

Oct 26 | Case Study: MovieLens [slides] [github] | Suggested Readings:- Home Depot Product Search Relevance (Kaggle competition)
| Proj 1 due | ||

Nov 02 | Neural Networks [slides] [github] | Suggested Readings:- Chapter 11 in The Elements of Statistical Learning
- Neural Networks and Deep Learning (free online book)
| Proj 2 release [instruct] | ||

Nov 09 | Neural collaborative filtering [slides] [github] | Suggested Readings: | HW 3 release [colab] | ||

Nov 16 | Side information [slides] [github] | Suggested Readings: | HW 3 due | ||

Suggested Readings: | |||||

Nov 30 | โฐ Quiz 2: Math & Python | InClass quiz | |||

- | Proj 2 due |