Data Science Toolbox
CUHK • Department of Statistics • STAT1013
This course will give a conceptual introduction, implementation, and interpretation of the data scientist’s toolbox in practice. There are three components to this course. The first is a practical introduction to the tools that will be used in the project like Python, Colab, Jupyter notebook, markdown. The second is a conceptual introduction to A/B test. The third is about Case Studies of A/B Test based on the Toolbox.
👌 What you’ll learn:
- Understand principles behind statistical inference and A/B test;
- Familiar with Data Science Toolbox: Python (numpy, pandas, seaborn, sklearn), Colab; Jupyter notebook, Markdown;
- Analyze continuous and categorical data using statistics, Python programming based on Colab and software as appropriate;
- Ability in using advanced Python tools to describe, summarize, and visualize dataset;
- Understand and implement good coding practices, including statistical inference on A/B test, and statistical learning/prediction based on tabular data.
🏗️ Prerequisites:
- Calculus & Linear algebra: inner product, matrix-vector product.
- Basic Statistics: STAT1011 level statistics, basics of distributions, probabilities, conditional probability, mean, standard deviation, etc.
- Python: basic grammar; numpy, pandas
- ⏲️ Lectures: Wed 11:30AM - 2:15PM
- 🎒 Lecture/Recitation Location: Y.C. Liang Hall 103
- 💻 HW submission: BlackBoard
- ⌨️ colab: notebook or click
Open in Colab
All students welcome: we are happy to have audiences in our lecture.
📋 Reference Textbooks
The following textbooks are useful, but none are exactly same with our course.
- VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O’Reilly Media, Inc.
- Bruce, P. & Bruce, A. (2017) Practical Statistics for Data Scientists. O’Reilly Media, Inc.
- Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy online controlled experiments: A practical guide to A/B testing. Cambridge University Press.