Course overview

Topic	Math prerequisites	Textbook references	StatQuest videos
Linear regression	Matrix notation, matrix–vector multiplication (IALA Sec. II.5) Inner product/dot product (IALA Sec. I.1) Norm of a vector (IALA Sec. I.3) Matrix inverse (IALA Sec. II.11) Derivatives and optimization (IALA App. C)	ISL: Ch. 3 (intuition) PRML: Sec. 3.1.1 IALA: Sec. 12.2 (OLS derivation)	Linear regression, clearly explained!!! Multiple regression, clearly explained!!! The Chain Rule
Gradient descent	Derivatives and optimization (IALA App. C) Complexity of algorithms, especially vector/matrix operations (IALA App. B, I.1, II.6)	PRML: Sec. 3.1.3 Sequential learning	Gradient descent, step-by-step
Error decomposition and bias–variance tradeoff	Expectation of a random variable (linearity, constants) Variance of a random variable Independence of random variables	ESL: Sec. 2.9; Ch. 7 PRML: Sec. 3.2	Machine Learning Fundamentals: Bias and Variance
Model selection	—	ISL: Sec. 5.1 Cross-Validation ESL: Ch. 7 PRML: Sec. 1.3	Cross validation
Regularization	Derivatives and optimization (IALA App. C) Norm of a vector (IALA Sec. I.3)	ISL: Sec. 6.2 PRML: Sec. 3.1.4	Ridge regression (L2 regularization) Lasso regression (L1 regularization) Elasticnet regression (L1 and L2)
Logistic regression	Probability mass functions Bernoulli random variable Independence of samples Conditional probability Joint probability	ISL: Sec. 4.3 PRML: Sec. 4.3.2, Sec. 4.3.4	Logistic Regression Logistic Regression Details Pt1: Coefficients Logistic Regression Details Pt 2: Maximum Likelihood Odds and Log(Odds), Clearly Explained!!! ROC and AUC, Clearly Explained! Confusion matrix Sensitivity and Specificity
K-nearest neighbor	Expectation of a random variable (linearity, constants) Variance of a random variable Independence of random variables	ESL: Sec. 13.3–13.5	K-nearest neighbors, Clearly Explained
Decision trees and ensembles	Variance of a random variable Independence of random variables Variance of sum of random variables	ISL: Ch. 8 ESL: Sec. 9.2; Ch. 10; Ch. 15	Decision Trees, Clearly Explained!!! Decision Trees, Part 2 - Feature Selection and Missing Data Regression Trees, Clearly Explained!!! How to Prune Regression Trees, Clearly Explained!!! Random Forests Part 1: Building, Using and Evaluating AdaBoost, Clearly Explained
Support Vector Machines, Kernels, Other Kernel-Based Models	Expectation of a random variable (linearity, constants) Variance of a random variable Independence of random variables Conditional distributions Gaussian distribution	ISL: Ch. 9 ESL: Ch. 12	Support Vector Machines Part 1 (of 3): Main Ideas!!! SVM with Polynomial kernel SVM with RBF kernel
Neural networks	Derivatives and optimization (IALA App. C) Chain rule for multivariable functions	PRML: Ch. 5 ESL: Ch. 11 ISL: Sec. 10.1, Sec. 10.7.1	Neural Networks Pt. 1: Inside the Black Box Neural Networks Pt. 2: Backpropagation Main Ideas Backpropagation Details Part 1 Backpropagation Details Part 2 Neural Networks Pt. 3: ReLU In Action!!! Neural Networks Pt. 4: Multiple Inputs and Outputs Neural Networks Part 5: ArgMax and SoftMax Neural Networks Part 6: Cross Entropy Neural Networks Part 7: Cross Entropy Derivatives and Backpropagation Introduction to PyTorch
Deep neural networks, convolutional neural networks	—	ISL: Sec. 10.2, Sec. 10.3, Sec. 10.8	Image Classification with Convolutional Neural Networks (CNNs)
Unsupervised learning	Expectation and variance of random variables Covariance and covariance matrices Eigenvalues and eigenvectors Probability distributions Joint and conditional probability	ESL: Ch. 14	PCA main ideas in only 5 minutes!!! Principal Component Analysis (PCA), Step-by-Step K-means clustering Word Embedding and Word2Vec, Clearly Explained!!!
Reinforcement learning	Derivatives and optimization (IALA App. C) Derivatives of exponential and log functions Expectation of random variables Conditional probability	RL: Sec. 2.1, 2.2, 2.4, 2.5, 2.8, 6.1, 6.5, 13.3	Reinforcement Learning: Essential Concepts

Legend:

ISL – Introduction to Statistical Learning (James et al.)
ESL – Elements of Statistical Learning (Hastie, Tibshirani, Friedman)
PRML – Pattern Recognition and Machine Learning (Bishop)
IALA – Introduction to Applied Linear Algebra (Boyd & Vandenberghe)
RL – Reinforcement Learning: An Introduction (Sutton & Barto)