Overview
This table will be updated weekly with links to course materials (lecture handouts, recordings, lab manuals) as we progress through the course.
| Week | Lecture topic | Lab | Links |
|---|---|---|---|
| 1 (1/26) | Basic principles of machine learning systems engineering and operations | Hello, Chameleon Hello, Linux | |
| 2 (2/2) | Cloud computing | Cloud computing on Chameleon | |
| 3 (2/9) | DevOps and continuous X for ML systems | Build an MLOps pipeline on Chameleon | |
| 4 (2/17) | Large scale data systems | Persistent storage on Chameleon | |
| 5 (2/23) | Model training at scale | Large-scale model training on Chameleon | |
| 6 (3/2) | Model training infrastructure and platforms | Train ML models with MLFlow and Ray | |
| 7 (3/9) | Model serving | Model optimizations for serving Serving on edge devices System optimizations for model serving | |
| 8 (3/30) | Monitoring and evaluating ML systems | Offline evaluation of ML systems Online evaluation of ML systems Closing the feedback loop | |
| 9 (4/6) | Safeguarding ML systems | ||
| 10 (4/13) | Using commercial clouds | ||
| 11 (4/20) | Additional topic: MLOps for GenAI | ||
| 12 (4/27) | Additional topic: RAG | ||
| 13 (5/4) | Additional topic: Agents and MCP |