Overview

This table will be updated weekly with links to course materials (lecture handouts, recordings, lab manuals) as we progress through the course.

Week Lecture topic Lab Links
1 (1/26) Basic principles of machine learning systems engineering and operations Hello, Chameleon
Hello, Linux
2 (2/2) Cloud computing Cloud computing on Chameleon
3 (2/9) DevOps and continuous X for ML systems Build an MLOps pipeline on Chameleon
4 (2/17) Large scale data systems Persistent storage on Chameleon
5 (2/23) Model training at scale Large-scale model training on Chameleon
6 (3/2) Model training infrastructure and platforms Train ML models with MLFlow and Ray
7 (3/9) Model serving Model optimizations for serving
Serving on edge devices
System optimizations for model serving
8 (3/30) Monitoring and evaluating ML systems Offline evaluation of ML systems
Online evaluation of ML systems
Closing the feedback loop
9 (4/6) Safeguarding ML systems
10 (4/13) Using commercial clouds
11 (4/20) Additional topic: MLOps for GenAI
12 (4/27) Additional topic: RAG
13 (5/4) Additional topic: Agents and MCP