Course Info
Welcome to the Winter 2025 edition of the Deep Learning Dynamics course!
This course explores the theoretical and practical aspects of how neural networks learn and generalize. The goal of this course is not to introduce deep learning architectures or algorithms. Instead, this course will focus on understanding the training dynamics of neural networks and how they generalize. We will discuss in detail the optimization challenges involved in training overparameterized models.
Tentative course content: Deep Learning (DL) Review - Optimization for DL - Neural Networks Loss Landscape - Initialization Theory - Normalization - Residual Connections - Implicit Regularization of SGD - Second-order Optimization for Neural Nets - Overparameterization - Weight Decay - Double Descent - Grokking - Edge of Stability - Sharpness Aware Minimization (SAM) - Phase Transitions in Neural Net Training - Lottery Ticket Hypothesis - Pruning - Loss of Plasticity - Catastrophic Forgetting - Mode Connectivity - Model Merging - Alignment.
This course will be offered in English. However, students can submit written work that is to be graded in English or French.
Quebec university students from outside Polytechnique Montreal can register for the course via Inter-University Transfer Authorization.
General Information
When?
Mondays 9:30 am to 12:30 pm (starting from 13 Jan 2025).
Where?
Auditorium-2 at Mila, the Quebec AI Institute.
6650 Rue Saint-Urbain, Montréal, QC H2S 3G9.
Auditorium-2 is on the second floor of 6650.
About Labs
The lab slot for this course is every Thursday from 1:45 pm to 4:45 pm. The lab slot will be mainly used for office hours. You can use the rest of the lab time to work on the course project by yourself. Any in-person lab activities will happen in B-411.
People
Instructor
- Sarath Chandar
TAs
- Pranshu Malviya
- Darshan Patil
Prerequisites
This course is not an introduction to deep learning; it can be considered a second course in deep learning. Before taking this course, you should have taken a machine learning course (like INF8245AE at Poly) and a deep learning course (like IFT6135 at UdeM).
The course is intended for hard-working, technically skilled, highly motivated students. Participants will be expected to display initiative, creativity, scientific rigour, critical thinking, and good communication skills.
If you do not have the necessary prerequisites, then you have to spend a lot of time in this course (more than what is required for a 4-credit course).
Evaluation Criteria
The class grade will be based on the following components:
- 1 Programming assignment (individual) - 15%
- Paper Reviews - 12%
- Paper Presentation in Class - 25%
- course project (team) - 40%
- Class Participation - 8%
More details and timelines for evaluations can be found here.
Schedule
You can find the schedule for the class and lecture materials here.