Tentative Class Schedule

All the lecture scribbles can be found here.

Lec Date Topic Mandatory Readings Optional Readings
1 28 Aug 2023 Introduction, Sequential Decision Problems Slides, SB Chapter 1, Faulty Reward Functions Computing Machinery and Intelligence by Alan Turing,
Probability Review,
Linear Algebra Review
2 11 Sep 2023 Supervised Learning, Online Learning, Function Approximation Relevant parts of chapters 1, 2, 7, 11, 12 from my ML lecture notes.  
3 18 Sep 2023 Immediate Reinforcement Learning: Multi-armed Bandits and Contextual Bandits SB Chapter 2 Algorithms for MAB problems and clinical trial application, A survey on practical applications of bandits
4 25 Sep 2023 Gradient Bandits, Contextual Bandits, Imitation Learning, Markov Decision Process SB Chapter 2, 3. DAGGER  
5 02 Oct 2023 Markov Decision Process and Dynamic Programming SB Chapter 3 and 4. Reward is Enough, Reward is not Enough
6 16 Oct 2023 Monte-Carlo Methods, Monte-Carlo Prediction, Monte-Carlo Control, Off-policy Learning, Temporal Difference Learning SB Chapter 5 and 6.  
7 23 Oct 2023 Temporal Difference Learning, Sarsa, Q-Learning, n-step methods, Divergence of off-policy TD with function approximation SB Chapter 6, 7, and 11.  
  27 Oct 2023 Mid-Term Exam (12:45 pm to 2:15 pm) Sample Questions  
8 30 Oct 2023 Deep Q-Learning, Rainbow, Average Reward RL DQN, DQN Nature, Double DQN, Rainbow, SB Chapter 10.  
9 06 Nov 2023 Policy Gradient Methods - I (Reinforce, Actor-Critic, A3C, A2C) SB Chapter 13, A3C paper REINFORCE paper
10 13 Nov 2023 Policy Gradient Methods - II ( Deterministic Policy Gradient, DDPG, TD3, Natural Policy Gradient, Trust Region Methods DPG, DDPG, TD3, NPG Explained, TRPO, PPO  
11 20 Nov 2023 Model-Based RL: Background Planning, Decision-Time Planning, Dyna-Q, MPC, CEM, MCTS, AlphaGo Family, Dreamer Family SB Chapter 8, AlphaGo, Dreamer Dreamer v2, Dreamer v3, MuZero
12 27 Nov 2023 Multi-agent RL: Independent Learning, Self-Play, CTDE. Partial Observability: POMDP, Belief-state MDPs, RNNs for POMDPs MARL, POMDP  
13 04 Dec 2023 Hierarchical RL: Options, SMDP Q-Learning, Intra-Option Q-Learning. Frontiers in RL HRL, Continual RL  
  19 Dec 2023 Final Exam (1:30 pm to 4:00 pm) Mid-term Paper  

Tutorials

Lec Date Time Topic Lecture Videos Lecture Materials
1 15 Sep 2023 1 pm - 2:30 pm PyTorch Recording Notebook