Schedule

Tentative Class Schedule

All the lecture scribbles can be found here.

Lec	Date	Topic	Mandatory Readings	Optional Readings
1	28 Aug 2023	Introduction, Sequential Decision Problems	Slides, SB Chapter 1, Faulty Reward Functions	Computing Machinery and Intelligence by Alan Turing, Probability Review, Linear Algebra Review
2	11 Sep 2023	Supervised Learning, Online Learning, Function Approximation	Relevant parts of chapters 1, 2, 7, 11, 12 from my ML lecture notes.
3	18 Sep 2023	Immediate Reinforcement Learning: Multi-armed Bandits and Contextual Bandits	SB Chapter 2	Algorithms for MAB problems and clinical trial application, A survey on practical applications of bandits
4	25 Sep 2023	Gradient Bandits, Contextual Bandits, Imitation Learning, Markov Decision Process	SB Chapter 2, 3. DAGGER
5	02 Oct 2023	Markov Decision Process and Dynamic Programming	SB Chapter 3 and 4.	Reward is Enough, Reward is not Enough
6	16 Oct 2023	Monte-Carlo Methods, Monte-Carlo Prediction, Monte-Carlo Control, Off-policy Learning, Temporal Difference Learning	SB Chapter 5 and 6.
7	23 Oct 2023	Temporal Difference Learning, Sarsa, Q-Learning, n-step methods, Divergence of off-policy TD with function approximation	SB Chapter 6, 7, and 11.
	27 Oct 2023	Mid-Term Exam (12:45 pm to 2:15 pm)	Sample Questions
8	30 Oct 2023	Deep Q-Learning, Rainbow, Average Reward RL	DQN, DQN Nature, Double DQN, Rainbow, SB Chapter 10.
9	06 Nov 2023	Policy Gradient Methods - I (Reinforce, Actor-Critic, A3C, A2C)	SB Chapter 13, A3C paper	REINFORCE paper
10	13 Nov 2023	Policy Gradient Methods - II ( Deterministic Policy Gradient, DDPG, TD3, Natural Policy Gradient, Trust Region Methods	DPG, DDPG, TD3, NPG Explained, TRPO, PPO
11	20 Nov 2023	Model-Based RL: Background Planning, Decision-Time Planning, Dyna-Q, MPC, CEM, MCTS, AlphaGo Family, Dreamer Family	SB Chapter 8, AlphaGo, Dreamer	Dreamer v2, Dreamer v3, MuZero
12	27 Nov 2023	Multi-agent RL: Independent Learning, Self-Play, CTDE. Partial Observability: POMDP, Belief-state MDPs, RNNs for POMDPs	MARL, POMDP
13	04 Dec 2023	Hierarchical RL: Options, SMDP Q-Learning, Intra-Option Q-Learning. Frontiers in RL	HRL, Continual RL
	19 Dec 2023	Final Exam (1:30 pm to 4:00 pm)	Mid-term Paper

Tutorials

Lec	Date	Time	Topic	Lecture Videos	Lecture Materials
1	15 Sep 2023	1 pm - 2:30 pm	PyTorch	Recording	Notebook