Project Instructions

  • The course project will contribute to 25% of your overall grade.
  • The project should be done in teams of three people. All the team members will receive the same grade and thus it is important that all team members take part in executing integral parts of the project.

  • For this year, the goal of the course project is to create a Jupyter/Colab notebook on a topic in RL.
  • You can pick your own topic or look at the list of suggestions below for inspiration.
  • Your notebooks will be evaluated based on the following criteria: creativity in explaining complex algorithms, the pedagogical nature of the notebook, novel insights based on the experiments, and novel visualizations of complex concepts through toy experiments.
  • We are looking for tutorials which are very different from the existing tutorials online.
  • You can take a look at the following two blog posts as reference for what we expect as an outcome of the project: post-1, post-2
  • Have a lot of comments and descriptions in the notebook to make it suitable for someone to self-learn the concept. While evaluating your notebook, I will grade based on how easily I can understand what you are doing.
  • You can talk to me during office hours to get my quick feedback on your project idea.

  • The project schedule is given below:

    Milestone Info Deadline
    Register your team Google Form Nov 10
    Project Proposal 1 page (excluding references) - In GradeScope Nov 10
    Final Colab Submit your final Colab here Dec 20

Project Topics

  1. Different techniques for sampling from the replay buffer.
  2. Maximization bias and double Q-learning.
  3. Extending algorithms from single-agent setting to multi-agent setting.
  4. Importance of different components of a Rainbow Agent.
  5. Visualization of loss surface for RL algorithms.
  6. Effect of different optimizers in RL.
  7. World models for model-based RL.
  8. Hierarchical RL.
  9. Visualization of Eligibility Traces.
  10. New environments and applications for RL.