Project Instructions

  • The course project will contribute to 20% of your overall grade.
  • The project should be done in teams of three people. All the team members will receive the same grade and thus it is important that all team members take part in executing integral parts of the project.
  • For this year, you can participate in one out of these 2 tracks for the project: Notebook Track or Applications Track.
  • You can talk to me during office hours to get my quick feedback on your project idea.
  • There will not be any extension or late days applicable for the project submission deadlines.

Notebook Track

  • In the notebook track, your goal will be to create a Jupyter/Colab notebook on a topic in RL.
  • You can pick your own topic or look at the list of suggestions below for inspiration.
  • Your notebooks will be evaluated based on the following criteria: creativity in explaining complex algorithms, the pedagogical nature of the notebook, novel insights based on the experiments, and novel visualizations of complex concepts through toy experiments.
  • We are looking for tutorials which are very different from the existing tutorials online.
  • You can take a look at the following two blog posts as reference for what we expect as an outcome of the project: post-1, post-2
  • Have a lot of comments and descriptions in the notebook to make it suitable for someone to self-learn the concept. While evaluating your notebook, I will grade based on how easily I can understand what you are doing.

Applications Track

  • For the applications track, I want you to work on a real application for RL that is not a game.
  • For the final submission, you are expected to submit the code for the RL setup for the application, a notebook to explore the project, and a 5-page report.
  • The report should explain in detail the application, how you frame it as a reinforcement learning problem, your baselines, and the benchmark performance of the baselines on this application.

Project Schedule

  • The project schedule is given below:

    Milestone Info Deadline
    Register your team Google Form Nov 15
    Project Proposal 1 page (excluding references) - In GradeScope Nov 18, 10 pm
    Final Submission   Dec 20

Project Topics for Notebook Track

  1. Different techniques for sampling from the replay buffer.
  2. Maximization bias and double Q-learning.
  3. Extending algorithms from single-agent setting to multi-agent setting.
  4. Importance of different components of a Rainbow Agent.
  5. Visualization of loss surface for RL algorithms.
  6. Effect of different optimizers in RL.
  7. World models for model-based RL.
  8. Hierarchical RL.
  9. Visualization of Eligibility Traces.