Sarath’s note to prospective students:

My group works in several areas of Machine Learning (ML). Our lab publications on the publications page and the research interests of lab members on the people page provide a better summary of the research problems the lab is currently interested in.

I will maintain a list of my current research interests here. When you email me, mention the research topics and sub-topics you are interested in.

  • Lifelong/Continual Learning - loss of plasticity, catastrophic forgetting, capacity expansion, forward transfer, dynamic neural architectures, modular neural networks, memory for lifelong learning, and knowledge acquisition.
  • Large Language Models (LLMs) - efficient training, RL for training LLMs, knowledge consolidation, continual pre-training, updatable LLMs, bias/fairness, interpretability.
  • Deep Learning (DL) - Better Transformer architectures, state space models and other RNN models, hybrid architectures, long-term dependency modeling, and memory-augmented architectures.
  • Reinforcement Learning (RL) - lifelong RL, model-based RL, multi-agent RL, generalization in RL, hierarchical reinforcement learning, sample efficient RL, RL for drug discovery, language-based RL agents.
  • Optimization - understanding learning dynamics of over-parametrized models, better optimizers that can explore the loss landscape, accelerated training, learning to optimize, optimization for lifelong learning, and optimization for RL.
  • AI for Science - Foundation models for biology (models for proteins, small molecules, DNA, RNA sequences), AI/ML/RL for scientific discovery (drug discovery, material discovery), foundation models for physics/chemistry.

For Fall 2025, I plan to recruit several MSc/PhD students.

For Fall 2025, I will also recruit students for co-supervisions with Ross Goroshin, Caglar Gulcehre (through the ELLIS PhD Fellowship Program), and Amal Zouaq.

General Expectations:

  • I expect the students to have a thorough understanding of the basics of ML and DL before applying. You are also expected to have a thorough understanding of RL if you want to do research in RL. In the interviews, I will test your math and ML knowledge. Ideally, you should be familiar with the topics covered in:
  • You should also have strong Python programming skills and engineering skills. My team uses PyTorch/Jax for research and hence you should be proficient in using PyTorch or Jax. Some of these skill requirements can be waived if you are an exceptional candidate with a different academic background.

Women and underrepresented minorities are especially encouraged to apply.


Postdocs

  • I am currently looking for a postdoc in Large Language Models. Please contact me with your CV.

Ph.D./Masters Students

  • The next batch of Ph.D. and Masters admissions will be for Fall 2025.
  • If you are interested in working with me, you should submit your application to Mila before December 01, 2024. I will only take a look at the applications which mention me as a potential adviser.
  • If you are shortlisted for an interview, you will need to submit an application to Polytechnique Montréal (with my name). Please note that this application submission does not guarantee admission. Your admission is only based on your performance in the interview. Applying to Polytechnique Montréal before the interview is to avoid the possibility of future admission/visa delays.

Current Students at Polytechnique Montréal and UdeM

  • If you are already a student (undergrad, Masters, Ph.D.) at Poly or UdeM and want me to be your adviser, email me with your detailed CV, up-to-date copies of all your transcripts, and a summary of your research interests.
  • If you want me to read your email, your email should have “[RESEARCH_APPLICATION_INTERNAL]” in the subject.

Interns and Visiting Students

  • Please do not directly e-mail me for visiting positions and internships.
  • Interested students should complete the following form.
  • Successful applicants will be contacted directly by me or one of my students.