Chandar Research Lab (CRL) Annual Symposium 2026

We welcome you to the sixth annual CRL symposium !

The CRL symposium is an annual event that showcases some highlights of the research work that happened in the Chandar Lab in the last year. The symposium will also have a keynote. The keynote talk for this year will be given by Prof. Mengye Ren (New York University).

Date: July 23-24, 2026

Time: 9 am to 5 pm EST

Mode: Hybrid (both remote and in-person)

Address: 6650 Saint-Urbain, Montréal, QC H2S 3H1

Room: Mila Agora

How to register? Please register on Eventbrite (it takes 1min) if you are planning to either attend in-person or remotely.

Contact: ekaterina.lobacheva@mila.quebec

Day 1 (July 23)

Time	Speaker	Topic	Abstract
9:00am - 9:30am	Sarath Chandar	Opening remarks	A welcome message with an overview of various research activities at CRL.
9:30am - 10:00am	Darshan Patil	Loss Smoothing for Stable Adaptation Under Distribution Shift	Neural networks are often adapted under distribution shift. Standard adaptation methods typically optimize the target objective directly, inducing an abrupt change from the source training objective. We propose loss smoothing, a simple approach that interpolates between the source and target training objectives at the start of adaptation. Across adaptation regimes such as (offline and online) RL and LLM finetuning, we find that loss smoothing consistently improves performance, suggesting that smoother objective transitions are a broadly useful tool for model adaptation.
10:00am - 10:30am	Artem Zholus	TAPNext++: What’s Next for Tracking Any Point (TAP)?	Tracking-Any-Point (TAP) models aim to track any point through a video — a crucial task in AR/XR and robotics applications. The recently introduced TAPNext approach proposes an end-to-end, recurrent transformer architecture to track points frame-by-frame in a purely online fashion, demonstrating competitive performance at minimal latency. However, we show that TAPNext struggles with longer video sequences and also frequently fails to re-detect query points that reappear after being occluded or leaving the frame. In this work, we present TAPNext++, a model that tracks points in sequences that are orders of magnitude longer while preserving the low memory and compute footprint of the architecture.
10:30am - 11:00am		Coffee break
11:00am - 11:30am	Behnoush Khavari	The Expressive Limits of Diagonal SSMs for State-Tracking	We study the expressivity of input-Dependent Complex-valued Diagonal (DCD) SSMs, such as Mamba, Mamba-2 and Mamba-3, on sequential state-tracking tasks. We show that single-layer DCD SSMs cannot express state-tracking of any non-Abelian group at finite precision, and k-layer DCD SSMs can express state-tracking of a group iff that group has a subnormal series of length k, with Abelian factors. Empirically, we find that multi-layer models often fail to learn state-tracking for non-Abelian groups, highlighting a gap between expressivity and learnability.
11:30am - 12:00pm	Istabrak Abbes	What Does Layer-Importance Reveal About Transformers and State-Space Models?	We study how far layer-importance analysis developed for transformers transfers to state-space models. We decompose layer importance into Necessity, how much the pretrained model depends on a layer as measured by the loss from bypassing it, and Plasticity, where fine-tuning concentrates task-specific weight updates. The two families diverge: in residual transformers, Necessity and Plasticity anti-align across depth, whereas in Mamba-style SSMs they overlap. The sign of this alignment predicts adaptation behavior—transformers update their most plastic layers, increasing catastrophic forgetting, while this tier-dependent effect disappears in SSMs.
12:00pm - 1:30pm		Lunch break
1:30pm - 2:30pm	Mengye Ren (New York University)	Keynote: The Always-Learning Machine	Today’s AI models acquire most of their knowledge through offline, i.i.d. learning. In-context learning offers some capacity for online adaptation, but a crucial question remains: can models keep learning at deployment, or even learn from scratch, through continuous streams of experience? In this talk, I will present several recent efforts toward building always-learning machines for perception and planning. Starting with experiential video streams, I show that event segmentation—clustering event concepts in lifelong video—enables effective visual representation learning and event recognition from scratch. In JEPA world models, always-learning can yield rapid test-time learning and generalization for planning. Finally, I will discuss my recent work on creative exploration, and on linking always learning and world modeling to the self.
2:30pm - 3:00pm	Maryam Hashemzadeh	Dialectics of Alignment: Harnessing Unsafe Knowledge for Dynamic Safety Routing	Current alignment paradigms, reliant on data erasure and blanket refusals, often sacrifice epistemological depth and utility. To overcome this limitation, we introduce SafeMoE. This Mixture-of-Experts framework treats “unsafe” data as a valuable knowledge source rather than noise. By isolating harmful corpora into domain-specific LoRA experts and using a router trained on safe-informative primitives, SafeMoE synthesizes deep domain insights while strictly enforcing safety constraints. Our results show a >20% relative improvement in safe response rates and superior informativeness, demonstrating that robust safety is best achieved through the controlled integration, rather than the erasure, of unsafe knowledge.
3:00pm - 3:30pm		Coffee break
3:30pm - 4:00pm	Davide Baldelli	What Hangman Reveals About Language Agents: Private State and Probabilistic Calibration	To host a game of Hangman, a language model must privately commit to a secret word and choose it with genuine randomness, two things current systems quietly fail at. In the first part of this talk, I formalize the former as Private State Interactive Tasks, prove that agents conditioned only on public history cannot guarantee both secrecy and consistency, and show that a private working memory restores it. In the second part, I turn to the question of randomness, showing that probabilistic calibration is a trainable capability whose gains transfer to open-ended stochastic generation.
4:00pm - 04:30pm	Diego Cerda Mardini	Consistent but Miscalibrated: Evaluating LLM Limitations for Risk Communication in Natural Language	Whether LLMs are reliable explainers of probabilistic information in natural language remains unclear. This requires consistent descriptors for identical inputs, and descriptors that reflect underlying magnitudes. We evaluated nine LLMs on selecting verbal descriptors for simulated probabilistic predictions across six domains and multiple inference settings. Models were consistent but miscalibrated, performing worse for uncertainty than likelihood. Precomputed summary statistics did not improve calibration, locating the performance bottleneck to the verbalization layer. To conclude, current LLMs are not yet reliable zero-shot explanators of probabilistic outputs.
4:30pm - 5:00pm	Jerry Huang	On the Uncertainty Calibration of Large Language Models	Trusting deep neural networks requires consistency in how models quantify their uncertainty. However, large language models complicate this through multiple stages of abstract training as well as the gaps between training objectives and downstream evaluation. One major stage where issues arises is during instruction tuning. We first demonstrate how label smoothing can be used to ensure smaller calibration error but at the cost of limited trainability at latter stages. To address this, we propose an instance-specific smoothing adjustment to the loss regularization, allowing for trainable models that remain well calibrated throughout.

Day 2 (July 24)

Time	Speaker	Topic	Abstract
9:00am - 9:30am	Alex Aselstyne	A systematic analysis of machine learning pipelines for robust antimicrobial resistance prediction	Antimicrobial resistance (AMR), the ability for bacteria to survive antibiotic exposure, poses an increasing public health risk. Predicting resistance from whole-genome sequencing using machine learning models has emerged as a promising direction, yet the influence of representation and model design on predictive performance remains understudied. Our systematic evaluation of the ML AMR prediction pipeline shows that tuned XGBoost models with k-mer representations are a robust and interpretable option, supporting the utility and biological validity of ML for AMR prediction.
9:30am - 10:00am	Lola Le Breton	pLM representations unlock metagenomic space beyond homology	TBD.
10:00am - 10:30am	Darshan Patil	CoPeP: Benchmarking Continual Pretraining for Protein Language Models	Protein language models (pLMs) are trained on large protein databases that are continuously updated by the biology community, motivating continual learning both to keep up with ever-growing data and to take advantage of the temporal meta-information created during this process. We introduce the Continual Pretraining of Protein Language Models (CoPeP) benchmark for evaluating continual learning approaches on pLMs at scale in an impactful real-world application.
10:30am - 11:00am		Coffee break
11:00am - 11:30am	Anabel Tan	Data-Driven Transformer Framework for Radio Astronomy Imaging	We present TransformerRIM, a transformer-based approach for radio interferometric imaging that integrates learnt image priors with the underlying measurement physics. Reconstruction updates are modelled using a Swin Transformer, enabling effective multi-scale feature extraction and long-range spatial reasoning. A custom CUDA-based differentiable measurement operator provides efficient forward and adjoint mappings between sky images and sparse visibility measurements, supporting scalable imaging for next-generation radio telescopes like the Square Kilometre Array (SKA).
11:30am - 12:00pm	Aidan Li	Sparse Koopman Autoencoders Identify Local Dynamical Regimes in Multibasin Systems	Koopman autoencoders (KAEs) forecast nonlinear dynamics by learning higher-dimensional latent representations with linear evolution. However, multibasin dynamical systems generally lack a single finite-dimensional global Koopman embedding. In this talk, we discuss how sparsity-inducing encoders can make latent supports serve as inspectable, label-free basin variables in Sparse Koopman Autoencoders (SKAEs), without basin labels or regime annotations being provided during training.
12:00pm - 1:30pm		Lunch break
1:30pm - 2:00pm	Ekaterina Lobacheva	Shared Gradients Capture the Development and Structure of Generalizing Mechanisms in LLMs	We propose analysing Shared Gradients, i.e. alignment in per-example gradients, as a measurable abstraction for understanding LLM learning. Intuitively, shared gradients capture shared solution learning, or how models generilize across examples by learning shared mechanisms. We show that shared gradient discovery co-occurs with finding a generalizable solution across a variety of tasks, and reveals a rich modular structure that reflects the organization of models’ shared mechanisms. Studying LLM circuit emergence through this lens sheds light on when and how practical models learn different shared mechanisms.
2:00pm - 2:30pm	Kamran Chitsaz	The Markovian Thinker	RL for reasoning LLMs has a trivial underlying RL environment (MDP) that treats the state as the whole prompt plus all past thinking tokens. That state keeps growing, which make the computation cost quadratic. We propose Markovian Thinking paradigm, where the state size remains bounded/fixed. This by design, and no matter the policy architecture, makes the compute cost linear with the number of thinking tokens, and memory stays flat.
2:30pm - 3:00pm	Nilaksh	Squeezing More from the Stream : Learning Representation Online for Streaming Reinforcement Learning	In streaming Reinforcement Learning (RL), agents learn from data once and immediately discard it. This saves memory for on-device applications but makes learning highly inefficient, as it is difficult to extract deep patterns from fleeting data. To get the most out of every observation, we adapt Self-Predictive Representations (SPR) for streaming RL. Because streaming data arrives in a highly correlated sequence, simply adding SPR causes training instability. We solve this by adjusting the network’s learning updates to prevent conflicting signals. Tested across the Atari, MinAtar, and Octax benchmarks, our approach consistently outperforms existing streaming methods.
3:00pm - 3:30pm		Coffee break
3:30pm - 4:00pm	Saurav Jha	Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models	World model-based policy evaluation is a practical proxy for testing real-world robot control. As these models increasingly adopt latent diffusion modeling (LDM), choosing the right latent space becomes critical. While the status quo uses autoencoding latent spaces like VAEs that are primarily trained for pixel reconstruction, recent work suggests benefits from pretrained encoders with representation-aligned semantic latent spaces. We systematically evaluate these latent spaces for action-conditioned LDM by comparing six reconstruction and semantic encoders to train world model variants under a fixed protocol on BridgeV2 dataset, and show effective world model training in high-dimensional representation spaces.
4:00pm - 4:30pm	Hadi Nekoei	Shielded Controller Units for RL with Operational Constraints Applied to Remote Microgrids	Remote microgrids require coordinating renewable generation, batteries, and fuel generators under uncertainty while respecting strict operational constraints. We introduce Shielded Controller Units (SCUs), an interpretable framework that uses system knowledge to enforce safety and regulatory constraints during RL control. SCUs decompose the environment into a hierarchy, with each unit responsible for a specific subset of constraints. On a real-world microgrid task, SCUs enable RL to reduce fuel consumption by 24% without increasing battery degradation, outperforming industry heuristics and constrained RL baselines while maintaining full constraint satisfaction.
4:30pm - 5:00pm	Kanishk Jain	Discovering Failure Modes in Vision-Language Models using RL	We propose a Reinforcement Learning (RL)-based framework to automatically discover the failure modes or blind spots of any VLM on a given data distribution without human intervention. Our framework trains a questioner agent that adaptively generates queries based on the candidate VLM’s responses to elicit incorrect answers. Our approach increases question complexity by focusing on fine-grained visual details and distinct skill compositions as training progresses, consequently identifying novel failure modes in which VLMs struggle.