Preprints

  • Hierarchical Planning with Latent World Models
    Wancong Zhang, Basile Terver, , Soham Chitnis, Harsh Sutaria, Mido Assran, Randall Balestriero, Amir Bar, Adrien Bardes, Yann LeCun, and Nicolas Ballas
    In ArXiv, 2026.
    #DL, #RL
    [arXiv], [website], [code]

  • Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling
    , M. Jehanzeb Mirza, Wei Lin, Shiqi Yang, and
    In ArXiv, 2025.
    #DL
    [arXiv]

  • Neural Coherence: Find higher performance to out-of-distribution tasks from few samples
    , Mats Richter, , and Christopher Pal
    In ArXiv, 2025.
    #DL
    [arXiv]

  • Optimizers Qualitatively Alter Solutions And We Should Leverage This
    Razvan Pascanu, Clare Lyle, Ionut-Vlad Modoranu, Naima Elosegui Borras, Dan Alistarh, Petar Velickovic, , Soham De, and James Martens
    In ArXiv, 2025.
    #DL
    [arXiv]

  • V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
    Mido Assran*, Adrien Bardes*, David Fan*, Quentin Garrido*, Russell Howes*, Mojtaba Komeili*, Matthew Muckley*, Ammar Rizvi*, Claire Roberts*, Koustuv Sinha*, , Sergio Arnaud*, Abha Gejji*, Ada Martin*, Francois Robert Hogan*, Daniel Dugas*, Piotr Bojanowski, Vasil Khalidov, Patrick Labatut, Francisco Massa, Marc Szafraniec, Kapil Krishnakumar, Yong Li, Xiaodong Ma, , Franziska Meier*, Yann LeCun*, Michael Rabbat*, and Nicolas Ballas*
    Technical Report, 2025.
    #DL
    [website], [arXiv], [code], [huggingface], [blogpost]

  • Torque-Aware Momentum
    , , Aristide Baratin, Reza Babanezhad Harikandeh, Gintare Karolina Dziugaite, Razvan Pascanu, and
    In ArXiv, 2024.
    #DL
    [arXiv]

  • Interpretability Needs a New Paradigm
    , Himabindu Lakkaraju, Siva Reddy, and
    In ArXiv, 2024.
    #NLP, #DL
    [arXiv]

  • Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent
    Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, , , Sonia Joseph, Blake Richards, Irina Rish, and Özgür Şimşek
    Workshop on Mechanistic Interpretability @ ICML, 2024.
    #DL
    [arXiv]

  • Segmentation of Multiple Sclerosis Lesions across Hospitals: Learn Continually or Train from Scratch?
    , Anne Kerbrat, Pierre Labauge, Tobias Granberg, Jason Talbott, Daniel S. Reich, Massimo Filippi, Rohit Bakshi, Virginie Callot, , and Julien Cohen-Adad
    In ArXiv, 2022.
    [Medical Imaging meets NeurIPS, 2022]
    #DL, #Other
    [arXiv], [code]

  • Feature diversity in self-supervised learning
    and
    Conference on Lifelong Learning Agents (CoLLAs) Workshop Track, 2022.
    #DL
    [arXiv]

  • An Introduction to Lifelong Supervised Learning
    Shagun Sodhani, , Sanket Vaibhav Mehta, , , , and
    In ArXiv, 2022.
    #DL
    [arXiv]

Conference and Journal Papers

2026

  1. Squeezing More from the Stream: Learning Representation Online for Streaming Reinforcement Learning
    , , , François Rivest, and
    International Conference on Machine Learning (ICML), 2026.
    #RL, #DL
    [arXiv], [code]

  2. Position: Modular Memory is the Key to Continual Learning Agents
    Vaggelis Dorovatas, Malte Schwerin, Andrew D. Bagdanov, Lucas Caccia, Antonio Carta, Laurent Charlin, Barbara Hammer, Tyler L. Hayes, Timm Hess, Christopher Kanan, Dhireesha Kudithipudi, Xialei Liu, Vincenzo Lomonaco, Jorge Mendez-Mendez, , Ameya Prabhu, Elisa Ricci, Tinne Tuytelaars, Gido M van de Ven, Liyuan Wang, Joost van de Weijer, Jonghyun Choi, Martin Mundt, and Rahaf Aljundi
    International Conference on Machine Learning (ICML), 2026.
    #DL
    [arXiv]

  3. TAPNext++: What's Next for Tracking Any Point (TAP)?
    Sebastian Jung*, , Martin Sundermeyer, Carl Doersch, Ross Goroshin, David Joseph Tan, , Rudolph Triebel, and Federico Tombari
    Findings of the IEEE CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
    #DL
    [arXiv], [website], [code]

  4. The Expressive Limits of Diagonal SSMs for State-Tracking
    Mehran Shakerinava, , Siamak Ravanbakhsh, and
    International Conference on Learning Representations (ICLR), 2026.
    #DL
    [openreview], [arXiv]

  5. Investigating the Multilingual Calibration Effects of Language Model Instruction-Tuning
    , Peng Lu, Qiuhao Zeng, Yusuke Iwasawa, Yutaka Matsuo, , Edison Marrese-Taylor, and Irene Li
    Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2026.
    #NLP, #DL
    [arXiv], [code]

  6. Monitoring morphometric drift in lifelong learning segmentation of the spinal cord
    Enamundram Naga Karthik, Sandrine Bédard, Jan Valošek, Christoph S. Aigner, Elise Bannier, Josef Bednařík, Virginie Callot, Anna Combes, Armin Curt, Gergely David, Falk Eippert, Lynn Farner, Michael G Fehlings, Patrick Freund, Tobias Granberg, Cristina Granziera, RHSCIR Network Imaging Group, Ulrike Horn, Tomáš Horák, Suzanne Humphreys, Markus Hupp, Anne Kerbrat, Nawal Kinany, Shannon Kolind, Petr Kudlička, Anna Lebret, Lisa Eunyoung Lee, Caterina Mainero, Allan R. Martin, Megan McGrath, Govind Nair, Kristin P. O'Grady, Jiwon Oh, Russell Ouellette, Nikolai Pfender, Dario Pfyffer, Pierre-François Pradat, Alexandre Prat, Emanuele Pravatà, Daniel S. Reich, Ilaria Ricchi, Naama Rotem-Kohavi, Simon Schading-Sassenhausen, Maryam Seif, Andrew Smith, Seth A Smith, Grace Sweeney, Roger Tam, Anthony Traboulsee, Constantina Andrada Treaba, Charidimos Tsagkas, Zachary Vavasour, Dimitri Van De Ville, Kenneth Arnold Weber II, , and Julien Cohen-Adad
    Imaging Neuroscience, 2026.
    #DL, #Other
    [mit], [arXiv]

2025

  1. TRecViT: A Recurrent Video Transformer
    Viorica Pătrăucean, Xu Owen He, Joseph Heyward, Chuhan Zhang, Mehdi S. M. Sajjadi, George-Cristian Muraru, , Mahdi Karami, Ross Goroshin, Yutian Chen, Simon Osindero, João Carreira, and Razvan Pascanu
    Transactions on Machine Learning Research (TMLR), 2025.
    #DL
    [openreview], [arXiv], [code]

  2. TAPNext: Tracking Any Point (TAP) as Next Token Prediction
    , Carl Doersch, Yi Yang, Skanda Koppula, Viorica Pătrăucean, Xu Owen He, Ignacio Rocco, Mehdi S. M. Sajjadi, , and Ross Goroshin
    International Conference on Computer Vision (ICCV), 2025.
    #DL, #Other
    [website], [arXiv], [code], [huggingface], [YouTube]

  3. Steering Large Language Model Activations in Sparse Spaces
    Reza Bayat*, , Mohammad Pezeshki, , and Pascal Vincent
    Conference on Language Modeling (COLM), 2025.
    #NLP, #DL
    [openreview], [arXiv]

  4. Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
    , , Ariside Baratin, , and
    Conference on Lifelong Learning Agents (CoLLAs), 2025.
    #DL
    [arXiv]

  5. Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
    , Gopeshh Subbaraj, Matthew Riemer, Nizar Islah, Tsuguchika Tabaru, Hiroaki Kingetsu, , and Irina Rish
    Conference on Lifelong Learning Agents (CoLLAs), 2025.
    #NLP, #DL
    [arXiv], [code]

  6. Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data
    , Anian Ruoss, Joel Veness, and Tim Genewein
    International Conference on Machine Learning (ICML), 2025.
    #DL
    [pmlr], [openreview], [arXiv]

  7. BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning
    , Maksim Kuznetsov, Roman Schutski, Rim Shayakhmetov, Daniil Polykovskiy, , and Alex Zhavoronkov
    AAAI Conference on Artificial Intelligence (AAAI), 2025. [Best poster award]
    #DL, #RL
    [website], [arXiv], [code], [YouTube]

2024

  1. Exploring Quantization for Efficient Pre-Training of Transformer Language Models
    , , , and
    Findings of the Association for Computational Linguistics (EMNLP), 2024.
    #NLP, #DL
    [acl], [arXiv]

  2. Sharpness-Aware Minimization Scaled by Outlier Normalization for Robust DNNs on In-Memory Computing Accelerators
    Sébastien Henwood, , Yvon Savaria, , and François Leduc-Primeau
    Asilomar Conference on Signals, Systems, and Computers, 2024.
    [Conference on Lifelong Learning Agents (CoLLAs) Workshop Track, 2022]
    [Edge Intelligence Workshop (EIW), 2022]
    #DL
    [paper], [arXiv]

  3. Lookbehind-SAM: k steps back, 1 step forward
    , , Aristide Baratin, and
    International Conference on Machine Learning (ICML), 2024.
    #DL
    [pmlr], [arXiv], [code], [YouTube]

  4. Contrast-agnostic Spinal Cord Segmentation: A Comparative Study of ConvNets and Vision Transformers
    , Sandrine Bedard, Jan Valosek, , and Julien Cohen-Adad
    Medical Imaging with Deep Learning (MIDL), 2024.
    #DL, #Other
    [openreview]

  5. Promoting Exploration in Memory-Augmented Adam using Critical Momenta
    , , Aristide Baratin, Reza Babanezhad Harikandeh, , Simon Lacoste-Julien, Razvan Pascanu, and
    Transactions on Machine Learning Research (TMLR), 2024.
    #DL
    [openreview], [arXiv]

  6. A Responsible Framework for Applying Artificial Intelligence on Medical Images and Signals at the Point-of-care: the PACS-AI Platform
    Pascal Theriault-Lauzier, Denis Cobin, Olivier Tastet, Elodie Labrecque Langlais, Bahareh Taji, Guson Kang, Aun-Yeong Chong, Derek So, An Tang, Judy Wawira Gichoya, , Pierre-Luc Déziel, Julie G Hussin, Samuel Kadoury, and Robert Avram
    Canadian Journal of Cardiology, 2024.
    #DL, #Other
    [paper]

  7. Mastering Memory Tasks with World Models
    , , , and
    International Conference on Learning Representations (ICLR), 2024. [Oral presentation.]
    #RL, #DL
    [openreview], [arXiv], [code]

  8. On the Costs and Benefits of Adopting Lifelong Learning for Software Analytics - Empirical Study on Brown Build and Risk Prediction
    Doriane Olewicki, Sarra Habchi, Mathieu Nayrolles, , , and Bram Adams
    International Conference on Software Engineering (ICSE) - Software Engineering in Practice Track, 2024. [ICSE24 SEIP Distinguished Paper Award]
    #DL
    [arXiv]

  9. Fast and Accurate Output Error Estimation for Memristor-Based Deep Neural Networks
    Jonathan Kern, Sébastien Henwood, , Elsa Dupraz, Abdeldjalil Aïssa-El-Bey, Yvon Savaria, and François Leduc-Primeau
    IEEE Transactions on Signal Processing, 2024.
    #DL
    [paper]

2023

  1. Training DNNs Resilient to Adversarial and Random Bit-Flips by Learning Quantization Ranges
    , , Jean Pierre David, and François Leduc-Primeau
    Transactions on Machine Learning Research (TMLR), 2023.
    #DL
    [openreview], [code]

  2. An Empirical Investigation of the Role of Pre-training in Lifelong Learning
    Sanket Vaibhav Mehta, , , and Emma Strubell
    Journal of Machine Learning Research, 2023.
    #DL
    [jmlr], [arXiv]

  3. DEUP: Direct Epistemic Uncertainty Prediction
    Moksh Jain, Salem Lahlou, , Victor Butoi, Paul Bertin, , Maksym Korablyov, and Yoshua Bengio
    Transactions on Machine Learning Research (TMLR), 2023.
    #DL
    [openreview], [arXiv], [code]

  4. Label fusion and training methods for reliable representation of inter-rater uncertainty
    Andreanne Lemay, Charley Gros, , and Julien Cohen-Adad
    The Journal of Machine Learning for Biomedical Imaging (MELBA), 2023.
    #DL, #Other
    [paper]

2022

  1. TAG: Task-based Accumulated Gradients for Lifelong Learning
    , Balaraman Ravindran, and
    Conference on Lifelong Learning Agents (CoLLAs), 2022.
    [Theory and Foundation of Continual Learning @ ICML, 2021]
    #DL
    [pmlr], [arXiv], [code]

  2. Improving Meta-Learning Generalization with Activation-Based Early-Stopping
    , Christopher Pal, , and
    Conference on Lifelong Learning Agents (CoLLAs), 2022.
    #DL
    [pmlr], [arXiv], [code], [YouTube]

  3. Biological Sequence Design with GFlowNets
    Moksh Jain, Emmanuel Bengio, Alex-Hernandez Garcia, , Bonaventure F. P. Dossou, Chanakya Ekbote, Jie Fu, Tianyu Zhang, Micheal Kilgour, Dinghuai Zhang, Lena Simine, Payel Das, and Yoshua Bengio
    International Conference on Machine Learning (ICML), 2022.
    #DL
    [pmlr], [arXiv], [code]

  4. Memory Augmented Optimizers for Deep Learning
    , Prasanna Parthasarathi, Mido Assran, and
    International Conference on Learning Representations (ICLR), 2022.
    #DL
    [openreview], [arXiv], [code]

  5. PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks
    , Mohammad Amini, , Vikas Verma, and
    AAAI Conference on Artificial Intelligence (AAAI), 2022.
    #DL
    [aaai], [arXiv], [code]

2021

  1. IIRC: Incremental Implicitly-Refined Classification
    , , Shagun Sodhani, and
    IEEE CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
    #DL
    [website], [paper], [arXiv], [code], [PyPI], [docs]

2020

  1. Fully Quantized Transformer for Machine Translation
    , Ella Charlaix, and Mehdi Rezagholizadeh
    Findings of the Association for Computational Linguistics (EMNLP), 2020.
    #NLP, #DL
    [acl], [arXiv]