Publications by Andreas Madsen

Alumni member

Activity

New Faithfulness-Centric Interpretability Paradigms for Natural Language Processing
by Andreas Madsen, with Siva Reddy and Sarath Chandar as supervisors.
Polytechnique Montreal ⸺ December 2024.
[thesis]

Interpretability Needs a New Paradigm
Andreas Madsen, Himabindu Lakkaraju, Siva Reddy, and Sarath Chandar
In ArXiv, 2024.
#NLP, #DL
[arXiv]

Are self-explanations from Large Language Models faithful?
Andreas Madsen, Sarath Chandar, and Siva Reddy
Findings of the Association for Computational Linguistics (ACL), 2024.
#NLP
[acl], [arXiv], [code], [YouTube]
Faithfulness Measurable Masked Language Models
Andreas Madsen, Siva Reddy, and Sarath Chandar
International Conference on Machine Learning (ICML), 2024. [Spotlight award - top 3.5%]
#NLP
[pmlr], [arXiv], [code], [YouTube], [blogpost]

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen, Nicholas Meade, Vaibhav Adlakha, and Siva Reddy
Findings of the Association for Computational Linguistics (EMNLP), 2022.
[BlackboxNLP, 2022]
#NLP
[acl], [arXiv], [code]
Post-hoc Interpretability for Neural NLP: A Survey
Andreas Madsen, Siva Reddy, and Sarath Chandar
ACM Computing Surveys, 2022.
#NLP
[acm], [arXiv]