• Picture of Andreas Madsen

    Co-supervisor: Siva Reddy
    Research Areas: Interpretability in NLP

Activity

  • PhD Student: Aug 2020 - now

Preprints

Conference and Journal Papers

2024

  1. Are self-explanations from Large Language Models faithful?
    , , and Siva Reddy
    Findings of the Association for Computational Linguistics (ACL), 2024.
    #NLP
    [arXiv], [code]

  2. Faithfulness Measurable Masked Language Models
    , Siva Reddy, and
    International Conference on Machine Learning (ICML), 2024. [Spotlight award - top 3.5%]
    #NLP
    [arXiv], [code], [YouTube], [blogpost]

2022

  1. Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
    , Nicholas Meade, Vaibhav Adlakha, and Siva Reddy
    Findings of the Association for Computational Linguistics (EMNLP), 2022.
    [BlackboxNLP Workshop, 2022]
    #NLP
    [arXiv], [code]

  2. Post-hoc Interpretability for Neural NLP: A Survey
    , Siva Reddy, and
    ACM Computing Surveys, 2022.
    #NLP
    [arXiv]