About
I'm a Constellation Astra fellow under the mentorship of Charlie Griffin from UK AISI. I'm also a PhD student at Durham University supervised by Noura Al Moubayed. Before that, I did MATS 6.0 with Neel Nanda and worked as a software engineer in industry for
four years.
Publications
- Inference-Time Decomposition of
Activations (ITDA): A Scalable Approach to Interpreting Large Language
Models.
P. Leask, N. Nanda, N. Al Moubayed. ICML 2025.
- Sparse Autoencoders Do Not Find
Canonical Units of Analysis.
P. Leask, B. Bussmann, M. Pearce, J. Bloom, C. Tigges, N. Al Moubayed, L.
Sharkey,
N. Nanda. ICLR 2025.
- BatchTopK Sparse
Autoencoders.
B. Bussmann, P. Leask, N. Nanda. NeurIPS 2024 Workshop on
Scientific
Methods for Understanding Deep Learning
, 2024.