Interpretability & Mechanistic Interpretability

People who work on interpretability for LLMs / VLMs! Message @alessiodevoto.bsky.social to be added :)

Created by

Alessio Devoto

@alessiodevoto.bsky.social

View in Bluesky

Neel Rajani

@neelrajani.bsky.social

PhD student in Responsible NLP at the University of Edinburgh, passionate about MechInterp

Simone Scardapane

@sscardapane.bsky.social

I fall in love with a new #machinelearning topic every month 🙄 Ass. Prof. Sapienza (Rome) | Author: Alice in a differentiable wonderland (https://www.sscardapane.it/alice-book/)

David Dobre

@busycalibrating.bsky.social

PhD in ML @Mila/UdeM LLM robustness, safety, interpretability

Christina (Chrisy)

@variint.bsky.social

Lost in translation | Interpretability of modular convnets applied to 👁️ and 🛰️🐝 | she/her 🦒💕 variint.github.io

Machine learning PhD student @ Blei Lab in Columbia University Working in mechanistic interpretability, nlp, causal inference, and probabilistic modeling! Previously at Meta for ~3 years on the Bayesian Modeling & Generative AI teams. 🔗 www.sweta.dev

Isabelle Lee

@wordscompute.bsky.social

nlp/ml phding @ usc, interpretability & reasoning & pretraining & emergence 한american, she, iglee.me, likes ??= bookmarks

Francesco Ortu

@francescortu.bsky.social

NLP & Interpretability | PhD Student @ University of Trieste & Laboratory of Data Engineering of Area Science Park | Prev MPI-IS

Aaron Mueller @ NAACL 🇺🇸

@amuuueller.bsky.social

Postdoc at Northeastern and incoming Asst. Prof. at Boston U. Working on NLP, interpretability, causality. Previously: JHU, Meta, AWS

Zachary Lipton

@zacharylipton.bsky.social

CTO & Chief Scientific Officer @ Abridge, CMU ML prof, occasional writer, relapsing 🎷, creator of d2l.ai & approximatelycorrect.com

Sara Hooker

@sarahooker.bsky.social

I lead Cohere For AI. Formerly Research Google Brain. ML Efficiency, LLMs, @trustworthy_ml.

Yoav Goldberg

@yoavgo.bsky.social

Yung-Sung Chuang

@yungsung.bsky.social

PhD student #MIT_CSAIL | Intern #MetaAI #Microsoft #MITIBMLab | BS #NTU in #Taiwan

Yoav Artzi

@yoavartzi.com

LM/NLP/ML researcher ¯\_(ツ)_/¯ yoavartzi.com / associate professor @ Cornell CS + Cornell Tech campus @ NYC / nlp.cornell.edu / associate faculty director @ arXiv.org / researcher @ ASAPP / starting @colmweb.org / building RecNet.io

David Bau

@davidbau.bsky.social

Interpretable Deep Networks. http://baulab.info/ @davidbau

Sarah Wiegreffe

@sarah-nlp.bsky.social

Research in LM explainability & interpretability since 2017. sarahwie.github.io Postdoc @ai2.bsky.social & @uwnlp.bsky.social PhD from Georgia Tech Views my own, not my employer's.

Naomi Saphra | hiring PhD students

@nsaphra.bsky.social

Waiting on a robot body. All opinions are universal and held by both employers and family. Current fellow at Harvard Kempner, incoming faculty at Boston University, recruiting students! ML/NLP/they/she.

Gabriele Sarti

@gsarti.com

PhD Student at @gronlp.bsky.social 🐮, core dev @inseq.org. Interpretability ∩ HCI ∩ #NLProc. gsarti.com

Chris Olah

@colah.bsky.social

Reverse engineering neural networks at Anthropic. Previously Distill, OpenAI, Google Brain.Personal account.

Mor Geva

@megamor2.bsky.social

https://mega002.github.io

sonia joseph

@soniajoseph.bsky.social

AI researcher at Mila, visiting researcher at Meta Also on X: @soniajoseph_

Yu Zhao

@yuzhaouoe.bsky.social

https://yuzhaouoe.github.io/ | PhD Student @ University of Edinburgh | Opening the Black Box for Efficient Training/Inference

Pasquale Minervini

@neuralnoise.com

Researcher in ML/NLP at the University of Edinburgh (faculty at Informatics and EdinburghNLP), Co-Founder/CTO at www.miniml.ai, ELLIS (@ELLIS.eu) Scholar, Generative AI Lab (GAIL, https://gail.ed.ac.uk/) Fellow -- www.neuralnoise.com, he/they

Alessio Devoto

@alessiodevoto.bsky.social

PhD in ML/AI | Researching Efficient ML/AI (vision & language) 🍀 & Interpretability | @SapienzaRoma @EdinburghNLP | https://alessiodevoto.github.io/

Interpretability & Mechanistic Interpretability

Neel Rajani

Simone Scardapane

David Dobre

Christina (Chrisy)

Sweta Karlekar

Isabelle Lee

Francesco Ortu

Aaron Mueller @ NAACL 🇺🇸

Zachary Lipton

Sara Hooker

Yoav Goldberg

Yung-Sung Chuang

Yoav Artzi

David Bau

Sarah Wiegreffe

Naomi Saphra | hiring PhD students

Gabriele Sarti

Chris Olah

Mor Geva

sonia joseph

Yu Zhao

Pasquale Minervini

Alessio Devoto