AI interpretability/explainability

Theoretical and empirical approaches to interpretability and/or explainability in AI.

Created by

Federico Adolfi

@fedeadolfi.bsky.social

View in Bluesky

Donatella Genovese

@donatellag.bsky.social

PhD Student | Works on Explainable AI

Yu Zhao

@yuzhaouoe.bsky.social

https://yuzhaouoe.github.io/ | PhD Student @ University of Edinburgh | Opening the Black Box for Efficient Training/Inference

Gabriele Sarti

@gsarti.com

PhD Student at @gronlp.bsky.social 🐮, core dev @inseq.org. Interpretability ∩ HCI ∩ #NLProc. gsarti.com

Giuseppe Attanasio

@gattanasio.cc

Postdoc at @sardine-lab-it.bsky.social working on fair and safe language technologies. | gattanasio.cc | he/him | http://questovirgolettatoesiste.com

Researcher in ML/NLP at the University of Edinburgh (faculty at Informatics and EdinburghNLP), Co-Founder/CTO at www.miniml.ai, ELLIS (@ELLIS.eu) Scholar, Generative AI Lab (GAIL, https://gail.ed.ac.uk/) Fellow -- www.neuralnoise.com, he/they

Tim van Erven

@timvanerven.nl

Associate professor in machine learning at the University of Amsterdam. Topics: (online) learning theory and the mathematics of interpretable AI. www.timvanerven.nl Theory of Interpretable AI seminar: https://tverven.github.io/tiai-seminar

Sukrut Rao

@sukrutrao.bsky.social

PhD Student at the Max Planck Institute for Informatics @cvml.mpi-inf.mpg.de @maxplanck.de | Explainable AI, Computer Vision, Neuroexplicit Models Web: sukrutrao.github.io

@patrickknab.bsky.social

Julia Grabinski @ICLR

@juliagrabinski.bsky.social

PhD Student in Computer Vision 💫 seeking Postdoc Opportunities 💫

Katharina Prasse

@katharinaprasse.bsky.social

PhD in Computer Vision Supervised and Inspired by Prof. Dr.-Ing Margret Keuper Member of the Data & Web Science Group @ University of Mannheim.

claudia shi

@claudiashi.bsky.social

machine learning, causal inference, science of llm, ai safety, phd student @bleilab, keen bean https://www.claudiashi.com/

Nora Belrose

@norabelrose.bsky.social

AI, philosophy, spirituality Head of interpretability research at EleutherAI, but posts are my own views, not Eleuther’s.

Chandan Singh

@csinva.bsky.social

Senior researcher at Microsoft Research. Seeking good explanations with machine learning https://csinva.io/

Marielza

@marielza.bsky.social

Retired UNESCO Dir for Digital Inclusion, Policies & Transformation. Chair, UN University, eGov Institute. UNESCO Women in STEM Committee Some pottery and cyanotyping Profile picture is of my face and torso Banner is a picture I took of a light garden

Anka Reuel ➡️ NeurIPS

@ankareuel.bsky.social

Computer Science PhD Student @ Stanford | Geopolitics & Technology Fellow @ Harvard Kennedy School/Belfer | Vice Chair EU AI Code of Practice | Views are my own

Max Lamparth, Ph.D.

@mlamparth.bsky.social

Postdoc at @Stanford, @StanfordCISAC, Stanford Center for AI Safety, and the SERI program | Focusing on interpretable, safe, and ethical AI decision-making.

Laura Kopf

@lkopf.bsky.social

PhD student in Interpretable Machine Learning at TU Berlin & BIFOLD

José Oramas

@jaom7.bsky.social

Associate Professor @UAntwerp, sqIRL/IDLab, imec. #RepresentationLearning, #Model #Interpretability & #Explainability A guy who plays with toy bricks, enjoys research and gaming. Opinions are my own idlab.uantwerpen.be/~joramasmogrovejo

Andreas Madsen

@andreasmadsen.bsky.social

Ph.D. in NLP Interpretability from Mila. Previously: independent researcher, freelancer in ML, and Node.js core developer.

Neel Rajani

@neelrajani.bsky.social

PhD student in Responsible NLP at the University of Edinburgh, passionate about MechInterp

Maria Eckstein

@mariaeckstein.bsky.social

Research scientist at Google DeepMind. Intersection of cognitive science and AI. Reinforcement learning, decision making, structure learning, abstraction, cognitive modeling, interpretability.

Kayo Yin

@kayoyin.bsky.social

PhD student at UC Berkeley. NLP for signed languages and LLM interpretability. kayoyin.github.io 🏂🎹🚵‍♀️🥋

Angie Boggust @ CHI

@angieboggust.bsky.social

MIT PhD candidate in the VIS group working on interpretability and human-AI alignment

Jasmijn Bastings

@jasmijn.uk

Senior Research Scientist at Google DeepMind. Interested in (equitable) language technology, gender, interpretability, NLP. Views my own. She/her. 🌐 https://jasmijn.uk

Gunnar König

@gunnark.bsky.social

PostDoc @ Uni Tübingen explainable AI, causality gunnarkoenig.com

Simon Schrodi

@simonschrodi.bsky.social

🎓 PhD student @cvisionfreiburg.bsky.social @UniFreiburg 💡 interested in mechanistic interpretability, robustness, AutoML & ML for climate science https://simonschrodi.github.io/

Fiona K. Ewald

@fionaewald.bsky.social

PhD Student @ LMU Munich Munich Center for Machine Learning (MCML) Research in Interpretable ML / Explainable AI

Julian Skirzynski

@jskirzynski.bsky.social

PhD student in Computer Science @UCSD. Studying interpretable AI and RL to improve people's decision-making.

Maxime Peyrard

@peyrardmax.bsky.social

Junior Professor CNRS (previously EPFL, TU Darmstadt) -- AI Interpretability, causal machine learning, and NLP. Currently visiting @NYU https://peyrardm.github.io

Rachel Lawrence

@rachel-law.bsky.social

Organic machine turning tea into theorems ☕️ AI @ Microsoft Research ➡️ Goal: Teach models (and humans) to reason better Let’s connect re: AI for social good, graphs & network dynamics, discrete math, logic 🧩, 🥾,🎨 Organizing for democracy.🗽 www.rlaw.me

Dilyara Bareeva

@dilya.bsky.social

PhD Candidate in Interpretability @FraunhoferHHI | 📍Berlin, Germany dilyabareeva.github.io

Eliana Pastor

@elianapastor.bsky.social

Assistant Professor at PoliTo 🇮🇹 | Currently visiting scholar at UCSC 🇺🇸 | she/her | TrustworthyAI, XAI, Fairness in AI https://elianap.github.io/

Oskar van der Wal

@ovdw.bsky.social

Technology specialist at the EU AI Office / AI Safety / Prev: University of Amsterdam, EleutherAI, BigScience Thoughts & opinions are my own and do not necessarily represent my employer.

Julian Minder

@jkminder.bsky.social

CS Student at ETH Zürich, currently doing my masters thesis at the DLAB at EPFL Mainly interested in Language Model Interpretability. Most recent work: https://openreview.net/forum?id=Igm9bbkzHC MATS 7.0 Winter 2025 Scholar w/ Neel Nanda jkminder.ch

Alicia Curth

@aliciacurth.bsky.social

Machine Learner by day, 🦮 Statistician at ❤️ In search of statistical intuition for modern ML & simple explanations for complex things👀 Interested in the mysteries of modern ML, causality & all of stats. Opinions my own. https://aliciacurth.github.io

Anne Oeldorf-Hirsch, PhD

@anneo.bsky.social

Comm tech & social media research professor by day, symphony violinist by night, outside as much as possible otherwise. German/American Pacific Northwestern New Englander, #firstgen academic, she/her, 🏳️‍🌈 https://anne-oeldorf-hirsch.uconn.edu

Max Müller-Eberstein

@mxij.me

Postdoc AI Researcher (NLP) @ ITU Copenhagen 🧭 https://mxij.me

Abhilasha Ravichander @ NAACL2025 🌵☀️

@lasha.bsky.social

✨On the faculty job market✨ Postdoc at UW, working on Natural Language Processing 🌐 https://lasharavichander.github.io/

Marianne de Heer Kloots

@mdhk.net

Linguist in AI & CogSci 🧠👩‍💻🤖 PhD student @ ILLC, University of Amsterdam 🌐 https://mdhk.net/ 🐘 https://scholar.social/@mdhk 🐦 https://twitter.com/mariannedhk

Nicolas Beltran-Velez

@velezbeltran.bsky.social

Machine Learning PhD Student @ Blei Lab & Columbia University. Working on probabilistic ML | uncertainty quantification | LLM interpretability. Excited about everything ML, AI and engineering!

Lee Sharkey

@leesharkey.bsky.social

Scruting matrices @ Apollo Research

Isabelle Lee

@wordscompute.bsky.social

nlp/ml phding @ usc, interpretability & reasoning & pretraining & emergence 한american, she, iglee.me, likes ??= bookmarks

Mor Geva

@megamor2.bsky.social

https://mega002.github.io

Sweta Karlekar

@swetakar.bsky.social

Machine learning PhD student @ Blei Lab in Columbia University Working in mechanistic interpretability, nlp, causal inference, and probabilistic modeling! Previously at Meta for ~3 years on the Bayesian Modeling & Generative AI teams. 🔗 www.sweta.dev

Sarah Wiegreffe

@sarah-nlp.bsky.social

Research in LM explainability & interpretability since 2017. sarahwie.github.io Postdoc @ai2.bsky.social & @uwnlp.bsky.social PhD from Georgia Tech Views my own, not my employer's.

Pepa Atanasova

@apepa.bsky.social

Assistant Professor, University of Copenhagen; interpretability, xAI, factuality, accountability, xAI diagnostics https://apepa.github.io/

André Panisson

@panisson.bsky.social

Principal Researcher @ CENTAI.eu | Leading the Responsible AI Team. Building Responsible AI through Explainable AI, Fairness, and Transparency. Researching Graph Machine Learning, Data Science, and Complex Systems to understand collective human behavior.