AI interpretability/explainability
Theoretical and empirical approaches to interpretability and/or explainability in AI.
Created by
@fedeadolfi.bsky.social
@yuzhaouoe.bsky.social
https://yuzhaouoe.github.io/ | PhD Student @ University of Edinburgh | Opening the Black Box for Efficient Training/Inference
@gsarti.com
PhD Student at @gronlp.bsky.social 🐮, core dev @inseq.org. Interpretability ∩ HCI ∩ #NLProc. gsarti.com
@gattanasio.cc
Postdoc at @sardine-lab-it.bsky.social working on fair and safe language technologies. | gattanasio.cc | he/him | http://questovirgolettatoesiste.com
@neuralnoise.com
Researcher in ML/NLP at the University of Edinburgh (faculty at Informatics and EdinburghNLP), Co-Founder/CTO at www.miniml.ai, ELLIS (@ELLIS.eu) Scholar, Generative AI Lab (GAIL, https://gail.ed.ac.uk/) Fellow -- www.neuralnoise.com, he/they
@timvanerven.nl
Associate professor in machine learning at the University of Amsterdam. Topics: (online) learning theory and the mathematics of interpretable AI. www.timvanerven.nl Theory of Interpretable AI seminar: https://tverven.github.io/tiai-seminar
@sukrutrao.bsky.social
PhD Student at the Max Planck Institute for Informatics @cvml.mpi-inf.mpg.de @maxplanck.de | Explainable AI, Computer Vision, Neuroexplicit Models Web: sukrutrao.github.io
@juliagrabinski.bsky.social
PhD Student in Computer Vision 💫 seeking Postdoc Opportunities 💫
@katharinaprasse.bsky.social
PhD in Computer Vision Supervised and Inspired by Prof. Dr.-Ing Margret Keuper Member of the Data & Web Science Group @ University of Mannheim.
@claudiashi.bsky.social
machine learning, causal inference, science of llm, ai safety, phd student @bleilab, keen bean https://www.claudiashi.com/
@norabelrose.bsky.social
AI, philosophy, spirituality Head of interpretability research at EleutherAI, but posts are my own views, not Eleuther’s.
@csinva.bsky.social
Senior researcher at Microsoft Research. Seeking good explanations with machine learning https://csinva.io/
@marielza.bsky.social
Retired UNESCO Dir for Digital Inclusion, Policies & Transformation. Chair, UN University, eGov Institute. UNESCO Women in STEM Committee Some pottery and cyanotyping Profile picture is of my face and torso Banner is a picture I took of a light garden
@ankareuel.bsky.social
Computer Science PhD Student @ Stanford | Geopolitics & Technology Fellow @ Harvard Kennedy School/Belfer | Vice Chair EU AI Code of Practice | Views are my own
@mlamparth.bsky.social
Postdoc at @Stanford, @StanfordCISAC, Stanford Center for AI Safety, and the SERI program | Focusing on interpretable, safe, and ethical AI decision-making.
@jaom7.bsky.social
Associate Professor @UAntwerp, sqIRL/IDLab, imec. #RepresentationLearning, #Model #Interpretability & #Explainability A guy who plays with toy bricks, enjoys research and gaming. Opinions are my own idlab.uantwerpen.be/~joramasmogrovejo
@andreasmadsen.bsky.social
Ph.D. in NLP Interpretability from Mila. Previously: independent researcher, freelancer in ML, and Node.js core developer.
@neelrajani.bsky.social
PhD student in Responsible NLP at the University of Edinburgh, passionate about MechInterp
@mariaeckstein.bsky.social
Research scientist at Google DeepMind. Intersection of cognitive science and AI. Reinforcement learning, decision making, structure learning, abstraction, cognitive modeling, interpretability.
@kayoyin.bsky.social
PhD student at UC Berkeley. NLP for signed languages and LLM interpretability. kayoyin.github.io 🏂🎹🚵♀️🥋
@angieboggust.bsky.social
MIT PhD candidate in the VIS group working on interpretability and human-AI alignment
@jasmijn.uk
Senior Research Scientist at Google DeepMind. Interested in (equitable) language technology, gender, interpretability, NLP. Views my own. She/her. 🌐 https://jasmijn.uk
@simonschrodi.bsky.social
🎓 PhD student @cvisionfreiburg.bsky.social @UniFreiburg 💡 interested in mechanistic interpretability, robustness, AutoML & ML for climate science https://simonschrodi.github.io/
@fionaewald.bsky.social
PhD Student @ LMU Munich Munich Center for Machine Learning (MCML) Research in Interpretable ML / Explainable AI
@jskirzynski.bsky.social
PhD student in Computer Science @UCSD. Studying interpretable AI and RL to improve people's decision-making.
@peyrardmax.bsky.social
Junior Professor CNRS (previously EPFL, TU Darmstadt) -- AI Interpretability, causal machine learning, and NLP. Currently visiting @NYU https://peyrardm.github.io
@rachel-law.bsky.social
Organic machine turning tea into theorems ☕️ AI @ Microsoft Research ➡️ Goal: Teach models (and humans) to reason better Let’s connect re: AI for social good, graphs & network dynamics, discrete math, logic 🧩, 🥾,🎨 Organizing for democracy.🗽 www.rlaw.me
@dilya.bsky.social
PhD Candidate in Interpretability @FraunhoferHHI | 📍Berlin, Germany dilyabareeva.github.io
@elianapastor.bsky.social
Assistant Professor at PoliTo 🇮🇹 | Currently visiting scholar at UCSC 🇺🇸 | she/her | TrustworthyAI, XAI, Fairness in AI https://elianap.github.io/
@ovdw.bsky.social
Technology specialist at the EU AI Office / AI Safety / Prev: University of Amsterdam, EleutherAI, BigScience Thoughts & opinions are my own and do not necessarily represent my employer.
@jkminder.bsky.social
CS Student at ETH Zürich, currently doing my masters thesis at the DLAB at EPFL Mainly interested in Language Model Interpretability. Most recent work: https://openreview.net/forum?id=Igm9bbkzHC MATS 7.0 Winter 2025 Scholar w/ Neel Nanda jkminder.ch
@aliciacurth.bsky.social
Machine Learner by day, 🦮 Statistician at ❤️ In search of statistical intuition for modern ML & simple explanations for complex things👀 Interested in the mysteries of modern ML, causality & all of stats. Opinions my own. https://aliciacurth.github.io
@anneo.bsky.social
Comm tech & social media research professor by day, symphony violinist by night, outside as much as possible otherwise. German/American Pacific Northwestern New Englander, #firstgen academic, she/her, 🏳️🌈 https://anne-oeldorf-hirsch.uconn.edu
@lasha.bsky.social
✨On the faculty job market✨ Postdoc at UW, working on Natural Language Processing 🌐 https://lasharavichander.github.io/
@mdhk.net
Linguist in AI & CogSci 🧠👩💻🤖 PhD student @ ILLC, University of Amsterdam 🌐 https://mdhk.net/ 🐘 https://scholar.social/@mdhk 🐦 https://twitter.com/mariannedhk
@velezbeltran.bsky.social
Machine Learning PhD Student @ Blei Lab & Columbia University. Working on probabilistic ML | uncertainty quantification | LLM interpretability. Excited about everything ML, AI and engineering!
@wordscompute.bsky.social
nlp/ml phding @ usc, interpretability & reasoning & pretraining & emergence 한american, she, iglee.me, likes ??= bookmarks
@swetakar.bsky.social
Machine learning PhD student @ Blei Lab in Columbia University Working in mechanistic interpretability, nlp, causal inference, and probabilistic modeling! Previously at Meta for ~3 years on the Bayesian Modeling & Generative AI teams. 🔗 www.sweta.dev
@sarah-nlp.bsky.social
Research in LM explainability & interpretability since 2017. sarahwie.github.io Postdoc @ai2.bsky.social & @uwnlp.bsky.social PhD from Georgia Tech Views my own, not my employer's.
@apepa.bsky.social
Assistant Professor, University of Copenhagen; interpretability, xAI, factuality, accountability, xAI diagnostics https://apepa.github.io/
@panisson.bsky.social
Principal Researcher @ CENTAI.eu | Leading the Responsible AI Team. Building Responsible AI through Explainable AI, Fairness, and Transparency. Researching Graph Machine Learning, Data Science, and Complex Systems to understand collective human behavior.
@thomasfel.bsky.social
Explainability, Computer Vision, Neuro-AI.🪴 Kempner Fellow @Harvard. Prev. PhD @Brown, @Google, @GoPro. Crêpe lover. 📍 Boston | 🔗 thomasfel.me
@sejdino.bsky.social
Professor of Statistical Machine Learning at the University of Adelaide. https://sejdino.github.io/
@markriedl.bsky.social
AI for storytelling, games, explainability, safety, ethics. Professor at Georgia Tech. Associate Director of ML Center at GT. Time travel expert. Geek. Dad. he/him
@jeku.bsky.social
Postdoc at Linköping University🇸🇪. Doing NLP, particularly explainability, language adaptation, modular LLMs. I‘m also into🌋🏕️🚴.
@annarogers.bsky.social
Associate professor at IT University of Copenhagen: NLP, language models, interpretability, AI & society. Co-editor-in-chief of ACL Rolling Review. #NLProc #NLP
@bharathr98.com
Theoretical physicist at day. ML researcher at night. Currently split between CERN and UniGE. Ex - IISER-M, @caltech.edu https://scholar.google.com/citations?user=8BDAnVAAAAAJ
@kylem.bsky.social
Full of childlike wonder. Building friendly robots. UT Austin PhD student, MIT ‘20.
@jbarbosa.org
Junior PI @ INM (Paris) in computational neuroscience, interested in how computations enabling cognition are distributed across brain areas. Expect neuroscience and ML content. jbarbosa.org
@martinagvilas.bsky.social
Computer Science PhD student | AI interpretability | Vision + Language | Cogntive Science. 🇦🇷living in 🇩🇪, she/her https://martinagvilas.github.io/
@stephaniebrandl.bsky.social
Assistant Professor in NLP (Fairness, Interpretability and lately interested in Political Science) at the University of Copenhagen ✨ Before: PostDoc in NLP at Uni of CPH, PhD student in ML at TU Berlin
@mdlhx.bsky.social
NLP assistant prof at KU Leuven, PI @lagom-nlp.bsky.social. I like syntax more than most people. Also multilingual NLP, interpretability, mountains and beer. (She/her)
@variint.bsky.social
Lost in translation | Interpretability of modular convnets applied to 👁️ and 🛰️🐝 | she/her 🦒💕 variint.github.io
@mimansaj.bsky.social
Robustness, Data & Annotations, Evaluation & Interpretability in LLMs http://mimansajaiswal.github.io/
@christophmolnar.bsky.social
Author of Interpretable Machine Learning and other books Newsletter: https://mindfulmodeler.substack.com/ Website: https://christophmolnar.com/
@stellaathena.bsky.social
I make sure that OpenAI et al. aren't the only people who are able to study large scale AI systems.
@romapatel.bsky.social
research scientist @deepmind. language & multi-agent rl & interpretability. phd @BrownUniversity '22 under ellie pavlick (she/her) https://roma-patel.github.io
@fedeadolfi.bsky.social
Computation & Complexity | AI Interpretability | Meta-theory | Computational Cognitive Science https://fedeadolfi.github.io