Clément Dumas
Master student at ENS Paris-Saclay / aspiring AI safety researcher / improviser
Prev research intern @ EPFL w/ wendlerc.bsky.social and Robert West
MATS Winter 7.0 Scholar w/ neelnanda.bsky.social
https://butanium.github.io
Starter Packs
Created by Clément Dumas (1)
Mechanistic interpretability
Starter pack with mechanistic interpretability researchers mostly posting about their research
@blackboxnlp.bsky.social
The largest workshop on analysing and interpreting neural networks for NLP. BlackboxNLP will be held at EMNLP 2025 in Suzhou, China blackboxnlp.github.io
@maximemeloux.bsky.social
PhD student @LIG | Causal abstraction, interpretability & LLMs
@davidduvenaud.bsky.social
Machine learning prof at U Toronto. Working on evals and AGI governance.
@antoninpoche.bsky.social
PhD Student doing XAI for NLP at @ANITI_Toulouse, IRIT, and IRT Saint Exupery. 🛠️ Xplique library development team member.
@actinterp.bsky.social
🛠️ Actionable Interpretability🔎 @icmlconf.bsky.social 2025 | Bridging the gap between insights and actions ✨ https://actionable-interpretability.github.io
@stevebyrnes.bsky.social
Researching Artificial General Intelligence Safety, via thinking about neuroscience and algorithms, at Astera Institute. https://sjbyrnes.com/agi.html
@wattenberg.bsky.social
Human/AI interaction. ML interpretability. Visualization as design, science, art. Professor at Harvard, and part-time at Google DeepMind.
@canrager.bsky.social
@scasper.bsky.social
AI technical governance & risk management research. PhD Candidate at MIT CSAIL. Also at https://x.com/StephenLCasper. https://stephencasper.com/
@sarafish.bsky.social
PhD student at Harvard interested in EconCS and ML / previously Caltech undergrad in math
@hidenori8tanaka.bsky.social
Group Leader, CBS-NTT "Physics of Intelligence" Program at Harvard website: https://sites.google.com/view/htanaka/home
@taovacano.bsky.social
Maths addict : Apmep, Maths pour tous, Ires Aix-Marseille, club de Maths Marseille. Prof au lycée Monte-Cristo à Allauch Encadrant plongée sur Marseille.
@timhua.bsky.social
Helping people is good I guess Trying to do AI interp and control Used to do economics timhua.me
@angieboggust.bsky.social
MIT PhD candidate in the VIS group working on interpretability and human-AI alignment
@simonschrodi.bsky.social
🎓 PhD student @cvisionfreiburg.bsky.social @UniFreiburg 💡 interested in mechanistic interpretability, robustness, AutoML & ML for climate science https://simonschrodi.github.io/
@sarahwiegreffe.bsky.social
Research in NLP (mostly LM interpretability & explainability). Incoming assistant prof at UMD CS + CLIP. Current postdoc @ai2.bsky.social & @uwnlp.bsky.social Views my own. sarahwie.github.io
@pkhdipraja.bsky.social
PhD student @ Fraunhofer HHI. Interpretability, incremental NLP, and NLU. https://pkhdipraja.github.io/
@notaphonologist.bsky.social
Faculty fellow (independent postdoc) in Data Science at New York University. NLP, computational linguistics, interpretability, gender. she/her. Please hire me! https://www.notaphonologist.com/
@naturecomputes.bsky.social
Searching for principles of neural representation | Neuro + AI @ enigmaproject.ai | Stanford | sophiasanborn.com
@mlamparth.bsky.social
Postdoc at @Stanford, @StanfordCISAC, Stanford Center for AI Safety, and the SERI program | Focusing on interpretable, safe, and ethical AI decision-making.
@ndif-team.bsky.social
The National Deep Inference Fabric, an NSF-funded computational infrastructure to enable research on large-scale Artificial Intelligence. 🔗 NDIF: https://ndif.us 🧰 NNsight API: https://nnsight.net 😸 GitHub: https://github.com/ndif-team/nnsight
@abosselut.bsky.social
Helping machines make sense of the world. Asst Prof @icepfl.bsky.social; Before: @stanfordnlp.bsky.social @uwnlp.bsky.social AI2 #NLProc #AI Website: https://atcbosselut.github.io/
@yulislavutsky.bsky.social
Stats Postdoc at Columbia, @bleilab.bsky.social Statistical ML, Generalization, Uncertainty, Empirical Bayes https://yulisl.github.io/
@claudiashi.bsky.social
machine learning, causal inference, science of llm, ai safety, phd student @bleilab, keen bean https://www.claudiashi.com/
@sqirllab.bsky.social
We are "squIRreL", the Interpretable Representation Learning Lab based at IDLab - University of Antwerp & imec. Research Areas: #RepresentationLearning, Model #Interpretability, #explainability, #DeepLearning #ML #AI #XAI #mechinterp
@asantilli.bsky.social
PhD student in NLP at Sapienza | Prev: Apple MLR, @colt-upf.bsky.social , HF Bigscience, PiSchool, HumanCentricArt #NLProc www.santilli.xyz