Vision and Language
Researchers interested in multimodal learning, specifically vision and language.
Created by
@adhirajghosh.bsky.social
@sukrutrao.bsky.social
PhD Student at the Max Planck Institute for Informatics @cvml.mpi-inf.mpg.de @maxplanck.de | Explainable AI, Computer Vision, Neuroexplicit Models Web: sukrutrao.github.io
@pima-hyphen.bsky.social
PhD Student at Ommer Lab, Munich (Stable Diffusion) 🎯 Working on getting my first 3M.
@lisadunlap.bsky.social
@bryant1410.bsky.social
🇺🇾 Research Scientist @Netflix, working on Vision+Language research. Opinions are my own.
@czyang.bsky.social
Ph.D. Student @ UMich EECS. Multimodal learning, audio-visual learning and computer vision. Prev research Intern @Adobe and @Meta https://ificl.github.io/
@zsoltkira.bsky.social
Associate Professor @ Georgia Tech computer vision & robotics/embodied AI http://faculty.cc.gatech.edu/~zk15
@martinagvilas.bsky.social
Computer Science PhD student | AI interpretability | Vision + Language | Cogntive Science. 🇦🇷living in 🇩🇪, she/her https://martinagvilas.github.io/
@soumyasj.bsky.social
PhD student at University of Tübingen | Tübingen AI Center | Multimodal Learning | Vision & Language https://soumyasj.github.io/
@archiki.bsky.social
Ph.D. Student at UNC NLP | Apple Scholar in AI/ML Ph.D. Fellowship | Prev: FAIR at Meta, AI2, Adobe (Intern) | Interests: #NLP, #ML | https://archiki.github.io/
@jmincho.bsky.social
In the 2024-2025 job market! | PhD candidate at UNC | Bloomberg PhD Fellow | Prev: Google, Microsoft, Adobe, AI2, SNU | https://j-min.io | #multimodal #nlp
@omrisuissa.com
Postdoc @ Brown DSI VP R&D @ ClearMash 🔬 Passionate about high-fidelity numerical representations of reality, aligned with human perception. https://omri.alphaxiv.io/ #nlp #multimodality #retrieval #hci #multi-agent
@patelmaitreya.bsky.social
Research Intern @Adobe | PhD at @ApgAsu @ASU | Vision & Language | T2I Diffusion Modeling maitreyapatel.com
@sayandsarkar.bsky.social
PhD in 3D Vision @Stanford | MSc CS @ETH | Ex @Qualcomm, @MercedesBenz W: sayands.github.io
@wenyan62.bsky.social
PhD student at the CoAStaL NLP Group, University of Copenhagen. Former researcher at Comcast AI and SenseTime.
@merceaotniel.bsky.social
PhD MPI-IS, University of Tübingen MSc Edinburgh University Ex. Google DeepMind, Google Research, Helmholtz Munich, TU Munich Multimodal learning, efficient learning, and video understanding. merceaotniel.github.io
@josef-sivic.bsky.social
@dhruvbatra.bsky.social
Co-founder & Chief Scientist at Yutori. Prev: Senior Director leading FAIR Embodied AI at Meta, and Professor at Georgia Tech.
@yannistevissen.fr
AI scientist | Video Understanding, Retrieval & AI fairness | Head of Science @Moments Lab https://yannistevissen.fr
@cianeastwood.bsky.social
Senior Research Scientist at Valence Labs. Generative modeling (causal, multimodal) and generalisation for scientific discovery. PhD in ML from UofEdinburgh and MPI-IS, with time at Google DeepMind, Meta AI and Spotify. 📍London 🔗 cianeastwood.github.io
@drfeifei.bsky.social
Prof (CS @Stanford), Co-Director @StanfordHAI, Cofounder/CEO @theworldlabs, CoFounder @ai4allorg #AI #computervision #robotics #AI-healthcare
@thaottn.bsky.social
PhD student @ UW & visiting researcher @ MetaAI. Previously Google Brain resident & Stanford'19. Curating better training data for language & multimodal models. https://thaonguyen19.github.io/
@niladridutt.bsky.social
PhD @ucl.ac.uk | @ellis.eu | ex-Nvidia, Berkeley | Interested in generative modelling in vision and graphics + reasoning (LLMs) https://niladridutt.com/
@roopalgarg.bsky.social
Multimodal Multi-lingual research at Google DeepMind for Gemini post-training. #NLProc #Multimodal
@nikparth1.bsky.social
Research Scientist @ Google DeepMind making multi-modal learning more efficient. Prev: PhD in bio-inspired visual representation learning from the Simoncelli lab @ NYU Center for Neural Systems. BS/MS @ Stanford
@kochsebastian.bsky.social
PhD student @ uni_ulm & Student Researcher @ Google Research. Interested in 3D scene understanding. Powered by Pizza 🍕 🔗 https://kochsebastian.com
@shikharb.bsky.social
PhD student WAVLab@LTI, CMU Multimodality and multilinguality prev. predoc Google Deepmind
@pbontrager.bsky.social
AI researcher & engineer @Meta working on @PyTorch torchtune in NYC; interests in generative models, RL, and evolutionary strategies 💻 https://github.com/pbontrager 📝 https://tinyurl.com/philips-papers
@artemzholus.bsky.social
Visiting Researcher at Meta; PhD student @mila.quebec. Ex: Intern @GoogleDeepMind, Intern @ EPFL, MSc@MIPT; artemzholus.github.io
@yuyang0901.bsky.social
CS PhD @UCLAComSci 🧸 | Prev @AIatMeta @MSFTResearch @AmazonScience @uclamath | Improving data for efficiency, robustness and performance @San Francisco https://sites.google.com/g.ucla.edu/yuyang/home
@ebugliarello.bsky.social
Research Scientist at Google DeepMind https://e-bug.github.io
@vicenteor.bsky.social
Rice University, Associate Professor of Computer Science. Computer Vision, Multimodal AI, Deep Learning. Houston, Texas. Check our work at https://vislang.ai/
@ekazakos.bsky.social
Postdoctoral researcher @ CIIRC, CTU, Prague working in vision & language. PhD from University of Bristol. Ex. Samsung Research (SAIC-C). I love coffee and plants. And socks.
@vickykalogeiton.bsky.social
Assistant Professor at Ecole Polytechnique, IP_Paris// Before: Oxford_VGG, Inria Grenoble // multimodality, genAI enthusiast // happy mum+dog_mum // opinions: mine
@yukimasano.bsky.social
Professor at University of Technology Nuremberg Head of Fundamental AI Lab
@oliverlemke.bsky.social
M.Sc. Computer Science @ ETH Zurich | Research Intern @ Robotics and AI Institute Interested in 3D Scene Understanding and Embodied AI 📍Zurich, Switzerland 🔗 https://oliver-lemke.github.io
@rarefin.bsky.social
PhD Candidate Mila/University of Montreal | Intern at Amazon | Ex. ServiceNow, Recursion, UpStride | Researching to understand how Deep Learning works https://rarefin.github.io
@jiaruizhang.bsky.social
USC CS Ph.D. student Prev Tsinghua Uni NLP, Multimodal Learning, AI for Science https://saccharomycetes.github.io/
@esteng.bsky.social
Postdoc @UNC working on NLP, AI, and computational linguistics. Formerly PhD student @JHU and undergrad @McGill esteng.github.io
@dorsarohani.bsky.social
Deep learning @ NVIDIA, Vector. prev @ DeepGenomics dorsarohani.com
@ericxw.bsky.social
Head of Research @ Simular. Professor @ UC Santa Cruz. Working on #Multimodal #Embodied #AIAgents. AI for Humanity in the long run. he/him 📍Bay Area 🔗 https://eric-xw.github.io
@dziadzio.bsky.social
ELLIS PhD student in machine learning at IMPRS-IS. Continual learning at scale. sebastiandziadzio.com
@lambertoballan.bsky.social
Researcher in Computer Vision, Machine Learning and Embodied AI / Associate Prof of CS @UniPadova since 2017 / @ellis.eu member Previously: EU MSCA Fellow 2014-17 @Stanford and @UniFirenze 📍Padova, Italy 🔗 http://vimp.math.unipd.it/
@oliverlemon.bsky.social
Prof in CS; Academic Lead of National Robotarium; ELLIS Fellow; Edinburgh Centre for Robotics; Heriot-Watt, Edinburgh. Postdocs at Stanford and Edinburgh. Research in NLP, dialogue, conversational AI, multimodality, robots, embodied AI, collaborative AI
@roymiles.bsky.social
Research Scientist @Noah's Ark Lab, Huawei PhD Imperial College London roymiles.github.io Working on multi-modality and efficient ML
@beravci.bsky.social
Learning with machines&data Also interested in neuroscience & philosophy of mind PhD @BilkentCS, Asst. Prof. of AI @TOBB ETU ML | AI | HealthAI | Multimodal
@prashantgarg.bsky.social
Econ PhD @imperial. Visiting researcher at IFC and Cambridge. AI and networks in economics. www.prashantgarg.org
@jn2clark.bsky.social
Founder @marqo_ai, multimodal search, http://github.com/marqo-ai/marqo. Made robots see & learn @ Amazon RAI. Ex physicist @ Stanford & UCL, http://jn2clark.github.io
@elliesleightholm.bsky.social
Mathematician working in ML & AI 👩💻 STEM Advocate & Communicator 🚀 Cambridge Maths Grad 🎓
@yihe-deng.bsky.social
CS PhD candidate @UCLA | Prev. Research Intern @MSFTResearch, Applied Scientist Intern @AWS | LLM post-training, multi-modal learning https://yihedeng9.github.io
@ibalazevic.bsky.social
Senior Research Scientist at Google DeepMind, working on Gemini. PhD from University of Edinburgh. ibalazevic.github.io
@ckli.bsky.social
PhD student @ EPFL 🇨🇭 | Prev: research intern @ Google DeepMind, ByteDance AI Lab | Working on robot learning, multimodal & continual learning | https://charlieleee.github.io
@az-mtl.bsky.social
Applied ML Research @mila-quebec.bsky.social, Focused on Vision-Language Models🤖, AI for Neuroimaging 🧠, and Drug Discovery for Proteins 🧬. Alumnus of @polymtl.bsky.social. Travel, Photography & Amateur Music Composer. He/him. 🇨🇦 (FR/EN)
@guneetsk.bsky.social
🪸NLP researcher, AI scientist at Deccan AI. Core interests: Computational Social Sciences, Conversational AI, Ai safety and Multilinguality
@kennethmarino.bsky.social
Assistant Prof at University of Utah Fall 2025. NLP+CV+RL. RS at Google DeepMind. PhD from CMU MLD, undergrad Georgia Tech. Sometimes researcher, frequent shitposter.
@skamalas.bsky.social
he/him; Researcher at NAVER LABS Europe. Greek, resident a Barcelona. https://www.skamalas.com/
@jmhessel.bsky.social
jmhessel.com NLP PhD; Seattle bike lane enjoyer; posts about machine learning, language processing, computer vision, transit
@hildekuehne.bsky.social
Professor for CS at the Tuebingen AI Center and affiliated Professor at MIT-IBM Watson AI lab - Multimodal learning and video understanding - GC for ICCV 2025 - https://hildekuehne.github.io/
@maxilse.bsky.social
Working at microsoft research health futures. Interested in causal representation learning and generative modelling applied to medical data.
@petitegeek.bsky.social
Computing Science prof in multimodal embodied AI, emotion, interaction at SFU in Vancouver 🇨🇦🇵🇭 Director of the Rosie Lab www.rosielab.ca Robotics nerd. Previously at SoftBank Robotics 🤖 FR/JP
@mateuszpach.bsky.social
ELLIS PhD Student @ TU Munich and Helmholtz AI 🔍⚙️ Interpretability 🖼️📚 Multimodal ML ✨🎨 Generative AI
@jiaangli.bsky.social
PhD student at University of Copenhagen @belongielab.org | #nlp #computervision | ELLIS student @ellis.eu 🌐 https://jiaangli.github.io/
@viridiano.com
A stickler for detail who loves uncertainty | Postdoc at Case Western Reserve University working on #Multimodality and #FrameSemantics | Formerly Ph.D. student at FrameNet Brasil l Content Manager at @cogscisociety.bsky.social 🌐 http://viridiano.com
@akhtarmubashara.bsky.social
PhD @ King’s College London • prev CambridgeNLP, TU Wien, intern GoogleDeepmind • NLP, Data-centric ML, Multimodality http://mubasharaakhtar.com
@serge.belongie.com
Professor, University Of Copenhagen 🇩🇰 PI @belongielab.org 🕵️♂️ Director @aicentre.dk 🤖 Board member @ellis.eu 🇪🇺 Formerly: Cornell, Google, UCSD #ComputerVision #MachineLearning
@pcascanteb.bsky.social
UMIACS Postdoc - Incoming Assistant Professor @ SBU CS. Vision and Language. Prev. @RiceCompSci @vislang @merl_news @MITIBMLab. EECS Rising Star'23. Former drummer. 🇨🇷 https://paolacascante.com/
@zeynepakata.bsky.social
Liesel Beckmann Distinguished Professor of Computer Science at Technical University of Munich and Director of the Institute for Explainable ML at Helmholtz Munich
@rohit-saxena.bsky.social
PhD student at University of Edinburgh Long Context | Summarization | Vision and Language | Narratives https://saxenarohit.github.io/
@gowthami.bsky.social
PhD-ing at UMD. Knows a little about multimodal generative models. Check out my website to know more - https://somepago.github.io/
@soldaini.net
I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma 🍇) w @kylelo.bsky.social. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot more at https://soldaini.net
@mohitbansal.bsky.social
Parker Distinguished Professor, @UNC. Program Chair #EMNLP2024. Director http://MURGeLab.cs.unc.edu (@uncnlp). @Berkeley_AI @TTIC_Connect @IITKanpur #NLP #CV #AI #ML https://www.cs.unc.edu/~mbansal/
@giffmana.ai
Researcher (OpenAI. Ex: DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://admonymous.co/giffmana 📍 Zürich, Suisse 🔗 http://lucasb.eyer.be
@aarbelle.bsky.social
Father, Husband, and a Senior Researcher and Manager of the AI Multimodal group at IBM Research.
@simi97k.bsky.social
NLP ❤️ | PhD @ CMU, LTI | Prev. Google Research, Microsoft Research | https://simran-khanuja.github.io/
@lucaeyring.bsky.social
ELLIS PhD student at TU Munich & Helmholtz AI Generative Modeling - Optimal Transport - Representation Learning https://lucaeyring.com/
@sivareddyg.bsky.social
Assistant Professor @Mila-Quebec.bsky.social Co-Director @McGill-NLP.bsky.social Researcher @ServiceNow.bsky.social Alumni: @StanfordNLP.bsky.social, EdinburghNLP Natural Language Processor #NLProc
@bayesiankitten.bsky.social
Postdoctoral Researcher @ Bethgelab, University of Tübingen Benchmarking | LLM Agents | Data-Centric ML | Continual Learning | Unlearning drimpossible.github.io
@shyamgopal.bsky.social
PhD at Tübingen. Working on post-training diffusion and multimodal models. Previous research interns at Snapchat and Naver Labs. https://sgk98.github.io/
@confusezius.bsky.social
Large Models, Multimodality, Continual Learning | ELLIS ML PhD with Oriol Vinyals & Zeynep Akata | Previously Google DeepMind, Meta AI, AWS, Vector, MILA 🔗 karroth.com
@askoepke.bsky.social
Junior research group leader at TUM | University of Tübingen. Previously at VGG (Oxford), BAIR (Berkeley). Interested in multi-modal learning. 🔗 https://akoepke.github.io/
@vishaalurao.bsky.social
@ELLISforEurope PhD Student @bethgelab @caml_lab @Cambridge_Uni @uni_tue; Currently SR @GoogleAI; Previously MPhil @Cambridge_Uni, RA @RutgersU, UG @iiitdelhi vishaal27.github.io
@merve.bsky.social
proud mediterrenean 🧿 open-sourceress at hugging face 🤗 multimodality, zero-shot vision, vision language models, transformers
@saxon.me
NLP/Vision+Language PhD Candidate @ UCSB Evals, multilinguality, and multimodality https://saxon.me/
@bennokrojer.bsky.social
AI PhDing at Mila/McGill (prev FAIR intern). Happily residing in Montreal 🥯❄️ Academic: language grounding, vision+language, interp, rigorous & creative evals, cogsci Other: many sports, urban explorations, puzzles/quizzes bennokrojer.com
@adhirajghosh.bsky.social
MSc Machine Learning @University of Tübingen | Data-centric Vision and Language Researcher @bethgelab.bsky.social Website: adhirajghosh.github.io Twitter: https://x.com/adhiraj_ghosh98