Skrub
skrub is a Python library to ease preprocessing and feature engineering for tabular machine learning.
Our long-term goal is to directly connect database tables to machine learning estimators.
https://skrub-data.org
https://discord.gg/ABaPnm7fDC
Starter Packs
Created by Skrub (1)
pydata projects
Libraries and projects related to data processing and machine learning in Python
@jupyter.org
Multi-language interactive computing environments. Jupyter Notebook, JupyterLab and related projects – @mentions not monitored. Open issues on GitHub.
@commitcanary.bsky.social
I'm CommitCanary, your daily(-ish) source for GitHub updates. I use AI to turn commit messages into concise summaries—though I might occasionally hallucinate!
@pydata.bsky.social
The PyData Global Conference is where users, contributors, and newcomers can share experiences to learn from one another and grow together. Want to meet the community? Join our Discord! https://discord.gg/CjspHbE9xe
@dataumbrella.org
◈ community ◈ data science & open source ◈ videos: https://youtube.com/@dataumbrella ◈ events: https://meetup.com/data-umbrella ◈ news: dataumbrella.substack.com/
@mystmd.org
Markdown with superpowers for writing scientific 👨🔬 and technical 📈papers. #OpenSource & maintained by Project #Jupyter.
@networkx.bsky.social
Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
@realpython.com
Online #Python Training & Expert Community: Tutorials, Video Courses, Books, Quizzes...and More! Join 1M+ Pythonistas at http://realpython.com
@pydataparis.bsky.social
🇫🇷 Paris hub of the PyData global community! Join our community meetup and the upcoming conference at Cité des Sciences. September 30 - October-1 2025 • Cité des Sciences PyData Paris CFP deadline: April 27th 2025
@2i2c.org
A non-profit dedicated to helping communities create and share knowledge with open infrastructure for interactive computing.
@arviz.bsky.social
Official account for the ArviZ project. We provide #FOSS tools for exploratory analysis of #Bayesian models in #Python and #JuliaLang www.arviz.org
@galaxyproject.bsky.social
Reproducible data science for everyone! #UseGalaxy https://galaxyproject.org/
@prefect.io
Modern workflow orchestration for data and ML engineers. Website: https://prefect.io Code: https://github.com/prefecthq/prefect Slack: https://prefect.io/slack
@story645.bsky.social
graduate student studying visualization (CS) @thegraduatecenter.bsky.social & @matplotlib.org community manager
@conda.org
conda is a multi-platform and language-agnostic open source package management and environment management ecosystem. ⇒ @conda@fosstodon.org ⇒ https://conda.org/
@conda-forge.org
A community led collection of recipes, build infrastructure and distributions for the conda package manager. https://conda-forge.org
@vscode.dev
Visual Studio Code with GitHub Copilot supercharges your code with AI-powered suggestions, right in your editor
@plotly.com
Put data & AI into action by creating scalable, interactive data apps for your business with Dash. https://plotly.com/
@dslc.io
Our mission: To provide tools and resources to foster a diverse, friendly, and inclusive community of data science learners and practitioners. Join us at https://dslc.io
@machsci.bsky.social
🌲🌲 Applied ML/AI, data science, MLOps | Wife of 1, mom of 2 | Co-Founder and CTO of http://storytellers.ai python 🐍 AI 🤖 cloud ☁️ data 📊 I also talk about Jesus here: @itskirstenlum.bsky.social
@wang.social
Anaconda Founder&Head of AI; created the PyData movement, PyScript, Bokeh, Datashader; Fellow @ Python Software Foundation; Center for Humane Tech Game~B; Physics, Cybernetics, Memetics. A student of the human condition. Memento mori
@allendowney.bsky.social
Former professor at Olin College, principal data scientist at PyMC Labs, author of Think Python, and Probably Overthinking It -- blog and book -- and stark raving Bayesian.
@seanlaw.bsky.social
👋 Principal Data Scientist, R&D at Schwab. 🐍 PyData Ann Arbor co-organizer. 🔥Creator of the STUMPY Python package for modern time series analysis: 👉 https://stumpy.readthedocs.io/en/latest/
@inesmontani.bsky.social
💥 Founder & CEO @explosion.ai 👩💻 Developing spacy.io & prodi.gy 🐍 Python Software Foundation Fellow 🧠 AI, Machine Learning & NLP 💼 linkedin.com/in/inesmontani 🐘 sigmoid.social/@ines 💥 explosion.ai 🌎 ines.io
@posit.co
We make free, open-source software for data scientists like the RStudio IDE. We're formerly known as RStudio. You can always download our open-source IDE here. https://posit.co/download/rstudio-desktop/
@pymc-labs.bsky.social
The Bayesian Consultancy • Using PyMC to solve your most challenging data science problems • http://pymc-labs.com
@python4data.science
Teaching materials for the cusy training courses on a Python-based data science workflow: https://cusy.io/en/seminars
@numpy.bsky.social
@duckdb.org
DuckDB is an analytical in-process SQL database management system. "DuckDB" and the DuckDB logo are registered trademarks of the DuckDB Foundation.
@pydatasoton.bsky.social
https://www.meetup.com/pydata-southampton Southampton, UK chapter of PyData
@pydatamadrid.masto.ai.ap.brid.gy
Meetup de la comunidad PyData en Madrid. ¡Únete! Toots por @astrojuanlu [bridged from https://masto.ai/@pydatamadrid on the fediverse by https://fed.brid.gy/ ]
@pydatalondon.bsky.social
https://london.pydata.org We run a monthly meetup and host the annual PyData London conference. PyData is an educational program of NumFOCUS, helping our community share ideas and learn from each other. https://www.meetup.com/PyData-London-Meetup/
@matplotlib.org
Python library for creating static, animated, & interactive visualizations. Chat w/ us @ https://discourse.matplotlib.org/ Sponsored by NumFocus
@koaning.bsky.social
Prefer common sense over hype. Employed at @marimo.io, building calmcode.io and dearme.email. Also blogs over at https://koaning.io.
@riccardocappuzzo.com
Research engineer at Inria Saclay, working on the Skrub library. Python, data preparation, ML, tabular learning. ORCID: 0000-0002-4448-2959 ☄️ https://www.riccardocappuzzo.com https://github.com/rcap107
@ogrisel.bsky.social
Software engineer at probabl, scikit-learn contributor. Also at: https://sigmoid.social/@ogrisel https://github.com/ogrisel
@gaelvaroquaux.bsky.social
Research & code: Research director @inria ►Data, Health, & Computer science ►Python coder, (co)founder of scikit-learn, joblib, & @probabl.bsky.social ►Sometimes does art photography ►Physics PhD
@bsky.app
official Bluesky account (check username👆) Bugs, feature requests, feedback: support@bsky.app