Datalab is building the core infrastructure for how enterprises process and understand documents at scale. Our models — Chandra, Surya, and Marker — have become the backbone of document intelligence, with 60,000+ GitHub stars and adoption across tier-1 AI research labs, Fortune 500 enterprises, and government agencies.
Our team of 6 is about to hit 8-figures in revenue and have over 1000+ customers across our API and on-prem.
Backed by founding members of OpenAI, FAIR, and Hugging Face. We move fast, ship often, and we're hiring builders who do the same.
View and apply to our open roles here!
Interested, but don’t see your role? Email us at [email protected]
We believe in the simplest possible technology that gets the job done. Our stack is FastAPI (Python) for the backend and frontend, with some light HTMX and JS sprinkled in. We use Postgres and Redis, and deploy to Render. For our API, training, and inference, we mostly use Python and Pytorch.
We prioritize clarity and maintainability over trends, and we ship code that thousands of developers and enterprises rely on every day.