About us

The future of AI hinges on access to high-quality, diverse data, much of which is locked away in tough-to-parse formats like PDFs. We built SoTA document intelligence models to solve this problem, including OCR and PDF parsing repos, and our tools Surya and Marker have accumulated over 41K stars collectively. We do meaningful research, ship product, and contribute to open source.

What started as an open source passion project has now been adopted by hundreds of your favorite teams and researchers at leading organizations like Ai2, OpenAI, Harvard, Stanford, and MIT’s research labs. We also recently raised a seed round from founding members of OpenAI, FAIR, and Huggingface to continue building our API and enterprise products (announcing soon!)


Contact

We aren’t actively hiring, but we’re always open to connecting to great talent. If you’re interested in joining our team, either now or in the future, please send a note to [email protected]. We love to nerd out on side projects, so please include a link to something you’ve built!