Salary range - $275k - $375k | Equity - up to 0.5% | In-person NYC
Apply here or email us at [email protected] with your resume and references to past work!
Role Overview
We're looking for a Research Engineer to own problems end to end across our models, inference service, and product. You won't just train a model and hand it off. You'll take it from training through benchmarking, into our inference stack, and work with the team to integrate it into our products.
We're a small team that has shipped the current state of the art OCR model, Chandra. Our models collectively have 50k+ Github stars. Our tools are used internally at frontier AI labs like Anthropic, and Fortune 500 enterprises like Siemens.
Our team focuses on training small, efficient models that outperform much larger LLMs on domain-specific tasks (like OCR, structured extraction, tables). We move fast, prioritize practical results, and build tools that are open, reproducible, and built to last. You'll test hypotheses quickly, iterate on results, and balance experimental rigor with shipping to customers.
Day to day, you will:
A typical project might look like: identify a gap in extraction quality on long documents, train and benchmark a new model, optimize it for inference, and work with the team to ship it to users. Concretely:
- Train and evaluate models: Train task-specific models (OCR, layout, text recognition, extraction). Explore architectures and training strategies to optimize task performance. This includes our open source models, like Marker, Surya, and Chandra.
- Optimize inference: Profile and accelerate model inference across different hardware setups (H100s, B200s, L40s, CPUs).
- Ship to product: Work with the team to integrate models into our API and product, helping define how new capabilities surface for end users. You will be involved from model training through integration, although your work will be weighted much more towards the model side than the product side.
- Create and maintain datasets: Source, design, and clean datasets for supervised and synthetic training; create reproducible pipelines for data versioning and evaluation.
- Experiment and benchmark: Run ablations, track metrics, and publish findings that inform model design and internal research direction.
- Engage with users and partners: Occasionally join calls or Slack threads to better understand customer needs and inform your work.
Ideal Candidate
You've shipped models that made it into production. You understand how to balance exploration with delivery, and how to turn research insights into products people actually use.
- 3+ years experience training, fine-tuning, and evaluating deep learning models
- Trained at least one production-grade model or system used in real-world applications
- Deep expertise in PyTorch and Python, with strong fundamentals in deep learning (optimization, evaluation, architecture design)
- Comfortable with data engineering, benchmarking, and performance profiling across hardware setups