AI Benchmark Engineer (Reasoning/Math) (Jaipur)
-
Jaipur, India
-
Posted: yesterday
-
Save
- Build multi-agent benchmark tasks that require multi-step mathematical reasoning, proof construction, or algorithmic problem-solving
- Design problems that are genuinely hard for a single agent but decomposable — competition math, numerical analysis, combinatorial optimization, statistical inference
- Create verification scripts that check mathematical correctness — numerical answers with appropriate tolerance, proof step validity, algorithm output correctness
- Write transparent problem statements with precise notation, definitions, and output format
- Create decomposition guides that split problems into independent sub-computations or parallel solution strategies Required Qualifications:
- 5+ years of experience in mathematics, quantitative research, or computational science (e.g., competition math, university-level mathematics, or quantitative research)
- Strong Python programming skills, including NumPy, SciPy, or symbolic computation (SymPy)
- Experience writing mathematical proofs or formal derivations
- Ability to create problems with precise, verifiable answers (not subjective or open-ended)
- Familiarity with AI coding benchmarks such as SWE-bench and Terminal-bench
- Comfortable with Docker (writing Dockerfiles, building images, debugging containers)
- Understanding of numerical methods, including floating-point tolerance, convergence criteria, and error bounds Nice to have:
- Experience creating math competition problems (e.g., AMC, AIME, Putnam, IMO, or similar)
- Research experience in mathematics, theoretical computer science, or quantitative fields
- Experience with automated theorem proving or formal verification
- Knowledge of AI reasoning benchmarks (e.g., GSM8K, MATH, AIME, GPQA, ARC-AGI)
- Experience with large-scale numerical computation or scientific computing Perks of Freelancing With Turing:
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies.
- Potential for contract extension based on performance and project needs. Offer Details: Commitments Required : 40 hours /week with 4 hours of PST Overlap Engagement type : Contractor assignment/freelancer (no medical/paid leave) Duration of contract : 1 month; [expected start date is next week] Apply on Kit Job: kitjob.in/job/4nb5nl
-
Company nameTuring
-
Job positionAI Benchmark Engineer (Reasoning/Math) (Jaipur)
AI Benchmark Engineer (Reasoning/Math) (Jaipur) has been posted in the Jaipur Engineering category on Locanto.
Why not check out other ads in this category, such as Income Tax Valuer in Vidhyadhar Nagar, Jaipur - Er. Harish Chand, Jaipur, Build Your Career at a Leading Engineering College in Jaipur, Jaipur or Top Software Companies in Jaipur Hiring Freshers Now in ajmer road, jaipur, jaipur. Right now, there are 11 classified ads in Engineering in Jaipur on Locanto.
You can find the Engineering category under Jobs. Want something else? Check out the related categories BPO & KPO, Marketing, Advertising & PR and Education & Training Jaipur.
There are more ads within a 15 km radius for this category. If you want to view those ads, click here.