AI Benchmark Engineer (Reasoning/Math) (Kolkata)
-
Kolkata, India
-
Posted: yesterday
-
Save
- Build multi-agent benchmark tasks that require multi-step mathematical reasoning, proof construction, or algorithmic problem-solving
- Design problems that are genuinely hard for a single agent but decomposable — competition math, numerical analysis, combinatorial optimization, statistical inference
- Create verification scripts that check mathematical correctness — numerical answers with appropriate tolerance, proof step validity, algorithm output correctness
- Write clear problem statements with precise notation, definitions, and output format
- Create decomposition guides that split problems into independent sub-computations or parallel solution strategies Required Qualifications:
- 5+ years of experience in mathematics, quantitative research, or computational science (e.g., competition math, university-level mathematics, or quantitative research)
- Strong Python programming skills, including NumPy, SciPy, or symbolic computation (SymPy)
- Experience writing mathematical proofs or formal derivations
- Ability to create problems with precise, verifiable answers (not subjective or open-ended)
- Familiarity with AI coding benchmarks such as SWE-bench and Terminal-bench
- Comfortable with Docker (writing Dockerfiles, building images, debugging containers)
- Understanding of numerical methods, including floating-point tolerance, convergence criteria, and error bounds Nice to have:
- Experience creating math competition problems (e.g., AMC, AIME, Putnam, IMO, or similar)
- Research experience in mathematics, theoretical computer science, or quantitative fields
- Experience with automated theorem proving or formal verification
- Knowledge of AI reasoning benchmarks (e.g., GSM8K, MATH, AIME, GPQA, ARC-AGI)
- Experience with large-scale numerical computation or scientific computing Perks of Freelancing With Turing:
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies.
- Potential for contract extension based on performance and project needs. Offer Details: Commitments Required : 40 hours /week with 4 hours of PST Overlap Engagement type : Contractor assignment/freelancer (no medical/paid leave) Duration of contract : 1 month; [expected start date is next week] Apply on Kit Job: kitjob.in/job/4nbp2x
-
Company nameTuring
-
Job positionAI Benchmark Engineer (Reasoning/Math) (Kolkata)
AI Benchmark Engineer (Reasoning/Math) (Kolkata) has been posted in the Kolkata Engineering category on Locanto.
If you’re still wanting to browse, there is so much to explore in the Engineering category! Take a look at the ads Urgent Requirement”, Uttarpāra, skill development course, Kolkata and Top Engineering Colleges in Kolkata GNIT in 157/F, Nilgunj Rd, Sahid Colony, Panihati, Khardah, Khardaha to discover more of what you’re looking for. Currently, there are 6 ads posted in the Engineering category in Kolkata.
You can find the Engineering category under Jobs. Want something else? Check out the related categories Education & Training, Legal & Consulting and Marketing, Advertising & PR Kolkata.
Interested in more? Widen your search to view ads in nearby areas of Kolkata. This includes Engineering in Bhāngar, South Dumdum and Baranagar. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.