AI Benchmark Engineer (Planning/Operations) (Dombivli)
AI Benchmark Engineer (Planning/Operations) (Dombivli)
-
Dombivli, India
-
Posted: less than a week ago
-
Save
Description
About Turing: Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L; Role Overview: We are looking for AI Benchmark Engineers specializing in planning and operations to design and build complex, multi-agent benchmark tasks that simulate real-world planning, scheduling, and operational decision-making scenarios. This role focuses on creating constraint-rich problems that evaluate multi-agent reasoning, decomposition, and optimization capabilities in realistic environments. What does day-to-day life look like?
- Design and develop multi-agent benchmark tasks involving:
- Planning, scheduling, and resource allocation
- Operational decision-making (project management, logistics, incident response, capacity planning)
- Create constraint-rich problem statements with multiple interacting variables
- Develop verification scripts to evaluate:
- Feasibility (all constraints satisfied)
- Completeness (all requirements addressed)
- Optimality (efficient solutions)
- Build decomposition strategies:
- Split tasks across specialized sub-agents (resource-based, constraint-based, conflict resolution, optimization)
- Model real-world operational scenarios with dependencies, timelines, and resource constraints
- Collaborate on improving task quality, coverage, and evaluation rigor Requirements:
- 5+ years of experience in operations or project management or logistics or supply chain or AI research or a strong computer science research background
- Strong ability to formalize constraints, dependencies, and scheduling logic
- Proficiency in Python for building verification and validation scripts
- Strong structured problem-solving and decomposition skills
- Explicit and precise technical writing skills
- Experience with AI coding benchmarks (e.g., SWE-bench, Terminal-bench)
- Hands-on experience with Docker (Dockerfiles, image builds, debugging) Nice to have:
- Experience with optimization techniques (linear programming, constraint satisfaction, scheduling algorithms)
- Background in operations research
- Experience with simulation or modeling tools
- Knowledge of AI planning systems or automated reasoning
- Project management experience or certifications (PMP, Agile, etc.) Perks of Freelancing With Turing:
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies. Offer Details:
- Commitments Required: 40 hours per week with overlap of 4 hours with PST.
- Engagement Type: Contractor assignment (no medical/paid leave)
- Duration of Contract: 4 weeks (adjustable based on engagement) Apply on Kit Job: kitjob.in/job/4mvmk1
- Design and develop multi-agent benchmark tasks involving:
- Planning, scheduling, and resource allocation
- Operational decision-making (project management, logistics, incident response, capacity planning)
- Create constraint-rich problem statements with multiple interacting variables
- Develop verification scripts to evaluate:
- Feasibility (all constraints satisfied)
- Completeness (all requirements addressed)
- Optimality (efficient solutions)
- Build decomposition strategies:
- Split tasks across specialized sub-agents (resource-based, constraint-based, conflict resolution, optimization)
- Model real-world operational scenarios with dependencies, timelines, and resource constraints
- Collaborate on improving task quality, coverage, and evaluation rigor Requirements:
- 5+ years of experience in operations or project management or logistics or supply chain or AI research or a strong computer science research background
- Strong ability to formalize constraints, dependencies, and scheduling logic
- Proficiency in Python for building verification and validation scripts
- Strong structured problem-solving and decomposition skills
- Explicit and precise technical writing skills
- Experience with AI coding benchmarks (e.g., SWE-bench, Terminal-bench)
- Hands-on experience with Docker (Dockerfiles, image builds, debugging) Nice to have:
- Experience with optimization techniques (linear programming, constraint satisfaction, scheduling algorithms)
- Background in operations research
- Experience with simulation or modeling tools
- Knowledge of AI planning systems or automated reasoning
- Project management experience or certifications (PMP, Agile, etc.) Perks of Freelancing With Turing:
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies. Offer Details:
- Commitments Required: 40 hours per week with overlap of 4 hours with PST.
- Engagement Type: Contractor assignment (no medical/paid leave)
- Duration of Contract: 4 weeks (adjustable based on engagement) Apply on Kit Job: kitjob.in/job/4mvmk1
Highlights
-
Company nameTuring
-
Job positionAI Benchmark Engineer (Planning/Operations) (Dombivli)
Safety Tips
Protect your personal details and initiate communication using our contact form.
More info about this ad
AI Benchmark Engineer (Planning/Operations) (Dombivli) has been posted in the Dombivali Engineering category on Locanto.
In this category, there are no other ads right now posted in Dombivali.
Interested in more? Widen your search to view ads in nearby areas of Dombivali. This includes Engineering in Kopar Khairane, Airoli and Ulhāsnagar. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.