Ai Benchmark Engineer Vapi
Ai Benchmark Engineer Vapi
-
Vapi, India
-
Posted: less than a week ago
-
Save
Description
About Turing:
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.;
Role Overview:
We are seeking a highly analytical and computationally proficient individual to join our team with a robust research background. You will be instrumental in contributing to this role by either crafting challenging and insightful problems in your respective research domain, devising elegant computational solutions.
Responsibilities:
Build multi-agent benchmark tasks that require reading, analyzing, and synthesizing large document collections
Curate real-world research corpora — academic papers, case studies, technical reports — and design questions that require comprehensive analysis
Write structured ground-truth oracles (JSON) with specific, verifiable answers that prove the agent actually read the source material
Design LLM judge prompts that evaluate agent output field-by-field against the oracle
Create decomposition guides that split research across multiple parallel sub-agents (one per document, one per domain, then synthesis)
Required Qualifications:
5+ years of research experience (academic or industry) in any scientific domain
Solid reading comprehension with ability to extract structured data from unstructured text
Experience with JSON and data structures, including schema design and output validation
Proficiency in Python scripting for data processing and evaluation Apply on Kit Job: kitjob.in/job/4n4smt
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.;
Role Overview:
We are seeking a highly analytical and computationally proficient individual to join our team with a robust research background. You will be instrumental in contributing to this role by either crafting challenging and insightful problems in your respective research domain, devising elegant computational solutions.
Responsibilities:
Build multi-agent benchmark tasks that require reading, analyzing, and synthesizing large document collections
Curate real-world research corpora — academic papers, case studies, technical reports — and design questions that require comprehensive analysis
Write structured ground-truth oracles (JSON) with specific, verifiable answers that prove the agent actually read the source material
Design LLM judge prompts that evaluate agent output field-by-field against the oracle
Create decomposition guides that split research across multiple parallel sub-agents (one per document, one per domain, then synthesis)
Required Qualifications:
5+ years of research experience (academic or industry) in any scientific domain
Solid reading comprehension with ability to extract structured data from unstructured text
Experience with JSON and data structures, including schema design and output validation
Proficiency in Python scripting for data processing and evaluation Apply on Kit Job: kitjob.in/job/4n4smt
Highlights
-
Company nameTuring
-
Job positionAi Benchmark Engineer Vapi
Safety Tips
Beware of ads written with poor grammar or spelling.
More info about this ad
Ai Benchmark Engineer Vapi has been posted in the Vapi Engineering category on Locanto.
Right now, this is the only ad posted in this category in Vapi.
There are more ads within a 15 km radius for this category. If you want to view those ads, click here.