AI Benchmark Engineer – Mathematical Reasoning (Meerut)

29.0019 77.768
Meerut, India
Posted: a week ago
Save
Share

Description

Role Overview We are seeking a highly analytical and computationally proficient individual to join our team with a strong research background. You will be instrumental in contributing to this role by either crafting challenging and insightful problems in your respective research domain or devising elegant computational solutions.

Responsibilities:
- Build multi-agent benchmark tasks that require multi-step mathematical reasoning, proof construction, or algorithmic problem-solving
- Design problems that are genuinely hard for a single agent but decomposable — competition math, numerical analysis, combinatorial optimisation, statistical inference
- Create verification scripts that check mathematical correctness — numerical answers with appropriate tolerance, proof step validity, and algorithm output correctness
- Write clear problem statements with exact notation, definitions, and output format
- Create decomposition guides that split problems into independent sub-computations or parallel solution strategies

Offer Details
- Pay: INR 1.75 to 2 Lakhs per month
- Mode of work: Fully Remote
- Duration: 12 months (likely extended)
- Number of positions: 15

Required Qualifications:
- 5+ years in mathematics, quantitative research, or computational science — competition math, university-level mathematics, or quantitative research background. Python programming — NumPy, SciPy, or symbolic computation (SymPy). Experience writing mathematical proofs or formal derivations.
- Ability to create problems with precise, verifiable answers — not subjective or open-ended.
- Experience with AI coding benchmarks (SWE-bench, Terminal-bench). Comfortable with Docker — writing Dockerfiles, building images, and debugging container issues.
- Understanding of numerical methods — floating-point tolerance, convergence criteria, and error bounds.

Strong plus:
- Experience creating math competition problems (AMC, AIME, Putnam, IMO, or similar).
- Research in mathematics, theoretical CS, or q Apply on Kit Job: kitjob.in/job/4lakqz

Highlights

Company name

Millionlogics
Job position

AI Benchmark Engineer – Mathematical Reasoning (Meerut)

Ad ID:

8764520417
Flag
Block ad

Safety Tips

Beware of ads written with poor grammar or spelling.