Full Time
Anywhere
Posted 1 month ago

Gramian Consulting Group

We are looking for an AI Evaluation Engineer specialized in data analysis to design benchmark tasks that simulate real-world analytical workflows. The ideal candidate will have 5+ years of experience in data analysis or analytics-heavy roles, strong proficiency in Python and SQL, and experience working with real-world, messy datasets.

Requirements

Design and develop multi-agent benchmark tasks focused on complex data analysis workflows
Create or curate realistic datasets
Implement evaluation pipelines using Python and SQL
Create reproducible environments using Docker
Analyze task performance and refine for clarity, difficulty, and scoring accuracy

Benefits

Competitive salary

Originally posted on Himalayas

To apply for this job please visit himalayas.app.

Keep exploring on Get A Job.ai

Not quite the right fit? Your next opportunity is a click away.

Browse all jobs
More jobs by category
Remote jobs you can do from anywhere
Research typical pay for this role
Set a job alert so new matches reach you first
Upload your resume to apply faster

Hiring instead? Post a job and reach candidates searching right now.

Get A Job.ai

AI Evaluation Engineer (Data Analysis & Multi-Agent Systems)

Requirements

Benefits

Keep exploring on Get A Job.ai

Senior Software Engineer, Fullstack

Lifecycle Marketing Manager, Growth

Senior Financial Analyst I

Staff Software Engineer, CV UX

Digital Health Nurse, Germany (Remote/Hybrid)

Launcher – International Expansion

Patient Experience Intern

Senior Finance Analyst

Talent Acquisition Manager, APAC

DevOps Engineer