Gramian Consulting Group
We are looking for an AI Evaluation Engineer specialized in data analysis to design benchmark tasks that simulate real-world analytical workflows. The ideal candidate will have 5+ years of experience in data analysis or analytics-heavy roles, strong proficiency in Python and SQL, and experience working with real-world, messy datasets.
Requirements
- Design and develop multi-agent benchmark tasks focused on complex data analysis workflows
- Create or curate realistic datasets
- Implement evaluation pipelines using Python and SQL
- Create reproducible environments using Docker
- Analyze task performance and refine for clarity, difficulty, and scoring accuracy
Benefits
- Competitive salary
Originally posted on Himalayas
To apply for this job please visit himalayas.app.
Keep exploring on Get A Job.ai
Not quite the right fit? Your next opportunity is a click away.
- Browse all jobs
- More jobs by category
- Remote jobs you can do from anywhere
- Research typical pay for this role
- Set a job alert so new matches reach you first
- Upload your resume to apply faster
Hiring instead? Post a job and reach candidates searching right now.