Gramian Consulting Group
We are looking for an AI Evaluation Engineer to design benchmark tasks that simulate real-world analytical workflows, with a focus on data analysis, multi-agent systems, and verification logic.
Requirements
- Design and develop multi-agent benchmark tasks focused on complex data analysis workflows
- Create or curate realistic datasets (CSV, JSON, logs, reports, financial or operational data)
- Build tasks requiring: Cross-referencing across multiple data sources, Anomaly detection and contradiction identification, Statistical analysis and interpretation
- Define task decomposition strategies across specialized sub-agents (e.g., financial, technical, operational analysis)
- Develop verification logic to validate precise analytical outputs (not generic summaries)
- Implement evaluation pipelines using Python and SQL
- Create reproducible environments using Docker
- Analyze task performance and refine for clarity, difficulty, and scoring accuracy
Originally posted on Himalayas
To apply for this job please visit himalayas.app.
About this role & career path
Working in Egypt
Egypt, officially the Arab Republic of Egypt, is a country spanning the northeast corner of Africa and southwest corner of Asia via the Sinai Peninsula. It is bordered by the Mediterranean Sea to the north, Palestine and Israel to the northeast, the Red Sea to the east, Sudan and the Sahara to the south, and Libya to the west. The Gulf of Aqaba in the northeast separates Egypt from Jordan and Saudi Arabia. Cairo is the capital, largest city, and leading cultural centre, while Alexandria is the second-largest city and an important hub of industry and tourism. With over 107 million inhabitants,
More jobs at Gramian Consulting Group
Keep exploring on Get A Job.ai
Not quite the right fit? Your next opportunity is a click away.
- Browse all jobs
- More jobs by category
- Remote jobs you can do from anywhere
- Research typical pay for this role
- Set a job alert so new matches reach you first
- Upload your resume to apply faster
Hiring instead? Post a job and reach candidates searching right now.