Loading...

Freelance Agent Evaluation Engineer

  • Part Time
  • Anywhere

Mindrift

We’re building a dataset to evaluate AI coding agents by creating challenging tasks and evaluation criteria within realistic simulated environments. You’ll design tasks, write tests, and iterate with AI agents to evaluate their performance.

Requirements

  • Degree in Computer Science, Software Engineering, or related fields
  • 5+ years in software development, primarily Python (FastAPI, pytest, async/await, subprocess, file operations)
  • Background in full-stack development, with experience building React-based interfaces (JavaScript/TypeScript) and robust back-end systems
  • Experience writing tests (functional, integration — not just running them)
  • Docker containers, and familiarity with infrastructure tools (Postgres, Kafka, Redis)
  • CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
  • English proficiency – B2

Benefits

  • Competitive hourly rate of up to $45
  • Flexible work schedule
  • Opportunity to work on cutting-edge AI projects

Originally posted on Himalayas

To apply for this job please visit himalayas.app.

About this role & career path

Working in Japan

Japan is an island country in East Asia. Located in the Pacific Ocean off the northeast coast of the Asian mainland, it is bordered to the west by the Sea of Japan and extends from the Sea of Okhotsk in the north to the East China Sea in the south. The Japanese archipelago consists of four major islands alongside 14,121 smaller islands. Japan is divided into 47 administrative prefectures and eight traditional regions, and around 75% of its terrain is mountainous and heavily forested, concentrating its agriculture and highly urbanized population along its eastern coastal plains. With a populati

    More jobs at Mindrift

    Keep exploring on Get A Job.ai

    Not quite the right fit? Your next opportunity is a click away.

    Hiring instead? Post a job and reach candidates searching right now.