Loading...

Network Reliability Engineer

  • Full Time
  • Anywhere

Margo

Margo — Warszawa, mazowieckie

HPC AI GPU CLUSTERS YOUR DAILY ROUTINE – Build a large AI infrastructure with monitoring, diagnosis, and remediation of production incidents- Troubleshoot high-impact production issues in collaboration with other engineering teams – Participate in an on-call rotation to handle incidents and ensure service continuity – Implement and maintain observability solutions to monitor AI infrastructure and …

View full listing & apply (via Adzuna)

To apply for this job please visit www.adzuna.com.

Keep exploring on Get A Job.ai

Not quite the right fit? Your next opportunity is a click away.

Hiring instead? Post a job and reach candidates searching right now.