Loading...

Tech Ops – Production Support & Reliability (AWS)

  • Full Time
  • Anywhere

CommIT

Description

We are looking for Tech Ops – Production Support & Reliability Lead

Front-line production support for Braviant’s AWS multi-account stack. Monitor systems, triage alerts, execute runbooks, escalate cleanly to developers. Defensive ownership role – not a developer role despite “Lead” in title.

Stack:

  • AWS – VPC, ECS, Lambda (SAM/CloudFormation), IAM, NAT, security groups
  • PostgreSQL on Amazon RDS (~15 instances)
  • Datadog + CloudWatch (APM, logs, alerting)
  • Java microservices / API-heavy app stacks
  • Jira (ITSM) + Slack (ops channels)
  • Nice-to-have: AWS data services (Glue, S3, Athena, EventBridge), Metaplane

Requirements

Must-have:

  • 3+ years production support / SRE / NOC / ops engineering
  • Hands-on AWS – EC2/ECS, VPC networking, IAM
  • Operational PostgreSQL / RDS – slow query reading, basic tuning, vacuum awareness
  • Incident triage across infra + app layers
  • Structured incident response (ITIL, NIST, or equivalent)
  • SLA management in a ticketed environment (Jira or similar)
  • Strong written English for escalation + post-incident write-ups

Nice-to-have:

  • Datadog / CloudWatch fluency
  • AWS data services (Glue, S3, Athena, EventBridge)
  • Basic IaC (CloudFormation, SAM, Terraform)
  • Financial services or other regulated-environment background
  • AWS SysOps Administrator or Solutions Architect cert
  • Scripting / automation

Originally posted on Himalayas

To apply for this job please visit himalayas.app.

Keep exploring on Get A Job.ai

Not quite the right fit? Your next opportunity is a click away.

Hiring instead? Post a job and reach candidates searching right now.