Loading...

Remote Senior DevOPS Engineer

  • Full Time
  • Anywhere

HireLATAM

We are seeking a Senior DevOps Engineer to join our team and contribute to the development of cutting-edge payment solutions. As a remote full-time position, you will be responsible for designing, implementing, and managing a robust Kafka-based messaging infrastructure and collaborate closely with the company’s founders to ensure the delivery of high-quality, scalable software.

Requirements

  • Automating deployment, management, and operations of complex distributed systems with Apache Kafka
  • Implementing tracing and performance observability in high scale distributed microservice architectures
  • Designing and managing scalable, high-throughput, and low-latency Kafka clusters for real-time data streaming between services
  • Building and maintaining infrastructure as code (IaC) for Kafka and related services using Terraform, Ansible, or similar tools
  • Monitoring and optimizing Kafka performance, ensuring message reliability and minimal downtime in a high-availability payment environment
  • Setting up and maintaining centralized observability systems for logs, metrics, and traces across all services using Prometheus, Grafana, or Datadog
  • Designing and maintaining CI/CD pipelines for infrastructure and microservices using tools such as GitHub Actions, and Jenkins
  • Managing containerized workloads using Docker and Kubernetes, ensuring scalability, and automated rollouts/rollbacks in production
  • Collaborating with backend engineers, SREs, and platform teams to implement Kafka producers/consumers that integrate cleanly with payment processing flows
  • Establishing security, access control, and encryption protocols for Kafka to meet regulatory and compliance standards (e.g., PCI DSS)
  • Leading Kafka upgrades, partition strategy design, and rebalancing without disrupting critical microservices
  • Implementing observability tooling for Kafka (e.g., Confluent Control Center, Prometheus/Grafana, or Datadog integrations)
  • Developing disaster recovery and failover strategies for Kafka-related components in production
  • Participating in incident response processes for Kafka-related outages
  • Strong communication skills in both English and Spanish

Benefits

  • Generous Paid Time Off
  • 401k Matching
  • Retirement Plan
  • Four Day Work Week

Originally posted on Himalayas

To apply for this job please visit himalayas.app.

Keep exploring on Get A Job.ai

Not quite the right fit? Your next opportunity is a click away.

Hiring instead? Post a job and reach candidates searching right now.