Remote Otter LogoRemoteOtter

Site Reliability Engineer (SRE) - Remote

Posted 5 weeks ago

Overview

We are seeking an experienced SRE Engineer to join our dynamic team. As a SRE Engineer, you will be responsible for designing, implementing, and maintaining our infrastructure and CI/CD pipelines, with a focus on automation, scalability, and performance.

In Short

  • Design, build, and maintain highly scalable infrastructure using Terraform and Terragrunt.
  • Manage cloud environments, particularly in AWS, ensuring cost optimization, security, and high availability.
  • Work with Confluent Cloud and Kafka to manage and scale our data streaming platforms.
  • Deploy and manage REDIS instances for caching and real-time data processing.
  • Implement and maintain monitoring and alerting solutions using Prometheus, Grafana, Alert Manager, and OpsGenie.
  • Enable feature flag management and controlled rollouts using LaunchDarkly.
  • Manage Kubernetes clusters using Kubernetes, Helm, ArgoCD, Istio, and Kustomize.
  • Troubleshoot and resolve complex system issues, ensuring high performance and uptime.
  • Continuously improve automation tools, processes, and methodologies.
  • Stay up-to-date with emerging SRE trends and technologies.

Requirements

  • 4+ years proven experience as a SRE Engineer or in a similar role.
  • Expertise in Infrastructure as Code (IaC) using Terraform and Terragrunt.
  • Deep knowledge of AWS cloud services.
  • Hands-on experience with Confluent Cloud and Kafka.
  • Strong experience with REDIS.
  • Proficiency in monitoring and alerting using Prometheus and Grafana.
  • Experience with LaunchDarkly for feature flag management.
  • Extensive experience managing Kubernetes clusters.
  • Excellent problem-solving skills.
  • Strong communication and collaboration skills.

Benefits

  • Competitive Health, Vision, Dental, and Life Insurance plans.
  • Robust 401k plan.
  • Discretionary Time Off.
  • Other minor perks.

Similar Jobs:

P.W

Site Reliability Engineer (SRE) - Remote

Point Wild

7 days ago

Join Point Wild as a Site Reliability Engineer to maintain system reliability and performance in a dynamic engineering team.

Site Reliability Engineering
DevOps
AWS
Azure
Worldwide
Full-time
DevOps / Sysadmin
Ensono logo

Site Reliability Engineer (SRE) - Remote

Ensono

1 week ago

Ensono is looking for an experienced Site Reliability Engineer (SRE) to enhance their infrastructure and service management.

Site Reliability Engineering
Infrastructure AS Code
Terraform
Azure DevOps
USA
Full-time
DevOps / Sysadmin
$93,000 - $135,000/year
Element Solutions logo

Site Reliability Engineer (SRE) - Remote

Element Solutions

1 week ago

Element is seeking a motivated Site Reliability Engineer (SRE) to enhance cloud migration and collaborate on Infrastructure as Code and CI/CD efforts.

Site Reliability Engineering
Cloud Migration
Infrastructure AS Code
CI/CD
USA
Full-time
DevOps / Sysadmin
Capital Markets Gateway logo

Site Reliability Engineer (SRE) - Remote

Capital Markets Gateway

2 weeks ago

CMG is seeking a Site Reliability Engineer to enhance the reliability and performance of their infrastructure and applications.

Monitoring
Observability
Alerting
Infrastructure
Brazil
Full-time
DevOps / Sysadmin
Ververica logo

Site Reliability Engineer (SRE) - Remote

Ververica

2 weeks ago

Join Ververica as a Site Reliability Engineer to design and maintain infrastructure for a Unified Streaming Data Platform.

AWS
GCP
Azure
Infrastructure AS Code
Germany
Full-time
DevOps / Sysadmin