Remote Otter LogoRemoteOtter

Site Reliability Engineer (SRE) - Remote

Posted 3 days ago
DevOps / Sysadmin
Full Time
Worldwide

Overview

We are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of our production systems.

In Short

  • Design, implement, and maintain scalable, resilient AWS infrastructure.
  • Develop and manage CI/CD pipelines and infrastructure-as-code (Terraform or similar).
  • Set up and optimize monitoring, alerting, and incident response processes.
  • Proactively identify and resolve performance, reliability, and security issues.
  • Collaborate with development teams to integrate SRE best practices into their workflows.
  • Conduct post-mortems and root cause analyses on incidents.
  • Participate in on-call rotations to support 24/7 system reliability.
  • 5+ years of experience as an SRE or similar role.
  • Deep knowledge of AWS services (EC2, ECS, RDS, Lambda, S3, etc.).
  • Proficient in infrastructure-as-code tools (Terraform, CloudFormation, etc.).

Requirements

  • Solid experience with Linux systems administration and networking concepts.
  • Strong programming/scripting skills (Python, Bash, Go, etc.).
  • Experience with CI/CD tools (GitLab CI, Jenkins, etc.).
  • Familiarity with observability tools (Prometheus, Grafana, Datadog, etc.).
  • Experience with container orchestration (ECS, EKS, or Kubernetes).
  • Understanding of security best practices in cloud environments.
  • Exposure to incident management frameworks (SRE handbook, etc.).

Benefits

  • 100% remote work with flexible hours.
  • High-impact role with autonomy and ownership.
  • Collaborative and international engineering team.
  • Cutting-edge tech stack with strong focus on reliability and automation.

Blackfluo.ai

Blackfluo.ai

Blackfluo.ai is a fully remote company with a global team dedicated to creating innovative SaaS solutions for businesses and consulting firms. Their flagship product is an AI assistant designed to enhance daily operations by automating repetitive tasks, enabling clients to concentrate on their core activities. With a focus on backend development and machine learning engineering, Blackfluo.ai is committed to leveraging advanced technologies to optimize workflows and improve efficiency.

Share This Job!

Save This Job!

Similar Jobs:

Arista Networks logo

Site Reliability Engineer (SRE) - Remote

Arista Networks

7 days ago

Join Arista Networks as a Site Reliability Engineer to manage and enhance the global CloudVision service fleet.

Ireland
Full-time
DevOps / Sysadmin
RetailNext logo

Site Reliability Engineer (SRE) - Remote

RetailNext

7 weeks ago

RetailNext is seeking a Site Reliability Engineer to enhance the operation and reliability of their SAAS retail analytics solution.

Worldwide
Full-time
DevOps / Sysadmin
Coinbase Careers Page logo

Site Reliability Engineer (SRE) - Remote

Coinbase Careers Page

9 weeks ago

Join Coinbase as a Site Reliability Engineer to enhance and secure identity and access management systems.

USA
Full-time
DevOps / Sysadmin
$186,065 - $218,900/year
Air Apps logo

Site Reliability Engineer (SRE) - Remote

Air Apps

9 weeks ago

Join Air Apps as a Site Reliability Engineer (SRE) to ensure the reliability and scalability of our systems while implementing automation and performance optimization strategies.

USA
Full-time
DevOps / Sysadmin
Air Apps logo

Site Reliability Engineer (SRE) - Remote

Air Apps

9 weeks ago

Join Air Apps as a Site Reliability Engineer (SRE) to ensure the reliability and scalability of our AI-driven systems.

Worldwide
Full-time
DevOps / Sysadmin