Remote Otter LogoRemoteOtter

Staff Site Reliability Engineer - Remote

Posted 7 weeks ago

Overview

As a Staff Site Reliability Engineer (SRE), you will be responsible for developing and implementing highly reliable and scalable systems, working closely with different functional teams to create a stable, efficient, and scalable environment.

In Short

  • Define and enforce SRE best practices and standards.
  • Architect and implement highly reliable and scalable systems.
  • Lead complex post-incident reviews and implement systemic improvements.
  • Collaborate with product and engineering teams to set reliability targets.
  • Manage high-impact incidents and coordinate incident response.
  • Contribute to budget planning and resource allocation.
  • Lead efforts to establish disaster recovery strategies.
  • Provide technical leadership and mentorship to the SRE team.
  • Continuously track and improve metrics to optimize software delivery and operational performance.
  • Participate in on-call rotation.

Requirements

  • 8-10 years of experience in similar or related role.
  • Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
  • In-depth knowledge of Cloud Ops technologies including AWS and Terraform.
  • Advanced knowledge in Linux operating systems and troubleshooting OS issues.
  • Expertise in setting up and managing monitoring tools.
  • Strong understanding of incident management, capacity planning, and disaster recovery.
  • Advanced experience with security measures and practices.
  • Strong analytical and problem-solving skills.
  • Knowledge with Linux systems and common system administration tasks.
  • Strong understanding of programming/scripting languages including Python.

Benefits

  • Support for healthy work/life balance.
  • Floating holidays and wellness days.
  • Employee Resource Groups (ERGs).

Similar Jobs:

Wellhub logo

Staff Site Reliability Engineer - Remote

Wellhub

3 days ago

Join Wellhub as a Staff Site Reliability Engineer to build a secure and scalable cloud infrastructure.

AWS
Kubernetes
DevSecOps
Golang
Brazil
Full-time
DevOps / Sysadmin
Gemini logo

Staff Site Reliability Engineer - Remote

Gemini

2 weeks ago

Join Gemini as a Staff Site Reliability Engineer to lead engineering teams in adopting modern DevOps practices and enhancing system reliability.

Site Reliability Engineering
DevOps
Automation
Cloud Technologies
USA
Full-time
DevOps / Sysadmin
$172,000 - $241,000/year
Varo Bank logo

Staff Site Reliability Engineer - Remote

Varo Bank

3 weeks ago

Join Varo's SRE team as a Staff Site Reliability Engineer, focusing on cloud infrastructure reliability and performance.

AWS
Kubernetes
Terraform
CI/CD
USA
Full-time
DevOps / Sysadmin
Syngenta Group logo

Staff Site Reliability Engineer - Remote

Syngenta Group

3 weeks ago

Join our team as a Staff Site Reliability Engineer to design and optimize large-scale distributed systems.

SRE
DevOps
Infrastructure Engineering
CI/CD
Brazil
Full-time
DevOps / Sysadmin
Earnest logo

Staff Site Reliability Engineer - Remote

Earnest

4 weeks ago

Join Earnest as a Staff Site Reliability Engineer to ensure the reliability and performance of systems while optimizing infrastructure.

Site Reliability Engineering
Infrastructure Management
CI/CD
Terraform
USA
Full-time
DevOps / Sysadmin
$194,000 - $220,000 USD/year