Remote Otter LogoRemoteOtter

Manager of Site Reliability Engineering - Remote

Posted 7 weeks ago
DevOps / Sysadmin
Full Time
Worldwide

Overview

We are looking for a dynamic and experienced MGR Site Reliability Engineering with a strong background in cloud infrastructure (GCP/Azure), monitoring and observability stacks (such as Datadog, Dynatrace), and team leadership. This individual will play a key role in ensuring the reliability, scalability, and performance of our systems while managing a high-performing team of SREs.

In Short

  • Lead, mentor, and grow a team of Site Reliability Engineers (SREs).
  • Own the architecture and operational health of systems running on Google Cloud Platform (GCP) and/or Azure.
  • Implement and manage infrastructure-as-code (IaC) practices.
  • Lead the implementation of monitoring and observability tools.
  • Drive a culture of blameless post-mortems to learn from incidents.
  • Work closely with engineering, product, and operations teams.
  • Champion automation across all aspects of infrastructure.
  • Manage and prioritize the team’s work while aligning efforts with business goals.
  • Ensure systems meet Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • Develop and maintain proactive monitoring and alerting strategies.

Requirements

  • 12+ years of experience in Site Reliability Engineering, DevOps, or related roles.
  • 4+ years of experience managing and leading teams.
  • Extensive hands-on experience with Google Cloud Platform (GCP) or Microsoft Azure.
  • Expertise in monitoring and observability stacks (e.g., Datadog, Dynatrace).
  • Strong experience in infrastructure automation tools (e.g., Terraform, Ansible).
  • Deep understanding of SRE concepts, including SLOs, SLIs, and SLAs.
  • Proficiency in one or more programming/scripting languages (e.g., Python, Go).
  • Experience with CI/CD pipelines and infrastructure as code (IaC).
  • Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Strong understanding of system architecture, networking, and security in cloud environments.

Benefits

  • Comprehensive benefits package including Remote/Hybrid workplace options.
  • Group Medical Coverage.
  • Flexible Time Off.
  • Career Pathing.
  • Summer Fridays.
Sagent India logo

Sagent India

Sagent India is a transformative force in the mortgage servicing industry, dedicated to simplifying and enhancing the homeownership experience for consumers in the U.S. By leveraging cutting-edge technology, Sagent empowers servicers and consumers alike, enabling them to manage their home-owing lives seamlessly. The company is a joint venture that combines the fintech expertise of Fiserv Inc. with the growth acumen of Warburg Pincus, fostering an environment that encourages innovation and disruption in the lending and housing sectors. Sagent is committed to creating a diverse and inclusive workplace, offering a range of benefits and opportunities for professional growth.

Share This Job!

Save This Job!

Similar Jobs:

Klaviyo

Site Reliability Engineering Manager - Remote

Klaviyo

7 weeks ago

The Site Reliability Engineering Manager will lead a team to enhance system reliability and productivity at Klaviyo.

USA
Full-time
DevOps / Sysadmin
$188,000 - $282,000 USD
TextNow logo

Site Reliability Engineering Manager - Remote

TextNow

10 weeks ago

Join TextNow as a Site Reliability Engineering Manager to lead a critical team and enhance system reliability and performance.

USA, CA
Full-time
DevOps / Sysadmin
Customer.io logo

Engineering Manager - Site Reliability Engineering - Remote

Customer.io

10 weeks ago

Join Customer.io as an Engineering Manager to lead the SRE squad and ensure the reliability of their products.

Worldwide
Full-time
DevOps / Sysadmin
$140,000 - $190,000/year
Axon logo

Manager, Site Reliability Engineering - Remote

Axon

11 weeks ago

Axon is seeking a Manager for Site Reliability Engineering to lead a team in managing large-scale cloud platforms.

Canada
Full-time
DevOps / Sysadmin
Axon logo

Site Reliability Engineering Manager - Remote

Axon

12 weeks ago

Axon is seeking a Site Reliability Engineering Manager to lead a team in building and operating observability systems.

Worldwide
Full-time
Software Development
135000 - 216000 USD/year