Remote Otter LogoRemoteOtter

Senior Site Reliability Engineer (Observability & Resilience) - Remote

Posted 2 days ago
DevOps / Sysadmin
Full Time
USA

Overview

As a Senior Site Reliability Engineer (Observability & Resilience), you will lead observability across our platform and help design the resilient infrastructure our customers and educators rely on every day.

In Short

  • Design and implement observability patterns for actionable visibility.
  • Build internal tooling and dashboards for real-time insights.
  • Define and maintain SLIs and SLOs with product and engineering teams.
  • Architect high availability and disaster recovery infrastructure.
  • Collaborate with engineers to embed resilient design and observability.

Requirements

  • 5+ years in an SRE, DevOps, or observability-focused role.
  • Experience with systems for high availability and disaster recovery.
  • Deep experience with observability tools like Grafana and Prometheus.
  • Strong proficiency with Terraform and multi-cloud deployments.
  • Passion for enabling product engineers through training.
  • Excellent communication skills for technical and non-technical audiences.

Benefits

  • Work on cutting-edge AI technology impacting education.
  • Flexibility of working from home.
  • Unlimited time off for work-life balance.
  • Employer-paid health insurance plans.
  • Generous stock options vested over 4 years.
  • 401k match and monthly wellness stipend.
MagicSchool AI logo

MagicSchool AI

MagicSchool AI is an innovative technology company dedicated to transforming the educational landscape through artificial intelligence. By leveraging advanced AI solutions, MagicSchool AI aims to enhance learning experiences, making education more accessible and personalized for students of all ages. The company is committed to fostering creativity and critical thinking in learners, while also providing educators with powerful tools to support their teaching methods. With a focus on collaboration and continuous improvement, MagicSchool AI strives to be at the forefront of educational technology.

Share This Job!

Save This Job!

Similar Jobs:

Second Front Systems logo

Senior Site Reliability Engineer - Observability - Remote

Second Front Systems

3 weeks ago

Join Second Front Systems as a Senior Site Reliability Engineer to enhance observability infrastructure for national security applications.

Worldwide
Full-time
DevOps / Sysadmin
Rackspace logo

Site Reliability Engineer / Observability Engineer - Remote

Rackspace

1 week ago

Join Rackspace as a Site Reliability Engineer to enhance observability solutions and improve customer experiences.

India
Full-time
DevOps / Sysadmin
Xero logo

Site Reliability Engineer - Observability - Remote

Xero

12 weeks ago

Xero is looking for Site Reliability Engineers to enhance system observability and reliability.

Australia
Full-time
DevOps / Sysadmin
Rackspace logo

Site Reliability Engineer / Observability Engineer - Remote

Rackspace

25 weeks ago

Join Rackspace as a Site Reliability Engineer to implement observability solutions and enhance application performance for customers.

India
Full-time
DevOps / Sysadmin
Gusto logo

Senior Site Reliability Engineer - Remote

Gusto

2 days ago

The Senior Site Reliability Engineer at Gusto will design and implement resilient systems while working within a collaborative team.

USA
Full-time
DevOps / Sysadmin
$164,000 - $204,000/year