Remote Otter LogoRemoteOtter

Principal Site Reliability Engineer - Remote

Posted 2 days ago
DevOps / Sysadmin
Full Time
USA
$167,300 - $242,600/year

Overview

Your passion for uptime was forged from experience in production and refined through incident response. You’re an Expel Principal Site Reliability Engineer - a protector, champion, and leader of Expel's reputation for service reliability.

In Short

  • Lead project work to build and maintain platform features that cut across the Expel product’s reliability, networking, and cloud infrastructure.
  • Contribute by pushing IaC commits daily, with occasional opportunities to write and test application code in Python, Golang, and Javascript.
  • Mentor and motivate service owners on how to use the platform in order to deploy, measure, monitor, and operate their own services at scale.
  • Participate in a weekly support rotation that includes taking the on-call pager and providing nearly on-demand working-hours support to platform users.
  • Lead incident response, triage, and root cause analysis support.
  • Collaborate with architects and product stakeholders to outline the next quarter’s reliability initiatives.
  • Pair-program with junior SREs to mentor them in debugging tricky deployments.
  • Work in a shared backlog and participate in weekly blame-free retros.
  • Take pride in achieving high uptime standards.
  • Engage in a culture of continuous improvement and learning.

Requirements

  • A passion for learning and improving your work product.
  • Significant experience operating Kubernetes within highly distributed environments.
  • Experience running systems in GCP or AWS.
  • Exposure to monitoring and observability infrastructure and standard methodologies.
  • Understanding of infrastructure-as-code practices, tools, and patterns.
  • Experience developing software in Linux environments, preferably with Python and/or Golang.
  • A customer-minded approach that enables the success of platform users.
  • A collaborative disposition that allows you to work optimally on and across teams.
  • Six years of systems experience either in operations or development.
  • Missing some items on the list? That's ok! We still want to talk to you!

Benefits

  • Opportunity to grow and maintain reliability-focused platform features.
  • Mission-driven work to stop evil hackers.
  • Space for employees to learn and grow.
  • Contribute to a best-in-class product.
  • Leadership team embracing modern Site Reliability principles.
Expel logo

Expel

Expel is a rapidly growing cybersecurity company valued at over $1 billion, dedicated to bridging the cybersecurity talent gap through its Managed Detection and Response (MDR) platform. The company focuses on enabling rapid detection and response to security threats while fostering innovation and collaboration among software engineers, data scientists, and product teams. Expel emphasizes the importance of data as a strategic asset and is committed to optimizing its data architecture to enhance performance, scalability, and data quality. With a strong emphasis on leadership, personal growth, and a collaborative work environment, Expel offers its employees the opportunity to shape the future of its data platform and contribute to the company's success.

Share This Job!

Save This Job!

Similar Jobs:

Jobgether

Principal Site Reliability Engineer - Remote

Jobgether

1 week ago

Seeking a Principal Site Reliability Engineer to architect and maintain hybrid infrastructures in a collaborative environment.

USA
Full-time
DevOps / Sysadmin

Jobgether

Principal Site Reliability Engineer - Remote

Jobgether

9 weeks ago

We are looking for a Principal Site Reliability Engineer to enhance the reliability and efficiency of large-scale distributed systems in a hybrid remote setup.

USA
Full-time
DevOps / Sysadmin
Upwork logo

Principal Site Reliability Engineer - Remote

Upwork

15 weeks ago

Join Upwork as a Principal Site Reliability Engineer to lead and innovate in SRE practices for a global team.

Worldwide
Full-time
DevOps / Sysadmin
Cribl logo

Principal Site Reliability Engineer - Remote

Cribl

19 weeks ago

Join Cribl as a Principal Site Reliability Engineer to enhance observability and reliability in software systems.

USA
Full-time
DevOps / Sysadmin
$240,000 - $400,000/year

Groupon

Principal Site Reliability Engineer - Remote

Groupon

33 weeks ago

Join Groupon as a Principal Site Reliability Engineer to enhance the reliability and scalability of mission-critical systems.

Worldwide
Full-time
DevOps / Sysadmin