Remote Otter LogoRemoteOtter

Senior Site Reliability Engineer, Observability - Remote

Posted Yesterday
DevOps / Sysadmin
Full Time
USA

Overview

This position involves working as a Senior Site Reliability Engineer focused on observability and monitoring for a SaaS product, with responsibilities in cloud infrastructure design, deployment, and maintenance.

In Short

  • Key contributor on an Agile development team.
  • Build and execute monitoring strategies for SaaS infrastructure.
  • Define and maintain system and service monitors.
  • Utilize monitoring technologies like Prometheus and AWS CloudWatch.
  • Automate detection and remediation of platform issues.
  • Triage incident response and document SOPs.
  • Participate in on-call management rotation.
  • Provide analytics solutions to internal teams.
  • Mentor team members and improve systems.
  • Collaborate effectively with various teams.

Requirements

  • 8+ years in software development or site reliability engineering.
  • Strong problem-solving skills and experience with algorithms.
  • Experience with cloud automation tools like Terraform.
  • Proficiency in scripting languages such as Python and Bash.
  • Familiarity with configuration automation tools like Ansible.
  • Exposure to Windows and Linux administration.
  • Experience with project management tools like Jira.
  • Knowledge of datastore technologies like Postgres and MySQL.
  • Effective communication and collaboration skills.
  • Ability to mentor and lead by example.

Benefits

  • Comprehensive medical, dental, and vision plans.
  • 401(k) plan with employer match.
  • Flexible Paid Time Off (FTO).
  • Volunteer Time Off (VTO).
  • 5-year Service Milestone Sabbatical.
  • Paid parental leave.
  • Employee referral bonus program.
  • Pet insurance.
  • Regular virtual company-wide events.
  • Professional development opportunities.
ScienceLogic logo

ScienceLogic

ScienceLogic is a pioneering company at the forefront of transforming IT operations through automation and generative AI. With a focus on creating truly autonomous enterprises, ScienceLogic's cutting-edge AIOps platform enhances the management and optimization of IT operations, enabling organizations to achieve superior customer experiences and drive revenue growth. The company is dedicated to building a future of Autonomic IT, where operations are self-healing and self-optimizing, and is reshaping the $18+ billion IT operations market with innovative solutions. Trusted by thousands of organizations globally, ScienceLogic empowers businesses with actionable insights and workflow automation, eliminating manual tasks and enhancing operational efficiency.

Share This Job!

Save This Job!

Similar Jobs:

Chainlink Labs logo

Senior Site Reliability Engineer, Observability - Remote

Chainlink Labs

20 weeks ago

Join Chainlink Labs as a Senior Site Reliability Engineer to enhance system observability and reliability in the blockchain industry.

Worldwide
Full-time
DevOps / Sysadmin
Second Front Systems logo

Senior Site Reliability Engineer - Observability - Remote

Second Front Systems

33 weeks ago

Join Second Front Systems as a Senior Site Reliability Engineer to enhance observability infrastructure for national security applications.

Worldwide
Full-time
DevOps / Sysadmin
Rackspace logo

Site Reliability Engineer / Observability Engineer - Remote

Rackspace

30 weeks ago

Join Rackspace as a Site Reliability Engineer to enhance observability solutions and improve customer experiences.

India
Full-time
DevOps / Sysadmin
Xero logo

Site Reliability Engineer - Observability - Remote

Xero

41 weeks ago

Xero is looking for Site Reliability Engineers to enhance system observability and reliability.

Australia
Full-time
DevOps / Sysadmin
Rackspace logo

Site Reliability Engineer / Observability Engineer - Remote

Rackspace

54 weeks ago

Join Rackspace as a Site Reliability Engineer to implement observability solutions and enhance application performance for customers.

India
Full-time
DevOps / Sysadmin