Remote Otter LogoRemoteOtter

Staff Site Reliability Engineer - Remote

Posted 2 days ago
DevOps / Sysadmin
Full Time
USA

Overview

As a Staff Site Reliability Engineer (Staff SRE) at SailPoint, you will be a key member on our Reliability Engineering team, driving reliability practices servicing the Identity Security Cloud platform. You are immensely passionate about reliability practices and operational excellence.

In Short

  • Make it easy for everyone to create, consume, manage, and scale reliable cloud production services to achieve more
  • Keep up with industry trends to improve end-to-end reliability and maintainability for all services
  • Coach engineering teams on observability best practices such as setting up well-defined Service Level Objectives (SLOs)
  • Analyze performance of services and recommend infrastructure/code changes that will improve capacity and performance
  • Enable our engineering teams to scale our enterprise operations by providing guidance, best practices, and support as part of an SRE Center of Excellence
  • Manage cross-functional requirements working with Engineering, Product, Services, and other departments
  • Be a mentor of quality for design reviews, code, test cases, automation, observability, root cause analysis, and self-healing
  • Influence architectural design, implementation, consolidation, and simplification for global scale
  • Drive operational excellence to deliver frictionless operation, happy on call, and optimal customer experience

Requirements

  • 8+ years experience in SRE or DevOps production operations supporting a highly available environment for SaaS software or cloud service provider
  • Strong proficiency with one or more programming languages (Java, Python, Go, etc.)
  • Bachelor's degree in Computer Science or other technical discipline, or equivalent experience is preferred, not required
  • Due to FedRAMP requirements, US Citizenship is required to be considered for this role
  • Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as code, preferably Terraform
  • Strong proficiency with containerization technology and/or Kubernetes
  • In-depth experience with metrics, tracing, and logging observability tools such as Prometheus, Grafana, Honeycomb, and Kibana
  • Experience with incident management, including conducting incident reviews
  • Strong understanding of Linux, software development, systems, networking, and Cloud concepts
  • A positive and collaborative demeanor, combined with the ability to coach, mentor, and delegate
  • Excellent communication skills
  • Life-long learner – you stay up to date with technology trends, spend time learning new technologies, and share your learnings with your team

Benefits

  • Health and wellness coverage: Medical, dental, and vision insurance
  • Disability coverage: Short-term and long-term disability
  • Life protection: Life insurance and Accidental Death & Dismemberment (AD&D)
  • Flexible spending accounts for health care, and dependent care; limited purpose flexible spending account
  • Financial security: 401(k) Savings and Investment Plan with company matching
  • Time off benefits: Flexible vacation policy
  • Holidays: 8 paid holidays annually
  • Sick leave
  • Parental support: Paid parental leave
  • Employee Assistance Program (EAP) and Care Counselors
  • Voluntary benefits: Legal Assistance, Critical Illness, Accident, Hospital Indemnity and Pet Insurance options
  • Health Savings Account (HSA) with employer contribution
Sailpoint Technologies logo

Sailpoint Technologies

SailPoint Technologies is a leading provider of identity security solutions, dedicated to helping organizations manage and secure their digital identities. With a focus on operational excellence and reliability, SailPoint's Identity Security Cloud platform empowers businesses to create, consume, manage, and scale reliable cloud production services. The company emphasizes the importance of observability, performance analysis, and cross-functional collaboration, ensuring that engineering teams are equipped with the best practices and support needed to drive innovation and maintain high availability in a rapidly evolving technological landscape.

Share This Job!

Save This Job!

Similar Jobs:

Fivetran logo

Staff Site Reliability Engineer - Remote

Fivetran

1 week ago

Fivetran is seeking an experienced Staff Site Reliability Engineer to enhance the reliability of its data platform infrastructure.

USA
Full-time
DevOps / Sysadmin
$168,000 - $210,000 USD/year

primer.ai

Staff Site Reliability Engineer - Remote

primer.ai

5 weeks ago

Join Primer as a Staff Site Reliability Engineer to design and maintain fault-tolerant systems.

USA
Full-time
DevOps / Sysadmin
180000 - 230000 USD/year
Wikimedia logo

Staff Site Reliability Engineer - Remote

Wikimedia

8 weeks ago

The Wikimedia Foundation seeks a Staff Site Reliability Engineer to enhance its Machine Learning infrastructure.

Worldwide
Full-time
DevOps / Sysadmin
129,347 - 200,824 USD/year

S.C

Staff Site Reliability Engineer - Remote

Stryker Corporation

8 weeks ago

Join Stryker as a Staff Site Reliability Engineer, focusing on cloud infrastructure and reliability engineering in a remote work environment.

USA
Full-time
DevOps / Sysadmin
$100000 - $215000/year
Wellhub logo

Staff Site Reliability Engineer - Remote

Wellhub

9 weeks ago

Join Wellhub as a Staff Site Reliability Engineer to build a secure and scalable cloud infrastructure.

Brazil
Full-time
DevOps / Sysadmin