Remote Otter LogoRemoteOtter

Principal Site Reliability Engineer - Remote

Posted 6 days ago
DevOps / Sysadmin
Full Time
USA

Overview

This role offers a critical opportunity to enhance the reliability, scalability, and efficiency of large-scale distributed systems within a dynamic automotive software environment.

In Short

  • Develop automation tools and software to streamline operational processes and improve system reliability.
  • Lead and enhance observability frameworks to detect and resolve issues proactively.
  • Participate in on-call rotations to troubleshoot production incidents, minimizing downtime.
  • Collaborate closely with software developers to ensure service scalability, reliability, and quality.
  • Manage service level indicators (SLIs), objectives (SLOs), and agreements (SLAs) to meet reliability goals.
  • Conduct post-incident reviews and failure analyses to foster continuous improvement.
  • Identify and implement cost-saving optimizations while maintaining high service standards.

Requirements

  • Strong experience in Site Reliability Engineering or related fields.
  • Proficiency in automation tools and scripting languages.
  • Experience with observability and monitoring frameworks.
  • Ability to troubleshoot complex production issues.
  • Strong collaboration skills with software development teams.
  • Experience managing SLIs, SLOs, and SLAs.
  • Analytical mindset for post-incident reviews and failure analysis.

Benefits

  • Opportunity to work in a dynamic automotive software environment.
  • Hybrid remote work setup with occasional on-site collaboration.
  • Professional development opportunities.
  • Competitive salary and benefits package.

Jobgether

Jobgether

Jobgether is a global platform dedicated to connecting job seekers with fully remote job opportunities. The company focuses on matching candidates to roles where they are most likely to succeed, providing valuable feedback on applications to enhance the job search experience. Jobgether aims to eliminate common frustrations in the job market, such as application black holes and recruiter ghosting, by offering a supportive and transparent approach to remote employment.

Share This Job!

Save This Job!

Similar Jobs:

Upwork logo

Principal Site Reliability Engineer - Remote

Upwork

6 weeks ago

Join Upwork as a Principal Site Reliability Engineer to lead and innovate in SRE practices for a global team.

Worldwide
Full-time
DevOps / Sysadmin
Cribl logo

Principal Site Reliability Engineer - Remote

Cribl

11 weeks ago

Join Cribl as a Principal Site Reliability Engineer to enhance observability and reliability in software systems.

USA
Full-time
DevOps / Sysadmin
$240,000 - $400,000/year

Groupon

Principal Site Reliability Engineer - Remote

Groupon

24 weeks ago

Join Groupon as a Principal Site Reliability Engineer to enhance the reliability and scalability of mission-critical systems.

Worldwide
Full-time
DevOps / Sysadmin

Groupon

Principal Site Reliability Engineer - Remote

Groupon

24 weeks ago

Join Groupon as a Principal Site Reliability Engineer to enhance the reliability and scalability of mission-critical systems.

Worldwide
Full-time
DevOps / Sysadmin

Groupon

Principal Site Reliability Engineer - Remote

Groupon

24 weeks ago

Join Groupon as a Principal Site Reliability Engineer to enhance the reliability and scalability of mission-critical systems.

Colombia
Full-time
DevOps / Sysadmin