Remote Otter LogoRemoteOtter

Site Reliability Engineer - Remote

Posted 2 weeks ago
Netlify logo

Netlify

DevOps / Sysadmin
Full Time
Worldwide
CAD$125,000 - CAD$175,000/year

Share This Job!

Overview

Netlify’s SRE team is on a mission to scale Netlify’s infrastructure to support our next million users. We focus on ensuring application resiliency and delivering a robust compute and network platform at scale. As a Site Reliability Engineer within the Infrastructure SRE team, you’ll play a key role in designing, developing, and delivering solutions that enhance the scalability, availability, and efficiency of our platform.

In Short

  • Manage full infrastructure lifecycle from design to decommission, ensuring systems are reliable and efficient.
  • Participate in an on-call rotation for the compute platform and related systems.
  • Automate routine tasks and develop tools to improve system efficiency and reduce the human intervention time on any tasks.
  • Conduct system performance tuning and troubleshooting, as well as capacity planning, to ensure system reliability and efficiency.
  • Participate in the creation and testing of disaster recovery plans.
  • Monitor and maintain observability systems to ensure issues are identified and resolved proactively.
  • Educate team members on security best practices and emerging threats.

Requirements

  • Several years of experience in SRE, DevOps, or related roles.
  • Proven experience working in hyperscale cloud environments.
  • Demonstrated ability to lead infrastructure projects.
  • Strong understanding of network protocols and configurations.
  • Experience with automation tools (e.g., Ansible, Terraform) and scripting languages (e.g., Python, Bash, Golang).
  • Experience automating component deployment across multiple environments using tools like Jenkins, CircleCI, or GitHub Actions.
  • Proficient observability and log analysis techniques to detect and resolve system issues.
  • Effective communication skills for both technical and non-technical stakeholders.
  • Familiarity with compliance requirements and frameworks: PCI, ISO 2701, HIPAA, SOC.

Benefits

  • Remote-first work culture.
  • Focus on work-life balance.
  • Commitment to diversity and inclusion.
  • Opportunities for professional growth.
  • Participation in equity plan.

Similar Jobs:

Software Mind logo

Site Reliability Engineer - Remote

Software Mind

2 days ago

Software Mind is looking for a Site Reliability Engineer to enhance the reliability of their software systems in a flexible and supportive work environment.

Site Reliability Engineering
Cloud Native Applications
Azure
AWS
LATAM
Full-time
DevOps / Sysadmin
Jackbox Games logo

Site Reliability Engineer - Remote

Jackbox Games

7 days ago

Join Jackbox Games as a Site Reliability Engineer to maintain AWS infrastructure and develop applications in Go.

Site Reliability Engineering
AWS
GO
ECS
USA
Full-time
DevOps / Sysadmin
$103,326 - $190,465/year
Pinterest logo

Site Reliability Engineer - Remote

Pinterest

1 week ago

Pinterest is seeking a Site Reliability Engineer to ensure the reliability of its large-scale distributed systems.

Site Reliability Engineering
Python
GO
Linux
USA
Full-time
Software Development
Printify logo

Site Reliability Engineer - Remote

Printify

1 week ago

Join our team as a Site Reliability Engineer, responsible for ensuring the reliability of our distributed systems and platforms in a dynamic international environment.

Site Reliability Engineering
System Design
Development
Configuration
Worldwide
Full-time
DevOps / Sysadmin
Zepz logo

Site Reliability Engineer - Remote

Zepz

1 week ago

Join Zepz as a Site Reliability Engineer to enhance service stability and resilience through innovative automation and observability practices.

SRE
DevOps
Automation
Monitoring
South Africa
Full-time
DevOps / Sysadmin