Remote Otter LogoRemoteOtter

Senior Site Reliability Engineer (SRE) - Remote

Posted 4 weeks ago

Overview

The Sr. Site Reliability Engineer (SRE) is responsible for the availability, performance, monitoring, release engineering, and incident response, among other things, of the platforms and services the company runs and owns. SRE ensures that enterprise services have reliability and uptime appropriate to defined service levels. SRE's are focused on optimizing existing systems, building cloud infrastructure, and eliminating manual work through automation.

In Short

  • Analyzing and troubleshooting large-scale distributed systems in the public cloud.
  • Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Improve and maintain monitoring and logging solutions that measure availability, latency and overall system health of production systems.
  • Provision and manage cloud Infrastructure through automation and infrastructure as code.
  • Restore healthy operation of applications and services through sustainable incident response and blameless postmortems.
  • Follow and monitor security and compliance best practices.
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.

Requirements

  • Ability to program with one or more high level languages, ex: Typescript, Python, etc.
  • Configuration Management and Infrastructure as Code (e.g.: CloudFormation, Ansible).
  • Monitoring and Alerting tools, ex: AWS Cloudwatch, New Relic, etc.
  • Incident management/on-call, ex: PagerDuty, etc.
  • Gather and analyze metrics to assist in performance tuning and fault finding.
  • Bachelor's degree from a four-year college or university, or three to four years related experience and/or training; or equivalent combination of education and experience.
  • 3+ years of software engineering and/or IT operations and infrastructure experience preferred.

Benefits

  • Compensation competitive to market and geographical location.
  • Meal allowance for each day worked available through meal card.
  • Home/Office allowance reimbursement per calendar month, pro-rated based on employment start date.
  • Health insurance: Tillster pays the premium for employee private health insurance.
  • Up to 14 federal and local/municipal holidays in accordance with applicable Portuguese Labour laws.
  • Up to 22 days of vacation every holiday year, pro-rated based on employment start date.
  • Education, Learning & Development: We offer Udemy Learning courses; and ongoing learning and development opportunities.

Similar Jobs:

CI&T logo

Senior Site Reliability Engineer (SRE) - Remote

CI&T

4 days ago

We are looking for a qualified Senior Site Reliability Engineer (SRE) to manage application reliability and collaborate with various teams.

SRE
Java
FinOps
Cloud Management
BR
Full-time
DevOps / Sysadmin
Cribl logo

Senior Site Reliability Engineer (SRE) - Remote

Cribl

5 days ago

Join Cribl as a Senior Site Reliability Engineer to enhance observability and reliability in a remote-first environment.

Site Reliability Engineering
DevOps
Cloud Computing
JavaScript
USA
Full-time
DevOps / Sysadmin
$165,000 - $205,000/year

ZenGRC

Senior Site Reliability Engineer (SRE) - Remote

ZenGRC

7 days ago

Join ZenGRC as a Senior Site Reliability Engineer to define and implement cloud infrastructure and support Kubernetes clusters.

AWS
Terraform
Kubernetes
Docker
USA
Full-time
DevOps / Sysadmin

BLACKLANE

Senior Site Reliability Engineer (SRE) - Remote

BLACKLANE

1 week ago

Join our team as a Senior Site Reliability Engineer to enhance system reliability and mentor junior engineers.

Site Reliability Engineering
SRE Best Practices
System Reliability
Observability
Worldwide
Full-time
DevOps / Sysadmin

inbybob_

Senior Site Reliability Engineer (SRE) - Remote

inbybob_

3 weeks ago

Seeking a Senior SRE engineer to enhance DevOps practices and manage distributed systems in a financial strategy development team.

SRE
DevOps
Cloud Computing
Python
Argentina
Full-time
DevOps / Sysadmin