Remote Otter LogoRemoteOtter

Site Reliability Engineer (SRE) - GCP - Remote

Posted Yesterday
DevOps / Sysadmin
Full Time
Dominican Republic

Overview

We are seeking a Site Reliability Engineer (SRE) with deep expertise in monitoring, observability, and reliability engineering to support systems running across on-premises infrastructure and Google Cloud Platform (GCP).

In Short

  • Own and operate the monitoring and observability stack across on-prem and GCP environments
  • Design, build, and maintain Grafana dashboards for infrastructure, Kubernetes, and applications
  • Define, tune, and maintain alerts to ensure high signal-to-noise ratio
  • Establish observability standards and best practices across teams
  • Improve visibility into system health, performance, and reliability
  • Apply SRE principles to improve availability, performance, and resilience
  • Participate in on-call rotations and SEV incident response
  • Support and monitor Kubernetes environments (GKE and on-prem clusters)
  • Provide L2/L3 application support coverage during high-severity incidents
  • Document all actions, findings, and resolutions in ServiceNow (SNOW)

Requirements

  • Deep expertise in monitoring and observability
  • Experience with Grafana and Kubernetes
  • Knowledge of SRE principles
  • Ability to define and track SLIs, SLOs, and error budgets
  • Experience in incident response and root cause analysis
  • Strong troubleshooting skills for application issues
  • Collaboration skills with engineering teams
  • Experience with ServiceNow (SNOW)

Benefits

  • Opportunity to work with cutting-edge technologies
  • Collaborative team environment
  • Professional growth and development opportunities
Devsu logo

Devsu

Devsu is a dynamic software development company that specializes in delivering world-class software products, particularly in the financial and banking sectors of Latin America. With a focus on agile methodologies, Devsu fosters a collaborative environment where talented engineers from Latin America and the United States work together on challenging projects. The company offers a stable long-term contract, continuous training, private health insurance, and flexible working hours, making it an attractive workplace for professionals seeking growth and innovation in software engineering.

Share This Job!

Save This Job!

Similar Jobs:

Stacktics

Site Reliability Engineer (GCP) - Remote

Stacktics

49 weeks ago

Join Stacktics Inc. as a Site Reliability Engineer (GCP) to design and maintain cloud infrastructure and enhance system performance.

Canada
Full-time
DevOps / Sysadmin
Sensedia logo

Site Reliability Engineer (SRE) - Remote

Sensedia

5 weeks ago

Join Sensedia as a Site Reliability Engineer to work remotely and ensure the integration of high-tech solutions in cloud environments.

Worldwide
Full-time
DevOps / Sysadmin
Invoca logo

Site Reliability Engineer (SRE) - Remote

Invoca

16 weeks ago

Invoca is seeking a Site Reliability Engineer to enhance their cloud infrastructure and ensure system reliability for their AI-powered conversation intelligence platform.

USA, Canada
Full-time
DevOps / Sysadmin
$127,000.00 - $191,000.00/year
Mercor logo

Site Reliability Engineer (SRE) - Remote

Mercor

21 weeks ago

Join Mercor as a Site Reliability Engineer to enhance the reliability and scalability of our platform.

USA
Full-time
DevOps / Sysadmin
Monks logo

Site Reliability Engineer (SRE) - Remote

Monks

21 weeks ago

We are seeking a Site Reliability Engineer (SRE) to enhance the operability and reliability of our Disaster Recovery as a Service (DRaaS) solution.

India
Contract
DevOps / Sysadmin