Remote Otter LogoRemoteOtter

Observability/Monitoring Engineer - Remote

Posted 12 weeks ago

Overview

The Observability Platform team is building a state of the art system for logging, motoring, and tracing across cloud and on-prem data centers. We’re looking for an experienced Senior DevOps engineer to lead our Logging and Monitoring, ensuring robust, scalable solutions within our Google Cloud Platform. In this role, you will be helping to bring systems to life that give superpowers to an entire organization of software developers.

In Short

  • Lead the planning, execution, and manage our observability infrastructure, which processes trillions of observability events (logs, traces, metrics) daily.
  • Create and manage monitoring, logging, and alerting systems utilizing various technologies such as GrafanaLab, CaptainHook, Zabbix, fluentd, filebeat, ELK, Kafka, Prometheus, OpenTelemetry, and other related tools.
  • Design and develop parts of a highly scalable software observability platform which manages trillions of observability events (logs, traces, metrics) per day.
  • Develop and maintain Kubernetes Helm charts that deploy hundreds of pods across nodes every day.
  • Collaborate closely with DevOps teams in delivering cloud solutions aligned with our observability platform.
  • Ensure high availability and performance of observability platforms and tools.
  • Design and develop end-to-end Synthetic Tests Monitoring solutions on GCP. with self-service capabilities for engineering teams.
  • Participate in on-call rotations.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related work experience.
  • 3+ years as DevOps Engineer (or equal role) with a passion for technology and strong motivation and responsibility for high reliability and service level.
  • Proficient in Kubernetes and containerization technologies (Docker, etc.).
  • Extensive experience with observability tools such as GrafanaLab, CaptainHook, Zabbix, Fluentd, ELK, Kafka, and Prometheus.
  • Familiarity with infrastructure as code (IaC) tools like Terraform, Ansible, or CloudFormation.
  • Experience with cloud platforms (AWS, Azure, GCP) and their services related to computing, storage, and networking - preferred GCP.
  • Strong programming skills in one or more languages (Bash, Python, Go, etc.).
  • The ideal candidate will have experience with OpenTelemetry Collector and Grafana Agent.

Benefits

  • Health: Medical, Dental and Vision
  • Time away: Vacation and Holidays
  • Development: Generous tuition reimbursement and access to internal professional development resources.
  • Equal opportunity employer
  • #LI-Remote

Similar Jobs:

Canonical logo

Observability Engineering Manager - Remote

Canonical

18 weeks ago

Join Canonical as an Engineering Manager to lead a team in developing observability solutions while fostering a high-performing culture.

Engineering Management
Python
Open Source
Observability
Worldwide
Full-time
Software Development

Jobgether

Engineering Manager II - Observability - Remote

Jobgether

1 week ago

Lead a high-performing team as an Engineering Manager II focused on observability solutions.

Engineering Management
Observability
APIs
User Experience
Canada
Full-time
Software Development
Datadog logo

Director, Engineering - Observability SRE - Remote

Datadog

11 weeks ago

Datadog is seeking a Director of Engineering to lead their Internal Observability team, focusing on reliability and operational excellence.

Observability
SRE
Engineering Leadership
Product Management
France, Spain
Full-time
DevOps / Sysadmin
Coinbase logo

Software Engineer - Observability - Remote

Coinbase

3 weeks ago

Join Coinbase as a Software Engineer on the Observability team to enhance the reliability and performance of our systems.

Software Engineering
Observability
Reliability
Cloud Deployments
USA
Full-time
Software Development
$147,900 - $174,000 USD/year
Vercel logo

Software Engineer, Observability - Remote

Vercel

6 weeks ago

Join Vercel as a Software Engineer in the Observability team to enhance application monitoring and performance.

JavaScript
TypeScript
GO
Frontend Development
Worldwide
Full-time
Software Development
$192,000 - $264,000/year