Remote Otter LogoRemoteOtter

Site Reliability Engineer - Remote

Posted 1 week ago
DevOps / Sysadmin
Full Time
India

Overview

Dremio is the unified lakehouse platform for self-service analytics and AI, serving hundreds of global enterprises, including Maersk, Amazon, Regeneron, NetApp, and S&P Global. Customers rely on Dremio for cloud, hybrid, and on-prem lakehouses to power their data mesh, data warehouse migration, data virtualization, and unified data access use cases. Based on open source technologies, including Apache Iceberg and Apache Arrow, Dremio provides an open lakehouse architecture enabling the fastest time to insight and platform flexibility at a fraction of the cost.

In Short

  • Drive continuous improvements to our usage of Kubernetes, our Operators, and the GitOps deployment paradigm.
  • Extend our networking, service mesh and Kubernetes systems to support connectivity between GCP, AWS and Azure.
  • Collaborate with Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, monitoring/alerting, capacity planning, production readiness and service reviews.
  • Help define and instrument Service Level indicators and objectives (SLIs/SLOs) with service owners in the Engineering teams.
  • Collaborate within our virtual Observability team to develop and improve observability of the Dremio Cloud product.
  • Debug and optimize code written by others and automate routine tasks.
  • Evangelize and advocate for resilience engineering and reliability practices.
  • Scale systems sustainably through automation and evolve systems.
  • Join an on-call rotation for systems and services.
  • Practice sustainable incident response and post-incident investigation analysis.

Requirements

  • 3+ years of relevant experience in SRE, DevOps, Distributed Systems, Cloud Operations, Software Engineering.
  • Familiarity in Kubernetes, Istio, Terraform, ArgoCD/Flux.
  • Excellent command of cloud services on GCP/AWS/Azure, CI/CD pipelines.
  • Moderate-advanced experience in Python/Go, and at least reading knowledge of Java.
  • Systematic problem-solving approach with strong communication skills.
  • Ability to debug and optimize code and automate routine tasks.

Benefits

  • Workplace Wednesdays to improve cross-team communication.
  • Lunch catering/meal credits provided in the office.
  • Hybrid work environment.
Dremio logo

Dremio

Dremio is a leading unified lakehouse platform designed for self-service analytics and AI, catering to a diverse range of global enterprises such as Maersk, Amazon, and Regeneron. The company specializes in providing cloud, hybrid, and on-prem lakehouses that facilitate data mesh, data warehouse migration, and data virtualization. Leveraging open-source technologies like Apache Iceberg and Apache Arrow, Dremio offers an open lakehouse architecture that ensures rapid insights and platform flexibility at a competitive cost. Dremio is committed to high standards of communication, accountability, and respect among its employees, fostering a dynamic and innovative work environment.

Share This Job!

Save This Job!

Similar Jobs:

Competition Company logo

Site Reliability Engineer - Remote

Competition Company

1 week ago

Join our team as a Site Reliability Engineer to design and manage the server infrastructure for RENNSPORT, an online racing simulator.

Germany, Sweden
Full-time
DevOps / Sysadmin
SGS logo

Site Reliability Engineer - Remote

SGS

1 week ago

The Site Reliability Engineer will ensure the reliability and performance of .NET applications while collaborating with development and operations teams.

USA
Full-time
DevOps / Sysadmin
75000 - 95000/year
Patreon logo

Site Reliability Engineer - Remote

Patreon

1 week ago

Patreon is seeking a Site Reliability Engineer to enhance their cloud infrastructure and improve operational excellence.

USA
Full-time
DevOps / Sysadmin
DaCodes logo

Site Reliability Engineer - Remote

DaCodes

2 weeks ago

Join DaCodes as a Site Reliability Engineer and work on optimizing scalable infrastructure for diverse clients.

Mexico, Mexico City
Full-time
DevOps / Sysadmin
DaCodes logo

Site Reliability Engineer - Remote

DaCodes

2 weeks ago

Join DaCodes as a Site Reliability Engineer and contribute to optimizing and maintaining scalable infrastructure for diverse clients.

Mexico
Full-time
DevOps / Sysadmin