Remote Otter LogoRemoteOtter

Senior Site Reliability Engineer - Remote

Posted 6 weeks ago
DevOps / Sysadmin
Full Time
India

Overview

Dremio is the unified lakehouse platform for self-service analytics and AI, serving hundreds of global enterprises, including Maersk, Amazon, Regeneron, NetApp, and S&P Global. Customers rely on Dremio for cloud, hybrid, and on-prem lakehouses to power their data mesh, data warehouse migration, data virtualization, and unified data access use cases. Based on open source technologies, including Apache Iceberg and Apache Arrow, Dremio provides an open lakehouse architecture enabling the fastest time to insight and platform flexibility at a fraction of the cost.

In Short

  • Drive continuous improvements to our usage of Kubernetes, our Operators, and the GitOps deployment paradigm.
  • Extend our networking, service mesh and Kubernetes systems to support connectivity between GCP, AWS and Azure.
  • Collaborate with Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, monitoring/alerting, capacity planning, production readiness and service reviews.
  • Help define and instrument Service Level indicators and objectives (SLIs/SLOs) with service owners in the Engineering teams.
  • Collaborate within our virtual Observability team: develop and improve observability of the Dremio Cloud product.
  • Ability to debug and optimize code written by others and automate routine tasks.
  • Evangelize and advocate for resilience engineering and reliability practices across our organization.
  • Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Join an on-call rotation for systems and services that the SRE team owns.
  • Practice sustainable incident response and post-incident investigation analysis.

Requirements

  • 10+ years of relevant experience in SRE, DevOps, Distributed Systems, Cloud Operations, Software Engineering.
  • Expertise in Kubernetes, Istio, Terraform, Terragrunt, ArgoCD/Flux.
  • Expertise with software defined networking infrastructure.
  • Excellent command of cloud services on GCP/AWS/Azure, CI/CD pipelines.
  • Moderate-advanced experience in Python/Go, and at least reading knowledge of Java.
  • Systematic problem-solving approach with strong communication skills.
  • Ability to debug and optimize code and automate routine tasks.
  • Solid background in software development and architecting resilient applications.

Benefits

  • Workplace Wednesdays to improve cross-team communication.
  • Hybrid work environment.
  • Lunch catering and meal credits provided in the office.
  • Local socials align to Workplace Wednesdays.
Dremio logo

Dremio

Dremio is a leading unified lakehouse platform designed for self-service analytics and AI, catering to a diverse range of global enterprises such as Maersk, Amazon, and Regeneron. The company specializes in providing cloud, hybrid, and on-prem lakehouses that facilitate data mesh, data warehouse migration, and data virtualization. Leveraging open-source technologies like Apache Iceberg and Apache Arrow, Dremio offers an open lakehouse architecture that ensures rapid insights and platform flexibility at a competitive cost. Dremio is committed to high standards of communication, accountability, and respect among its employees, fostering a dynamic and innovative work environment.

Share This Job!

Save This Job!

Similar Jobs:

Airalo logo

Senior Site Reliability Engineer - Remote

Airalo

7 weeks ago

Join Airalo as a Senior Site Reliability Engineer to develop and maintain reliable systems in a remote-first environment.

Worldwide
Full-time
DevOps / Sysadmin
Joinpaxos logo

Senior Site Reliability Engineer - Remote

Joinpaxos

7 weeks ago

Join Paxos as a Senior Site Reliability Engineer to enhance cloud infrastructure reliability and performance.

USA
Full-time
DevOps / Sysadmin
$157,254 - $185,005 USD/year

P.W

Senior Site Reliability Engineer - Remote

Point Wild

7 weeks ago

Join Point Wild as a Senior Site Reliability Engineer to maintain and enhance the reliability and performance of our systems.

Worldwide
Full-time
DevOps / Sysadmin

M.M

Senior Site Reliability Engineer - Remote

Modernizing Medicine

7 weeks ago

Join ModMed as a Senior Site Reliability Engineer to enhance cloud infrastructure and empower developers.

USA
Full-time
DevOps / Sysadmin

M.M

Senior Site Reliability Engineer - Remote

Modernizing Medicine

7 weeks ago

Join Modernizing Medicine as a Senior Site Reliability Engineer to enhance cloud infrastructure and mentor junior engineers.

India
Full-time
DevOps / Sysadmin