Remote Otter LogoRemoteOtter

GPU Engineer - Remote

Posted 45 weeks ago
Software Development
Full Time
Worldwide

Overview

We are seeking an experienced GPU Engineer with a strong background in Python and large-scale model training. In this role you will design and implement improvements to our large scale training infrastructure and directly help make technical decisions to optimise our models' training performance and efficiency.

In Short

  • Strong engineering skills with fluency in Python and PyTorch or other acceleration libraries.
  • Experience writing and debugging low-level GPU code (CUDA, Triton) and debugging hardware errors.
  • Experienced in scaling up GPU jobs via large-scale compute clusters using Slurm or Kubernetes.
  • Preferred: knowledge of advanced filesystems, particularly Ceph, and their integration and optimization in large systems.
  • Preferred: proficient in implementing robust monitoring systems for performance tracking and anomaly detection.

Requirements

  • Strong engineering skills with fluency in Python and PyTorch or other acceleration libraries.
  • Experience writing and debugging low-level GPU code (CUDA, Triton) and debugging hardware errors.
  • Experienced in scaling up GPU jobs via large-scale compute clusters using Slurm or Kubernetes.
  • Preferred: knowledge of advanced filesystems, particularly Ceph, and their integration and optimization in large systems.
  • Preferred: proficient in implementing robust monitoring systems for performance tracking and anomaly detection.

Benefits

  • An Elite Team: Collaborate with top-tier engineers, researchers, operators from renowned organizations like Google DeepMind and Facebook AI Research (FAIR) and successful startups, driving innovation in cutting-edge AI technology.
  • Massive Market Opportunity: Be part of a rapidly growing industry poised to transform multiple sectors globally, offering the chance to make a significant impact.
  • Mission-Driven Environment: Work alongside a collaborative, mission-focused team dedicated to advancing AI for meaningful applications.
  • Inclusive and Open Culture: Thrive in an open and inclusive work environment that values diverse perspectives and fosters creativity.
  • Generous Benefits: Enjoy 5 weeks of paid leave to recharge, comprehensive healthcare benefits including vision and dental, and additional perks that support your well-being.
  • Visa Support: We provide visa assistance, including H1B and OPT transfers, for US employees to ensure a smooth transition and support your career with us.
Reka logo

Reka

Reka is a globally distributed foundation model startup headquartered in the San Francisco Bay Area, California, dedicated to building useful multimodal artificial intelligence to empower organizations and businesses. With a remote-first approach, Reka brings together top talent from around the world, including contributors to significant AI breakthroughs over the past decade. The company fosters a collaborative, mission-driven environment focused on advancing AI for meaningful applications, while promoting an inclusive culture that values diverse perspectives. Reka offers generous benefits and is positioned in a rapidly growing industry with massive market opportunities.

Share This Job!

Save This Job!

Similar Jobs:

G.S

Engineer - Remote

GSB Solutions

6 weeks ago

An international company is seeking a bilingual Engineer for a remote position.

Worldwide
Full-time
All others
Wistia logo

Engineer - Remote

Wistia

19 weeks ago

Wistia is seeking a talented engineer to join the Discover team, focusing on full-stack development and enhancing product features.

USA
Full-time
Software Development

DoubleZero

Engineer - Remote

DoubleZero

21 weeks ago

Join Malbec Labs as an Engineer to develop a decentralized network orchestration plane for the DoubleZero Protocol.

Worldwide
Full-time
Software Development
Veda Data Solutions logo

Engineer - Remote

Veda Data Solutions

30 weeks ago

Join Veda as an Engineer to work on complex data management projects in healthcare.

USA
Full-time
Software Development
Scalepex logo

Engineer - Remote

Scalepex

40 weeks ago

Join Scalepex as an engineer to design and develop real-time software for DSP algorithms and device drivers.

Worldwide
Full-time
Software Development