Remote Otter LogoRemoteOtter

Senior AI Infrastructure Engineer - Remote

Posted 1 week ago
Software Development
Full Time
USA
$160,000 - $230,000/year

Overview

Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.

As a Senior AI Infrastructure Engineer, you will play a key role in building the next generation AI cloud platform – a highly available, global, blazing-fast cloud infrastructure that virtualizes cutting-edge ML hardware (GB200s/GB300s, BlueField DPUs) and enables state-of-the-art ML practitioners with self-serve AI cloud services, such as on-demand + managed Kubernetes and Slurm clusters. This platform serves both our internal SaaS products (inference, fine-tuning) and our external cloud customers, spanning dozens of data centers across the world.

In Short

  • Design, build, and maintain performant, secure, and highly-available backend services/operators.
  • Design and build out the IaaS software layer for a new GB200 data center.
  • Work on a global multi-exabyte high-performance object store.
  • Build advanced observability stacks for customers.
  • Collaborate with technical and non-technical team members.
  • Strong fundamental software development skills required.
  • Experience with Kubernetes internals is a plus.
  • Experience with VMs/hypervisors is a plus.
  • Experience with high-performance compute, networking, and/or storage is a plus.
  • Experience with infrastructure automation tools is beneficial.

Requirements

  • 5+ years of professional software development experience.
  • Proficiency in at least one backend programming language (Golang desired).
  • Experience writing high-performance, well-tested, production quality code.
  • Experience with building and operating high-performance micro-service architectures.
  • Excellent communication skills.
  • Deep experience with Kubernetes internals is a plus.
  • Strong systems knowledge across compute, networking, and storage.
  • Experience with infrastructure automation tools (Terraform, Ansible).
  • Experience with monitoring/observability stacks (Prometheus, Grafana).
  • Experience building IaaS or PaaS systems at scale is a plus.

Benefits

  • Competitive compensation.
  • Startup equity.
  • Health insurance.
  • Flexibility in terms of remote work.
  • Opportunity to work on cutting-edge AI technologies.

T.A

Together AI

Together AI is a research-driven artificial intelligence company dedicated to fostering innovation through open and transparent AI systems. The company aims to significantly reduce the costs associated with modern AI by co-designing software, hardware, algorithms, and models. With a strong commitment to advancing the field, Together AI has contributed to notable open-source research, models, and datasets, and is behind significant technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. The team is composed of passionate researchers and engineers focused on building the next generation of AI infrastructure.

Share This Job!

Save This Job!

Similar Jobs:

Raya logo

Senior Infrastructure Engineer - Remote

Raya

2 weeks ago

Join Raya as a Senior Infrastructure Engineer to lead and optimize scalable infrastructure solutions.

Worldwide
Full-time
DevOps / Sysadmin

Jobgether

Senior Infrastructure Engineer - Remote

Jobgether

3 weeks ago

Join a team as a Senior Infrastructure Engineer to build and scale bare-metal infrastructure for AI workloads.

CA, USA
Full-time
DevOps / Sysadmin
Dataiku logo

Senior Infrastructure Engineer - Remote

Dataiku

3 weeks ago

Join Dataiku as a Senior Infrastructure Engineer to design and maintain scalable infrastructures for AI technologies.

Worldwide
Full-time
DevOps / Sysadmin
Pinaka Technology Solutions logo

Senior Infrastructure Engineer - Remote

Pinaka Technology Solutions

5 weeks ago

The AEC is looking for a Senior Infrastructure Engineer to enhance their system infrastructure with a focus on Azure and hybrid solutions.

ACT, Australia
Contract
DevOps / Sysadmin
DataCamp logo

Senior Infrastructure Engineer - Remote

DataCamp

6 weeks ago

Join DataCamp as a Senior Infrastructure Engineer and manage cloud infrastructure and CI/CD pipelines to enhance developer productivity.

Worldwide
Full-time
DevOps / Sysadmin