Remote Otter LogoRemoteOtter

Member of Technical Staff - AI Infrastructure - Remote

Posted 59 weeks ago
Software Development
Full Time
USA

Overview

As a Member of Technical Staff at Fluidstack, you will design, develop, and maintain software solutions that power our AI infrastructure and enable our customers to run complex ML workloads efficiently at scale.

In Short

  • Developing and optimizing job scheduling systems to maximize GPU utilization and throughput for ML workloads
  • Building and improving software interfaces for cluster management that support PyTorch, JAX, and other ML frameworks
  • Creating monitoring and observability tools for tracking training progress, resource usage, and system performance
  • Implementing data pipeline optimizations to accelerate training and inference workflows
  • Designing and developing APIs and services to integrate with MLflow, Kubeflow, Weights & Biases, and other ML tooling
  • Writing libraries and utilities to simplify the deployment and management of distributed training jobs

Requirements

  • You have developed software for training or serving large-scale ML models (1000+ GPU scale)
  • You have optimized distributed training performance across multiple nodes and accelerators
  • You have implemented APIs and interfaces for ML platforms that prioritize developer experience
  • You have experience with orchestration systems like Kubernetes or SLURM in the context of large scale ML workloads
  • You have built or contributed to ML infrastructure tools (e.g., Ray, Horovod, DeepSpeed), and have experience with ML experiment tracking and workflow systems (MLflow, Kubeflow, W&B)

Benefits

  • Competitive total compensation package (cash + equity).
  • Retirement or pension plan, in line with local norms.
  • Health, dental, and vision insurance.
  • Generous PTO policy, in line with local norms.
  • Fluidstack is remote first, but has offices in key locations. For all other locations, we provide access to WeWork.

FluidStack

FluidStack

Fluidstack is at the forefront of building infrastructure for advanced artificial intelligence, collaborating with leading AI labs, governments, and enterprises to provide high-speed computing solutions. The company is dedicated to making artificial general intelligence (AGI) a reality, driven by a team that values excellence and customer outcomes. Fluidstack's People team is committed to creating an exceptional work environment, focusing on systems and support that empower employees to tackle meaningful challenges. The organization emphasizes thoughtful leadership support, ensuring executives can concentrate on critical decisions while fostering a culture of collaboration and continuous improvement.

Share This Job!

Save This Job!

Similar Jobs:

anchorage logo

Member of Technical Staff, Infrastructure - Remote

anchorage

64 weeks ago

Join Anchorage Digital as a Member of Technical Staff to work on cloud infrastructure and enhance developer productivity.

USA
Full-time
Software Development

I.G

Member of Technical Staff - SRE / Infrastructure - Remote

IntelliPro Group

64 weeks ago

Seeking an experienced Member of Technical Staff to manage AWS cloud infrastructure and ensure system stability and security.

Worldwide
Full-time
DevOps / Sysadmin
$200,000 - $280,000/year
anchorage logo

Member of Technical Staff - Remote

anchorage

76 weeks ago

Join Anchorage Digital as a Member of Technical Staff to work on cloud infrastructure and build systems for a leading digital asset platform.

Worldwide
Full-time
Software Development
Moonvalley AI logo

Member of Technical Staff - Remote

Moonvalley AI

83 weeks ago

Join Moonvalley as a Member of Technical Staff to work on cutting-edge AI technology in a fully remote role.

UK
Full-time
Software Development
anchorage logo

Member of Technical Staff - Remote

anchorage

110 weeks ago

Join Anchorage Digital as a Member of Technical Staff to support and integrate new crypto assets into a leading digital asset platform.

USA
Full-time
Software Development