Remote Otter LogoRemoteOtter

Senior AI Systems Engineer (LLM Inference & Infra Optimization) - Remote

Posted 1 week ago
Software Development
Full Time
Worldwide

Overview

At Sully.ai, we’re building cutting-edge AI-native infrastructure to power real-time, intelligent healthcare applications. Our team operates at the intersection of high-performance computing, ML systems, and cloud infrastructure — optimizing inference pipelines to support next-generation multimodal AI agents. We're looking for a deeply technical engineer who thrives at the systems level and loves building performant, scalable infrastructure.

In Short

  • Lead efforts in deploying and optimizing large language models on high-end GPU hardware.
  • Work across the stack from C++ and CUDA kernels to Python APIs.
  • Shape DevOps practices for scalable, multi-cloud deployments.
  • Develop and optimize inference pipelines using quantization and attention caching.
  • Build and maintain low-level modules in C++/CUDA/NCCL.
  • Manage multi-cloud environments using IaC frameworks like Pulumi or Terraform.
  • Design low-latency streaming and decision-support systems.
  • Build robust tooling and interfaces for other engineers.

Requirements

  • Proficiency in C++, CUDA, and Python.
  • Deep understanding of GPU architectures and large model serving techniques.
  • Hands-on experience with multi-cloud environments (GCP, AWS).
  • Familiarity with ML deployment frameworks (TensorRT, vLLM).
  • Comfortable with DevOps workflows and distributed system debugging.
  • (Bonus) Experience with streaming embeddings or hybrid retrieval architectures.
  • (Bonus) Interest in building tools for broader engineering teams.

Benefits

  • Collaborate with a highly technical team solving hard problems in AI and healthcare.
  • Work with cutting-edge GPU infrastructure.
  • Be a foundational part of shaping AI-native infrastructure.
  • Help accelerate a meaningful product that improves clinician workflows.
Sully.ai logo

Sully.ai

Sully.ai is a pioneering healthcare technology company dedicated to transforming the medical landscape by addressing the global shortage of physicians. With a mission to make 'one human, one doctor' a reality, Sully.ai is developing an AI-driven doctor that aims to provide high-quality healthcare accessible to everyone, anywhere, and anytime. By automating administrative tasks and minimizing misdiagnoses through advanced AI solutions, the company is committed to building the future of medicine. Sully.ai fosters an inclusive and innovative environment, welcoming diverse perspectives to drive creativity and enhance healthcare delivery.

Share This Job!

Save This Job!

Similar Jobs:

Tether Operations Limited logo

Senior AI Inference Engineer - Remote

Tether Operations Limited

5 weeks ago

Join Tether as a Senior AI Inference Engineer and work on pioneering AI solutions in a fully remote environment.

Worldwide
Full-time
Software Development

TraceGains

Senior AI Engineer – LLM & Agentic Systems - Remote

TraceGains

4 days ago

Join TraceGains as a Senior AI Engineer to lead the development of innovative AI solutions leveraging large language models.

Worldwide
Full-time
Software Development

MongoDB

Senior Query Optimization Engineer - Remote

MongoDB

22 weeks ago

Join MongoDB as a Senior Query Optimization Engineer to innovate and enhance the query optimization system for a leading distributed database.

USA
Full-time
Software Development
$168,000 - $330,000 USD/year

MongoDB

Senior Query Optimization Engineer - Remote

MongoDB

22 weeks ago

Join MongoDB as a Senior Query Optimization Engineer to innovate and enhance the query optimization system for a leading distributed database.

USA
Full-time
Software Development
$168,000 - $330,000 USD/year
Waabi logo

Senior / Staff ML Optimization Engineer - Remote

Waabi

12 weeks ago

Join Waabi as a Senior / Staff ML Optimization Engineer to advance self-driving technology through innovative AI solutions.

Worldwide
Full-time
Software Development