Research Scientist / Engineer – Performance Optimization - Remote

Posted 71 weeks ago

Software Development

Full Time

CA, USA

$180,000 - $250,000/year

Performance Monitoring

Transformer Architectures

Profiling Tools

Compiler Optimization

Overview

The Performance Optimization team at Luma is dedicated to maximizing the efficiency and performance of our AI models. Working closely with both research and engineering teams, this group ensures that our cutting-edge multimodal models can be trained efficiently and deployed at scale while maintaining the highest quality standards.

In Short

Profile and optimize GPU/CPU/Accelerator code for maximum utilization and minimal latency
Write high-performance PyTorch, Triton, CUDA, deferring to custom PyTorch operations if necessary
Develop fused kernels and leverage tensor cores and modern hardware features for optimal hardware utilization on different hardware platforms
Optimize model architectures and implementations for distributed multi-node production deployment
Build performance monitoring and analysis tools and automation
Research and implement cutting-edge optimization techniques for transformer model

Requirements

Expert-level proficiency in Triton/CUDA programming and GPU optimization
Strong PyTorch skills
Experience with PyTorch kernel development and custom operations
Proficiency with profiling tools (NVIDIA Nsight, torch profiler, custom tooling)
Deep understanding of transformer architectures and attention mechanisms
(Preferred) Experience with compilers/exporters such as torch.compile, TensorRT, ONNX, XLA
(Preferred) Experience optimizing inference workloads for latency and throughput
(Preferred) Experience with Triton compiler and kernel fusion techniques
(Preferred) Knowledge of warp-level intrinsics and advanced CUDA optimization
(Preferred) Background in compiler optimization or hardware-software co-design

Benefits

Competitive equity packages in the form of stock options
Comprehensive benefits plan

Luma AI

Luma Ai is dedicated to advancing the field of artificial intelligence through the development of multimodal systems that enhance human creativity and capabilities. The company believes that integrating various forms of data, particularly visual information, is essential for creating more intelligent and interactive AI systems. Luma Ai focuses on training and scaling multimodal foundation models that can perceive, understand, and engage with the world, aiming to deliver high-performance AI solutions across diverse hardware platforms.

Share This Job!

Save This Job!

Luma AI

Share This Job!

Save This Job!

Similar Jobs:

Data Scientist - AI Performance Optimization - Remote

Blend360

47 weeks ago

Blend360

Join Blend as a Data Scientist to optimize AI performance and tackle computational challenges.

USA

Full-time

Data Analysis

47 weeks ago

Performance Research Engineer - Remote

Eclipse

80 weeks ago

Eclipse

Research Engineer

Compilers

Runtimes

Language Implementation

Join Eclipse as a Research Engineer to enhance the performance of the fastest Ethereum Layer 2 execution environment.

Research Engineer

Compilers

Runtimes

Language Implementation

Worldwide

Full-time

Software Development

$300,000 - $550,000/year

80 weeks ago

Operations Research / Optimization Engineer - Remote

SFR3

67 weeks ago

SFR3

Operations Research

Optimization

Data Analysis

Join a boutique real estate investment fund as an Operations Research / Optimization Engineer, focusing on resource management and operational efficiency.

Operations Research

Optimization

Data Analysis

Full-time

Software Development

67 weeks ago