Remote Otter LogoRemoteOtter

Senior Machine Learning Engineer, Ads Training Platform - Remote

Posted Yesterday
Software Development
Full Time
Netherlands

Overview

Reddit is a community of communities. It’s built on shared interests, passion, and trust and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 101M+ daily active unique visitors, Reddit is one of the internet’s largest sources of information.

In Short

  • Design, build, and maintain large-scale distributed training infrastructure for Ads ML models.
  • Develop tools and frameworks on top of the Ray platform.
  • Build tools to debug, profile, and tune distributed training jobs for performance and reliability.
  • Integrate with object storage systems and improve data access patterns.
  • Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs.
  • Drive improvements in scheduling, state management, and fault tolerance within the training platform.

Requirements

  • 5+ years in infrastructure/platform engineering or large-scale distributed systems.
  • 2+ years hands-on experience with Ray platform.
  • Strong understanding of distributed computing principles.
  • Experience with distributed storage systems and large-scale data processing.
  • Proven ability to debug and profile distributed jobs.
  • Experience with deep learning frameworks (PyTorch, TensorFlow) is a big plus.
  • Bonus: model optimization for distributed training, Ads ML experience.

Benefits

  • Private Pension plan with Employer-matching
  • 100% employer-sponsored group medical plan
  • Income Replacement Programs
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Reddit Global Days Off
Reddit logo

Reddit

Reddit is a dynamic online platform that fosters community engagement and discussion across a wide range of topics. As a leading social news aggregation and discussion website, Reddit connects millions of users who share content and participate in conversations. The company is committed to understanding user needs and enhancing the user experience through innovative research and collaboration among cross-functional teams. Reddit values growth and is focused on driving product strategy through actionable insights, making it an exciting place for professionals passionate about user experience and research.

Share This Job!

Save This Job!

Similar Jobs:

Reddit logo

Machine Learning Engineer, Ads Training Platform - Remote

Reddit

5 days ago

Join Reddit as a Machine Learning Engineer to enhance the Ads Training Platform by designing and maintaining large-scale distributed training infrastructure.

Worldwide
Full-time
Software Development
$185,800 - $260,100 USD
Coinbase logo

Senior Machine Learning Engineer, Platform - Remote

Coinbase

12 weeks ago

Join Coinbase as a Senior Machine Learning Engineer to enhance the platform's security and user experience through innovative machine learning solutions.

India
Full-time
Software Development
Gusto logo

Senior Machine Learning Engineer - Platform - Remote

Gusto

12 weeks ago

Join Gusto as a Senior Machine Learning Engineer to develop and enhance ML infrastructure solutions.

Worldwide
Full-time
Software Development
$157,000 - $235,000/year
Fieldguide logo

Senior Platform Engineer, Machine Learning - Remote

Fieldguide

50 weeks ago

Join Fieldguide as a Senior Platform Engineer, Machine Learning, to build and maintain infrastructure for ML solutions in a remote-first environment.

United States
Full-time
Software Development
Coinbase logo

Senior Staff Machine Learning Engineer - Platform - Remote

Coinbase

12 weeks ago

Join Coinbase as a Senior Staff Machine Learning Engineer to enhance infrastructure for the open financial system using cutting-edge machine learning technologies.

India
Full-time
Software Development