Remote Otter LogoRemoteOtter

Data Engineer for Generative Image and Video Models - Remote

Posted 22 weeks ago
Software Development
Full Time
Worldwide

Overview

Black Forest Labs is a cutting-edge startup pioneering generative image and video models. Our team, which invented Stable Diffusion, Stable Video Diffusion, and FLUX.1, is currently looking for a strong candidate to join us in developing large-scale data pipelines for training frontier models.

In Short

  • Develop and maintain scalable infrastructure for large-scale image and video data acquisition
  • Manage and coordinate data transfers from various licensing partners
  • Implement and deploy state-of-the-art ML models for data cleaning, processing, and preparation
  • Implement scalable and efficient tools to visualize, cluster, and deeply understand the data
  • Optimize and parallelize data processing workflows to handle billion-scale datasets efficiently
  • Ensure data quality, diversity, and proper annotation (including captioning) for training readiness
  • Getting training data from alternative sources such as user preferences into trainable format
  • Work closely in the model development loop to update data as necessitated by the training trajectory

Requirements

  • Proficiency in Python and various file systems for data intensive manipulation and analysis
  • Familiarity with cloud computing platforms (AWS, GCP, or Azure) and Slurm/HPC environments for distributed data processing
  • Experience with image and video processing libraries (e.g., OpenCV, FFmpeg)
  • Demonstrated ability to optimize and parallelize data processing workflows across CPUs and GPUs
  • Familiarity with data annotation and captioning processes for ML training datasets
  • Knowledge of machine learning techniques for data cleaning and preprocessing

Benefits

  • Background or keen interest in developing large-scale data acquisition systems
  • Experience with natural language processing for image/video captioning
  • Experience with data deduplication techniques at scale
  • Experience with big data processing frameworks (e.g., Apache Spark, Hadoop)
  • Understanding of ethical considerations in data collection and usage
Black Forest Labs logo

Black Forest Labs

Black Forest Labs is an innovative startup at the forefront of generative image and video technology. Known for developing groundbreaking models such as Stable Diffusion and Stable Video Diffusion, the company is dedicated to creating advanced AI media solutions. With a focus on building intuitive user interfaces and enhancing user experiences, Black Forest Labs collaborates closely with machine learning researchers and engineers. The company operates from key hubs in San Francisco, Germany, and London, while also considering remote work arrangements. Their mission is to revolutionize the way users interact with AI-generated content.

Share This Job!

Save This Job!

Similar Jobs:

Black Forest Labs logo

Researcher in Generative Image and Video Models - Remote

Black Forest Labs

25 weeks ago

Join Black Forest Labs as a researcher to develop and train generative image and video models.

Worldwide
Full-time
Software Development
SpreeAI Corporation logo

Image and Video Synthesis AI Engineer - Remote

SpreeAI Corporation

65 weeks ago

Join our team as an Image and Video Synthesis AI Engineer to develop advanced AI technologies for image and video synthesis.

Worldwide
Full-time
Software Development

T.H

Generative AI Engineer - Remote

Tech Holding

1 week ago

Join Tech Holding as a Generative AI Engineer to develop innovative AI chat solutions.

Mexico
Full-time
Software Development
SoundHound AI logo

Generative AI Engineer - Remote

SoundHound AI

7 weeks ago

Join SoundHound AI as a Generative AI Engineer to develop cutting-edge conversational AI solutions.

Toronto, Canada
Full-time
Software Development
Clarity AI logo

Generative AI Engineer - Remote

Clarity AI

7 weeks ago

Clarity AI is seeking a Generative AI Engineer to develop and enhance their AI platform, focusing on sustainability and innovative technologies.

Spain
Full-time
Software Development