Remote Otter LogoRemoteOtter

Pre-Training Data Engineer - Remote

Posted 2 days ago
Software Development
Full Time
Worldwide

Overview

As a Pre-Training Data Engineer, you will play a pivotal role in developing the data infrastructure that underpins Cohere’s advanced language models.

In Short

  • Design and build scalable data pipelines to ingest, clean, filter, and optimize diverse datasets.
  • Conduct data ablations to assess data quality and experiment with data mixtures.
  • Develop robust data modeling techniques for optimal training efficiency.
  • Research and implement innovative data curation methods.
  • Collaborate with cross-functional teams to ensure data pipelines meet model demands.

Requirements

  • Strong software engineering skills, with proficiency in Python.
  • Familiarity with data processing frameworks like Apache Spark or Pandas.
  • Experience with large-scale datasets.
  • Knowledge of data quality assessment techniques.
  • A passion for bridging research and engineering in AI.

Benefits

  • An open and inclusive culture and work environment.
  • Work closely with a team on cutting-edge AI research.
  • Weekly lunch stipend and in-office lunches.
  • Full health and dental benefits, including mental health support.
  • Remote-flexible work with offices in major cities.
  • 6 weeks of vacation.
Cohere logo

Cohere

Cohere is a pioneering company dedicated to scaling intelligence to serve humanity through the development and deployment of advanced AI models. With a mission to enhance the capabilities of AI systems for developers and enterprises, Cohere focuses on creating transformative experiences in areas such as content generation, semantic search, and AI agents. The company prides itself on its diverse team of top-tier researchers, engineers, and designers who are committed to building high-quality products. Cohere fosters a culture of hard work, rapid innovation, and a strong emphasis on customer value, while also valuing inclusivity and diverse perspectives in the workplace.

Share This Job!

Save This Job!

Similar Jobs:

Anthropic logo

Data Infra Engineer, Pretraining - Remote

Anthropic

24 weeks ago

Join Anthropic as a Research Engineer to develop cutting-edge AI systems focused on safety and reliability.

United States
Full-time
Software Development
$315,000 - $340,000/year
Learnkwik.com logo

Data Engineer Training and Placement - Remote

Learnkwik.com

16 weeks ago

Kanshe Infotech offers online training and placement assistance for aspiring Data Engineers.

USA
Internship
All others

A.M

Data Engineer - Pricing - Remote

Argus Media

13 weeks ago

Join Argus as a Data Engineer - Pricing in Mumbai, focusing on data processing and client support.

India
Full-time
Data Analysis
Eventual logo

Software Engineer, Pre-Training/AI - Remote

Eventual

20 weeks ago

Join Eventual as a Software Engineer focused on AI Pretraining, working on cutting-edge AI research and scalable data systems.

CA, USA
Full-time
Software Development
G2i logo

Software Engineer for Training AI Data - Remote

G2i

30 weeks ago

Join a remote team as a software engineer to create training data for advanced AI models.

US, Canada, LATAM, Europe, Africa, Asia
Contract
Software Development
25.40 - 50.00 USD/hour