Remote Otter LogoRemoteOtter

AI Systems Evaluation Engineer - Remote

Posted 1 week ago

Overview

At Trunk Tools, we are tackling the massive $13 trillion+ construction industry. We’re an exceptional team of serial entrepreneurs, brought together by our shared mission: automating construction. Our founding team (SpaceX, Stanford, MIT, Carta, etc.) has successfully built and deployed software in construction for 140k+ users, millions of users beyond the construction space, and worked on +$2 billion of built-environment projects. We aren’t another out-of-touch tech startup, most of our team comes from construction.

In Short

  • Design and implement rigorous evaluation frameworks and performance metrics for AI systems.
  • Develop tools, dashboards, and processes for AI development lifecycle observability.
  • Collaborate cross-functionally to embed monitoring and testing methodologies.
  • Identify bottlenecks and propose solutions for AI components.
  • Stay updated on industry trends in LLMs and agent architectures.

Requirements

  • MS/PhD in Computer Science, Machine Learning, or related field.
  • 2+ years of experience evaluating AI/ML systems.
  • Hands-on experience with observability and analytics platforms.
  • Proficiency in Python and machine learning frameworks.
  • Knowledge of retrieval-augmented generation (RAG) and agent-based workflows.
  • Experience with synthetic data generation or test automation.
  • Strong problem-solving skills and collaborative mindset.

Benefits

  • Collaborative early-stage startup environment.
  • Competitive salary and stock option equity packages.
  • 3 Medical Plans including 100% covered option.
  • 401K.
  • Learning & Growth stipend.
  • Free lunch in NYC and Austin office.
  • Unlimited PTO for work-life balance.
  • In-Person retreats throughout the year.

Similar Jobs:

Anthropic logo

AI Systems Engineer - Remote

Anthropic

3 weeks ago

Join Anthropic as an AI Systems Engineer to develop and optimize agentic systems for Claude.

Machine Learning
Large Language Models
Software Engineering
Prompt Engineering
Worldwide
Full-time
Software Development
$315,000 - $425,000 USD/year
Goodnotes logo

AI Systems Engineer - Remote

Goodnotes

8 weeks ago

Join our team to develop cutting-edge AI systems for a leading digital paper solution.

AI Applications
LLM
Python
Java
Worldwide
Full-time
Software Development
Silverfort logo

Information Systems AI Engineer - Remote

Silverfort

2 weeks ago

Join Silverfort as an Information Systems AI Engineer to develop AI-driven automation solutions in a cutting-edge cybersecurity environment.

AI
Machine Learning
Automation
NLP
Israel
Full-time
Software Development
AiFi logo

Sr. AI Systems Engineer - Remote

AiFi

9 weeks ago

Lead the development of advanced computer vision systems as a Sr. AI Systems Engineer.

Computer Vision
Python
RESTful APIs
OpenAPI
United Kingdom
Full-time
Software Development
Janeasystems logo

Edge AI Systems Engineer - Remote

Janeasystems

8 weeks ago

Join Janea Systems as an Edge AI Systems Engineer to develop and deploy AI/ML models on Edge AI devices in a fully remote role.

PyTorch
TensorFlow Lite
ONNX
Python
Worldwide
Full-time
Software Development