Remote Otter LogoRemoteOtter

Senior Site Reliability Engineer - Remote

Posted Yesterday
DevOps / Sysadmin
Full Time
USA

Overview

As a Senior Site Reliability Engineer at Trase Systems, you will be responsible for building and maintaining the resilient, scalable infrastructure that powers cutting-edge AI, ensuring the reliability and performance of complex, distributed systems.

In Short

  • Design and maintain core infrastructure on cloud platforms.
  • Automate deployment and management of production environments.
  • Ensure system reliability, performance, and capacity.
  • Manage ML infrastructure and CI/CD pipelines.
  • Lead incident response and post-mortem reviews.
  • Implement observability solutions for system health.
  • Optimize resource utilization and capacity planning.
  • Champion SRE best practices across the organization.

Requirements

  • Proven experience in Site Reliability Engineering or DevOps.
  • Expertise in cloud infrastructure (GCP, AWS, Azure).
  • Proficiency in Infrastructure as Code tools (Terraform, Ansible).
  • Knowledge of Docker and Kubernetes.
  • Strong programming skills, especially in Python.
  • Experience with monitoring tools like Prometheus and Grafana.
  • Background in building CI/CD pipelines.
  • Excellent problem-solving and communication skills.
  • Bachelor's or Master's degree in Computer Science or related field.

Benefits

  • 100% employer-paid health care for you and your family.
  • Paid maternity and paternity leave for 14 weeks.
  • Unlimited PTO with management approval.
  • Opportunities for professional development and educational reimbursements.
  • Optional 401K, FSA, and equity incentives.
  • Mental health benefits available.

T.S

Trase Systems

Trase Systems, co-founded in 2023 by Joe Laws and Grant Verstandig, is an innovative technology company focused on simplifying the adoption of artificial intelligence (AI) for enterprises. The company provides an end-to-end solution that empowers enterprise leaders to leverage AI's full potential while minimizing complexity and risks. Trase specializes in bridging the 'last mile' of AI adoption, enabling organizations to implement, manage, and optimize AI technologies effectively, driving efficiency and significant cost savings.

Share This Job!

Save This Job!

Similar Jobs:

Red Cell Partners logo

Senior Site Reliability Engineer - Remote

Red Cell Partners

Yesterday

Join Trase Systems as a Senior Site Reliability Engineer to build and maintain resilient infrastructure for AI applications.

USA
Full-time
DevOps / Sysadmin
Flock Safety logo

Senior Site Reliability Engineer - Remote

Flock Safety

Yesterday

Flock Safety is seeking a Senior Site Reliability Engineer to enhance their technology solutions for community safety.

USA
Full-time
Software Development
$180,000 - $190,000/year
ClickHouse logo

Senior Site Reliability Engineer - Remote

ClickHouse

7 days ago

Join ClickHouse as a Senior Site Reliability Engineer to enhance the reliability and performance of our cloud infrastructure.

Worldwide
Full-time
DevOps / Sysadmin
Going logo

Senior Site Reliability Engineer - Remote

Going

7 days ago

Join Going as a Senior Site Reliability Engineer to enhance platform reliability and efficiency in a fully remote role.

USA
Full-time
DevOps / Sysadmin
$155,000/year

Runwise

Senior Site Reliability Engineer - Remote

Runwise

1 week ago

Join Runwise as a Senior Site Reliability Engineer to enhance the reliability and performance of their climate-tech services.

Worldwide
Full-time
DevOps / Sysadmin
$140,000 - $190,000/year