Remote Otter LogoRemoteOtter

Site Reliability Engineer - Cloud - Remote

Posted 7 weeks ago
DevOps / Sysadmin
Full Time
USA
136000 - 212750 USD/year

Overview

NVIDIA is seeking an outstanding Site Reliability Engineer to ensure the reliability and efficiency of its Digital Marketing Services, focusing on AWS Infrastructure and automation.

In Short

  • Ensure reliability and performance of Digital Marketing Services.
  • Lead and improve AWS Infrastructure.
  • Develop monitoring and alerting tools.
  • Automate deployment pipelines.
  • Debug and triage user-reported issues.
  • On-board new applications on AWS.
  • Implement monitors and SOPs for early issue detection.
  • Automate daily tasks with scripting.
  • Collaborate with internal and external stakeholders.
  • Work in East Coast time zones.

Requirements

  • MS or BS in Computer Science/Engineering or equivalent experience.
  • 5+ years of experience in live-site production environments.
  • Experience with Python/Java on Windows or Linux.
  • Strong knowledge of Kubernetes.
  • Experience with incident management processes.
  • On-call SRE experience.
  • Advanced scripting and development skills.
  • Strong problem-solving abilities.
  • Excellent communication and analytical skills.
  • Passion for technology and automation.

Benefits

  • Competitive salaries.
  • Generous benefits package.
  • Equity eligibility.
  • Diverse work environment.
  • Opportunity to work with leading technology professionals.
  • Autonomous and creative work culture.

N.U

NVIDIA USA

VN01 NVIDIA Vietnam Company Limited is a subsidiary of NVIDIA, a global leader in accelerated computing. The company focuses on pioneering technologies in AI and digital twins, transforming major industries and making a significant impact on society. With a commitment to innovation, NVIDIA Vietnam plays a crucial role in the manufacturing and engineering processes, ensuring high standards of manufacturability and production capabilities in a fast-paced environment. The team collaborates closely with global contract manufacturers and engineering teams to enhance production efficiency and drive continuous improvement.

Share This Job!

Save This Job!

Similar Jobs:

BDR Solutions logo

Cloud Site Reliability Engineer (SRE) - Remote

BDR Solutions

17 weeks ago

Join BDR Solutions as a Cloud Site Reliability Engineer, focusing on building and automating infrastructure services for SaaS solutions on Azure and AWS.

USA
Full-time
DevOps / Sysadmin
Ryzlabs logo

Cloud Site Reliability Engineer (SRE) - Remote

Ryzlabs

21 weeks ago

RYZ is looking for a Cloud SRE to enhance system resiliency and availability for self-driving robotic carriers.

Argentina, Uruguay
Full-time
DevOps / Sysadmin

MongoDB

Site Reliability Engineer - Cloud Team - Remote

MongoDB

20 weeks ago

Join MongoDB's Cloud Team as a Site Reliability Engineer to design and build global infrastructure for cloud services.

USA
Full-time
DevOps / Sysadmin
$127,000 - $249,000 USD/year

MongoDB

Site Reliability Engineer - Cloud Team - Remote

MongoDB

20 weeks ago

Join MongoDB's Cloud Team as a Site Reliability Engineer to design and build global infrastructure for cloud services.

USA
Full-time
DevOps / Sysadmin
$127,000 - $249,000 USD/year

MongoDB

Site Reliability Engineer - Cloud Team - Remote

MongoDB

20 weeks ago

Join MongoDB's Cloud Team as a Site Reliability Engineer to design and build global infrastructure for cloud services.

USA
Full-time
DevOps / Sysadmin
$127,000 - $249,000 USD/year