Cloud Site Reliability Engineer (SRE) - Remote

Posted 72 weeks ago

DevOps / Sysadmin

Full Time

USA

Infrastructure AS Code

Overview

BDR Solutions, LLC excels in delivering best-value services to U.S. Federal Civilian and Defense agencies, driving mission success with excellence and innovation. We specialize in modernizing government systems for health, social services, and disaster relief, enhancing veteran lives. As a service-disabled veteran-owned, 8(a), HUBZone small business, our mission is to provide unparalleled support to veterans in all sectors. We are committed to creating a future where every veteran's well-being is prioritized, combining IT expertise with compassionate care. At BDR, we are known for reliable outcomes tailored to our clients' missions, ensuring our services positively impact veterans and all our clients.

We are seeking a Cloud SRE to join our growing team! This team works across the company, and with multiple cloud partners, to make using Smile Digital Health products simple for our customers. As part of the hosting operations team, the Cloud SRE will support the building, operating and automating of infrastructure services to deliver SaaS-based solutions on Azure/AWS. This role creates a bridge between development and operations by applying a software engineering mindset to system administration topics. The incumbent will divide their time between operations/on-call duties and administering systems and software which help increase site reliability and performance.

In Short

Collaborate with Security Operations teams to help define and implement best practices around Cloud Service Provider configuration for AWS, Azure and other cloud providers.
Develop, implement and coordinate a multi-tenant approach around service offerings for DB, Container platform, Authentication, Certificates, and Product Registries etc.
Develop and maintain cost/utilization tracking and attribution processes for all Cloud Service Providers.
Create documentation around Cloud Service Provider offerings detailing use cases, best practices, and implementation details.
Develop and maintain technical relationships with our core Cloud Service Providers.
Implement and maintain a secure and scalable infrastructure platform for delivering Cloud Services applications.
Ensure that internal and external SLA’s meet and exceed expectations, and ensure that system centric KPIs are continuously monitored and improved.
Create tools for automating deployment, monitoring and operations of the overall platform.
Participate in an on-call rotation to provide application support, incident management, and troubleshooting.
Provide ongoing maintenance and support of internal tools, improve system health and reliability.

Requirements

Demonstrated expertise of cloud service providers and best practices around implementation and configuration, preferably managing Azure on behalf of multiple teams for a company that delivers SaaS products.
Experience with Kubernetes, Openshift, Kafka, Elastic stack.
Proven experience with Security and Compliance (SOC2, HIPAA, ISO27001) best practices and how to implement controls that support high-velocity software delivery teams.
Proficiency in Terraform, Ansible or Chef.
Expertise in troubleshooting support escalation, on-Call process optimization and documenting knowledge.
Passionate about Infrastructure as code, automation, and developing solutions that help developers move quickly and safely.
Familiarity with infrastructure management and operations lifecycle concepts and ecosystem.
Experience operating and maintaining production systems in a Linux and public cloud environment.
You have prior experience working in high performance or distributed systems; while we strive to hire at a variety of experience levels.
Working knowledge of industry best practices with regard to information security.
Previous experience building or maintaining a large scale Cloud service.
Proven ability to prioritize and track multiple projects in parallel.
Proven ability to be highly responsive and customer-focused.

Benefits

Military Veterans encouraged to apply.
Equal Opportunity Employer.
Consideration for employment without regard to race, color, religion, sex, age, national origin, marital status, disability, veteran status, sexual orientation, or genetic information.

BDR Solutions

BDR Solutions, LLC (BDR) is a dedicated partner to the U.S. Federal Government, focusing on enhancing the effectiveness of its operations and achieving its mission goals. The company specializes in understanding the unique needs of each client and seamlessly integrating solutions within various agencies to improve business and technical processes. BDR is committed to fostering a collaborative work environment and encourages military veterans to apply, reflecting its dedication to diversity and inclusion.

Share This Job!

Save This Job!

Jobs from BDR Solutions:

Software Engineer

Junior Web Developer

Senior Web Developer

WEB Development

Junior Software Engineer

Technical Writer

Technical Writing

Documentation Standards

BDR Solutions

BDR Solutions, LLC (BDR) is a dedicated partner to the U.S. Federal Government, focusing on enhancing the effectiveness of its operations and achieving its mission goals. The company specializes in understanding the unique needs of each client and seamlessly integrating solutions within various agencies to improve business and technical processes. BDR is committed to fostering a collaborative work environment and encourages military veterans to apply, reflecting its dedication to diversity and inclusion.

Share This Job!

Save This Job!

Jobs from BDR Solutions:

Software Engineer

Junior Web Developer

Senior Web Developer

WEB Development

Junior Software Engineer

Technical Writer

Technical Writing

Documentation Standards

Similar Jobs:

Cloud Site Reliability Engineer (SRE) - Remote

Ryzlabs

76 weeks ago

Ryzlabs

Site Reliability Engineering

Cloud Computing

RYZ is looking for a Cloud SRE to enhance system resiliency and availability for self-driving robotic carriers.

Site Reliability Engineering

Cloud Computing

Argentina, Uruguay

Full-time

DevOps / Sysadmin

76 weeks ago

Cloud Site Reliability Engineer (SRE) - Remote

Smile Digital Health

78 weeks ago

Smile Digital Health

Join Smile Digital Health as a Cloud SRE to enhance healthcare data management through innovative cloud solutions.

Worldwide

Full-time

DevOps / Sysadmin

78 weeks ago

P.W

Site Reliability Engineer (SRE) - Remote

Point Wild

72 weeks ago

Point Wild

Site Reliability Engineering

Join Point Wild as a Site Reliability Engineer to maintain system reliability and performance in a dynamic engineering team.

Site Reliability Engineering

Worldwide

Full-time

DevOps / Sysadmin

72 weeks ago

Site Reliability Engineer (SRE) - Remote

Ensono

72 weeks ago

Ensono

Site Reliability Engineering

Infrastructure AS Code

Ensono is looking for an experienced Site Reliability Engineer (SRE) to enhance their infrastructure and service management.

Site Reliability Engineering

Infrastructure AS Code

USA

Full-time

DevOps / Sysadmin

$93,000 - $135,000/year

72 weeks ago

Site Reliability Engineer (SRE) - Remote

Element Solutions

72 weeks ago

Element Solutions

Site Reliability Engineering

Cloud Migration

Infrastructure AS Code

Element is seeking a motivated Site Reliability Engineer (SRE) to enhance cloud migration and collaborate on Infrastructure as Code and CI/CD efforts.

Site Reliability Engineering

Cloud Migration

Infrastructure AS Code

USA

Full-time

DevOps / Sysadmin

72 weeks ago