Remote Otter LogoRemoteOtter

Senior Manager, Site Reliability Engineering - Remote

Posted 1 week ago
DevOps / Sysadmin
Full Time
EU

Overview

SEON is the leading fraud prevention system of record, catching fraud before it happens at any point across the customer journey. Trusted by over 5,000 global companies, we combine your company’s data with our proprietary real-time signals to deliver actionable fraud insights tailored to your business outcomes. We deliver the fastest time to value in the market through a single API call, enabling quick and seamless onboarding and integration. By analyzing billions of transactions, we’ve prevented $200 billion in fraudulent activities, showcasing why the world’s most innovative companies choose SEON.

The Site Reliability Engineering (SRE) team at SEON ensures our products and services' reliability, scalability, and efficiency. The SRE team provides Incident Response, Reliability Engineering consulting, and limited embedded SRE engagements.

We seek a highly experienced and motivated SRE Manager to lead a team of Site Reliability Engineers. You will play a crucial role in maintaining the reliability and efficiency of our services, ensuring that our products and services are reliable while coordinating with cross-functional teams across various geographical regions. You will have a proven track record of leading top-performing teams in complex, fast-paced environments and will excel in organizing and motivating a team amidst rapid growth and change.

This role offers flexibility. It can be based in Budapest with a hybrid schedule or anywhere in the European Union with a remote setup, including occasional travel to our other offices.

In Short

  • Lead and grow a high-performing SRE team responsible for the reliability, performance, and scalability of production systems.
  • Own the incident management process, postmortems, and root cause analysis to improve system resilience.
  • Drive implementation of SLAs, SLOs, and error budgets across services to align operational goals with business objectives.
  • Champion the use of automation to reduce manual work and improve deployment and recovery times.
  • Collaborate with software engineering and Platform engineering teams to ensure systems are designed for reliability and operational efficiency.
  • Oversee system monitoring, alerting, and observability efforts using tools like Prometheus, Grafana, Datadog, or similar.
  • Manage on-call rotations, and ensure proper documentation, runbooks, and playbooks are maintained.
  • Identify and drive continuous improvement in system architecture, capacity planning, and deployment strategies.
  • Ensure compliance with security, privacy, and regulatory requirements within the infrastructure.
  • Provide mentorship, performance reviews, and career development opportunities for SRE team members.
  • You will communicate effectively with stakeholders at all levels, providing updates on team performance, project status, and incident resolutions.
  • You will advocate for the SRE team within the broader organization, representing their needs and concerns.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
  • Proven success in leading high-performing SRE or DevOps teams in a large-scale, fast-paced environment.
  • Extensive experience running high-availability web services at a large scale, with comprehensive knowledge of cloud-native architectures and advanced networking concepts.
  • Strategic vision to balance immediate operational needs with long-term reliability and scalability objectives.
  • Outstanding communication and interpersonal skills, with the ability to build strong relationships with team members and stakeholders.
  • Strong technical background with hands-on experience in cloud computing, system architecture, automation, and monitoring.
  • Excellent problem-solving skills with a focus on root cause analysis and proactive improvements.
  • Exceptional organizational skills, with the ability to manage multiple priorities and projects simultaneously.
  • Experience with tools and technologies such as AWS, Kubernetes, Terraform, Prometheus, Grafana, Jenkins, and similar.

NICE TO HAVE:

  • Cloud Architect Certification in one of the public clouds (AWS, GCP, Azure).
  • Good Knowledge of security controls for SOC2 and ISO certifications.

Benefits

  • Flexible work arrangements.
  • Opportunity to work in a dynamic and innovative environment.
  • Continuous learning and development opportunities.
  • Competitive salary and benefits.
SEON Technologies logo

SEON Technologies

SEON Technologies is a rapidly growing company dedicated to creating innovative solutions for fraud prevention and financial crime defense. With a strong focus on data engineering, SEON provides an API-first platform that empowers leading digital service providers across various industries, including financial services and entertainment. The company boasts a diverse team of over 250 professionals across global offices in Austin, Budapest, London, and Jakarta. Recognized as the world's fastest-growing fraud prevention company, SEON is committed to making the internet a safer place for businesses and customers alike, while fostering a culture of collaboration and continuous learning.

Share This Job!

Save This Job!

Similar Jobs:

ADT logo

Senior Manager of Site Reliability Engineering - Remote

ADT

15 weeks ago

Lead and grow the Site Reliability Engineering team to ensure the reliability and performance of ADT's product platform.

USA
Full-time
DevOps / Sysadmin
$140,800 - $211,200/year
Blackpoint logo

Senior Site Reliability Engineering (SRE) Manager - Remote

Blackpoint

15 weeks ago

Lead the SRE team at Blackpoint Cyber, focusing on infrastructure reliability and cost optimization.

Canada
Full-time
DevOps / Sysadmin
Blackpoint Cyber logo

Senior Site Reliability Engineering (SRE) Manager - Remote

Blackpoint Cyber

15 weeks ago

Lead the SRE team at Blackpoint Cyber, focusing on infrastructure reliability and cost optimization.

Canada
Full-time
DevOps / Sysadmin

Algolia

Manager, Site Reliability Engineering - Remote

Algolia

5 weeks ago

Lead and manage a team of Site Reliability Engineers to ensure the global reliability and performance of Algolia's Search Products.

Worldwide
Full-time
DevOps / Sysadmin
Replicant logo

Engineering Manager, Site Reliability - Remote

Replicant

5 weeks ago

Join Replicant as an Engineering Manager to lead the Site Reliability Engineering team in a remote-first environment.

Canada
Full-time
DevOps / Sysadmin