Senior DevOps Engineer
Role Overview
As a Senior DevOps Engineer focused on Site Reliability Engineering at SEON, you will maintain and improve the reliability, scalability, and performance of cloud infrastructure, working closely with cross-functional teams. Your day-to-day involves implementing SRE best practices, managing monitoring and alerting systems, automating tasks, and ensuring systems meet security and compliance requirements. This senior role impacts the company's ability to provide robust, scalable services for fraud prevention and AML compliance.
Perks & Benefits
This role offers remote work flexibility, with an option for a hybrid schedule based in Budapest, Hungary, implying some time zone considerations for collaboration. The company emphasizes a culture of continuous improvement, proactive problem-solving, and cross-functional teamwork, with opportunities for career growth through staying current with new technologies and industry trends. Benefits likely include on-call support and a focus on work-life balance in a fast-growing, innovative environment.
Full Job Description
SEON is the command center for fraud prevention and AML compliance, helping thousands of companies worldwide stop fraud, reduce risk and protect revenue. Powered by 900+ real-time, first-party data signals, SEON enriches customer profiles, flags suspicious behavior and streamlines compliance workflows - all from one place. SEON provides richer data, more flexible and transparent analysis, and faster time to value than any other provider on the market. We’ve helped companies reduce fraud by 95% and achieve 32x ROI, and we’re growing fast, thanks to our partnerships with some of the world’s most ambitious digital brands like Revolut, Wise, and Bilt.
As a Senior DevOps Engineer focused on Site Reliability Engineering (SRE) at SEON, you will play a crucial role in maintaining and improving the reliability, scalability, and performance of our cloud infrastructure. You will work closely with cross-functional teams to ensure our systems are robust, scalable, and efficient.
This role offers flexibility and can be based in Budapest, Hungary with a hybrid schedule.
WHAT YOU’LL DO:
Ensure the reliability, availability, and performance of our systems by implementing SRE best practices
Develop and maintain comprehensive monitoring and alerting systems using tools such as Prometheus, Grafana, ELK stack, etc.
Manage incident response and root cause analysis for production issues
Conduct postmortems to learn from failures and drive continuous improvement in the system’s reliability
Continuously monitor and optimize the performance of cloud infrastructure to ensure efficient resource utilization and cost-effectiveness
Automate routine tasks and processes to reduce manual intervention and increase efficiency
Analyze current system capacity and plan for future growth to ensure the infrastructure can scale with increasing demands
Define, measure, and monitor SLOs and SLIs to ensure that services meet their reliability targets
Work closely with engineering, and product teams to provide feedback and suggestions on new architectures, ensuring they meet reliability and performance standards
Develop and maintain comprehensive documentation for architecture, infrastructure, and troubleshooting processes.
Provide on-call support to ensure the continuous availability of our applications and infrastructure
Ensure that systems meet security and compliance requirements, performing regular audits and assessments based on the internal security team’s guidelines
Stay current with new technologies and industry trends, evaluating their potential impact on our infrastructure and reliability practices
WHAT YOU’ll BRING:
6+ years of experience as a SRE, DevOps or in a similar engineering role, with a focus on reliability principles and practices
Strong hands-on experience working with Kubernetes ( AWS EKS preferred )
Strong hands-on expertise in Terraform
Extensive experience working in multi-region and multi-account AWS setup
Strong experience with monitoring and logging tools such as Prometheus, Grafana, Elasticsearch, and Kibana
Strong experience deploying, maintaining and troubleshooting scalable distributed components in microservice-based architecture
Experience researching, troubleshooting and improving customer critical requests related to latency, availability and performance issues
Ability to quickly troubleshoot complex issues related to infrastructure
Proficiency with incident management tools such as PagerDuty, Opsgenie, etc.
Familiarity with CI pipelines and tools (Github Actions preferred)
Experience working with GitOps practices and CD tools (ArgoCD preferred)
A proactive approach to identifying and resolving issues independently with a strong problem-solving attitude
Excellent communication and collaboration skills to work effectively with cross-functional teams
SEON Technologies collects and processes personal data in accordance with applicable data protection laws. If you are a European Job Applicant see the privacy notice for further details.
SEON is an equal opportunity employer. We strive to embrace what makes each one of us unique; we each have our own story. Whether looking at our current staff or future team members, we believe that everyone has something to contribute, and our employment practices reflect that. We do not make an employment decision based upon race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Please let your recruiter know if you need reasonable adjustments to our recruitment process.
Similar jobs
Found 6 similar jobs