Senior Site Reliability Engineer

This listing is synced directly from the company ATS.

Role Overview

This senior-level role involves leading the modernization of LeanData's AWS cloud infrastructure through architectural transformation, focusing on automation, reliability, and scalability. Day-to-day responsibilities include designing high-availability systems, implementing Infrastructure as Code with Terraform, optimizing performance, and enhancing observability with New Relic. The hire will impact the company's ability to scale by driving zero-downtime deployments and improving security and CI/CD pipelines.

Perks & Benefits

This is a hybrid remote role requiring two days per week in the Santa Clara, CA office, with flexibility for remote work on other days. Benefits include 90% coverage of employee insurance premiums, stock options, flexible PTO, and a 401K plan. The role offers autonomy and leadership opportunities, reporting to the SVP of Engineering, with a focus on strategic problem-solving and collaboration in a high-velocity environment.

⚠️ This job was posted over 5 months ago and may no longer be open. We recommend checking the company's site for the latest status.

Full Job Description

LeanData helps the world’s fastest-growing companies automate, simplify, and accelerate revenue.

We are looking for a Senior Site Reliability Engineer to lead the strategic evolution of our cloud infrastructure. Reporting directly to the SVP of Engineering, this role is designed for a builder - someone who wants to move beyond maintenance and into the realm of architectural transformation.

You will have the autonomy to evaluate our existing AWS footprint and lead the charge in modernizing our environment. Your mission is to take a high-velocity system and implement the best practices, guardrails, and automated architectures that will support our next 10x of scale. You will be the primary authority on reliability, performance, and infrastructure security.

Please note: This is a hybrid role based in our Santa Clara, CA office, with an in-office schedule of two days per week – Monday and Wednesday.

Key Responsibilities

Architectural Modernization: Lead the design and implementation of a scalable, "Cloud-First" AWS architecture. You will drive the transition toward fully automated, state-of-the-art Infrastructure as Code (Terraform).
High Availability & Resilience: Design and implement robust Disaster Recovery (DR) and Business Continuity plans, moving our services toward a zero-downtime deployment model.
Performance & Capacity Engineering: Own the strategy for capacity planning and autoscaling. You will optimize our compute resources (EC2, Lambda) to handle bursty traffic patterns with precision and cost-efficiency.
Advanced Observability: Define our monitoring and alerting philosophy using New Relic for deep APM and system insights. Partner this with IncidentIO to ensure we catch and resolve issues before they impact customers.
Streamlined CI/CD: Partner with feature teams to refine Change Management and CI/CD pipelines, ensuring code moves from "commit" to "production" safely and predictably.
Cloud Security: Harden our network architecture and application security posture, including WAF management and secure service-to-service communication.

The Tech Stack

Cloud Infrastructure: AWS (EC2, Lambda, SQS, SNS, ALB, API Gateway, S3, WAF).
Observability & Incident Response: New Relic (APM/Infrastructure), IncidentIO.
Automation & Tools: Terraform, Redis/Elasticache, Shell Scripting, NPM/PM2.
Application Ecosystem: NodeJS, Python, C#, Angular, Apex.
Integration: Salesforce Managed Packages, MSFT Dynamics365.

Who You Are

Experienced Architect: 5+ years of experience in SRE, DevOps, or Systems Engineering, with a proven track record of managing complex AWS environments.
Proven Incident Commander: You demonstrate calm, decisive leadership during high-pressure outages. You have extensive experience running blameless postmortems and, crucially, driving the remediation work needed to prevent recurrence.
Observability Pro: You have deep experience configuring New Relic (or similar platforms) to create meaningful dashboards, SLIs, and SLOs.
Automation Advocate: You believe that manual intervention is a bug. You have deep experience with Terraform and a "Code-First" approach to infrastructure.
Strategic Problem Solver: You can look at a complex, "needs-based" architecture and formulate a clear, prioritized roadmap to move it toward industry best practices.
Collaborative Leader: You enjoy working with feature engineers to help them build "reliability-by-design" into their services.
Education: A Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent professional experience).

Why work at LeanData:

LeanData covers employee insurance premiums up to 90%
Stock options in LeanData for all full-time employees
Flexible PTO
401K plan

Apply on original site

Similar jobs

Found 6 similar jobs

Manager, People Operations

LeanData • Remote

Product Designer

LeanData • Remote

Staff Engineer, Agents

LeanData • Remote

Head of AI Products

LeanData • Remote

Senior Director, Customer Success and Outcomes

LeanData • Remote

Talent Acquisition Lead

LeanData • Remote

Browse more jobs in:

Devops Engineer Jobs

LeanData

leandata.com

LeanData provides lead management solutions that help businesses optimize their sales processes. Their primary users are sales and marketing teams in organizations of various sizes looking to improve lead routing, tracking, and reporting. The company offers products such as lead-to-account matching and automated lead distribution, which enhance the efficiency of sales operations. LeanData promotes a remote-friendly work culture, allowing employees to collaborate effectively regardless of their location.

Industry

Software

Fully remote-friendly

36 open positions