Customer Reliability Engineer, Infrastructure (Pacific)

This listing is synced directly from the company ATS.

Role Overview

This mid-level Customer Reliability Engineer role focuses on operating and maintaining the Astro platform to ensure high availability and reliability for customers. Day-to-day responsibilities include monitoring Kubernetes and cloud infrastructure across AWS, Azure, and GCP, troubleshooting issues, and working directly with customers to meet SLAs. The engineer will have a significant impact by improving platform reliability and building strong customer relationships.

Perks & Benefits

This role offers full remote work with flexible hours, requiring only a fixed 12PM-6PM Pacific time window Monday to Friday. Benefits include opportunities for career growth through exposure to various engineering disciplines, up to 20% time for side projects, and participation in 2-4 in-person events annually. The position also includes a competitive salary range of $140,000-$150,000, equity, and a supportive, distributed team environment.

⚠️ This job was posted over 4 months ago and may no longer be open. We recommend checking the company's site for the latest status.

Full Job Description

Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform powered by Apache Airflow®. Astro accelerates building reliable data products that unlock insights, unleash AI value, and powers data-driven applications. Trusted by more than 800 of the world's leading enterprises, Astronomer lets businesses do more with their data. To learn more, visit www.astronomer.io.

About this role:

The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service. The CREs are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations. As an infrastructure specialist within the team, you will learn to become an expert on the reliability of Kubernetes and the underlying cloud infrastructure on all 3 public clouds (AWS, Azure, and GCP). Our CRE team ensures production environments are available, predictable, and reliable for our customers. You will create strong relationships with customers and help them achieve their reliability goals.

When you learn a new piece of technology, are you aiming not just for getting started but becoming the expert? Do you listen to the plumber when they tell you what was wrong with the pipes? Do you know how your router works? Are you the kind of person who takes an MIT Opencourseware course and actually finishes it? Then this role could be for you.

This position includes a requirement to work from 12PM-6PM Pacific US, Monday to Friday. Your remaining work time is flexible.

What you get to do:

  • Learn and build expertise across several software engineering disciplines, including:

    • Kubernetes

    • Cloud engineering

    • Cloud networking

  • Gain exposure to the big picture; learn about product, engineering, customer relationship management, and more.

  • Spend up to 20% of your time on side projects that contribute to Astronomer’s overall success, such as contributing to the open-source Airflow repository or developing Astronomer’s internal monitoring and alerting systems built on Airflow.

  • Work on a modern, sophisticated, cloud-native product that customers use to connect to dozens of other systems. Gain depth and breadth of learning!

  • Work directly with our customers’ data engineers, system admins, DevOps teams, and management.

  • Provide feedback from your experience that can shape the direction of Astronomer’s products

  • Own the customer experience, working directly with customers to prioritize and solve issues and meet SLAs.

  • Participate remotely within a fully distributed team. Approximately 2-4 in-person events per year.

  • Help maintain 24x7 coverage through a specified 6-hour pager period during your work day.

  • Participate in paid on-call rotation for weekend coverage.

What you bring to the role:

  • 4 years of professional experience

  • Experience with Kubernetes/Docker/Containers

  • Experience with any major cloud provider (AWS, GCP, Azure)

  • Motivation to learn

  • Commitment to excellence

  • Problem-solving and troubleshooting abilities

  • Willingness to identify and own problems through the full lifecycle, from vague problem to delivered solution

  • Strong written and verbal communication for connecting with our customers over our ticketing system and through Zoom

  • Demonstrable Linux familiarity

Bonus points if you have:

  • Previous experience working directly with customers (internal or external)

  • Experience with DevOps

  • Contributions to open-source projects

  • Experience with Splunk or Prometheus

The salary for this role is $140,000-$150,000, depending on experience level, along with an equity component.

#LI-Remote

At Astronomer, we value diversity. We are an equal opportunity employer: we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Similar jobs

Found 6 similar jobs