Senior AI & Cloud Operations Engineer

This listing is synced directly from the company ATS.

Role Overview

This senior engineering role involves designing and maintaining backend services and automation layers for AI-driven systems on Google Cloud, focusing on anomaly detection, cost optimization, and model health checks. The engineer will lead minor feature development, collaborate with architects to address risks, and dedicate time to internal R&D to eliminate manual tasks, ensuring high-impact improvements and production excellence in a continuous engineering environment.

Perks & Benefits

The job is fully remote, likely based in Croatia, with flexible time zones implied for collaboration. It offers career growth through strategic advisory roles, internal R&D opportunities, and a culture emphasizing engineering excellence, ownership, and innovation in AI and cloud operations, without traditional support queues.

Full Job Description

Senior AI & Cloud Operations Engineer

About The Role: We are reimagining what it means to run Data and AI platforms. At Datatonic, Managed Services isn't about "maintenance", it’s about Continuous Engineering. We are looking for a foundational Senior Engineer to join us in Croatia to bridge the gap between high-level architecture and production excellence.

This is an engineering-first role. You won't be sitting in a traditional support queue; you will be designing the AI-driven systems that monitor, optimize, and evolve our clients’ GCP environments. You will play a central role in shaping our "XOps" (FinOps, MLOps, AIOps) strategy, building autonomous agents that ensure some of the world’s most sophisticated AI platforms stay reliable and cost-effective.

Key Responsibilities:

  • Technical Ownership: Design and maintain the backend services and automation layers that power our managed platforms.

  • AI-Augmented Operations: Build and deploy AI Agents and evaluation frameworks to automate anomaly detection, cost-optimization, and model health checks.

  • Platform Evolution: Lead "Minor Feature" development implementing high-impact improvements and architectural tweaks to keep client platforms modern.

  • The Strategic Advisor: Collaborate with architects and client stakeholders to identify "Day 2" risks and provide technical solutions before they impact the business.

  • Engineering Excellence: Apply the "80/20" rule dedicating a portion of your time to internal R&D and building the automation tools that eliminate manual toil.

What We’re Looking For:

  • Experience: 5+ years of software engineering experience, with a proven track record of building and running production systems.

  • GCP Expertise: Deep technical knowledge of Google Cloud and Google Cloud tools (e.g., BigQuery, Dataflow, DataProc, Dataplex, Composer, Vertex, Looker, etc.). You understand how to architect for performance and cost.

  • Tech Stack: Proficiency in Python, and a solid grasp of modern CI/CD, IaC, and containerization.

  • AI/ML Literacy: A strong interest in MLOps and the challenges of deploying and monitoring Large Language Models (LLMs).

  • Mindset: You are a versatile engineer who takes extreme ownership. You enjoy the "puzzle" of finding why a system is sub-optimal and building the code to fix it permanently.

Similar jobs

Found 6 similar jobs

Browse more jobs in: