Forward Deployed Data Scientist (Healthcare)

This listing is synced directly from the company ATS.

Role Overview

This senior-level role involves leading solution architecture and end-to-end program management for healthcare datasets, translating customer requirements into structured plans, and overseeing QA and delivery of complex healthcare data. As a Forward Deployed Data Scientist, you'll work cross-functionally to source, validate, and deliver HIPAA-compliant datasets while maintaining customer-facing responsibilities to ensure alignment and adapt solutions as needs evolve. You'll directly impact the next generation of AI models by building datasets that power AI partnerships in the healthcare vertical.

Perks & Benefits

This is a fully remote position with a fast-moving, high-trust team culture that values velocity and impact. The company offers opportunities to operate at the cutting edge of multimodal data where human judgment meets machine intelligence, with career growth through building datasets that power next-generation AI models. While specific benefits aren't detailed, typical remote tech roles include competitive compensation, flexible work arrangements, and professional development opportunities in a startup environment.

Full Job Description

Company Overview:

We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI — and in tech.

We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.

Role Overview:

As a Forward Deployed Data Scientist (Healthcare Solutions Lead) in the Healthcare vertical, you will guide prospects and customers through the definition and delivery of healthcare datasets. Your job will be to understand what customers are building, identify the data that best fits their needs, and assemble and QA high-quality samples and final deliveries that meet their technical and conceptual specs. Along the way, you’ll ensure timelines and milestones are clearly communicated from the first stages of feasibility to the final data delivery.

What You Will Own

  • Lead solution architecture and deal design, translating customer requirements into a structured plan and driving it forward through internal project management, quality assurance, and cross-functional execution — while maintaining a customer-facing presence to ensure alignment and adapt solutions as needs evolve

  • Lead end-to-end program management from data specification and preparation through QA and delivery, ensuring cross-functional coordination and on-time execution

  • Work with Protege data partners to source cutting edge healthcare data into the Protege ecosystem

  • Oversee the QA, packaging, and delivery of complex datasets (EHR, claims, radiology, pathology, unstructured text), ensuring HIPAA compliance in collaboration with privacy partners

Who You Are

  • Proven customer-facing experience: skilled at managing expectations, leading customer conversations, and delivering technical outcomes with clarity and confidence

  • Bring an analyst-first mindset to challenges. You are an expert in using SQL and python to query data to construct complex patient cohorts, analyze data readiness for model training, validate clinical coverage, and support other customer-specific needs

  • FInd satisfaction by bringing order to multiple simultaneous projects and masterfully juggle competing (and sometimes changing) priorities

  • Deep expertise in various healthcare data modalities ranging from EHR, claims, radiology, pathology and unstructured text

  • Familiarity with privacy-preserving techniques of healthcare data

  • Experience in healthcare AI, ML products, or enterprise data platforms

  • Prior startup experience

  • You treat those around you with kindness

Why Protege

  • Be the connective tissue between Protege’s platform, our data, and our customers

  • Build datasets that directly power the next generation of AI models

  • Operate at the cutting edge of multimodal data — where human judgment meets machine intelligence

Similar jobs

Found 6 similar jobs

Browse more jobs in: