Machine Learning Engineer — AI Architecture Research

This listing is synced directly from the company ATS.

Role Overview

This senior-level Machine Learning Engineer role involves designing and prototyping next-generation AI model architectures, focusing on alternatives to Transformers and long-context systems. You'll conduct architecture-level experiments, collaborate with inference and systems engineers for deployable solutions, and contribute to research papers and open-source projects, directly influencing the technical direction of a Series-A startup.

Perks & Benefits

The role is fully remote with a small, high-caliber team offering fast feedback loops and direct impact on core model architecture. Benefits include competitive compensation, meaningful equity, and opportunities to ship research into production, with an emphasis on a research-driven, collaborative culture typical of tech startups.

⚠️ This job was posted over 4 months ago and may no longer be open. We recommend checking the company's site for the latest status.

Full Job Description

About the Role

We’re looking for a Machine Learning Engineer focused on AI architecture research to help design, prototype, and validate next-generation model architectures. You’ll work at the intersection of research and production — turning new ideas into scalable, real-world systems.

This role is ideal for someone who enjoys questioning architectural assumptions, experimenting with novel model designs, and pushing beyond standard Transformer-style approaches.

What You’ll Work On

Research and develop new neural network architectures (e.g. alternatives or extensions to Transformers, recurrent / hybrid models, long-context systems)
Design and run architecture-level experiments (scaling laws, memory mechanisms, compute trade-offs)
Prototype models end-to-end — from research code to training-ready implementations
Collaborate with inference and systems engineers to ensure architectures are deployable and efficient
Analyze model behavior, failure modes, and inductive biases
Read, reproduce, and extend cutting-edge research papers
Contribute to internal research notes, benchmarks, and open-source efforts (where applicable)

What We’re Looking For

Strong background in machine learning fundamentals and deep learning
Hands-on experience implementing model architectures from scratch
Solid understanding of:
- Attention mechanisms, RNNs, state-space models, or hybrid architectures
- Training dynamics, scaling behavior, and optimization
- Memory, latency, and compute constraints at the model level
Comfortable working in PyTorch or JAX
Ability to move fluidly between theory, experimentation, and engineering
Clear communicator who can explain architectural trade-offs

Nice to Have

Experience with non-Transformer architectures (RNN variants, SSMs, long-context models)
Background in research-driven startups or open-source ML projects
Experience with large-scale training or custom training loops
Publications, preprints, or notable research contributions
Familiarity with inference optimization and deployment constraints

Why Join

Work on core model architecture, not just fine-tuning
Direct influence on the technical direction of a Series-A company
Small, high-caliber team with fast feedback loops
Opportunity to ship research into production
Competitive compensation + meaningful equity

Apply on original site

Similar jobs

Found 6 similar jobs

Founding Account Executive (AI Cloud)

Featherless AI • Remote

Business Development Rep (AI Cloud)

Featherless AI • Remote

AI Researcher — Training Optimization

Featherless AI • Remote

AI Researcher – Multilingual Data

Featherless AI • Remote

AI Researcher — AI Architecture Research

Featherless AI • Remote

AI Researcher — Distillation

Featherless AI • Remote

Browse more jobs in:

Machine Learning Engineer Jobs

Featherless AI

featherless.ai

Featherless AI specializes in developing lightweight and efficient artificial intelligence solutions tailored for resource-constrained environments. Their typical customers include tech startups, IoT device manufacturers, and enterprises seeking to integrate AI into mobile and edge computing applications. The company's main product is a suite of optimized AI models and tools that reduce computational overhead while maintaining high performance. As a fully remote organization, Featherless AI fosters a distributed work culture that emphasizes asynchronous communication and flexible scheduling to support a global team.

Industry

Artificial Intelligence

Fully remote

21 open positions

About this company (remote-wise)

Headquarters:: Distributed / remote-first
Team style:: Async-ish, remote-first

View company profile →

About the job

Posted onJan 22, 2026

LocationRemote

Skills

PyTorchJAXNeural Network Architecture DesignAttention MechanismsRNNsState-Space ModelsLarge-Scale TrainingInference OptimizationResearch Paper Reproduction

Share this job

💌 Get remote jobs in your inbox

Subscribe to get the latest curated remote jobs every week.