AI Researcher — AI Architecture Research
Role Overview
This senior-level AI Researcher role involves designing and prototyping novel AI architectures, such as alternatives to Transformers and long-context models, through empirical studies and publishing research at top ML conferences. You'll work in a small, technical team, collaborating closely with engineers to translate research into production systems, with high ownership over the research direction and a direct impact on deployed models.
Perks & Benefits
The role offers remote work with a flexible setup, likely requiring collaboration across time zones in a fast-paced startup environment. Benefits include competitive compensation, meaningful equity, a clear path to publishing impactful work, and a tight feedback loop between research and real-world deployment, fostering a strong research culture and career growth.
Full Job Description
About the Role
We’re looking for an AI Researcher focused on AI architecture research to help design, analyze, and advance next-generation model architectures. You’ll work at the intersection of theory and production—publishing novel research while collaborating closely with engineers to turn ideas into real systems.
This role is ideal for someone who has published research papers and wants to see their work directly shape deployed models, not just benchmarks.
What You’ll Work On
Research and design novel AI architectures (e.g. alternatives to standard Transformer designs, long-context models, efficient sequence modeling, hybrid architectures)
Explore architectural improvements for scalability, efficiency, and stability
Prototype and evaluate new architectures through ablations, benchmarks, and empirical studies
Author and co-author research papers for top ML conferences and journals
Collaborate with engineering teams to translate research into training and inference systems
Stay current with state-of-the-art research and identify promising directions early
What We’re Looking For
Strong background in machine learning research, with a focus on model architecture
Publication record in ML/AI venues (e.g. NeurIPS, ICML, ICLR, COLM, ACL, EMNLP, arXiv)
Deep understanding of:
Neural network architectures
Sequence models and attention mechanisms
Training dynamics and optimization
Hands-on experience with PyTorch or JAX
Ability to reason rigorously, design clean experiments, and communicate results clearly
Comfortable working in a fast-moving startup environment
Nice to Have
Experience with non-Transformer architectures (e.g. RNN-based, state-space, hybrid models)
Work on long-context or memory-efficient models
Open-source research contributions
Experience bridging research and production systems
Background in efficient training or inference-aware architecture design
Why Join Us
High ownership over research direction and roadmap
Clear path to publishing impactful work
Tight feedback loop between research and real-world deployment
Small, highly technical team with strong research culture
Competitive compensation and meaningful equity
Similar jobs
Found 6 similar jobs





