Sr AI Research Scientist, AI Evaluation and Reliability

This listing is synced directly from the company ATS.

Role Overview

This senior-level role involves leading applied research initiatives focused on AI evaluation, reliability, and robustness, including designing methods to measure and mitigate risks like hallucination detection. The hire will work within the AI Foundations team, partnering with engineering and product teams to translate research into production outcomes and influence technical direction across Upwork's AI systems. They will mentor researchers and contribute to Upwork's external research presence through publications and community engagement.

Perks & Benefits

The role offers remote work flexibility, likely with expectations aligned with Toronto's time zone, and provides access to Upwork's resources and culture while employed through a partner initially. It includes opportunities for career growth through mentorship, cross-functional collaboration, and potential transition to direct employment with Upwork. The company emphasizes a diverse, inclusive, and growth-oriented environment with a focus on innovation in AI.

Full Job Description

Upwork Inc.’s (Nasdaq: UPWK) family of companies connects businesses with global, AI-enabled talent across every contingent work type including freelance, fractional, and payrolled. This portfolio includes the Upwork Marketplace, which connects businesses with on-demand access to highly skilled talent across the globe, and Lifted, which provides a purpose-built solution for enterprise organizations to source, contract, manage, and pay talent across the full spectrum of contingent work. From Fortune 100 enterprises to entrepreneurs, businesses rely on Upwork Inc. to find and hire expert talent, leverage AI-powered work solutions, and drive business transformation. With access to professionals spanning more than 10,000 skills across AI & machine learning, software development, sales & marketing, customer support, finance & accounting, and more, the Upwork family of companies enables businesses of all sizes to scale, innovate, and transform their workforces for the age of AI and beyond.

Since its founding, Upwork Inc. has facilitated more than $30 billion in total transactions and services as it fulfills its purpose to create opportunity in every era of work. Learn more about the Upwork Marketplace at Upwork.com and follow us on LinkedIn, Facebook, Instagram, TikTok, and X; and learn more about Lifted at Go-Lifted and follow on LinkedIn.

Sr. Lead AI Research Scientist, AI Evaluation and Reliability

The AI Foundations team leads core research and development across the training, evaluation, and deployment of AI systems that power Uma, Upwork’s flagship AI model, and other customer-facing generative AI capabilities. As a Sr. Lead AI Research Scientist focused on AI Evaluation and Reliability, you will drive high-impact research initiatives that improve the trustworthiness, robustness, and real-world performance of AI systems operating at marketplace scale.

At the Sr. Lead level, this role combines deep technical expertise with cross-functional leadership. You will identify and lead research efforts that address systemic reliability challenges, partner closely with engineering and product teams to translate research into production outcomes, and help shape how Upwork evaluates AI performance in real work scenarios. Your work will support AI systems embedded in retrieval-based workflows, agentic architectures, and human plus AI collaboration patterns, while contributing to Upwork’s broader AI research strategy and external presence.

Responsibilities:

  • Lead applied research initiatives focused on AI evaluation, reliability, and robustness, defining success metrics tied to customer impact and production readiness.

  • Design and validate methods to measure and mitigate AI reliability risks, including uncertainty estimation, hallucination detection, and identification of model failure modes.

  • Partner cross-functionally with engineering, data science, and product teams to integrate research outcomes into customer-facing AI systems and workflows.

  • Own research projects end to end, from problem framing and hypothesis development through experimentation, prototyping, and synthesis of results.

  • Influence technical direction across teams by surfacing insights, proposing scalable solutions, and aligning stakeholders on priorities and tradeoffs.

  • Mentor researchers and engineers through technical guidance, feedback, and collaborative leadership on shared initiatives.

  • Contribute to Upwork’s external research footprint through publications, presentations, and engagement with the broader AI research community.

What it takes to catch our eye:

  • Proven experience leading applied AI research that balances scientific rigor with real-world deployment constraints and business impact.

  • A strong record of research contribution through publications, internal innovation, or demonstrable influence on production AI systems.

  • Deep proficiency with Python and modern deep learning frameworks such as PyTorch, with hands-on experience evaluating and improving large-scale models.

  • An adaptive approach to integrating AI tools into research and development workflows to accelerate experimentation, improve evaluation quality, and share best practices with others.

  • A collaborative, growth-oriented mindset with the ability to mentor peers, communicate complex ideas clearly, and thrive in a fast-evolving, bottom-up environment.

This position will initially be employed through a partner to ensure a seamless hiring process while we establish the hub. Once the hub is established, there may be opportunities to transition to employment with Upwork depending on business needs and other requirements. While employed by the partner, you’ll work as part of Upwork’s team, with access to our resources, culture, and growth opportunities.

Upwork is an Equal Opportunity Employer committed to recruiting and retaining a diverse and inclusive workforce. We do not discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, or other legally protected characteristics under federal, state, or local law.

Please note that a criminal background check may be required once a conditional job offer is made. Qualified applicants with arrest or conviction records will be considered in accordance with applicable law, including the California Fair Chance Act and local Fair Chance ordinances. The Company is committed to conducting an individualized assessment and giving all individuals a fair opportunity to provide relevant information or context before making any final employment decision.

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice

Similar jobs

Found 6 similar jobs

Browse more jobs in: