Sr AI Research Scientist, AI Evaluation and Reliability

This listing is synced directly from the company ATS.

Role Overview

This senior-level role involves leading applied research initiatives focused on AI evaluation, reliability, and robustness, including designing methods to measure and mitigate risks like hallucination detection. The hire will work within the AI Foundations team, partnering with engineering and product teams to translate research into production outcomes and influence technical direction across Upwork's AI systems. They will mentor researchers and contribute to Upwork's external research presence through publications and community engagement.

Perks & Benefits

The role offers remote work flexibility, likely with expectations aligned with Toronto's time zone, and provides access to Upwork's resources and culture while employed through a partner initially. It includes opportunities for career growth through mentorship, cross-functional collaboration, and potential transition to direct employment with Upwork. The company emphasizes a diverse, inclusive, and growth-oriented environment with a focus on innovation in AI.

Full Job Description

Upwork Inc.’s (Nasdaq: UPWK) family of companies connects businesses with global, AI-enabled talent across every contingent work type including freelance, fractional, and payrolled. This portfolio includes the Upwork Marketplace, which connects businesses with on-demand access to highly skilled talent across the globe, and Lifted, which provides a purpose-built solution for enterprise organizations to source, contract, manage, and pay talent across the full spectrum of contingent work. From Fortune 100 enterprises to entrepreneurs, businesses rely on Upwork Inc. to find and hire expert talent, leverage AI-powered work solutions, and drive business transformation. With access to professionals spanning more than 10,000 skills across AI & machine learning, software development, sales & marketing, customer support, finance & accounting, and more, the Upwork family of companies enables businesses of all sizes to scale, innovate, and transform their workforces for the age of AI and beyond.

Since its founding, Upwork Inc. has facilitated more than $30 billion in total transactions and services as it fulfills its purpose to create opportunity in every era of work. Learn more about the Upwork Marketplace at Upwork.com and follow us on LinkedIn, Facebook, Instagram, TikTok, and X; and learn more about Lifted at Go-Lifted and follow on LinkedIn.

Sr. Lead AI Research Scientist, AI Evaluation and Reliability

The AI Foundations team leads core research and development across the training, evaluation, and deployment of AI systems that power Uma, Upwork’s flagship AI model, and other customer-facing generative AI capabilities. As a Sr. Lead AI Research Scientist focused on AI Evaluation and Reliability, you will drive high-impact research initiatives that improve the trustworthiness, robustness, and real-world performance of AI systems operating at marketplace scale.

At the Sr. Lead level, this role combines deep technical expertise with cross-functional leadership. You will identify and lead research efforts that address systemic reliability challenges, partner closely with engineering and product teams to translate research into production outcomes, and help shape how Upwork evaluates AI performance in real work scenarios. Your work will support AI systems embedded in retrieval-based workflows, agentic architectures, and human plus AI collaboration patterns, while contributing to Upwork’s broader AI research strategy and external presence.

Responsibilities:

Lead applied research initiatives focused on AI evaluation, reliability, and robustness, defining success metrics tied to customer impact and production readiness.
Design and validate methods to measure and mitigate AI reliability risks, including uncertainty estimation, hallucination detection, and identification of model failure modes.
Partner cross-functionally with engineering, data science, and product teams to integrate research outcomes into customer-facing AI systems and workflows.
Own research projects end to end, from problem framing and hypothesis development through experimentation, prototyping, and synthesis of results.
Influence technical direction across teams by surfacing insights, proposing scalable solutions, and aligning stakeholders on priorities and tradeoffs.
Mentor researchers and engineers through technical guidance, feedback, and collaborative leadership on shared initiatives.
Contribute to Upwork’s external research footprint through publications, presentations, and engagement with the broader AI research community.

What it takes to catch our eye:

Proven experience leading applied AI research that balances scientific rigor with real-world deployment constraints and business impact.
A strong record of research contribution through publications, internal innovation, or demonstrable influence on production AI systems.
Deep proficiency with Python and modern deep learning frameworks such as PyTorch, with hands-on experience evaluating and improving large-scale models.
An adaptive approach to integrating AI tools into research and development workflows to accelerate experimentation, improve evaluation quality, and share best practices with others.
A collaborative, growth-oriented mindset with the ability to mentor peers, communicate complex ideas clearly, and thrive in a fast-evolving, bottom-up environment.

Come change how the world works.

Upwork is establishing an operational hub in Toronto, Canada. The new office is expected to be fully operational by Q4 2026. This role will require 3 days in office once we have an office open.

This position will initially be employed through a partner to ensure a seamless hiring process while we establish the hub. Once the hub is established, there may be opportunities to transition to employment with Upwork, depending on business needs and other requirements. While employed by the partner, you’ll work as part of Upwork’s team, with access to our resources, culture, and growth opportunities.

Our partner will offer competitive benefits. When Upwork’s hub is established, we will be excited to offer employment and benefits directly as business needs require.

Upwork is committed to building a diverse, inclusive, and equitable workforce. Employment decisions are made without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, disability, or any other status protected by applicable law.

We use BrightHire, an AI-enabled tool, to record interviews and summarize interview transcripts. The tool allows the interviewer to focus on the discussion and does not score or evaluate candidates or make recommendations. The interview transcripts are reviewed, and decisions are only made by humans. Candidates who prefer not to have their interview recorded through BrightHire can opt out when the interview is scheduled.

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice and the Applicant Privacy Addendum (Canada).

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice

Apply on original site

Similar jobs

Found 6 similar jobs

Workplace Front Desk Coordinator

Upwork • Palo Alto, California, United States

Lead Data Analyst, Marketing Analytics

Upwork • Toronto, Ontario, Canada

Lead AI Engineer, NLQ & Agentic Systems

Upwork • Remote

Chief of Staff to the COO & GM of Marketplace

Upwork • Palo Alto, California, United States

Lead Machine Learning Engineer / Applied Scientist

Upwork • Toronto, Ontario, Canada

Senior Product Manager, EOR Solutions

Upwork • Toronto, Ontario, Canada

Browse more jobs in:

Virtual Assistant Jobs

Upwork

www.upwork.com

Upwork operates a global freelancing platform that connects businesses with independent professionals and agencies. The company serves a diverse range of clients from startups to Fortune 500 companies seeking talent for projects spanning web development, design, marketing, and administrative support. Their main service is a comprehensive online marketplace that facilitates hiring, collaboration, and payment processing for remote work engagements. As a remote-first organization, Upwork has built a distributed workforce model that enables employees to work from anywhere while maintaining strong virtual collaboration practices. This approach allows them to tap into global talent pools while providing flexibility for their team members. The platform has become essential for companies adapting to the growing trend of remote and flexible work arrangements.

Industry

Technology

Remote-first

121 open positions

About this company (remote-wise)

Headquarters:: Santa Clara, California, USA
Typical working hours:: Roughly US business hours

View company profile →

About the job

Posted onMay 5, 2026

LocationToronto, Ontario, Canada

Skills

Python

PyTorchAI ResearchDeep LearningMachine LearningAI EvaluationUncertainty EstimationHallucination DetectionCross-functional LeadershipMentoring

Share this job

💌 Get remote jobs in your inbox

Subscribe to get the latest curated remote jobs every week.