Multimodal AI PhD Intern (Summer 2026)

This listing is synced directly from the company ATS.

Role Overview

This is a PhD-level internship where you'll conduct cutting-edge research on multimodal deepfake detection, focusing on audio and vision modalities. You'll work with the AI team to propose new methods, implement and evaluate ideas using Python and PyTorch, and publish results in peer-reviewed journals. The role involves independent research and collaboration, aiming to advance the company's detection platform and contribute to academic publications.

Perks & Benefits

The internship is fully remote with the option to work from the New York City HQ, offering flexibility in location. It provides hands-on experience with modern deep learning tools like AWS/GCP and opportunities for career growth through publishing research and working with experienced AI researchers. The culture emphasizes innovation and mission-driven work in cybersecurity, with a focus on real-world impact against AI-generated media threats.

⚠️ This job was posted over 3 months ago and may no longer be open. We recommend checking the company's site for the latest status.

Full Job Description

Who we are.

Reality Defender is an award-winning cybersecurity company helping enterprises and governments detect deepfakes and AI-generated media. Utilizing a patented multi-model approach, Reality Defender is robust against the bleeding edge of generative platforms producing video, audio, imagery, and text media. Reality Defender's API-first deepfake detection platform empowers teams and developers alike to identify fraud, disinformation campaigns, and harmful deepfakes in real time.

Backed by world class investors including DCVC, Illuminate Financial, Y Combinator, Booz Allen Hamilton, IBM, Accenture, Rackhouse, and Argon VC, Reality Defender works with leading enterprise clients, financial institutions, and governments in order to ensure AI-generated media is not used for malicious purposes.

Youtube: Reality Defender Wins RSA Most Innovative Startup

The Multimodal AI Internship.

The 4-month internship is designed for current PhD students and candidates to partner with Reality Defender's AI team to conduct cutting-edge research and publish peer-reviewed papers. Your primary collaborators will be Surya Koppisetti and Yi Zhu, who will guide and advise your efforts within multi-modal deepfake detection. This internship can be performed remotely, although you're welcome to work from our HQ in New York City.

What you'll do.

Investigate and propose new methods for detecting generative multi-modal content, spanning audio and vision modalities.
Perform research on multi-modal deepfake detection and reasoning tasks.
Collaborate with researchers in the team.
Write up results of research for internal reports and submission to academic journals/workshops.
Independently implement and evaluate ideas on modern deep learning stack - Python, PyTorch, and GPU-enabled cloud compute, like AWS/GCP.

Who you are.

PhD student in a relevant technical field, preferably three or more years into the program
Experience in multi-modal learning, such as in audio-visual classification and audio-language reasoning.
Proficient in Python and in building deep learning models with PyTorch.
Published peer-reviewed research papers in reputable AI and speech venues, e.g. CVPR, NeurIPS, ACL, Interspeech.
Excited about Reality Defender's mission to build a best-in-class and comprehensive deepfake and AI-generated content detection platform.
Available to start May 1st, 2026, for a minimum duration of 4 months.

Apply on original site

Similar jobs

Found 6 similar jobs

Product Marketing Manager

Reality Defender • Remote

Enterprise Account Executive

Reality Defender • Remote

Director of People Operations

Reality Defender • Remote

Applied Scientist II (Audio)

Reality Defender • Remote

Senior Account Executive

Reality Defender • Remote

Demand Generation Manager

Reality Defender • Remote

Reality Defender

www.realitydefender.ai

Reality Defender develops AI-powered deepfake detection technology to identify manipulated media and synthetic content. Their solutions are used by enterprises, government agencies, and media organizations to combat misinformation and digital fraud. The company offers real-time detection APIs and forensic analysis tools that help verify the authenticity of audio, video, and image content. As a fully remote organization, they leverage distributed teams to collaborate across time zones while maintaining a strong focus on security innovation.

Industry

Cybersecurity

Fully remote

20 open positions

About this company (remote-wise)

Headquarters:: New York, NY
Typical working hours:: Roughly US business hours
Team style:: Async-ish, remote-first

View company profile →

About the job

Posted onFeb 25, 2026

LocationRemote

Skills

Python

PyTorch

AWS

GCPMultimodal LearningDeep LearningAudio-Visual ClassificationResearch Publication

Share this job

💌 Get remote jobs in your inbox

Subscribe to get the latest curated remote jobs every week.