Data Engineer

Role Overview

This entry-level Data Engineer role involves writing and deploying web crawling scripts, using Scala Spark for data transformation, and Python for parsing entity references. The engineer will diagnose bugs, analyze datasets, and collaborate within an agile team to support data collection and standardization efforts. The position is part of the Data team, working closely with Software and Product teams to enhance data quality and inform feature development.

Perks & Benefits

The job is remote within the US, offering flexibility in work location. It provides opportunities for career growth in a fast-growing company recognized for its workplace culture, with collaboration across engineering teams and exposure to agile methodologies. While specific benefits aren't detailed, typical perks may include competitive compensation, professional development, and a supportive, innovative environment.

Full Job Description

About Sayari:

Sayari is a risk intelligence provider that equips the public and private sectors with immediate visibility into complex commercial relationships by delivering the largest commercially available collection of corporate and trade data from over 250 jurisdictions worldwide. Sayari's solutions enable risk resilience, mission-critical investigations, and better economic decisions. Headquartered in Washington, D.C., its solutions are trusted by Fortune 500 companies, financial institutions, and government agencies, and are used globally by thousands of users in over 35 countries. Funded by world-class investors, with a strategic $228 million investment by TPG Inc. (NASDAQ: TPG) in 2024, Sayari has been recognized by the Inc. 5000 and the Deloitte Technology Fast 500 as one of the fastest growing private companies in the United States and was featured as one of Inc.’s “Best Workplaces” for 2025.

POSITION DESCRIPTION Sayari is looking for an Entry-Level Data Engineer to join our Data team located in Washington, DC. The Data team is an integral part of our Engineering division and works closely with our Software & Product teams, as well as other key stakeholders across the business. JOB RESPONSIBILITIES:

Write and deploy crawling scripts to collect source data from the web Write and run data transformers in Scala Spark to standardize bulk data sets Write and run modules in Python to parse entity references and relationships from source data Diagnose and fix bugs reported by internal and external users Analyze and report on internal datasets to answer questions and inform feature work Work collaboratively on and across a team of engineers using basic agile principles Give and receive feedback through code reviews

SKILLS & EXPERIENCE ReqPlease mention the word **HARMLESS** and tag RODguMTk4Ljk5LjE0Mw== when applying to show you read the job post completely (#RODguMTk4Ljk5LjE0Mw==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

Apply on original site

Similar jobs

Found 6 similar jobs

Senior Compensation Analyst

Twilio • Remote - US

Software Architect, Reliability Engineering

Twilio • Remote - US

Principal Presales Engineer

Twilio • Remote - US

HRBP Director, Product Management

Twilio • Remote - US

Senior Paralegal

Twilio • Remote - US

Employee Relations Business Partner

Twilio • Remote - US

Sayari

sayari.com

Sayari is a risk intelligence platform that provides data and analytics on global corporate networks to help organizations identify financial crime, supply chain risks, and regulatory compliance issues. Their typical customers include financial institutions, government agencies, corporations, and compliance professionals who need to understand complex business relationships across borders. The company's main product is a comprehensive database and analytical tools that map corporate ownership structures, uncover hidden connections, and monitor risk indicators in real-time. Sayari operates with a distributed team model, supporting remote work while maintaining collaborative hubs in key locations like Washington D.C. and London.

Industry

Technology, Data Analytics, Risk Intelligence

Hybrid or remote-friendly (distributed team with offices in multiple locations)

1 open position

About this company (remote-wise)

Headquarters:: Washington, D.C., United States
Typical working hours:: Roughly US business hours
Team style:: Hybrid with regular meetings

View company profile →

About the job

Posted onDec 22, 2025

LocationRemote - US

Skills

Scala Spark

Python

Web CrawlingData TransformationBug DiagnosisData AnalysisAgile MethodologiesCode Review

Share this job

💌 Get remote jobs in your inbox

Subscribe to get the latest curated remote jobs every week.