Principal Data Engineer

Spotter, Inc Culver City , CA 90231

Posted 2 months ago

Overview:

Spotter is a platform for Creators, providing services and software designed to accelerate growth for the world's best Creators and brands. Creators working with Spotter can access the capital, knowledge, community, and personalized AI software products they need to succeed. With unique knowledge of how Creators work, the resources they need to grow, and the challenges they face, Spotter is empowering top YouTube Creators to succeed.

Spotter has already deployed over $940 million to YouTube Creators to reinvest in themselves and accelerate their growth, with plans to reach $1 billion in investment by 2024. With a premium catalog that spans over 725,000 videos, Spotter generates more than 88 billion monthly watch-time minutes, delivering a unique scaled media solution to Advertisers and Ad Agencies that is transparent, efficient, and 100% brand safe. For more information about Spotter, please visit https://spotter.com.

OVERVIEW

The successful candidate will be responsible for processing huge data sets (billions of records) using distributed data processing frameworks (Apache Spark, etc...).

Must have:

  • Extensive experience working with very large data sets, creating performant & scalable ETL pipelines

  • In-depth understanding of performance bottlenecks in large-scale data processing

What You'll Do:

Are you ready to help lead the charge in shaping the data-driven future of Spotter? We're in search of an exceptional Principal Data Engineer who will play a pivotal role in designing, building, and optimizing scalable data infrastructure. You will help us with data pipelines for acquisition and transformation of large datasets, storage and querying optimizations of varying data to support a large range of use cases from Analytics to Creator Products to Operations using traditional and ML focused access patterns. You will be a key player in empowering us to make data-informed decisions that will fuel our innovation and growth.

  • Develop and maintain scalable data pipelines, including:

  • ETL pipelines, both single and multi-node solutions

  • Build data quality assurance steps for new and existing pipelines

  • Create derived datasets with augmented properties

  • Work on analytics ready datasets to power internal and creator facing tools

  • Troubleshoot issues when they arise, working directly with internal data consumers

  • Automate pipeline runs with scheduling and orchestration tools

  • Work with large scale datasets

  • Work with/use various external APIs to enhance data

  • Setup database tables for analytics users to consume the data collected by the Data Engineering team

  • Work with big data technologies to improve data availability and data quality in the cloud (AWS)

  • Lead development of projects involving other team members and act as a mentor

  • Actively participate in team discussions about technology/architecture/solutions for new projects and to improve existing code and pipeline

Who You Are:

  • Bachelor's degree, preferably in Computer Science or Computer Information Systems

  • 6+ years of software engineering experience

  • 5+ years of data engineering experience with Apache Spark or Apache Flink

  • 4+ years of experience running software and services in the cloud

  • Proficiency in working with DataFrame APIs (Pandas and Spark) for parallel and single node processing

  • Proficiency using advanced languages and techniques with Python, Scala, etc. with modern data optimized file formats such as Parquet and Avro

  • Proficiency with SQL on RDBMS and data warehouse solutions like Redshift

  • Hands on experience with Data Lake technologies like Delta Lake and Iceberg

  • Experience with data acquisition from external APIs at large scale / in parallel processing

  • Experience supporting ML/AI projects: deployed pipelines for computing features, using models for inference on large datasets

Additional Valued Skills:

  • Experience with YouTube APIs

  • Experience with AWS Glue metastore

  • Experience with Data-Mesh approaches

  • Experience with data cataloging, data lineage and data governance tools and approaches

  • Experience with vector databases

Why Spotter:

  • Medical and vision insurance covered up to 100%

  • Dental insurance

  • 401(k) matching

  • Stock options

  • Complimentary gym access

  • Autonomy and upward mobility

  • Diverse, equitable, and inclusive culture, where your voice matters.

In compliance with local law, we are disclosing the compensation, or a range thereof, for roles that will be performed in Culver City. Actual salaries will vary and may be above or below the range based on various factors including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The overall market range for roles in this area of Spotter are typically: $100K-$500K salary per year. The range listed is just one component of Spotter's total compensation package for employees. Other rewards may include annual discretionary bonus and equity.

Spotter is an equal opportunity employer. Spotter does not discriminate in employment on the basis of race, religion, creed, color, national origin, ancestry, citizenship, physical or mental disability, medical condition, genetic characteristics or information, marital status, sex (including pregnancy, childbirth, breastfeeding, and related medical conditions), gender, gender identity, gender expression, age, sexual orientation, military status, veteran status, use of or request for family or medical leave, political affiliation, or any other status protected under applicable federal, state or local laws.

Equal access to programs, services and employment is available to all persons. Those applicants requiring reasonable accommodations as part of the application and/or interview process should notify a representative of the Human Resources Department.


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Principal Data Engineer Platform

Leaflink

Posted 7 days ago

VIEW JOBS 9/3/2024 12:00:00 AM 2024-12-02T00:00 LeafLink is the largest unified B2B cannabis platform, providing licensed cannabis businesses a suite of tools to manage their business more effectively, sell Leaflink Textile Finance, CA Los Angeles County, CA

Principal Data Engineer

Spotter, Inc