Principal Software Engineer - Data Pipelines

Zoominfo Bethesda , MD 20813

Posted 3 weeks ago

Apply

This Job is not relevant Tell us why

Responsibilities:

Architect and develop large-scale, distributed data processing pipelines using technologies like Apache Spark, Apache Beam, and Apache Airflow for orchestration.
Design and implement efficient data ingestion, transformation, and storage solutions for structured and unstructured data.
Partner closely with Engineering Leaders, Architects, and Product Managers to understand business requirements and provide technical solutions within a larger roadmap.
Build and optimize real-time and batch data processing systems, ensuring high availability, fault tolerance, and scalability.
Collaborate with data engineers, analysts, and scientists to understand business requirements and translate them into technical solutions.
Implement best practices for data governance, data quality, and data security across the entire data lifecycle.
Mentor and guide junior engineers, fostering a culture of continuous learning and knowledge sharing.
Stay up-to-date with the latest trends, technologies, and industry best practices in the big data and data engineering domains.
Participate in code reviews, design discussions, and technical decision-making processes.
Contribute to the development and maintenance of CI/CD pipelines, ensuring efficient and reliable deployments.
Collaborate with cross-functional teams to ensure the successful delivery of projects and initiatives.

Requirements:

Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.
Minimum of 10 years of experience in backend software development, with a strong focus on data engineering and big data technologies.
Proven expertise in Apache Spark, Apache Beam, and Airflow, with a deep understanding of distributed computing and data processing frameworks.
Proficient in Java, Scala and SQL, with the ability to write clean, maintainable, and efficient code.
Proven experience building enterprise-grade software in a cloud-native environment (GCP or AWS) using cloud services such as GCS/S3, Dataflow/Glue, Data proc/EMR, Cloud Function/Lambda, Big Query/Athena, Big Table/Dynamo
Experience with cloud platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes).
Experience in stream / data processing technologies like Kafka, Spark, Google BigQuery, Google Dataflow, HBase
Familiarity designing CI/CD pipelines with Jenkins, Github Actions, or similar tools
Experience with SQL, particularly performance optimization
Experience with Graph and Vector database or processing frameworks
Strong knowledge of data modeling, data warehousing, and data integration best practices.
Familiarity with streaming data processing, real-time analytics, and machine learning pipelines.
Excellent problem-solving, analytical, and critical thinking skills.
Strong communication and collaboration skills, with the ability to work effectively in a team environment.
Experience in mentoring and leading technical teams.

The US base salary range for this position is $175,000 - $210,000 + variable compensation+ benefits.

Actual compensation offered will be based on factors such as the candidate's work location, qualifications, skills, experience and/or training. Your recruiter can share more information about the specific salary range for your desired work location during the hiring process.

We want our employees and their families to thrive. In addition to comprehensive benefits we offer holistic mind, body and lifestyle programs designed for overall well-being. Learn more about ZoomInfo benefits here.

#LI-SK

#LI-Hybrid

Show Full Description

See how you match
to the job

Upload my resume

Download the
LiveCareer app and find
your dream job anywhere

Similar Jobs

View All

Want to see jobs matched to your resume?
Upload One Now!

Apply