Chenoa Information Services New York City, NY , New York 11251
Posted 2 months ago
Job Title: Data Engineer Office Location:
Bethlehem, PA Work Location: This is a hybrid role and the selected resource will be required to work onsite in the Bethlehem, PA office a minimum of three days per week. Local candidates only.
Seeking an experienced Data Engineer to be part of our Enterprise Data and Analytics organization. You will be playing a key role in building and delivering best-in-class data and analytics solutions aimed at creating value and impact for the organization and our customers. As a member of the data engineering team, you will help with the development and delivery of Data Products with quality backed by best-in-class engineering.
You will collaborate with analytics partners, business partners and IT partners to enable the solutions. You will: • Architect, build, and maintain scalable and reliable data pipelines including robust data quality as part of data pipeline which can be consumed by analytics and BI layer. • Design, develop and implement low-latency, high-availability, and performant data applications and recommend & implement innovative engineering solutions. • Design, develop, test and debug code in Python, SQL, PySpark, bash scripting as per Client standards. • Design and implement data quality framework and apply it to critical data pipelines to make the data layer robust and trustworthy for downstream consumers. • Design and develop orchestration layer for data pipelines which are written in SQL, Python and PySpark. • Apply and provide guidance on software engineering techniques like design patterns, code refactoring, framework design, code reusability, code versioning, performance optimization, and continuous build and Integration (CI/CD) to make the data analytics team robust and efficient. • Performing all job functions consistent with Client policies and procedures, including those which govern handling PHI and PII. • Work closely with various IT and business teams to understand systems opportunities and constraints for maximally utilizing Client Enterprise Data Infrastructure. • Develop relationships with business team members by being proactive, displaying an increasing understanding of the business processes and by recommending innovative solutions. • Communicate project output in terms of customer value, business objectives, and product opportunity. You have: • 5 years of experience with Bachelors / master's degree in computer science, Engineering, Applied mathematics or related field. • Extensive hands-on development experience in Python, SQL and Bash. • Extensive Experience in performance optimization of data pipelines. • Extensive hands-on experience working with cloud data warehouse and data lake platforms like Databricks, Redshift or Snowflake. • Familiarity with building and deploying scalable data pipelines to develop and deploy Data Solutions using Python, SQL, PySpark. • Extensive experience in all stages of software development and expertise in applying software engineering best practices. • Extensive experience in developing end-to-end orchestration layer for data pipelines using frameworks like Apache Airflow, Prefect, Databricks Workflow. • Familiar with : RESTful Webservices (REST APIs) to be able to integrate with other services.
API Gateways like APIGEE to secure webservice endpoints. Data pipelines, Concurrency and parallelism. • Experience in creating and configuring continuous integration/continuous deployment using pipelines to build and deploy applications in various environments and use best practices for DevOps to migrate code to Production environment. • Ability to investigate and repair application defects regardless of component: front-end, business logic, middleware, or database to improve code quality, consistency, delays and identify any bottlenecks or gaps in the implementation. • Ability to write unit tests in python using unit test library like pytest. Additional Qualifications (nice to have): • Experience in using and implementing data observability platforms like Monte Carlo Data, Metaplane, Soda, bigeye or any other similar products. • Expertise in debugging issues in Cloud environment by monitoring logs on the VM or using AWS features like Cloudwatch. • Experience with DevOps tech stack like Jenkins and Terraform. • Experience working with concept of Observability in software world and experience with tools like Splunk, Zenoss, Datadog or similar. • Experience in developing and implementing Data Quality framework either home grown or using any open-source frameworks like Great Expectations, Soda, Deequ. • Ability to learn and adapt to new concepts and frameworks and create proof of concept using newer technologies. • Ability to use agile methodology throughout the development lifecycle and provide updates on regular basis, escalating issues, or delays in a timely manner.
Chenoa Information Services