As a member of the Pfizer Analytics Lab team, a component of Pfizer's Business Technology organization, the Data Engineer will join a team of highly collaborative data scientists & engineers dedicated to leveraging data and advanced analytics to create a healthier world. This team member will contribute their dynamic perspective and knowledge to data engineering and advanced analytics, inspire colleagues and peers to develop and implement critical data driven solutions within Pfizer's drug discovery efforts.
Specifically, this group is focused on developing a set of capabilities designed to enable highly efficient exploration, experimentation, and rapid hypothesis generation based on internal, public, and commercially available datasets to continue supporting Pfizer's data driven, forward thinking approach to data science
Day-to-day, the Data Engineer will
Build services and tooling around "scraping" databases, loading logs, fetching data from external stores or APIs
Automate data consumption from other source systems, files etc.
Collaborate with other engineering, cloud infrastructure , security and product management teams to understand requirements and develop highly scalable system designs and architecture
Integrate new data management technologies and software engineering tools into existing structures
Create custom software components and analytics applications
Employ a variety of languages and tools to marry systems together
Participate in the assessment of new technologies as well as identifying next-generation solution architectures.
Develop efficient analytic pipelines that include components related to data acquisition, exploratory analysis, feature engineering, modeling, and interactive storytelling.
Shared-ownership of advancing team's data engineering capabilities through the ability to implement and execute on state-of-the-art approaches
Co-develop re-usable components that will serve as the foundation for a scalable approach for Pfizer's analytic maturation
Partner with other Business Technology teams to define and execute technology POCs using innovative technologies to advance Pfizer's analytic capabilities
Directly engage with key business stake-holders (Director level)
Informal leadership of project teams comprised of Associate/Sr. Associate level colleagues
Bachelor's Degree in Computer Science, Operations Research, physics, applied mathematics, statistics required
Advanced Degree in Computer Science, Operations Research, physics, applied mathematics, statistics or related field strongly preferred
5 years' experience working as a Data/ML Engineer
3 years working with semi-structured and unstructured data
2 years working in a cloud based ecosystem, preferably Amazon Web Services.
Ability to thrive in a fast-paced multi-disciplinary environment; with the ability to effectively communicate with a diverse audience
Ability to create technical examples, prototypes, and demonstrations based on rapidly changing data sets
Excellent written and verbal communication skills
Proven experience in at least two of the three following categories:
Data Science / Machine Learning
Expertise with general-purpose statistics/machine learning algorithms and at least one of the following sub-disciplines: Natural Language Processing, Deep Learning, Network Analysis.
Expertise with the implementation of algorithms within Python, R or Scala
Expertise with model tuning, validation and evaluation
Expertise with SQL development, database administration and performance tuning
Expertise with data manipulation and extraction using modern programming languages (Java, C++, C#, Python, Scala, Spark, etc.)
Experience with Unix/Linux development package management, knowledge of filesystems, performance monitoring/troubleshooting
Experience with sourcing data from APIs; experience building APIs is a plus
Experience with a variety of data stores for unstructured and columnar data as well as traditional database systems, for example, ElasticSearch, MongoDB, Cassandra, HBase, MySQL, Postgres and Vertica
Machine Learning Engineering
Experience building production implementations of data science and engineering pipelines
Building and running high throughput real-time and batch data processing pipelines using Spark, Flink, Storm, Kafka or equivalent technologies
EEO & Employment Eligibility
Pfizer is committed to equal opportunity in the terms and conditions of employment for all employees and job applicants without regard to race, color, religion, sex, sexual orientation, age, gender identity or gender expression, national origin, disability or veteran status. Pfizer also complies with all applicable national, state and local laws governing nondiscrimination in employment as well as work authorization and employment eligibility verification requirements of the Immigration and Nationality Act and IRCA. Pfizer is an E-Verify employer.
Pfizer reports payments and other transfers of value to health care providers as required by federal and state transparency laws and implementing regulations. These laws and regulations require Pfizer to provide government agencies with information such as a health care provider's name, address and the type of payments or other value received, generally for public disclosure. Subject to further legal review and statutory or regulatory clarification, which Pfizer intends to pursue, reimbursement of recruiting expenses for licensed physicians may constitute a reportable transfer of value under the federal transparency law commonly known as the Sunshine Act. Therefore, if you are a licensed physician who incurs recruiting expenses as a result of interviewing with Pfizer that we pay or reimburse, your name, address and the amount of payments made currently will be reported to the government. If you have questions regarding this matter, please do not hesitate to contact your Talent Acquisition representative.
Other Job Details:
Eligible for Employee Referral Bonus
This position can sit in La Jolla, CA, New York, NY or Collegeville, PA
Pfizer is an equal opportunity employer and complies with all applicable equal employment opportunity legislation in each jurisdiction in which it operates.