Generate Biomedicines Somerville , MA 02143
Posted 3 weeks ago
About Generate:Biomedicines
Generate:Biomedicines is a new kind of therapeutics company - existing at the intersection of machine learning, biological engineering, and medicine - pioneering Generative Biology to create breakthrough medicines where novel therapeutics are computationally generated, instead of being discovered. The Company has built a machine learning-powered biomedicines platform with the potential to generate new drugs across a wide range of biologic modalities. This platform represents a potentially fundamental shift in what is possible in the field of biotherapeutic development.
We pursue this audacious vision because we believe in the unique and revolutionary power of generative biology to radically transform the lives of billions, with an outsized opportunity for patients in need. We are seeking collaborative, relentless problem solvers that share our passion for impact to join us!
Generate:Biomedicines was founded in 2018 by Flagship Pioneering and has received nearly $700 million in funding, providing the resources to rapidly scale the organization. The Company has offices in Somerville and Andover, Massachusetts with over 300 employees.
The Role:
We are seeking a creative and motivated ML (Machine Learning) Ops Data Engineer to help us build a cutting-edge data platform that will empower Generate's machine learning research. As an integral member of the ML Ops group, you will play a key role in data warehousing, ETL, and optimizing data usage during model training across a diverse array of biological datasets. The successful candidate will collaborate closely with ML Scientists, Computational Biologists, and Informatics/IT Engineers to develop scalable data systems that rapidly advance our scientific programs. This role is based in our Somerville, MA office with flexibility for hybrid work.
Here's how you will contribute:
Assist with the design, implementation and maintenance of performant, scalable ETL pipelines
Expand and refine Generate's data warehousing capabilities
Engage with multidisciplinary research teams to develop and optimize data models tailored to accommodate diverse biological datasets
Support the management and improvement of the cloud infrastructure backing our data platform
Develop and integrate APIs to streamline data flow and support the automation of machine learning pipelines and data management tasks
Champion data engineering best practices, contributing to the development and adherence to standards that enhance data quality, system reliability and workflow efficiency
The Ideal Candidate will have:
3+ years experience working in a data engineer role
Bachelor's or Master's degree in computer science or a similar field
Extensive experience with major Cloud Service Providers (CSPs) such as AWS, Azure, and Google Cloud Platform (GCP), with a strong understanding of cloud-based solutions and infrastructure service
Advanced knowledge and understanding of data warehouse technology such as Redshift or BigQuery
Demonstrated ability in constructing large-scale ETL pipelines using popular frameworks such as Apache Airflow or Prefect
Proficiency in Python and strong object-oriented design skills coupled with a solid understanding of data structures and algorithms
Experience designing, deploying and managing database systems
Strong understanding of machine learning fundamentals
A strong interest in leveraging data engineering skills to unlock insights from complex biological datasets
Exceptional communication skills, with the ability to articulate complex data concepts in a way that is accessible and compelling to both technical and non-technical stakeholders
#LI-HM1
Generate:Biomedicines is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.
COVID Safety:
Generate:Biomedicines enforces a mandatory vaccination policy for COVID-19. All employees must be fully vaccinated and have received a booster. The purpose of this policy is to safeguard the health of our employees, their families, and the community at large from infectious disease that may be reduced by vaccinations. The Company will make exceptions to this policy if required by applicable law and will consider requests for an exemption from this policy due to a medical reason, or because of a sincerely held religious belief, or any other exemptions that may be recognized by applicable.
Recruitment & Staffing Agencies: Generate:Biomedicines does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Generate:Biomedicines or its employees is strictly prohibited unless contacted directly by the Company's internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Generate:Biomedicines and the Company will not owe any referral or other fees with respect thereto.
Generate Biomedicines