Big Data Engineer, Iomt Data Lake

Partners Healthcare System Somerville , MA 02143

Posted 2 months ago

About Us:

As a not-for-profit organization, Partners HealthCare is committed to supporting patient care, research, teaching, and service to the community by leading innovation across our system. Founded by Brigham and Women's Hospital and Massachusetts General Hospital, Partners HealthCare supports a complete continuum of care including community and specialty hospitals, a managed care organization, a physician network, community health centers, home care and other health-related entities. Several of our hospitals are teaching affiliates of Harvard Medical School, and our system is a national leader in biomedical research.

We're focused on a people-first culture for our system's patients and our professional family. That's why we provide our employees with more ways to achieve their potential. Partners HealthCare is committed to aligning our employees' personal aspirations with projects that match their capabilities and creating a culture that empowers our managers to become trusted mentors. We support each member of our team to own their personal developmentand we recognize success at every step.

Our employees use the Partners HealthCare values to govern decisions, actions and behaviors. These values guide how we get our work done: Patients, Affordability, Accountability & Service Commitment, Decisiveness, Innovation & Thoughtful Risk; and how we treat each other: Diversity & Inclusion, Integrity & Respect, Learning, Continuous Improvement & Personal Growth, Teamwork & Collaboration.

Principal Duties and Responsibilities:

  • Design, create, build, integrate, maintain and optimize multiple ETL data pipelines.

  • Aggregate and transform raw data coming from a variety of data sources to fulfill the functional & non-functional requirements (e.g., Microsoft SQL, Apache Hive, Apache HBase, Enterprise Data Warehouse, bedside monitors (HL7), EEG recordings (waveforms), web services, and others).

  • Design, create, optimize and maintain conceptual/physical data models, data catalogues and data architecture diagrams.

  • Actively Involved in all facets of data lake development: business analysis, requirements gathering, functional and technical specification, infrastructure definition, data architecture design, development, implementation, testing, deployment, and support of new applications.

  • Create and maintain related documentation on Confluence including data models, dataflow diagrams, integration schemas, interoperability relationships, etc.

  • Uses the Partners HealthCare values to govern decisions, actions and behaviors. These values guide how we get our work done: Patients, Affordability, Accountability & Service Commitment, Decisiveness, Innovation & Thoughtful Risk; and how we treat each other: Diversity & Inclusion, Integrity & Respect, Learning, Continuous Improvement & Personal Growth, Teamwork & Collaboration.

  • Perform other duties as assigned or required by the situation and circumstances

  • Bachelor's Degree in Computer Science, or other technical degree. Master's degree strongly preferred.

  • 5 years in-depth experience with Python and at least one more programming language (R, MATLAB, C , Java, Scala, Spark).

  • 5 years of experience designing, coding, testing and debugging multiple ETL integration interfaces of varying size and complexity.

  • 3 years of experience with schema design, data architecting and dimensional data modeling (star schema).

  • 3 years of experience writing complex SQL statements to extract data from data lakes.

  • Hands-on experience of Hadoop-based technologies for distributed near-real-time processing is highly desired.

  • Previous experience in healthcare industry is highly desired.

Skills/Abilities/Competencies Required:

  • Proficiency in Python, shell scripting (Bash) and SQL is required.

  • Experience working with large volumes of structured, semi-structured & unstructured data.

  • Hands-on experience with Big Data frameworks/Hadoop-based technologies (Spark, Kafka, Hive, Hbase, Sqoop, Ranger, HDFS) is required.

  • In-depth experience designing, developing, and implementing pipelines to perform various ETL, cleaning, integration and scrubbing tasks.

  • Hands-on experience creating and maintaining multidimensional data models, conceptual/physical data models, data catalogues and data architecture diagrams.

  • Strong understanding of Object-oriented Programming (OOP), DevOps principles, design patterns, CI/CD (GitLab CI, Jenkins), code version control (GitLab, BitBucket), and industry software development best practices.

  • Exposure to the entire software development life cycle, supporting planned releases to different environments, such as QA, staging and production environments.

  • Experience working in an Agile/Scrum environment and familiarity with Jira and Confluence tools.

  • Strong verbal and written communication, ability to write clear technical documentation.

  • Highly desired:

  • Hands-on experience with near-real-time data processing streaming (Kafka, Storm, Apache NiFi) is a plus.

  • Experience with healthcare interoperability standards including HL7 messaging, DICOM or FHIR is a plus.

  • Experience with Azure, GCP, AWS or other cloud providers is a plus.

Working Conditions:

  • Standard office environment with travel to Hospital locations in the Boston Metro area including the data centers
icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Big Data Engineer (Digital Health)

Brigham And Women's Hospital

Posted 4 weeks ago

VIEW JOBS 8/22/2019 12:00:00 AM 2019-11-20T00:00 About Us: As a not-for-profit organization, Partners HealthCare is committed to supporting patient care, research, teaching, and service to the community by leading innovation across our system. Founded by Brigham and Women's Hospital and Massachusetts General Hospital, Partners HealthCare supports a complete continuum of care including community and specialty hospitals, a managed care organization, a physician network, community health centers, home care and other health-related entities. Several of our hospitals are teaching affiliates of Harvard Medical School, and our system is a national leader in biomedical research. We're focused on a people-first culture for our system's patients and our professional family. That's why we provide our employees with more ways to achieve their potential. Partners HealthCare is committed to aligning our employees' personal aspirations with projects that match their capabilities and creating a culture that empowers our managers to become trusted mentors. We support each member of our team to own their personal development—and we recognize success at every step. Our employees use the Partners HealthCare values to govern decisions, actions and behaviors. These values guide how we get our work done: Patients, Affordability, Accountability & Service Commitment, Decisiveness, Innovation & Thoughtful Risk; and how we treat each other: Diversity & Inclusion, Integrity & Respect, Learning, Continuous Improvement & Personal Growth, Teamwork & Collaboration. General Summary/Overview Statement: Partners HealthCare is embarking on a new Enterprise Data and Digital Health (EDDH) initiative focused on establishing best-in-class data analytics, digital competencies and tools to deliver superior care and experience for our patients. The organization is dedicated to creating a cutting-edge data science environment that supports patient discovery, cohort formation and patient disease stratification. The Research Information Science & Computing (RISC) group of Partners HealthCare plays a critical role in ensuring the success of this initiative. Our focus is on accelerating the digital transformation of healthcare by redefining how data is used to improve patient outcomes. To better enable researchers and clinicians, we are building tools, repositories, and new data workflows optimized for data science, machine learning and artificial intelligence. Our group is looking for a Big Data Engineer to be a key member of this fast-paced team. This position will be responsible for modifying, expanding and optimizing our data warehouse to include Big Data and Cloud technologies. You will have significant influence on our data strategy by helping define and build the next iterations of features for enterprise-wide data integration and data science. The ideal candidate should have prior experience in the implementation of a modern Big Data architecture and the design and implementation of analytical data platforms. S/he should have deep technical and analytical skills with a rich knowledge of accepted best practices around data flow, data transformation and data ingestion. To quickly get up to speed, a solid understanding of metadata catalogs, data governance, data science and data mining in Big Data technologies in Cloud, hybrid and/or on-premise environments is essential. The ideal person must love a challenge and feel comfortable working on an evolving project in which the specific goals and deliverables are subject to change. Deep technical and analytical skills are a necessity as well as an ability to successfully work with new technologies, quickly grasp new concepts, and think creatively when solving complex problems. You may be asked to take on other duties the team may need as assigned and you will actively participate in all facets of the project. This position will report to the RISC Senior Manager of Analytics with additional oversight from technical leads on each project. She/he will also be working closely with other developers, analysts and research-based teams around the enterprise. Principal Duties and Responsibilities: * Work with cross functional research leadership, technical and analytical teams to understand current and future enterprise-wide Big Data analytics goals spanning disparate platforms and datatypes * Participate hands-on in the projects to architect (research, recommend, design, develop and deploy) advanced systems for the collection, aggregation and analysis of those data in alignment with business objectives * Provide Big Data technology assessments, strategies, and roadmaps in several technical domains and act as a subject matter expert on Big Data * Collaborate with data engineers to drive and build innovative solutions, defining best practices and methodologies * Participate in configuring the architecture and advise data engineers on efficient performance * Assist data scientists, SMEs, Data Engineers and Big Data Cloud Architects in deploying and testing AI and machine learning algorithms * Develop and optimize ETL processes, implement transformations and quality check results * May participate in cross-team code reviews among data engineering personnel * Stay informed on initiatives across the industry and the enterprise to help leadership effectively prioritize current and future Big Data needs * Exhibit willingness and participate actively in agile practices by collaborating with the Senior Manager and team members * Provide adequate documentation to effectively communicate architectural designs and data inputs, outputs and workflows for technical and nontechnical audiences * Works closely with the RISC leadership, RISC Technical, other RISC managers and IT colleagues to support corporate/functional business and information needs * Help to establish processes to ensure HIPPA and institutional compliance in all aspects of the work * Bachelor's/Master's degree in computer science or a healthcare related informatics field * 7 years of experience designing, developing and implementing highly scalable enterprise-wide data warehousing and integration initiatives * Experience in a wide variety of traditional databases and Big Data technologies, developing strategies for data flow, data obfuscation, archiving and other aspects of data warehouses or data lakes * 5 years architecting data warehouses and/or data lakes with traditional database enterprise-class RDBMS technologies, preferably MS SQL Server, and Big Data NoSQL like technologies, such as MongoDB, Couch DB, Hadoop and MapReduce, or Hbase. Experience includes creating data integration systems spanning a variety of modern architectural paradigms, including event based, streaming, and near real time design patterns * 3 years of experience working in a variety of public and private cloud environments such as AWS, Azure, Dell EMC or VMWare, and experience deploying to data science machine learning/AI stacks such as Databricks, Cloudera/HortonWorks, HTFS, Kafka or Spark * Preferred experience working with data/Big Data in HealthCare, including genomics, imaging and EHR * Successful track record of delivering results on scope, on time, on budget * Knowledge of HIPPA, privacy regulations regarding patient data security and/or research related practices is a plus Skills/Abilities/Competencies Required: * Exceptional problem-solving skills * Excellent oral and written communication skills * Able to work in one or more cloud computing environments * Able to work in a highly-collaborative team and provide valuable insight * Able to document requirements and complex data architectures * Able to inform the project management process with thoughtful and accurate timeframe estimates * Able work efficiently under pressure and to manage to tight deadlines or shifting priorities * Enjoy being challenged and solving complex problems * Self-motivated, independent and possesses the ability to learn quickly * Familiarity with medical terminology and healthcare concepts is a significant plus Working Conditions: Working with our team on site in Somerville as well as traveling to meet collaborators at multiple sites in the local Boston area. 5 – 10% travel outside of Boston may be required. Brigham And Women's Hospital Somerville MA

Big Data Engineer, Iomt Data Lake

Partners Healthcare System