Job Openings Healthcare Data Engineer (AA - 11112025 - PTHDE)

About the job Healthcare Data Engineer (AA - 11112025 - PTHDE)

Position: Healthcare Data Engineer

Number of hours: 20 hours/week
Schedule: UK Time Zone - 9AM - 5PM


Key Responsibilities:

  • ETL Pipeline Development: Design, implement, and manage scalable and reliable ETL/ELT data pipelines to process diverse healthcare data from various sources.

  • Data Integration: Extract and consolidate data from disparate sources, including electronic health records (EHRs), real-world datasets, pharmacy sell-out data, and disease-specific surveys.

  • Data Transformation & Cleansing: Cleanse, validate, and standardize raw healthcare data. Map data to standard medical terminologies (e.g., ICD-10, SNOMED CT, LOINC), remove duplicates, and resolve inconsistencies to ensure high data quality.

  • Cloud Management: Utilize AWS or Google Cloud services to build, deploy, and monitor data processing solutions and storage infrastructure.

  • Compliance and Security: Implement and maintain strict data security, privacy, and governance protocols to ensure compliance with regulations such as HIPAA and GDPR, including encryption, access control, and audit trails.

  • Collaboration: Work closely with data scientists, analysts, and business stakeholders to understand data requirements and ensure the data architecture supports advanced analytics and business intelligence needs.

Qualifications:

  • Proven experience as a Data Engineer, preferably within healthcare or life sciences.

  • Strong expertise in designing and managing ETL/ELT processes and data pipelines.

  • Hands-on experience with at least one major cloud platform:
    • AWS: Proficiency with services such as AWS Glue, Amazon S3, AWS Lambda, Amazon Redshift, and AWS HealthLake.

    • Google Cloud: Proficiency with services like Cloud Healthcare API, BigQuery, Dataflow, and Dataproc.

  • Solid understanding of healthcare data standards (e.g. HL7, FHIR, DICOM) and data interoperability challenges.

  • Proficiency in programming languages such as Python and PySpark, and experience with SQL.

  • Knowledge of data security best practices and experience implementing measures to protect sensitive health information.