Data Engineer

Or refer someone

Job Openings Data Engineer

About the job Data Engineer

Job Type: Remote

Salary: 250 K

In this role as a Data Engineer you will:

Design and develop data architecturesto standardize and integrate data sources for the Data Science department.

Implement and manage ETL pipelinesusing Airflow via AWS Managed Workflows for Apache Airflow (MWAA).

Develop and maintain data warehouses and data lakesutilizing technologies such as Snowflake, Amazon S3, and Redshift.

Collaborate with Data Scientists and Engineersto ensure data availability, quality, and accessibility for predictive ML models.

Establish and enhance CI/CD pipelinesand contribute to DevOps practices to streamline integration and deployment processes.

Work alongside cross-functional teams, including a separate Data Engineering team within the company, to leverage organizational data initiatives.

HOW YOU'LL SUCCEED

Our ultimate goal is to smooth patient access to life-saving therapies. By centralizing and standardizing our data sources, you will enable our Data Science team to develop and deploy predictive models more efficiently and effectively. Your expertise in building robust data pipelines and ensuring high data quality will be crucial in delivering insights and products that accelerate pharmaceutical product development and improve patient outcomes. Your contributions will directly impact our ability to provide accurate, timely, and actionable intelligence to our clients.

WHAT IT TAKES

Essential Requirements

Degree at Master's level or higherin Computer Science, Engineering, or a related STEM field, or equivalent practical experience.

Strong experience with data warehousing technologies, including Snowflake, Amazon S3, and Redshift.

Proficiency in building and managing ETL pipelinesusing Airflow, particularly AWS Managed Workflows for Apache Airflow (MWAA).

Solid understanding of AWS servicesand cloud architecture.

Experience with DevOps practices and CI/CD tools, such as Jenkins, GitLab CI/CD, or AWS CodePipeline.

Strong programming skills in Python and SQL, with experience in writing efficient, maintainable, and scalable code.

Ability to work collaborativelywith cross-functional teams and manage projects independently.

Excellent problem-solving and communication skills, with the ability to explain technical concepts to non-technical stakeholders.

Nice to have

Experience with data modeling and schema designfor both relational and non-relational databases.

Knowledge of data security and compliance best practices, including GDPR and HIPAA regulations.

Familiarity with containerization technologiessuch as Docker and orchestration tools like Kubernetes.

Understanding of the pharmaceutical industry, particularly the stages of pharmaceutical product development.

Experience with additional orchestration toolsand data integration patterns.

Knowledge of monitoring and logging tools, such as CloudWatch, ELK Stack, or Prometheus.

Desirable:

AWS certifications.
Snowflake certifications.
Experience with data visualisation tools such as Tableau, Power BI or QuickSight.

Or refer someone