Job Openings M05 - Data Engineer

About the job M05 - Data Engineer

Scope of Work

Data Pipeline Development & Management

  • Design, build, and maintain robust data pipelines using AWS Glue
  • Implement ETL/ELT processes for data ingestion from multiple sources
  • Optimize data workflows for performance and scalability
  • Monitor and troubleshoot data pipeline failures and performance issues

Data Infrastructure & Engineering (with IT Department)

  • Manage and optimize AWS Redshift data warehouse operations
  • Configure and maintain data storage solutions (AWS S3, data lakes)
  • Implement data partitioning, indexing, and compression strategies
  • Support Infrastructure as Code (IaC) for data infrastructure deployment

CI/CD & DevOps for Data (with AWS partners and IT Department)

  • Develop and maintain CI/CD pipelines for data workflows using GitLab
  • Implement automated testing for data pipelines and data quality
  • Support version control and deployment strategies for data assets
  • Configure AWS Lambda functions for data processing automation

Monitoring & Support (with IT Department)

  • Set up monitoring and alerting for data pipeline health
  • Provide technical support for data-related issues
  • Collaborate with technical teams on data architecture requirements
  • Optimize query performance and database operations

Documentation & Reporting (with IT Department)

  • Document data pipeline architectures and technical specifications
  • Maintain runbooks and operational procedures
  • Conduct monthly progress meetings (1 hour) to report on system health
  • Track engineering tasks through SHIP-HATS Jira
  • Maintain technical documentation on SHIP-HATS Confluence

Required Skills & Experience:

  • Strong background in data engineering and data pipeline development
  • Proficiency in SQL, Python, and shell scripting
  • Extensive experience with AWS data services (Redshift, S3, Glue, Lambda, CloudWatch)
  • Data warehouse design and optimization experience
  • Strong CI/CD pipeline knowledge (GitLab preferred)
  • Infrastructure as Code (IaC) experience (Terraform, CloudFormation)
  • Knowledge of data modeling and database design principles
  • Strong troubleshooting and performance optimization skills