Karachi, Pakistan

Data Engineer

 Job Description:

We are looking for a skilled Data Engineer to design, build, and optimise our modern data platform on the cloud. You'll be responsible for architecting scalable, secure, and automated data pipelines and enabling data-driven applications and AI/ML solutions across the organisation.

This is an exciting opportunity to shape the data foundation of a growing business, leveraging a modern AWS and engineering ecosystem.

Must Haves:

  • 3-5 years of experience in data engineering or a similar role working with large-scale, distributed data systems.
  • Strong proficiency in Python, SQL, and data modelling for analytical and transactional systems.
  • Deep understanding of modern AWS data and analytics services (e.g., S3, Glue, Lambda, Redshift, or equivalents).
  • Experience with Apache Spark / PySpark for batch and stream processing.
  • Solid knowledge of data governance, security, and compliance frameworks.
  • Familiarity with Infrastructure as Code (Terraform, CloudFormation) and CI/CD practices.
  • Understanding of AI/ML data workflows, including feature engineering and data preparation for LLMs or predictive models.
  • Strong problem-solving, collaboration, and communication skills.

Preferred (Nice to Have)

  • Knowledge of containerization (Docker, ECS, EKS) or serverless data architectures.
  • Understanding of AI/ML and NLP technologies, including integration of LLMs, automation frameworks, and RPA tools for intelligent data workflows.
  • Awareness of MLOps pipelines, model monitoring, and feature store management.

Responsibilities:

  • Design, build, and maintain large-scale, distributed, and event-driven data pipelines using modern AWS Data Engineering services (e.g., Glue, Lambda, Step Functions, etc.).
  • Develop ETL/ELT workflows integrating structured and unstructured data into Snowflake, Redshift, and S3-based data lakes.
  • Architect and optimise data ingestion, transformation, and storage layers for structured and unstructured data in a data lake or warehouse environment.
  • Implement and enforce data security, governance, and access controlincluding encryption, IAM, auditing, and compliance with best practices.
  • Develop and maintain metadata management, data lineage, and cataloguing frameworks to ensure traceability and consistency across datasets.
  • Automate infrastructure provisioning using Infrastructure-as-Code (IaC) tools such as Terraform or AWS CloudFormation.
  • Collaborate with analytics and AI/ML teams to deliver clean, high-quality, and feature-ready data for training and inference pipelines.
  • Integrate observability and monitoring into data workflows to ensure reliability, performance, and cost efficiency.
  • Contribute to CI/CD pipelines and Git-based version control for data workflows and infrastructure changes.
  • Mentor junior engineers and champion engineering best practices for data quality, reliability, and scalability.

Other Details

  • Work Days: Monday-Friday (3 pm - 12 am)
  • Office location: Off to Shahrah-e-Faisal, PECHS, Karachi
  Required Skills:

ETL Data Engineering Pivot Tables Data Quality Analysis Pipelines Web Services AWS Business Requirements Programming Languages Excel Analytical Skills Reliability MS Excel Programming Computer Science Data Analysis Engineering SQL Python Business Science