Job Openings Senior Data Engineer

About the job Senior Data Engineer

Responsibilities

  • Design, develop and optimize largescale ETL pipelines with Apache Spark for both batch and streaming data workloads.
  • Build, schedule and monitor data workflows in Apache Airflow authoring DAGs to automate ingestion, cleaning, pseudonymization, transformation and loading processes.
  • Lead the architecture, deployment and tuning of cloudbased data platforms on AWS and Azure, including Amazon Redshift and Snowflake data warehouses.
  • Implement modular, versioncontrolled inwarehouse transformations using DBT, with full lineage tracking, testing and documentation.
  • Configure and manage Trino (Presto) for federated querying across Redshift, S3 and other sources to enable unified, realtime analytics.
  • Develop and maintain CI/CD pipelines for data engineering deliverables; containerize services using Docker and orchestrate deployments in Kubernetes.
  • Collaborate with data science teams to provision feature pipelines and data feeds for ML workflows (Scikitlearn, XGBoost, TensorFlow) and oversee experiment tracking and model versioning via MLflow.
  • Partner with analytics and BI teams to deliver interactive dashboards in Power BI, Meta base or Apache Superset, and to implement enterprise reporting solutions in AWS Quick Sight with rolebased access controls.
  • Mentor junior engineers, establish DevOps and data engineering best practices, and drive crossfunctional alignment on data strategy and delivery.

Requirements

  • Bachelors degree in computer science, Engineering or a related discipline.
  • Minimum of 4 years experience in a Data Engineer or equivalent role, with a proven record of delivering productiongrade data platforms.
  • Advanced Python and SQL proficiency for pipeline development, data transformation and performance tuning.
  • Demonstrated expertise with Apache Spark, Apache Airflow, DBT and Trino (Presto).
  • Handson experience architecting and managing Amazon Redshift and/or Snowflake environments.
  • Familiarity with ML support libraries (Scikitlearn, XGBoost, TensorFlow) and ML lifecycle management in MLflow.
  • Experience building and maintaining BI solutions in Power BI, Metabase, Apache Superset and AWS QuickSight.
  • Exceptional problemsolving, communication and collaboration skills, with a track record of mentoring peers and driving complex, crossteam initiatives.