Job Openings
Data Engineer
About the job Data Engineer
We are looking for a Data Engineer who has working knowledge of building andmaintaining scalable data pipelines on-premises and on the cloud. This includesunderstanding the input and output data sources, upstream downstream dependenciesand ensuring data quality. A key aspect of this role will be focusing on the deprecationof migrated workflows and migration of workflows into new systems (if needed). Theideal candidate should be experienced with tools and technologies such as Git, ApacheAirflow, Apache Spark, SQL, data migration, and data validation.Key Responsibilities:1. Workflow Deprecationo Plan and execute the deprecation of migrated workflows byevaluating current workflows&39; dependencies and consumption.o Utilize tools and best practices to identify, mark, and communicatedeprecated workflows to stakeholders.2. Data Migrationo Plan and execute data migration tasks to move data betweendifferent storage systems or formats.o Ensure the accuracy and completeness of data during migrationprocesses.o Implement strategies to accelerate the pace of data migration bybackfilling, validating, and making new data assets ready for use.3. Data Validationo Define and implement data validation rules to ensure dataaccuracy, completeness, and reliability.o Utilize data validation solutions and anomaly detection methods tomonitor data quality.4. Workflow Managemento Use Apache Airflow to schedule, monitor, and automate dataworkflows.o Develop and manage DAGs (Directed Acyclic Graphs) in Airflowto orchestrate complex data processing tasks.5. Data Processingo Develop and maintain data processing scripts using SQL andApache Spark.o Optimize data processing for performance and efficiency.6. Version Controlo Use Git for version control, collaborating with the team to managethe codebase and track changes.o Ensure best practices in code quality and repository management.7. Continuous Improvemento Keep up to date with the latest developments in data engineeringand related technologies.o Continuously improve and refactor data pipelines, tooling, andprocesses to enhance performance and reliability.Skills and Qualifications:· Bachelor&39;s degree in Computer Science, Engineering, or a related field.· Proficient in Git for version control and collaborative development.· Proficiency in SQL and experience with database technologies.· Experience in data pipeline tools such as Apache Airflow.· Strong knowledge of Apache Spark for data processing and transformation.· Experience with data migration and validation techniques.· Knowledge of data governance and security practices.· Strong problem-solving skills and the ability to work independently and in ateam.· Ability to communicate with global team· Ability to work as a team in high performing environment.