Job Openings
Data Engineer
About the job Data Engineer
Hybrid position
Minimum requirements:
- Relevant 3-year tertiary qualification BSC In Computer Science / Information Systems
- 2-4 Years Senior Data Engineering experience
In depth knowledge and understanding of:
- Azure Databricks
- Data governance and data management frameworks
- BI and Warehouse design
- Movement of data into the Cloud using the following tools Apache Kafka, Storm, Flume for ingesting data or Amazon Web Services (AWS) Cloud Development Kit (CDK)
- Operating with real-time streams, data warehouse queries, JSON, CSV, raw data
- Scripting Data Pipelines for scheduled movement of data
- Designing and developing ETL/ELT processes and data pipelines Experience working with Azure DevOps
- Data visualization including PowerBI / Tableau
- Exploratory Data Analysis - EDA
- SQL
- Microsoft SQL Server
- NoSQL
- Microsoft SQL Server
- Python (PySpark)
- C# , Json- calling APIs
- PowerShell
- Apache Spark
- Kafka
- Scala
- Splunk
- Elk Stack
- Data Modelling
- Data warehousing solutions
- Data pipelines
- Data Cleansing
- Terraform
Responsibilities:
- Work as part of an agile data engineering team
- Ensure that data systems meet the companys regulatory commitments, and that data is reliable and efficient
- Support best business capabilities for high performance database solutioning
- Actively participate in team, cross-discipline and vendor-driven collaboration sessions or forums to increase understanding of the working environment by contribution and participation in the relevant Data Governance Frameworks
- Partner with Technology, Business and other stakeholders to ensure that database structures, programmes and applications meet business delivery requirements
- Monitor adherence to processes which support the prescribed data architectural frameworks and ensure development/delivery teams align to the required standards and methodologies
- Design and implement scalable end-to-end database solutions including:
- Addressing issues of data migration i.e. validation, clean-up and mapping and consistently apply data dictionaries
- Data cleanse using databricks
- Component design and development
- Identification and resolution of production and application development constraints
- Integration of new technologies and software into the existing landscape
- Development data set processes for data modelling, mining and production
- Optimal utilization of big data
- Documentation management and disaster recovery