About the job Senior Data Engineer (Kafka Streaming, Spark, Iceberg on Kubernetes)
Senior Data Engineer (Kafka Streaming, Spark, Iceberg on Kubernetes)
Build and scale a next-generation real-time data platform with cutting-edge open-source technologies.
100% Remote | R100 000 – R110 000 per month
About Our Client
Our client is a rapidly growing technology-driven organization building high-performance data platforms to enable advanced analytics, AI, and business intelligence. The team operates at the forefront of real-time data processing and distributed systems, leveraging modern cloud-native infrastructure. They foster a culture of technical excellence, continuous learning, and collaboration across multidisciplinary engineering teams.
The Role: Senior Data Engineer
As a Senior Data Engineer, you will design, build, and optimize next-generation data pipelines and platforms. Youll lead the architecture and implementation of scalable, real-time data solutions using Kafka, Spark, and Apache Iceberg deployed on Kubernetes. This is a hands-on, high-impact role within a forward-thinking data engineering team focused on performance, scalability, and innovation.
Key Responsibilities
-
5+ years of professional experience in data engineering or software engineering
-
Design and implement scalable, highly available real-time data pipelines and architectures
-
Build robust ETL and streaming pipelines using Apache Spark (Scala/Python) and Kafka Connect/Streams
-
Develop and manage data lakes using Apache Iceberg with schema evolution and time travel capabilities
-
Deploy and manage distributed data processing services on Kubernetes using containerization best practices
-
Optimize performance and resource usage across Spark jobs, streaming apps, and Iceberg tables
-
Define and uphold engineering best practices including testing, code standards, and CI/CD workflows
-
Mentor junior engineers and contribute to building a high-performing data engineering team
About You
-
5+ years of experience in data engineering or related software engineering roles
-
Advanced proficiency with Apache Spark (batch and streaming)
-
In-depth experience with Apache Kafka (Connect, Streams, or ksqlDB)
-
Hands-on experience with Apache Iceberg, including table evolution and performance tuning
-
Skilled in Python (PySpark) or Scala
-
Experience deploying and managing distributed systems on Kubernetes (Spark Operator is a plus)
-
Solid understanding of data modeling and data warehousing concepts
-
Advantageous: Experience with AWS, Azure, or GCP; familiarity with Flink or Trino
-
Preferred: Bachelors or Masters degree in Computer Science, Engineering, or related field