Job Openings
Senior Data Engineer (AWS & Confluent) - WFH/WAH/Remote
About the job Senior Data Engineer (AWS & Confluent) - WFH/WAH/Remote
Job Title: Senior Data Engineer (AWS & Confluent Data/AI Projects)
Work Set-up: Remote/WFH (must be a Filipino citizen or a PH permanent resident)
Shift Schedule: 10 am-6 pm SGT/PHT
Required Qualifications:
- Bachelor's degree in Computer Science, or a related quantitative field
- At least 3 years of experience in data engineering, with a significant focus on cloud-based solutions.
- Strong expertise in AWS data services (S3, Glue, EMR, Redshift, Kinesis, Lambda, etc.).
- Extensive hands-on experience with Confluent Platform/Apache Kafka for building real-time data streaming applications.
- Proficiency in programming languages such as Python, PySpark, Scala, or Java.
- Expertise in SQL and experience with various database systems (relational and NoSQL).
- Solid understanding of data warehousing, data lakes, and data modeling concepts (star schema, snowflake schema, etc.).
- Experience with CI/CD pipelines and DevOps practices (Git, Terraform, Jenkins, Azure DevOps, or similar).
- Must have a working laptop
Preferred Qualifications (Nice to Have):
- AWS Certifications (e.g., AWS Certified Data Analytics - Specialty, AWS Certified
- Solutions Architect - Associate/Professional
- Experience with other streaming technologies (e.g., Flink).
- Knowledge of containerization technologies (Docker, Kubernetes).
- Familiarity with Data Mesh or Data Fabric concepts.
- Experience with data visualization tools (e.g., Tableau, Power BI, QuickSight).
- Understanding of MLOps principles and tools.
Responsibilities:
- Architect and Design Data Solutions: Lead the design and architecture of scalable, secure, and efficient data pipelines for both batch and real-time data processing on AWS. This includes data ingestion, transformation, storage, and consumption layers.
- Confluent Kafka Expertise: Design, implement, and optimize highly performant and reliable data streaming solutions using Confluent Platform (Kafka, ksqlDB, Kafka Connect, Schema Registry). Ensure efficient data flow for real-time analytics and AI
applications. - AWS Cloud Native Development: Develop and deploy data solutions leveraging a wide range of AWS services, including but not limited to:
- Data Storage: S3 (Data Lake), RDS, DynamoDB, Redshift, Lake Formation.
- Data Processing: Glue, EMR (Spark), Lambda, Kinesis, MSK (for Kafka integration).
- Orchestration: AWS Step Functions, Airflow (on EC2 or MWAA)
- Analytics & ML: Athena, QuickSight, SageMaker (for MLOps integration).