About the job Senior Data Engineer / AI Data Platform Engineer (Spark / ETL / Cloud)
Senior Data Engineer / AI Data Platform Engineer (Spark / ETL / Cloud)
Location: Santo Domingo, Dominican Republic (On-site Hybrid Remote flexibility)
Employment Type: Full-Time
Industry: AI / Software Engineering / Data Platforms
About the Role
We are looking for a Senior Data Engineer / AI Data Platform Engineer to design and build high-performance, scalable data pipelines that power AI/ML systems and data-driven applications.
This role goes beyond traditional ETL — it focuses on distributed computing, large-scale data processing, and data infrastructure for AI workloads using technologies like Apache Spark, Python, and cloud-native architectures.
You will play a key role in enabling machine learning pipelines, real-time data processing, and high-volume data systems.
Key Responsibilities
- Design and build scalable data pipelines for AI/ML workloads
- Develop distributed data processing systems using Apache Spark (batch & streaming)
- Optimize large-scale data transformations using Python and SQL
- Architect and maintain data platforms on AWS, Azure, or Google Cloud
- Implement parallel processing, partitioning strategies, and performance tuning
- Enable data ingestion pipelines for structured and unstructured data (logs, events, APIs)
- Collaborate with ML Engineers and Software Engineers to support AI models
- Ensure data reliability, observability, and system scalability
Technical Requirements (MUST HAVE)
- Strong experience with Apache Spark (core + performance optimization)
- Advanced SQL (analytical + optimization level)
- Strong programming in Python (data + performance oriented)
- Proven experience building ETL / ELT pipelines at scale
- Experience with cloud-native architectures (AWS, Azure, or GCP)
- Deep understanding of distributed systems and data processing at scale
- Experience handling large datasets (10M–1B+ records)
Nice to Have (Highly Valued in AI Market)
- Experience supporting ML pipelines (feature engineering, data prep)
- Familiarity with Spark Streaming / Kafka / real-time pipelines
- Experience with Databricks / Snowflake / BigQuery
- Knowledge of data lakehouse architectures
- Experience with containerization (Docker, Kubernetes)
- Exposure to MLOps workflows
What We Offer
- Competitive salary aligned with AI/Engineering market (USD-based)
- Flexible work model (Hybrid Remote)
- Opportunity to work on AI-driven systems and scalable platforms
- High-impact engineering environment (not BI / not reporting-focused)
Application Requirements (MANDATORY)
To be considered, candidates MUST submit:
- Updated Resume (CV)
- Updated LinkedIn profile link
- A short written response including:
Why should you be considered for this role?
Please include:
- Your experience with Apache Spark and distributed data systems
- A description of the most complex data pipeline or system you have built
- Your experience working with large-scale data or AI-related systems
- Why you are a strong fit for this position
Professional Summary (3–5 lines)
Provide a concise summary of your experience as a Data Engineer / AI Data Engineer.
Li SEO Keywords (AI / Engineering Focus)
(keep small when posting)
Senior Data Engineer | AI Data Engineer | Data Platform Engineer | Apache Spark | Distributed Systems | Big Data | Python | SQL | AWS | Azure | GCP | Machine Learning Data Pipelines | Data Infrastructure | Spark Streaming | Kafka | Databricks | Data Lakehouse | MLOps | Cloud Engineering | Scalable Systems