AI Automation Test Lead

Singapore, Singapore, Singapore

Job Openings AI Automation Test Lead

About the job AI Automation Test Lead

We are seeking an AI Automation Test Lead to drive the quality strategy for our clients AI/LLM-powered products. You will build and scale test automation frameworks that validate not just functionality, but also model behavior, data quality, and non-deterministic outputs. This role combines deep test automation expertise with hands-on experience in AI/ML systems, leading a team to ensure our AI features are reliable, safe, and performant in production.

Key Responsibilities

1. Test Strategy & Leadership

Define and own the end-to-end QA strategy for AI/LLM products including chatbots, agents, RAG pipelines, recommendation, and CV/NLP models
Establish best practices for testing non-deterministic systems: prompt evaluation, hallucination detection, bias/safety testing, latency & cost regression
Lead, mentor, and grow a team of SDETs/automation engineers. Set quality gates for CI/CD and release readiness
Partner with Product, Data Science, and Engineering to shift-left quality and define acceptance criteria for AI features

2. AI Test Automation Architecture

Architect automation frameworks for AI systems: prompt regression suites, golden dataset evaluation, synthetic data generation, LLM-as-judge pipelines
Build tooling to test RAG quality: context relevance, grounding, citation accuracy, retrieval latency
Automate testing of model APIs, vector DBs, embedding pipelines, and fine-tuning workflows
Implement eval harnesses using frameworks like DeepEval, RAGAS, LangSmith, Promptfoo, or custom solutions

3. Data & Model Quality

Design tests for data pipelines feeding AI: schema validation, drift detection, feature consistency between training/serving
Own offline/online eval pipelines. Track metrics: accuracy, faithfulness, toxicity, P50/P95 latency, token cost
Build canary & shadow testing for model deployments. Define rollback criteria based on guardrail violations

4. Traditional + AI System Testing

Drive API, UI, and integration test automation for services hosting AI models
Performance, load, and chaos testing for LLM inference endpoints and real-time features
Security testing for prompt injection, jailbreak, data leakage, and PII handling

5. Governance & Reporting

Create quality dashboards: model eval trends, defect leakage, flaky rate, coverage for AI scenarios
Drive root cause analysis for AI incidents. Feed learnings back into dataset curation and test design
Ensure compliance with AI safety, privacy, and regulatory requirements

Required Qualifications

8+ years in software QA/test automation, with 2+ years leading teams
3+ years hands-on testing AI/ML systems, LLMs, or data-intensive platforms
Strong coding in Python for test framework development. Java/Go is a plus
Experience with test automation: Pytest, Playwright, Selenium, REST/GraphQL, CI/CD with GitHub Actions, Jenkins
Deep understanding of LLM/RAG concepts: prompts, embeddings, vector DBs, chunking, eval metrics
Hands-on with Flink/Spark, SQL, Hive for validating data pipelines
Experience with cloud + K8s: AWS/GCP, Docker, Kubernetes, model serving on GPU/CPU
Built eval pipelines using LangSmith, Langfuse, Weights & Biases, MLflow, or similar
Strong grasp of statistics for A/B testing, significance, and measuring non-deterministic systems

Preferred Qualifications

Prior experience testing multi-agent systems, tool use, function calling
Knowledge of red-teaming, AI safety evals, bias/fairness testing
Contributions to open-source AI eval or testing frameworks
Experience with Doris, ClickHouse, Elasticsearch, Druid for test data analysis
Background in FinTech, E-commerce, or Search domains with real-time requirements

Or refer someone