Kubernetes Administrator
Job Description:
Location: Remote
Contract Duration: 6 months
Contract details: B2B/ PFA or SRL
Role Overview
We are looking for a Senior Kubernetes Administrator to architect, operate, and continuously improve mission-critical Kubernetes platforms with strict availability and reliability requirements (99.95%+ uptime SLAs).
This role owns the end-to-end Kubernetes platform, from control-plane architecture and multi-region resilience to security, automation, observability, and operational excellence.
Key Responsibilities
Kubernetes Architecture & Operations
-
Architect, operate, and scale highly available Kubernetes clusters meeting 99.95%+ uptime SLAs.
-
Design and manage multi-region, multi-zone Kubernetes clusters, including:
-
zero-downtime upgrades
-
blue/green control-plane rollouts
-
surge-based node rotations
-
-
Deeply understand and optimize Kubernetes internals, including:
-
API Server performance design and tuning based on NFRs
-
etcd clustering, backup, compaction, and disaster recovery
-
scheduler optimization and custom scheduling strategies
-
Advanced Troubleshooting & Incident Management
-
Lead advanced troubleshooting of:
-
control-plane failures
-
node instability
-
networking and storage issues
-
complex, multi-layer platform incidents
-
-
Act as a senior escalation point for critical production issues.
Platform, Networking & Storage
-
Operate AKS or native Kubernetes platforms at enterprise scale.
-
(Nice to have) Experience with bare-metal Kubernetes, including:
-
Cilium Load Balancer
-
BGP-based networking
-
node pool design
-
-
Own stateful Kubernetes workloads, including:
-
StatefulSets
-
highly available CSI backends
-
storage performance tuning and I/O optimization
-
Infrastructure as Code & GitOps
-
Lead Infrastructure as Code (IaC) and GitOps strategies using:
-
Terraform
-
Helm
-
Kustomize
-
-
Implement declarative cluster configuration with Argo CD.
-
Automate cluster creation, policy enforcement, compliance scanning, and drift remediation.
CI/CD & Automation
-
Build and maintain enterprise-grade CI/CD pipelines with advanced security controls using:
-
Azure DevOps
-
GitHub Actions
-
-
Integrate CI/CD pipelines with Kubernetes platforms for secure, reliable delivery.
Security & Identity
-
Implement and enforce enterprise-grade Kubernetes security frameworks, including:
-
Zero Trust architectures
-
workload identity
-
mutual TLS
-
signed container images
-
-
Enforce Pod Security Standards (PSS) and NetworkPolicies, including eBPF-based policy tracing.
-
Integrate Kubernetes clusters with corporate identity systems (OIDC, Azure AD, IAM).
-
Manage secrets lifecycle using:
-
HashiCorp Vault
-
Azure Key Vault
-
SOPS
-
-
Implement admission control using:
-
OPA Gatekeeper
-
Kyverno
-
Observability & Reliability Engineering
-
Own the full monitoring and observability stack, including:
-
Prometheus
-
Grafana
-
-
Design and operate high-throughput logging pipelines using Loki.
-
Implement distributed tracing and root-cause analysis with OpenTelemetry.
-
Define, measure, and report:
-
Service Level Indicators (SLIs)
-
Service Level Objectives (SLOs)
-
error budgets
-
reliability dashboards
-
Required Skills & Experience
-
Senior-level experience operating production Kubernetes platforms at enterprise scale
-
Deep expertise in Kubernetes internals and control-plane architecture
-
Strong hands-on experience with AKS or native Kubernetes
-
Advanced knowledge of Kubernetes networking, storage, and security
-
Proven experience with IaC, GitOps, and CI/CD automation
-
Strong troubleshooting skills in complex, distributed systems
Required Skills:
Kubernetes