Head of Observability
Job Description:
Were hiring a Head of Observability to lead the strategy, tooling, and delivery of observability across complex platforms and services. This is a senior leadership role that combines deep technical understanding with strong stakeholder management and the ability to drive enterprise-scale observability adoption. You'll define the vision, build high-performing teams, and partner closely with engineering, SRE, security, and product to embed observability into the heart of how systems are built and operated.
Key Responsibilities:
- Own the observability strategy and roadmap across logs, metrics, traces, and digital experience monitoring
- Lead the design and delivery of scalable observability platforms
- Build and manage a team of observability engineers and specialists
- Define SLIs/SLOs in collaboration with product and engineering to measure and improve system reliability
- Drive adoption of observability tooling and best practices across engineering teams
- Partner with stakeholders to enable proactive incident detection, root cause analysis, and performance tuning
- Establish governance, standards, and automation for consistent observability implementation across services
What You Need:
- Deep experience in observability, monitoring, or SRE leadership roles
- Strong technical knowledge of modern observability stacks and telemetry standards such as OpenTelemetry, ELK, Datadog, Grafana, Prometheus Splunk
- Proven experience leading teams and delivering observability at scale in cloud-native environments (AWS, Azure, or GCP)
- Solid understanding of DevOps, site reliability, or platform engineering principles
- Ability to translate business goals into actionable observability outcomes
- Excellent communication and leadership skills with a strategic mindset
Whats on Offer:
- Competitive salary and annual bonus
- Flexible working arrangements (remote or hybrid)
- 30 days holiday, pension contributions, private healthcare
Required Skills:
Adoption Root Cause Analysis Collaboration Analysis Splunk Azure ROOT Hiring Salary Stakeholder Management Healthcare AWS DevOps Metrics Reliability Automation Strategy Security Design Engineering Business Leadership Communication Management