Job Openings G64 - Full Stack Engineer

About the job G64 - Full Stack Engineer

Key Responsibilities

Site Reliability & Operations

  • Manage and improve the reliability, availability, and operational excellence of the SHIP-HATS platform
  • Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
  • Lead incident management, troubleshooting, root cause analysis, and post-mortem reviews
  • Drive continuous improvements to reduce operational toil and prevent recurring incidents
  • Perform capacity planning, performance tuning, and system optimisation

Observability & Monitoring

  • Design and implement observability solutions across logging, metrics, and distributed tracing
  • Build dashboards, alerts, and monitoring strategies to provide deep visibility into platform health
  • Manage and maintain monitoring stacks such as Prometheus, Grafana, ELK, or equivalent tools

Infrastructure & Automation

  • Develop and maintain Infrastructure-as-Code (IaC) solutions using tools such as Terraform or Ansible
  • Automate infrastructure provisioning, deployment, and operational workflows
  • Support both cloud and on-premises infrastructure environments
  • Contribute to CI/CD pipeline improvements and platform automation initiatives

Collaboration & Engineering Excellence

  • Work closely with engineering and product teams to embed reliability and operability practices into the software development lifecycle
  • Review system architectures and recommend reliability, scalability, and resilience improvements
  • Advocate for DevSecOps, automation, and operational best practices across teams

Requirements

  • Degree in Computer Science, Information Technology, Engineering, or related disciplines
  • Hands-on experience with Kubernetes and container orchestration platforms
  • Experience with CI/CD and DevSecOps tools such as GitLab, Jira, Confluence, Fortify, or similar platforms
  • Proficiency in at least one scripting or programming language such as Python, Go, or Bash
  • Experience with Infrastructure-as-Code tools such as Terraform or Ansible
  • Familiarity with cloud platforms such as AWS, Azure, or GCP
  • Experience implementing and managing observability and monitoring solutions such as ELK Stack, Prometheus, or Grafana
  • Good understanding of networking, system reliability, security hardening, and operational best practices
  • Strong analytical, troubleshooting, and problem-solving skills
  • Ability to work effectively in Agile and cross-functional environments
  • Good communication and stakeholder management skills

Good to Have

  • Experience with GitOps workflows and service mesh technologies such as Istio
  • Familiarity with secrets management tools such as HashiCorp Vault
  • Exposure to government ICT standards, IM8 policies, or regulated environments
  • AI-native mindset and software engineering capabilities
  • Experience supporting large-scale enterprise or public sector platforms