About the job Senior Site Reliability Engineer (SRE)
Job Title: Senior Site Reliability Engineer (SRE)
Location: Hybrid – San José, Costa Rica (2–3 days in office)
Type of Contract: Full-Time (EOR transitioning to Direct Employment)
Salary Range: Market Rates
Language Requirements: Advanced English (Required)
We are seeking a skilled Senior Site Reliability Engineer with strong hands-on experience in infrastructure automation, Kubernetes, and CI/CD pipelines to join our growing team. You will play a key role in building, securing, and optimizing scalable infrastructure and deployment systems across hybrid environments. Your work will directly impact system reliability, deployment efficiency, and the overall performance of mission-critical platforms.
Key Responsibilities
- Design and implement infrastructure automation to enable consistent, repeatable deployments across on-premises and customer-managed environments
- Develop and maintain CI/CD pipelines using tools such as GitHub Actions and ArgoCD to improve deployment speed and reliability
- Manage and optimize Kubernetes clusters, including application packaging and deployment using tools like Grafana Tanka and Kustomize
- Build and maintain observability systems (monitoring, logging, alerting) using the Grafana stack
- Troubleshoot and resolve performance and reliability issues, including scaling, latency, and resource allocation challenges
- Implement security best practices including container security, vulnerability scanning, and network hardening
- Collaborate with engineering teams to support infrastructure needs, troubleshoot environments, and improve developer experience
Must-Have Qualifications
- 3–5 years of hands-on experience in Site Reliability Engineering, DevOps, or infrastructure engineering (practical experience required; not purely theoretical)
- Strong background in the software engineering lifecycle with an engineering-first mindset
- Proven hands-on experience with Kubernetes in production environments (deployment, operations, troubleshooting)
- Experience building and maintaining CI/CD pipelines (GitHub Actions, ArgoCD, or similar tools)
- Solid understanding of infrastructure-as-code and configuration management practices
- Experience with observability and monitoring tools, preferably Grafana stack
- Strong problem-solving skills with the ability to clearly explain technical processes and decisions with real-world examples
Preferred Qualifications
- Experience working in hybrid or on-premises infrastructure environments (non cloud-native focused)
- Familiarity with VMware-based environments
- Experience with DevSecOps practices and security-focused infrastructure design
- Exposure to customer-managed or air-gapped deployment environments
- Prior experience mentoring junior engineers or supporting cross-functional teams