Site Reliability Engineer

Job Openings Site Reliability Engineer

About the job Site Reliability Engineer

DIGITALHUB es una empresa peruana de outsourcing de servicios de BPO y TI. Nuestra visión es un futuro en el que cada persona pueda encontrar el mejor empleo y donde nuestros partners puedan descubrir lo mejor del talento latinoamericano.

En esta oportunidad, nos encontramos buscando un "Site Reliability Engineer" para trabajo remoto, para ello deberás cumplir con los siguientes requisitos:

RESUMEN

In this role, you will be the linchpin of the product the technical expert for the product, sales, marketing and business teams and the product expert for the technical teams. By combining intimate knowledge of customers, strong analytical skills, and technical acumen, you will drive a holistic product vision that energizes your teammates and delights your customers.

MODALIDAD

Remoto

DURACIÓN

4 meses en remoto.

REQUISITOS

Senior Level Site Reliability Engineer experience 5+years experience with EKS, Kubernetes, and Containerization.
Inquisitive and a passion for learning.
Fluent with English and have Great Communication Skills verbal and written.
Some software development/coding experience.
Extensive experience with ECS with a focus on optimization of service/task definitions.
Need to understand and be able to author and execute Terraform7. Ability to work independently and in a team8. Amazon (AWS) Technologies/securities.
The candidate must include a specific explanation of a project with EKS at the top of the resume.
Client interview process: 2 1-hour interviews, possible 3.

Plus:

Write/Read/Understand Code:, Java, PHP, Node, and GoLang.
Telemetry: New Relic, Datadog, CloudWatch.
Experience writing GitHub Actions.

FUNCIONES

The consultant will be on call 24 hours a day for one week every 6 weeks starting at 9am on Monday through the following Monday.

While EKS administration/configuration experience is critical to the success of the candidate there are other things they must also be able to do:

Understand the flow of data for cloud-based applications between the browser and the cloud resources
Be able to lead calls with various interested parties when there are incidents in production that impact our customers to help identify the root cause (triage). Successful Candidate will OWN incidents and their resolution.
Understand AWS infrastructure in general
Be able to troubleshoot and isolate issues to specific infrastructure or the application logic/code
Must be able to utilize application monitoring tools like New Relic APM or an equivalent.
Must be able to query and understand logs and alert results in a tool like Datadog or an equivalent.

Puedes enviarnos tu CV al siguiente correo: reclutamiento@digitalhub.pe y omar.navarro@digitalhub.pe

Tipo de puesto: Tiempo completo

Or refer someone