Job Openings Site Reliability Engineer

About the job Site Reliability Engineer

We are a consulting company with a bunch of technology-interested and happy people!

We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and welcoming workplace where each individual is highly valued.

With us, each individual is her/himself and respects others for who they are and we believe that when a fantastic mix of people gather and share their knowledge, experiences and ideas, we can help our customers on a completely different level.

We are looking for you who want to grow with us!

With us, you have great opportunities to take real steps in your career and the opportunity to take great responsibility.

Job Overview:

are looking for a skilled Site Reliability Engineer to join our team and support the daily operations of our cloud-based platform. This role involves monitoring, troubleshooting, automation, and incident management for services running on Google Cloud Platform (GCP) and Microsoft Azure. The ideal candidate will have strong experience in Python, Terraform, and cloud infrastructure, along with a keen eye for system reliability and performance optimization.

Key Responsibilities

  • Monitor and maintain the operational health of cloud infrastructure and applications.
  • Set up, configure, and manage monitoring tools (especially Splunk) for logs, metrics, and alerting.
  • Perform incident management in alignment with ITIL best practices, including troubleshooting, root cause analysis (RCA), and long-term resolution.
  • Automate operational tasks using Python scripts and Terraform to manage infrastructure as code.
  • Collaborate with engineering, DevOps, and support teams to improve system reliability and deployment processes.
  • Gain deep understanding of services and frameworks running on GCP and Azure (from an infrastructure and operations perspective, not application development).
  • Drive continuous improvement initiatives around performance, availability, and operational excellence.

Must-Have Skills

  • Strong understanding and hands-on experience with ITIL (Incident, Problem, Change Management) processes.
  • Proficiency in Splunk for monitoring, log management, and setting up actionable alerts.
  • Solid experience in Python for automation and scripting.
  • Expertise in Terraform for infrastructure provisioning and management in cloud environments.
  • Working knowledge of cloud services and infrastructure on GCP and/or Azure.
  • Excellent troubleshooting, analytical, and problem-solving skills.

Notice Period: Immediate to 15 Days Only

Work Location : Bangalore(Hybrid)

Form of employment: Full-time until further notice, we apply 6 months probationary employment.

We interview candidates on an ongoing basis, do not wait to submit your application.