Job Openings Site Reliability Engineer

About the job Site Reliability Engineer

Required Qualification :

  • Candidates should have 3-4 years of experience.
  • Strong experience with observability stacks, including AppDynamics, Dynatrace, Prometheus, OpenTelemetry (OTel), and Grafana.
  • Designed and developed automation for routine tasks using GitOps and Python.
  • Expertise in AppDynamics and Dynatrace OneAgent deployment, configuration, customization, and troubleshooting.
  • Hands-on experience setting up synthetic monitoring.
  • Subject matter expert in full-stack observability concepts, including APM tools, metrics collection, and trace analysis, with a passion for automation.
  • Experience with Google Cloud Platform and a solid understanding of Kubernetes.
  • This role is 70% hands-on, focusing on administration and automation tasks for AppDynamics and Dynatrace. The remaining 30% involves Site Reliability Engineering (SRE) tasks related to Prometheus, OTel, Grafana, and similar tools.
  • A strong monitoring subject matter expert (SME) capable of collaborating with various teams to gather monitoring requirements, and design, develop, and implement effective monitoring solutions and alerting strategies.
  • Proficient in Java and Python.
  • Experience in DataDog is a plus.