Job Openings Production Support Lead - Mammoth-AI

About the job Production Support Lead - Mammoth-AI

Job Summary:

A proactive and technically skilled Production Support Lead to oversee the stability and performance of our live production systems. You will be responsible for leading a team of support engineers, managing incident response, and driving continuous improvement of support processes and system reliability. This role requires a balance of hands-on technical troubleshooting and people leadership.

Key Responsibilities:
· Lead the production support function across critical applications and services.
· Manage and mentor a team of support engineers, setting priorities and ensuring high performance.
· Oversee the incident management process — triage, root cause analysis, resolution, and communication.
· Work closely with development, QA, DevOps, and infrastructure teams to address issues and deploy fixes.
· Implement monitoring and alerting solutions to proactively identify and resolve issues.
· Define and enforce SLAs, uptime targets, and escalation procedures.
· Ensure comprehensive documentation of incidents, fixes, and knowledge-base articles.
· Drive the implementation of automation tools and practices to improve efficiency.
· Track key metrics (uptime, MTTR, incident volume) and present reports to leadership.
· Own the change management and release validation process to minimize production risks.

Required Skills & Qualifications:
· 5+ years of experience in production support/ IT operations.
· 2+ years of experience in a lead or managerial capacity.
· Experience in Java, NodeJS, SOA, SpringCloud SpringBoot.
· Experience supporting large-scale, mission-critical systems in a 24x7 environment.
· Strong troubleshooting skills across application, infrastructure, and network layers.
· Bachelors degree in Computer Science, Information Technology, or related field.
· Familiarity with Linux/Unix environments, databases (e.g., PostgreSQL, MySQL), and cloud infrastructure (AWS, Azure, or GCP).
· Experience with incident tracking tools (e.g., Jira, ServiceNow), monitoring tools (e.g., New Relic, Datadog, Prometheus), and logging tools (e.g., ELK stack, Splunk).
· Strong communication and leadership skills, especially under pressure.
· Excellent organizational and prioritization abilities.

Preferred Qualifications:
· ITIL Certification or familiarity with ITIL best practices.
· Experience with CI/CD pipelines and DevOps tooling.
· Knowledge of scripting (e.g., Bash, Python) for automation.
· Background in regulated environments (e.g., FinTech, Healthcare).