Skip to main content
Posted 12 June, 2026

Site Reliability Engineer(SRE)

EXL
Alappuzha, KL, IN Full Time
Reference: 2d3e001068176849

Job Description

Role Overview:

We are hiring an experienced SRE leader to manage a global Incident Management team and drive operational excellence for this engagement. The role involves leading teams, handling complex incidents, and improving overall incident response strategy.

Key Responsibilities:

  • Lead and mentor a team of SRE engineers (Level 7 ICs)
  • Own end-to-end incident management operations across regions
  • Establish and drive incident response processes and governance
  • Ensure effective 16x7 delivery model across geographies
  • Act as escalation point for critical incidents and stakeholder communication
  • Drive continuous improvements in MTTM and operational efficiency
  • Lead process enhancements, SOP creation, and knowledge transfer planning

Required Skills:

  • Strong experience in SRE / Incident Management leadership
  • Proven ability to manage high-impact, complex incidents
  • Excellent communication, stakeholder management, and leadership skills
  • Ability to drive alignment and influence cross-functional teams

Technical Skills:

  • Ability to proactively identify risks using monitoring tools such as DataDog and Grafana dashboards
  • Experience in incident response with capability to quickly restore services (restart, patch, or remediate live issues)
  • Strong focus on minimizing service downtime across environments
  • Hands-on experience supporting both on-premise (Linux environments) and cloud platforms (primarily Azure, with some exposure to GCP)
  • Solid understanding of networking concepts and system architecture

Sign up for Job Alerts