Posted 18 June, 2026
Site Reliability Engineer(SRE)
EXL
Nellore, AP, IN
Full Time
Reference: e60b8f351613c7c4
Job Description
Role Overview:\nWe are hiring an experienced SRE leader to manage a global Incident Management team and drive operational excellence for this engagement. The role involves leading teams, handling complex incidents, and improving overall incident response strategy.\nKey Responsibilities:\nLead and mentor a team of SRE engineers (Level 7 ICs)\nOwn end-to-end incident management operations across regions\nEstablish and drive incident response processes and governance\nEnsure effective 16x7 delivery model across geographies\nAct as escalation point for critical incidents and stakeholder communication\nDrive continuous improvements in MTTM and operational efficiency\nLead process enhancements, SOP creation, and knowledge transfer planning\nRequired Skills:\nStrong experience in SRE / Incident Management leadership\nProven ability to manage high-impact, complex incidents\nExcellent communication, stakeholder management, and leadership skills\nAbility to drive alignment and influence cross-functional teams\nTechnical Skills:\nAbility to proactively identify risks using monitoring tools such as DataDog and Grafana dashboards\nExperience in incident response with capability to quickly restore services (restart, patch, or remediate live issues)\nStrong focus on minimizing service downtime across environments\nHands-on experience supporting both on-premise (Linux environments) and cloud platforms (primarily Azure, with some exposure to GCP)\nSolid understanding of networking concepts and system architecture