Incident Management
Key Responsibilities:
Incident Coordination: Manage and coordinate the resolution of major and high-priority incidents across IT teams and business stakeholders.
Escalation Management: Act as the primary point of contact for escalations related to incidents and ensure timely communication to stakeholders.
Root Cause Analysis (RCA): Lead and document post-incident reviews, identify root causes, and follow up on permanent fixes (Problem Management handover).
Communication: Provide clear, concise, and timely updates to internal and external stakeholders during incidents.
Process Management: Enforce incident management processes and procedures in line with ITIL best practices to ensure consistency and efficiency.
Monitoring & Detection: Work with monitoring tools and NOC teams to ensure quick detection and alerting for potential issues.
Reporting: Generate and analyze incident trend reports, SLA/KPI dashboards, and performance metrics to support service improvement initiatives.
Tool Management: Utilize ITSM tools such as ServiceNow, BMC Remedy, or Cherwell for tracking and managing incident tickets.
Collaboration: Partner with Service Desk, Application Support, Infrastructure, and Security teams to improve incident response and service availability.
Continuous Improvement: Recommend and implement improvements to the incident management process to reduce recurrence and improve response times.
Compliance & Audits: Ensure incident documentation is audit-ready and compliant with regulatory or organizational standards.
Requirements:
Experience: 6+ years of experience in IT Incident Management or related IT Service Management roles.
ITIL Knowledge: Strong understanding of ITIL framework, preferably ITIL v3 or v4 certified.
Incident Handling: Proven experience managing P1/P2 incidents and driving resolution under pressure.
Tools Expertise: Proficiency with ITSM tools like ServiceNow, BMC Remedy, Jira Service Management, etc.
Communication: Exceptional verbal and written communication skills; experience in stakeholder management, including executive reporting.
Problem Solving: Strong analytical and troubleshooting skills with a solution-oriented mindset.
Technical Acumen: Basic understanding of IT infrastructure, applications, databases, and cloud platforms to facilitate issue triage and resolution.
Documentation: Ability to create detailed RCA reports, incident timelines, and procedural documentation.
24x7 Support: Willingness to be part of an on-call rotation or work flexible hours in the event of major incidents.