Posted 19 May, 2026

Director, Site Reliability (SRE, SLI/ SLO, Monitoring, Automation)

Vertafore

Hyderabad,IN,TS 500081 Full Time

Reference: 101_471795_609331_1098

The Director, Site Reliability Engineering (SRE) will lead reliability, performance, and observability initiatives for a portfolio of Vertafore products. This role owns SLIs/SLOs, incident response, automation, and CI/CD practices for assigned product families. Directors will manage multiple teams and collaborate with Product Development, Cloud Operations, Information Security, and other SRE leaders to ensure operational excellence.

Key Responsibilities

Product Reliability Leadership

Define and enforce SLIs/SLOs for a subset of Vertafore flagship products.

Drive observability strategy across application and infrastructure layers.

Release Engineering & Automation

Oversee CI/CD pipelines for product deployments using tools like GitLab, Jenkins, Ansible, LaunchDarkly.

Implement Infrastructure-as-Code (Terraform, AWS CloudFormation/CDK) for application provisioning.

Incident Management

Define 24x7 on-call rotations for assigned products; ensure rapid resolution and blameless postmortems.

Cross-Functional Collaboration

Partner with Cloud Ops on capacity planning, OS patching (app tier), and load balancing (ALB, F5).

Align reliability goals with product roadmaps and customer SLAs.

Team Leadership

Manage a group of Managers and Engineers; mentor teams on automation, observability, and reliability best practices.

Qualifications

Bachelor’s degree in Computer Science, Information Systems, or related field.

18+ years in Software Engineering, SRE, DevOps, or reliability roles; 5+ years in leadership(Director).

Proven ability to leverage software engineering principles and practices to solve reliability and operational challenges.

Expertise in CI/CD, observability, and incident response.

Strong AWS knowledge and experience with container orchestration.

Proven ability to lead reliability programs across multiple SaaS products.

Experience architecting applications or infrastructure for highgrowth cloud platforms.

Experience in B2B SaaS environments involving large-scale distributed systems.

Proven leadership communicating and influencing at team, peer, and leadership levels.

Demonstrated experience driving operational excellence through metrics and KPIs.

(Preferred) Background supporting financial services, healthcare, or regulated industries.

Apply to this Job

Director, Site Reliability (SRE, SLI/ SLO, Monitoring, Automation)

Sign up for Job Alerts

Share this Job