Posted 17 May, 2026

Engineer Sr Lead, Site Reliability

Zensar Technologies

Pune,Maharashtra,IN,411014 Full Time

Reference: 218_649632_144817

JD - Engineer Sr Lead, Site Reliability

Mandatory Skills:

Experience in Resiliency Testing / Chaos Engineering
Strong knowledge of Service Health Monitoring using SLI/SLO frameworks
Solid understanding of Core SRE concepts
Hands-on experience in Performance Engineering / Performance Testing

Design and maintain monitoring solutions for infrastructure, application performance, and user experience.
Implement automation tools to streamline tasks, scale infrastructure, and ensure seamless deployments.
Ensure application reliability, availability, and performance, minimizing downtime and optimizing response times.
Lead incident response, including identification, triage, resolution, and post-incident analysis.
Conduct capacity planning, performance tuning, and resource optimization.
Collaborate with security teams to implement best practices and ensure compliance.
Manage deployment pipelines and configuration management for consistent and reliable app deployments.
Develop and test disaster recovery plans and backup strategies.
Collaborate with development, QA, DevOps, and product teams to align on reliability goals and incident response processes.
Participate in on-call rotations and provide 24/7 support for critical incidents.

What you bring:
Proficiency in development technologies, architectures, and platforms (web, API).
Experience with cloud platforms (AWS, Azure, Google Cloud) and IaC tools.
Hands-on experience with Docker, Kubernetes.
Knowledge of monitoring tools (Prometheus, Grafana, DataDog) and logging frameworks (Splunk, ELK Stack).
Experience in incident management and post-mortem reviews.
Strong troubleshooting skills for complex technical issues.
Proficiency in scripting languages (Python, Bash) and automation tools (Terraform, Ansible).
Experience with CI/CD pipelines (Jenkins, GitLab CI/CD, Azure DevOps).
Ownership approach to engineering and product outcomes.
Excellent interpersonal communication, negotiation, and influencing skills.

At Zensar, we're "experience-led everything". We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus.

Part of the $4.8 billion RPG Group, we're a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. Explore Life at Zensar and join us to Grow. Own. Achieve. Learn. to be the best version of yourself.

We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.

What you will be doing:
Software Engineer/Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Solutions, Payments and Capital Markets business. In this role, the candidate will have the opportunity to make a lasting impact on the company's transformation journey, drive customer-centric innovation and automation, and position the organization as a leader in the competitive banking, payments and investment landscape. Specifically, the Site Reliability Engineer will be responsible for the following:
Design and maintain monitoring solutions for infrastructure, application performance, and user experience.
Implement automation tools to streamline tasks, scale infrastructure, and ensure seamless deployments.
Ensure application reliability, availability, and performance, minimizing downtime and optimizing response times.
Lead incident response, including identification, triage, resolution, and post-incident analysis.
Conduct capacity planning, performance tuning, and resource optimization.
Collaborate with security teams to implement best practices and ensure compliance.
Manage deployment pipelines and configuration management for consistent and reliable app deployments.
Develop and test disaster recovery plans and backup strategies.
Collaborate with development, QA, DevOps, and product teams to align on reliability goals and incident response processes.
Participate in on-call rotations and provide 24/7 support for critical incidents.

Apply to this Job

Engineer Sr Lead, Site Reliability

Sign up for Job Alerts

Share this Job