Skip to main content
Posted 19 May, 2026

Observability

Diverse Lynx
bengaluru,Karnataka,560063 Full Time
Reference: 365_569689_26-00426

Description:
Maintain and enhance observability systems, including alerting, dashboards, logs, traces, and metrics across the platform
Define, implement, and continuously monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure platform reliability and performance
Support incident detection, triage, troubleshooting, and root cause analysis to minimize downtime and improve operational resilience
Design and maintain proactive alerting strategies to reduce noise and enable faster incident response
Integrate observability practices and tooling into the CI/CD pipelines to enable early detection of issues during build and deployment stages
8-10 years
Implement and manage observability solutions within a Kubernetes-based infrastructure, ensuring visibility into cluster, node, and application performance
Collaborate with development and platform teams to embed monitoring and reliability best practices across services
8-10 years
Analyze trends and metrics to identify performance bottlenecks and drive continuous improvement
Document observability standards, dashboards, alerts, and operational runbooks
Participate in post-incident reviews and contribute to reliability and availability improvements
Experience with large scale AD modernization or cloud identity transformation

Sign up for Job Alerts