| A new Job Posting has been submitted. Please respond as soon as possible. If you are unable to submit a qualified Job Seeker, please decline immediately. |
Job Posting ID
INFYSYJP00005016
|
Job Posting Title
564981 -Devops SRE Obervability- India- DX
|
Description
We are seeking a highly experienced Senior OpenTelemetry & Observability Platform Engineer with 8–10 years of experience to design, implement, and operate enterprise-scale observability solutions. The role focuses on OpenTelemetry instrumentation, OpenTelemetry Collector operations, and administration of observability platforms including access control, governance, and platform reliability.
Key Responsibilities • Design and implement OpenTelemetry-based instrumentation for distributed systems (Java, Python, Go, .NET, Node.js). • Define and maintain standards for metrics, traces, and logs across microservices and cloud-native workloads. • Operate, tune, and scale OpenTelemetry Collectors (agent and gateway modes). • Design Collector pipelines including receivers, processors, exporters, batching, sampling, and tail-based sampling. • Ensure high availability, performance, and cost optimization of telemetry pipelines. • Troubleshoot telemetry data loss, latency, cardinality, and performance issues. • Administer enterprise observability platforms (e.g., Grafana, Prometheus, Elastic, Datadog, New Relic, Dynatrace). • Manage user access, groups, roles, RBAC, and tenancy models in observability platforms. • Establish governance models for dashboards, alerts, naming standards, and retention. • Integrate observability platforms with incident management and ITSM tools. • Automate observability infrastructure using IaC tools (Terraform, Helm, Ansible). • Collaborate with application, SRE, and platform teams to enable observability by default. • Mentor junior engineers and promote observability best practices across teams.
Technical Skills • Strong hands-on experience with OpenTelemetry SDKs and auto-instrumentation. • Deep knowledge of OpenTelemetry Collector architecture and configuration. • Experience with Kubernetes, Docker, and cloud platforms (AWS, Azure, GCP). • Solid understanding of metrics (Prometheus), logs, and distributed tracing concepts. • Proficiency in YAML, JSON, and configuration-driven systems. • Experience with CI/CD pipelines and DevOps practices. • Scripting and automation skills (Python, Bash, or equivalent). Observability Platform Administration • Expertise in observability platform administration and configuration. • Managing users, teams, roles, permissions, and access policies. • Implementing multi-tenant or multi-environment observability setups. • Defining alerting strategies, SLOs, and SLIs. • Ensuring compliance, auditability, and secure access to telemetry data.
Required Experience & Qualifications • 8–10 years of experience in monitoring, observability, platform engineering, or SRE roles. • Strong background in distributed systems and cloud-native architectures. • Proven experience operating observability platforms at scale. • Excellent troubleshooting, communication, and documentation skills. Preferred Qualifications • OpenTelemetry, Kubernetes, or cloud certifications. • Experience with large-scale enterprise or SaaS environments. • Exposure to FinOps or observability cost-management practices.
|
Job Posting Start Date
08/05/2026
|
Job Posting End Date
31/12/2026
|
Practice Unit (PU)
DX-PU1
|
Site
Not Applicable
|
Location
Not Applicable
|
Job Posting Owner
Supriya Sahu
|
This notification was sent by the SAP Fieldglass system. |
If you have any questions regarding this notice, access SAP Fieldglass Help Center to find documentation or submit a support case. Please do not respond to this email
|
|
|
|
|