Posted 21 May, 2026

Devops Observability Engineer

ARTech

Hyderabad Full Time

Reference: 365_625640_26-17094

Job Title: Observability / SRE Engineer

Location: Hyderabad Only

Job Description:

We are looking for an experienced Observability / Site Reliability Engineer (SRE) with strong expertise in monitoring, cloud-native technologies, and automation. The ideal candidate should have hands-on experience in Kubernetes environments, observability platforms, distributed tracing, and proactive incident management to improve system reliability and performance.

Required Experience:

10+ years of overall IT Infrastructure experience.
Minimum 8+ years of experience in Observability, Monitoring, or Site Reliability Engineering (SRE) roles.

Required Skills:

Strong expertise in Kubernetes and containerized environments.
Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, and Dynatrace.
Experience with distributed tracing tools like Jaeger and OpenTelemetry.
Strong scripting and automation skills using Python or Go.
Experience with logging and log analytics tools such as Splunk, ELK Stack, Fluentd, and Loki.
Strong understanding of observability concepts including metrics, logging, and tracing.
Experience working with cloud platforms such as AWS, Azure, or GCP and integrating observability solutions in cloud-native environments.
Familiarity with databases such as MySQL and PostgreSQL.
Hands-on experience with Infrastructure as Code (IaC) tools like Terraform or Helm.

Key Responsibilities:

Design, implement, and maintain enterprise observability and monitoring solutions.
Drive self-healing mechanisms, intelligent monitoring, and proactive incident response strategies.
Collaborate with SRE, DevOps, Infrastructure, and Development teams to improve system reliability and operational efficiency.
Implement monitoring, logging, tracing, and alerting solutions for cloud-native applications and Kubernetes platforms.
Automate operational tasks, incident management workflows, and infrastructure monitoring processes.
Perform root cause analysis (RCA), troubleshooting, and performance optimization activities.
Ensure high availability, scalability, and reliability of enterprise applications and infrastructure environments.

Apply to this Job

Devops Observability Engineer

Sign up for Job Alerts

Share this Job