Skip to main content
Posted 28 May, 2026

Devops Kubernetes

Diverse Lynx
bengaluru,Karnataka,560063 Full Time
Reference: 365_569689_26-00550

Description:
We are looking for a skilled and pragmatic DevOps Engineer to own and evolve our infrastructure across the EMEIA region. This is a dual-horizon role: you will keep our existing VM-based systems healthy while leading a greenfield effort to design and build the managed environment that those solutions will migrate onto.
A significant proportion of what we build is produced rapidly using AI-assisted, structured development. That means our solutions can move from idea to deployment faster than ever, and our infrastructure needs to keep pace. We need someone who thrives in a fast-moving, ambiguous environment, can absorb change quickly, and treats adaptability as a core part of the job rather than an occasional demand.
The new managed environment is most likely to be based on Kube — client's internal Kubernetes (EKS) deployment — though the final architecture will be a team decision and client specific AWS remains an option for workloads requiring greater control. You will help inform that decision and then own the build-out, regardless of which direction is chosen.
You will work closely with data engineers, developers, and analysts, acting as the infrastructure backbone for a team that moves quickly and expects you to move with it. The role also involves working directly with third-party vendors who support some of the tools being deployed, and collaborating with teams outside of EMEIA — including WorldWide — to align on standards, share solutions, and resolve cross-regional dependencies.
KEY RESPONSIBILITIES
Platform Migration & Environment Design
· Lead the design and build-out of a new managed container environment to replace existing VM-based infrastructure — the most likely candidate is Kube (client's internal Kubernetes/EKS cluster), but the final decision will be made collaboratively as a team
· Contribute meaningfully to the environment selection decision: weigh trade-offs between managed solutions (Kube) and more directly controlled alternatives (client specific AWS), considering maintenance overhead, operational control, and team capability
· Own the migration of existing VM-based workloads onto the new platform, managing sequencing, risk, and continuity of service throughout
· Establish and maintain the standard workflow for deploying solutions: build locally → containerise → publish to Kube → configure connectivity to client internal system dependencies
Client Internal Networking & Connectivity
· Configure and maintain networking between Kube and client's internal systems, including Shield, Snowflake, Floodgate, and any other platform dependencies the team relies on
· Own namespace and compute provisioning on the shared Kube cluster, ensuring workloads are appropriately isolated and correctly configured
· Manage credentials, service accounts, and access controls across the full connectivity chain — from container to downstream service
· Act as the go-to expert on how things connect within client's internal network topology
Infrastructure Management
· Own and manage cloud infrastructure across EMEIA using internal cloud tooling (client cloud and connected systems including Shield)
· Manage certificates, firewalls, resource pools, networking, and access controls
· Ensure infrastructure is appropriately sized, resilient, and cost-efficient
· Maintain accurate documentation of infrastructure topology and configuration
VM Provisioning & Automation (Existing Estate)
· Maintain and operate existing virtual machines, primarily on RHEL, while migration to the new environment is in progress
· Build and maintain standardised, repeatable provisioning processes (e.g. via Ansible, Terraform, or equivalent IaC tooling)
· Manage package deployment, software repositories, databases, and web servers
· Own the patching and update lifecycle for managed systems
Monitoring & Reliability
· Implement and maintain monitoring, alerting, and observability across both the existing VM estate and the new container environment
· Proactively identify risks, bottlenecks, and failure patterns before they impact users
· Define and track appropriate SLIs/SLOs for critical services
· Conduct post-incident reviews and drive lasting improvements
Supporting AI-Augmented Development
· A large proportion of the solutions you will support are built rapidly using structured AI-assisted development — you must be comfortable working with codebases and configurations that evolve quickly, may not have deep documentation histories, and may have been substantially generated with AI tooling
· Provide the infrastructure scaffold that allows AI-assisted solutions to move from local development to production reliably and safely
· Be a pragmatic partner to developers: unblock deployment quickly, catch infrastructure-level risks early, and help establish patterns that make rapid iteration safe at scale
· Actively use AI tools (e.g. Claude, Copilot, or similar) to accelerate your own work: writing scripts, diagnosing issues, generating runbooks, reviewing configurations
Diagnosis & Incident Response
· Take ownership of vague or ambiguous production issues (e.g. "it's running slow”, "the server keeps falling over”) and drive them through to resolution
· Deliver short-term fixes rapidly to restore service, while tracking and delivering long-term root cause resolutions
· Maintain a pragmatic balance between speed-of-recovery and quality-of-fix
SKILLS & EXPERIENCE
Essential
· Proven experience in a DevOps, infrastructure, or platform engineering role
· Hands-on experience with Kubernetes — deploying, configuring, and operating workloads in a shared or managed cluster environment
· Experience containerising applications: writing Dockerfiles, managing images, publishing to a registry, and debugging container-level issues
· Strong networking fundamentals: DNS, TLS/SSL certificates, firewall rules, load balancing, VPNs, and service-to-service connectivity
· Comfort operating in environments where the architecture is still being defined — able to contribute to the decision, then execute once direction is set
· Hands-on experience with RHEL (or equivalent enterprise Linux) — provisioning, hardening, package management (yum/dnf), systemd services
· Experience managing cloud infrastructure, ideally in an enterprise private/hybrid cloud environment
· Experience with infrastructure-as-code or configuration management tooling (e.g. Terraform, Ansible, Puppet, or similar)
· Solid scripting ability in Bash and at least one higher-level language (Python preferred)
· Experience with monitoring and observability tooling (e.g. Prometheus, Grafana, Datadog, or similar)
· Strong incident diagnosis skills — able to work from vague symptoms to root cause using logs, metrics, and reasoning
· Comfortable working with AI-generated or AI-assisted codebases: reading, extending, and debugging solutions without a full traditional authorship history
· Clear written and verbal communication — able to translate infrastructure complexity for non-technical stakeholders
Desirable
· Experience with AWS or client specific AWS, particularly EKS
· Familiarity with client's internal platform tooling: Kube, Shield, Floodgate, or similar
· Experience integrating with Snowflake, including managing drivers, credentials, and network access
· Experience with CI/CD pipelines (GitLab CI, Jenkins, GitHub Actions, or similar)
· Exposure to security tooling, vulnerability scanning, or compliance frameworks (e.g. CIS Benchmarks)
· Familiarity with secrets management tooling (Vault, CyberArk, or similar)
· Experience working in a regulated or enterprise environment with change management processes
WAYS OF WORKING
· You are comfortable with genuine ambiguity — including at the architectural level — and can make progress and contribute to decisions without waiting for everything to be resolved
· You default to automation: if you do something twice, you script it; if you do it three times, you build a process
· You adapt quickly: the tools, environments, and solutions you support can change fast, and you treat that as normal rather than exceptional
· You are pragmatic under pressure: you know when to stop the bleeding first and fix it properly later
· You are self-directed and comfortable owning problems end-to-end with minimal hand-holding
· You are a willing partner to developers who move fast — you keep up, add guardrails where they matter, and don't become a bottleneck
WHAT SUCCESS LOOKS LIKE
· A new managed container environment is designed, built, and running — with existing VM-based workloads migrated onto it in a controlled, sequenced way
· The standard deployment path (build → containerise → publish → connect) is well-established, documented, and easy for the team to use
· Connectivity from the new environment to client internal systems (Snowflake , Shield, Floodgate, etc.) is reliable, well-understood, and correctly secured
· Teams are unblocked quickly when they need new integrations, access, or capabilities — even when the solutions they are deploying have been built at speed
· Production issues are resolved rapidly, with lasting fixes following close behind
· Monitoring catches issues before users do
· The infrastructure estate — both old and new — is well-documented, well-understood, and in a known-good state

Sign up for Job Alerts