AgenticOps Platform Engineer Lead
Job Description
We are looking for a senior, hands-on AgentOps Platform Engineer to design, build, and operate the cloud-native infrastructure that...
Job Description
We are looking for a senior, hands-on AgentOps Platform Engineer to design, build, and operate the cloud-native infrastructure that powers our AI agents at scale.
This is a lead-by-example role:
- You write the Terraform
- You build the pipelines
- You own the platform in production
GCP is your primary environment, but you will design with multi-cloud in mind (AWS, Azure), ensuring portability, resilience, and long-term flexibility. This role sits at the intersection of DevOps, MLOps, and AgentOps, with deep responsibility for reliability, security, observability, and cost.
KEY RESPONSIBILITIES
Platform & Infrastructure Ownership
- Design, build, and operate production-grade infrastructure for AI agents and LLM services
- Own Terraform-based Infrastructure as Code for all environments (dev, uat, prod)
- Lead infrastructure decisions through hands-on implementation, not diagrams
- Build scalable foundations for: Agent orchestration Inference services RAG pipelines Vector stores
- Optimise cloud resources for performance and cost efficiency
AgentOps & AI Platform Enablement
- Enable safe, continuous operation of autonomous agents
- Design agent runtime environments with: Isolation & sandboxing Failover and recovery strategies Controlled rollout mechanisms
- Support prompt/version management, agent configuration, and tool/plugin lifecycle
- Work closely with Agentic RAG engineers to operationalise research into production
CI/CD & Automation
- Build and maintain CI/CD pipelines for: Infrastructure Agent services Prompt and config changes Model/version rollouts
- Automate workflows for: Vector DB updates RAG index refreshes Agent memory stores Tool registration and validation
- Reduce manual ops toil aggressively through automation
Observability & Production Readiness
- Design and implement deep observability for agent systems: Platform health Agent execution metrics Latency, cost, and throughput Failure modes and retries
- Build dashboards, alerts, and telemetry using: Prometheus Grafana OpenTelemetry (or equivalent)
- Enable visibility into agent decision traces and runtime behavior
Security, Safety & Reliability
- Implement secure cloud architecture and IAM best practices
- Own production reliability, incident response, and recovery
- Enforce operational guardrails and safety controls for agent APIs
- Support responsible AI practices from an infrastructure and runtime perspective
Collaboration & Technical Leadership
- Work closely with: Agentic RAG engineers AI engineers Product & CTO Office
- Define SLOs, reliability targets, and operational metrics
- Set the technical bar for AgentOps at BridgeAI
- Mentor engineers by example and code, not process overhead
REQUIRED SKILLS & EXPERIENCE
Core Platform & DevOps
- 5+ years in DevOps, Platform Engineering, SRE, or MLOps
- Strong, hands-on experience with GCP: GKE / Compute Engine Cloud Run / Functions Cloud Storage, Pub/Sub Vertex AI (or equivalent)
- Deep experience with Terraform (mandatory)
Containers, CI/CD & Automation
- Docker, Kubernetes, Helm
- CI/CD tooling (GitHub Actions, Jenkins, ArgoCD)
- Python and Bash for automation and platform glue code
Agentic & AI Systems
- Experience supporting LLM-based systems in production
- Understanding of: Prompt/version management Context handling & caching Model rollout strategies
- Hands-on experience with vector databases (Weaviate, FAISS, Pinecone)
- Familiarity with RAG pipelines and agent execution patterns
Observability & Security
- Monitoring and telemetry using Prometheus, Grafana, OpenTelemetry
- Strong understanding of cloud security, IAM, and operational safety
NICE TO HAVE
- Multi-cloud experience (AWS, Azure)
- Exposure to agent frameworks (LangChain, LangGraph, AutoGen, CrewAI)
- Event-driven systems (Temporal, Airflow)
- Experience with responsible AI operations or safety monitoring
WHAT SUCCESS LOOKS LIKE
- Infrastructure is reproducible, observable, and boring (in a good way)
- Agent failures are visible, debuggable, and recoverable
- Cloud costs are understood and controlled
- Engineers trust the platform and move faster because of it
- You are the go-to authority for AgentOps at BridgeAI
WHAT THIS ROLE IS (AND IS NOT)
- Deeply hands-on
- Terraform-first
- Production ownership
- Sets standards by building
- Not a people-manager role
- Not a ticket-based ops role
- Not a “just keep the lights on” job
Below are some other jobs we think you might be interested in.
-
Lead Platform Engineer
- MSCI
- Mumbai,MH,IN,400063
Your Team ResponsibilitiesThe DevOps Platform Engineering team is responsible for building, operating, and evolving the engineering platforms, developer...28 May -
Lead Platform Engineer
- 5 Star Recruitment
- Hyderabad, Telangana, India
JOB DESCRIPTION: As a Lead Platform Engineer (Zero Trust SASE Services), you will provide technical leadership at the enterprise level in defining...19 May -
Lead Platform Engineer
- – Immed Hire – Java Angular – Irving Tx | Mitchell Martin
- Mumbai,Maharashtra,India,400056
Title: Lead Platform Engineer Location: India (100% Remote) Employment Type: Full Time Job Summary: • We’re looking for a hands-on Lead Platform...13 Jun -
Lead Platform Engineer
- Mitchell Martin
- Mumbai, Maharashtra, IN
Title: Lead Platform Engineer Location: India (100% Remote) Employment Type: Full Time Job Summary: • We’re looking for a hands-on Lead Platform...12 Jun -
Lead Platform Engineer
- Welcome to Mitchell Martin Inc.
- Mumbai,Maharashtra,India,400056
Title: Lead Platform Engineer Location: India (100% Remote) Employment Type: Full Time Job Summary: • We’re looking for a hands-on Lead Platform...13 Jun -
Platform Engineering Lead
- Mahindra Finance
- Bangalore,Bang,IN
Roles and Responsibilities Execute systemlevel engineering delivery to meet platform TCP targets Own timely closure of engineering deliverables for...04 Jun -
Lead Platform Engineer
- The Depository Trust Clearing
- Chennai,Tamil Nadu,IN,600096
Are you ready to make an impact at DTCC? Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive...12 Jun -
Lead Platform Engineer
- The Depository Trust Clearing
- Hyderabad,Telangana,IN,500032
Are you ready to make an impact at DTCC? Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive...14 Jun -
AGENTIC OPS
- Virtusa
- Chennai, Tamil Nadu, India
AGENTIC OPS Proficiency in frameworks like Lang graphs , Crew AI or AutoGen Ability to create marked down files to define how agents should handle...12 Jun -
Lead Data Platform Engineer
- Gen Digital Inc.
- Pune, Maharashtra, India
About This RoleWe're seeking an exceptional Data Platform Engineer to shape the strategic direction and architecture of our data infrastructure. This is...29 May -
Lead Data Platform Engineer
- Gen Digital Inc.
- Chennai, Tamil Nadu, India
About This RoleWe're seeking an exceptional Data Platform Engineer to shape the strategic direction and architecture of our data infrastructure. This is...29 May -
Power Platform Lead Engineer
- NR Consulting
- Bangalore,Karnataka
Title: Power Platform Lead Engineer Location: Bangalore Exp: 10+ Years Job Description: Key Responsibilities Platform Administration &...29 May -
Lead Engineer - Revex Platform
- HighLevel
- Delhi
About us HighLevel is an AI-powered business operating system that gives agencies, entrepreneurs and SMBs the infrastructure to build, automate and...12 Jun -
Platform Engineering Lead - Software
- Mahindra Finance
- Bangalore,Bang,IN
Roles and Responsibilities Execute software/systemlevel delivery to meet platform targets Own timely closure of software engineering deliverables...16 Jun -
LEAD PLATFORM EMULATION ENGINEER
- Advanced Micro Devices, Inc
- Bangalore,Karnataka,India,560006
WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and...27 May -
Lead Engineer - API/Platform
- BlueJeans
- India
Lead and mentor a team of engineers Work closely with product managers to translate product requirements into design Work with the architects and the...12 Jun -
AGENTIC OPS
- Virtusa
- IN-TN-Chennai
AGENTIC OPS Proficiency in frameworks like Lang graphs , Crew AI or AutoGen Ability to create marked down files to define how agents should handle...12 Jun -
Lead Technical Engineer - Salesforce Platform
- General Mills
- Powai, Mumbai,Maharashtra,India,400076
Lead Technical Engineer – Salesforce Platform Mumbai, India | Hybrid | Global Scope Digital & Technology | General Mills India Center About the...20 May -
Lead Software Engineer - Cloud Platform Engineer
- JP Morgan Chase
- Bengaluru,IN-KA,IN,560103
Are you ready to shape the next generation of mobile banking? At Chase UK, we're building a digital bank from the ground up, combining start-up energy...04 Jun -
Lead Software Engineer ServiceNow platform
- JP Morgan Chase
- Hyderabad,IN-TG,IN
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.As a Lead Software Engineer at...11 Jun