Agentic AI Architect
Job Description
Interviews and onboarding are conducted virtually, reflecting our digital-first mindset.\n\nRooted in the region, we specialize in delivering tailored, impactful solutions in Data, Advanced Analytics and AI, Infrastructure, Cloud Security, and Application Modernization. Whether it’s enabling predictive analytics , transforming operations with automation, or driving customer engagement with intelligent platforms, we are the trusted partner for organizations ready to embrace a smarter, more efficient future.\n\nAbout the Role\n\nSenior AI/ML Technical Architect – Generative AI & Agentic Systems\n\nLocation: India\n\nWe are seeking a Senior AI/ML Solution Architect with deep expertise in Generative AI and agentic systems to lead the architecture and delivery of enterprise-scale AI solutions. This role requires a strong combination of hands-on technical capability across Large Language Models (LLMs) and Small Language Models (SLMs), along with the architectural leadership to design, integrate, and deploy AI systems across cloud, edge, and on-premises environments.\n\nThe successful candidate will architect scalable agentic platforms, build advanced RAG and fine-tuning pipelines, and design robust integration frameworks connecting AI services with enterprise applications.
This role sits at the core of AI transformation initiatives, balancing cutting-edge innovation with production-grade engineering, performance optimization, and security-first design.\n\nExperience Requirements\n\n8+ years of experience in software engineering, data, or AI systems\n2+ years of hands-on experience with Generative AI and LLM-based solutions\n4+ years of experience designing and architecting enterprise-scale platforms and distributed systems\n\nKey Responsibilities\n\nArchitecture & System Design\nArchitect scalable agentic systems using advanced LLM and SLM capabilities\nDesign multi-agent orchestration frameworks for complex, automated workflows\nDefine context, memory, and state-management architectures for persistent agent interactions\nImplement Model Context Protocol (MCP)–based integrations across enterprise services\n\nAI & Platform Implementation\nDesign and optimize Retrieval-Augmented Generation (RAG) architectures\nBuild agent solutions using LangChain, LangGraph, Semantic Kernel, Agno, and custom frameworks\nArchitect and deploy model inference pipelines across cloud, edge, and on-prem environments\nDevelop fine-tuning strategies for LLMs and SLMs, including domain and task specialization\nLead model compression, quantization, and performance-optimization initiatives\n\nIntegration & Enterprise Connectivity\nArchitect secure REST, gRPC, and GraphQL APIs for AI platform services\nDesign event-driven architectures using message buses and webhooks\nImplement authentication and authorization systems (SSO, OIDC, token-based security)\nBuild and govern enterprise connectors (Slack, Jira, Salesforce, ERP/CRM platforms)\n\nData, Models & Evaluation\nDesign data preprocessing and governance pipelines (cleaning, deduplication, PII handling)\nArchitect embedding generation, re-indexing, and retrieval optimization workflows\nDefine chunking, windowing, and content processing strategies\nEstablish model evaluation, benchmarking, and selection frameworks\n\nRequired Technical Skills\n\nCore AI & GenAI\nDeep experience with GPT-4, Claude, LLaMA, and enterprise/open-source LLM ecosystems\nExpertise deploying and optimizing SLMs (Phi-3, Gemma, TinyLlama)\nAdvanced agent frameworks: LangChain, LangGraph, Semantic Kernel, Agno\nRAG systems, vector databases, semantic and hybrid retrieval\n\nFine-Tuning & Model Optimization\nParameter-efficient tuning: LoRA, QLoRA, DoRA, AdaLoRA\nPrompt-tuning, adapters, prefix tuning, P-tuning v2\nRLHF / RLAIF pipelines\nCompression techniques: quantization (INT8/INT4, GPTQ, AWQ, GGML), pruning, distillation\n\nDeployment & Performance\nMulti-environment deployment (cloud, edge, on-prem)\nAutoscaling, rate-limiting, and resource governance\nReal-time inference, streaming pipelines, adaptive reasoning control\n\nEngineering & Platforms\nTensorFlow, PyTorch, Hugging Face, LlamaIndex\nAPI-driven system design and full-stack AI service development\nAWS, Azure, GCP AI platforms\nCI/CD, Docker, Kubernetes, monitoring and observability\n\nPreferred Qualifications\n\nMaster’s or PhD in Computer Science, AI, ML, or related field\nContributions to open-source AI projects or research publications\nExperience with multi-modal and cross-modal systems\nStrong grounding in MLOps and full model lifecycle management\nExperience designing compliant and regulated AI systems\nDemonstrated leadership in enterprise AI transformation programs\nCloud certifications (AWS, Azure, GCP – AI/ML focus)\n\nTechnical Competencies Assessed\n\nDistributed AI system architecture and design\nProduction code quality, scalability, and performance optimization\nModel benchmarking, evaluation, and cost optimization\nSecurity, privacy, and AI governance engineering\nEnterprise-grade deployment and scaling strategies\n\nWhat We Offer:\nAt Delphi, we are dedicated to creating an environment where you can thrive, both professionally and personally. Our competitive compensation package, performance-based incentives, and health benefits are designed to ensure you're well-supported. We believe in your continuous growth and offer company- sponsored certifications, training programs , and skill-building opportunities to help you succeed.
We foster a culture of inclusivity and support, with remote work options and a fully supported work-from- home setup to ensure your comfort and productivity. Our positive and inclusive culture includes team activities, wellness and mental health programs to ensure you feel supported.