Skip to main content
Posted 03 June, 2026

AI Engineer

Virtusa
Bangalore, Karnataka, India Full Time
Reference: 55_537753_78308

Core Infrastructure & Deployment Containerization Master Docker for packaging models. Orchestration Use Kubernetes (K8s) for managing clusters. GPU Architecture Understand NVIDIA CUDA, ROCm, and vGPU allocation. Model Serving Deploy via vLLM, Ollama, TGI, or Triton Serving Server. Model Management & Optimization Quantization Apply AWQ, GPTQ, or GGUF to fit models into local VRAM. Fine-Tuning Execute PEFT and LoRA for local domain adaptation. Open-Source Selection Evaluate weights from Hugging Face (e.g., Llama, Mistral, Phi). Hardware Benchmarking Calculate memory bandwidth and compute requirements. Agent Architecture & Integration Framework Mastery Build workflows using LangChain, LangGraph, or AutoGen. Vector Databases Deploy and manage local instances of Pinecone, Qdrant, Milvus, or PGVector. RAG Pipelines Construct robust Retrieval-Augmented Generation systems without cloud APIs. State Management Handle complex agent memory loops locally. On-Premise Security & Networking Air-Gapped Workflows Install dependencies and models without internet access. Data Privacy Implement strict local data governance and access controls. Local Auditing Set up private logging and guardrails (e.g., NeMo Guardrails). Software Engineering & DevOps Languages Python and Rust or C++ for speed. Java Microservices. CI/CD Pipelines Automate local builds with GitLab CI or Jenkins. Monitoring Track system metrics using OpenTEL, Prometheus and Grafana.

Sign up for Job Alerts