Posted 16 June, 2026
Al Engineer
HabileLabs
Jaipur, RJ, IN
Full Time
Reference: 35c93cef4f1032d1
Job Description
AI Engineer — Voice & Language
4+ Years · Full-Time· Jaipur ·Competitive Salary
About the Role
We're hiring a senior AI Engineer to design, build, and ship production AI systems — with strong emphasis on Voice AI. You'll own the full lifecycle: architecture, training, deployment, and monitoring across language and voice modalities.
What You'll Do
- LLM & GenAI: Fine-tune and deploy LLMs; build RAG pipelines and agentic workflows (LangChain, LlamaIndex).
- Voice Pipelines: Architect real-time ASR → LLM → TTS pipelines with <300 ms latency
- Voice Agents: Build production voice agents with turn-taking, barge-in handling, and emotion-aware dialogue.
- Speech Fine-Tuning: Adapt ASR/TTS models for domain-specific accents, terminology, and speaking styles.
- MLOps: Build reproducible ML pipelines (Kubeflow / MLflow); maintain CI/CD, monitoring, and model versioning.
- Inference Optimization: Apply quantization (GGUF, GPTQ), distillation, and hardware-aware inference (TensorRT, vLLM) to cut cost and latency.
- APIs & Services: Ship high-performance inference APIs in Python (FastAPI) or Go on Kubernetes.
- Data & Evaluation: Curate text + speech corpora; define eval harnesses covering WER, MOS, latency P95, and safety.
Requirements
Must-Have
- 4+ yrs ML/software engineering; 2+ yrs on production AI systems
- Strong Python; PyTorch or TensorFlow
- LLM fine-tuning: LoRA / QLoRA / PEFT
- End-to-end ML pipeline experience (train → serve)
- Cloud (AWS / GCP / Azure) + Docker / Kubernetes
- ASR & TTS integration in real-time streaming systems
- VAD, noise suppression, and barge-in handling
- Telephony APIs (Twilio, Vonage) or WebRTC experience
Nice-to-Have
- Whisper / wav2vec fine-tuning for domain adaptation
- Audio-language models (AudioPaLM, Qwen-Audio, Gemini Audio)
- Speaker diarization (pyannote.audio) or voice biometrics
- Prosody control, SSML, expressive TTS synthesis
- Multilingual ASR/TTS and code-switching pipelines
- RLHF / Constitutional AI alignment
- Vector DBs (Pinecone, Weaviate, pgvector)
- Open-source contributions or published research
Tech Stack
Core
Python
PyTorch
FastAPI / Go
Kubernetes
MLflow
LLM & GenAI
OpenAI / HuggingFace
LangChain
LlamaIndex
vLLM
RAG / Agents
️ Voice AI
STT
TTS
WebRTC / WebSockets
pyannote.audio
Twilio / Vonage
️ Audio Processing
librosa / FFmpeg
Silero VAD
openWakeWord
SSML / Prosody
AEC / Noise Suppression