Skip to main content
Posted 12 June, 2026

Generative AI Engineer

Nuvo AI
Remote Nationwide, IN Full Time
Reference: 9791baf4737f0ff7

Job Description

We are looking for a Senior Generative AI Engineer to lead the development and deployment of our next-generation Automated Drafting Tool. This role requires ownership of the complete AI lifecycle — from local prototyping using Ollama to scaling production-grade AI systems through OpenAI APIs. The ideal candidate has a Full-Stack AI mindset with strong expertise in: Retrieval-Augmented Generation (RAG) Vector databases and embeddings Prompt engineering LLM orchestration frameworks AI infrastructure and deployment You should be passionate about building reliable, context-aware, and production-ready AI systems that generate high-quality drafts with strong grounding and minimal hallucinations.

Key Responsibilities 1. AI Architecture & Drafting Logic Design and implement end-to-end RAG pipelines optimized for automated document drafting. Develop advanced prompt engineering strategies to manage: Tone consistency Legal/technical compliance Formatting requirements Context preservation Implement hybrid AI workflows using: Ollama for local development, testing, and privacy-sensitive workloads OpenAI models (GPT-4o/o1) for advanced production reasoning Build “Agentic RAG” workflows capable of: Multi-step reasoning Self-correction Context verification 2.

Data & Vector Engineering Build and maintain scalable Vector Databases such as: Pinecone Weaviate Milvus FAISS pgvector Optimize document ingestion pipelines, including: Chunking strategies Embedding model selection Metadata filtering Retrieval ranking Improve retrieval precision and contextual relevance for drafting workflows. Implement retrieval evaluation and grounding mechanisms to reduce hallucinations. 3.

Deployment & MLOps (Local to Cloud) Bridge local AI experimentation with scalable cloud deployment environments. Deploy AI services using: Docker Kubernetes Cloud infrastructure (AWS/GCP/Azure) Manage: API latency Rate limits Token optimization Cost efficiency Establish monitoring systems for: Hallucination detection Groundedness metrics AI quality evaluation Tracing and observability Required Skills & Qualifications Mandatory Experience 3 years of experience in: AI/ML Engineering Backend Engineering Generative AI-focused product development Hands-on expertise with: LangChain LlamaIndex Strong experience with: OpenAI API ecosystem Ollama and local model runners Proven experience implementing and optimizing: RAG pipelines Vector databases Embedding workflows Advanced Python development skills using: FastAPI Flask Asynchronous programming Exposure to: JIRA Confluence Technical Stack Models OpenAI GPT-4 / GPT-4o Ollama Llama 3 Mistral Mixtral Frameworks & Tools LangChain LlamaIndex LangSmith Databases Pinecone ChromaDB pgvector Infrastructure & DevOps Docker Kubernetes AWS / GCP / Azure GitHub Actions (CI/CD) What We Look For (The “Hacker” Mindset) Production-Proven You have successfully taken at least one GenAI product from: Jupyter Notebook / local prototype to A live production environment with real users. Problem Solver You understand the stochastic nature of LLMs and know how to: Build guardrails Reduce hallucinations Improve reliability Ensure grounded AI outputs Architecture-First Thinking You care deeply about: Scalability Latency optimization Token efficiency Cost management Output quality Preferred Qualifications Experience building AI-powered drafting or document automation systems Knowledge of evaluation frameworks for LLM outputs Familiarity with multi-agent systems and agent orchestration Experience with enterprise AI security and privacy considerations Strong debugging and performance optimization skills Why Join Us?

Work on cutting-edge Generative AI products with real-world impact Build scalable AI systems from prototype to production Collaborate with a highly technical and innovation-driven team Opportunity to shape the future of AI-powered drafting and automation systems

Sign up for Job Alerts