Skip to main content
Posted 24 May, 2026

InXiteOut- Data Science Lead (NLP & GenAI)

Nexthire
IN Full Time
Reference: 136_762505_47748e432d87_1358352502

Data Science Lead (NLP & GenAI)

Summary

We are seeking a highly experienced and innovative Data Science Lead with 8+ years of expertise in core data science concepts and around 2+ years of focused, hands-on experience in Natural Language Processing (NLP) and Generative AI (GenAI). You will lead strategic AI/ML initiatives, mentor junior data scientists, and deliver intelligent solutions that drive business value using both classical and modern machine learning techniques.

Key Responsibilities

Lead end-to-end design and delivery of data science solutions, from problem definition to deployment.

Design, build, and fine-tune NLP and GenAI models for tasks such as summarization, classification, question answering, translation, and chatbot applications.

Apply statistical modeling, predictive analytics, and machine learning algorithms on structured and unstructured datasets.

Collaborate with product, engineering, and business teams to translate high-level business problems into data science solutions.

Ensure scalability, reproducibility, and performance optimization in all machine learning workflows.

Work with large-scale data processing tools and frameworks in cloud-based environments.

Mentor and review work of junior data scientists and collaborate on research and experimentation.

Track advancements in GenAI, LLMs, and NLP frameworks and bring innovation to enterprise AI use cases.

Mandatory Skills

Python: Strong proficiency in Python for data science, modeling, and scripting

Machine Learning: Hands-on with classical and ensemble models (e.g., Random Forest, XGBoost)

NLP (2+ years): Experience with transformers, tokenization, embeddings, sentiment analysis

GenAI & LLMs: Working with GPT-like models, fine-tuning, prompt engineering

Deep Learning (PyTorch / TensorFlow): Building and training deep learning models for NLP and other domains

Model Deployment: Deploying models via REST APIs, Docker, or cloud-native services

SQL & Data Manipulation: Strong ability to query, clean, and process data

Statistical Analysis: Applied statistics, hypothesis testing, and A/B testing

Version Control (Git): Experience using Git in collaborative environments

Optional/nice-to-have skills

Vector Databases: Experience with FAISS, Pinecone, or ChromaDB for semantic search

RAG Architecture: Building Retrieval-Augmented Generation pipelines

LLM Orchestration: LangChain, LlamaIndex, or similar frameworks

Cloud Platforms (Azure/GCP/AWS): Cloud-based ML workflows, pipelines, and infrastructure

MLOps: Model tracking, monitoring, CI/CD with MLflow, Kubeflow, etc.

Big Data Tools: Spark, Databricks, or Hadoop ecosystem familiarity

Experiment Tracking: Tools like Weights & Biases, MLflow

Academic Research / Publications: Experience publishing whitepapers or research contributions

Hand-on experience with Databricks, preferably Azure Databricks platform.

Hand-on experience with Delta Lake, preferably Azure Databricks and ADLS Gen2 platforms.

Educational Qualifications

Master's or PhD in Computer Science, Data Science, AI/ML, Statistics, or a related field.

Certifications (preferred but not mandatory)

Google Cloud or Azure AI Engineer / Data Scientist Associate

Databricks Certified Machine Learning Professional

DeepLearning.AI Generative AI certification

Hugging Face Transformers certification

Employment Type: FULL_TIME

Sign up for Job Alerts