Skip to main content
Posted 17 June, 2026

AI/Data Engineer

All European Careers
Noida, UP, IN Full Time
Reference: ea467ee2d0461094

Job Description

For an international project in Rome, we are urgently looking for a AI/Data Engineer. The AI/Data Engineer should have strong hands-on experience in data engineering, search technologies and the implementation of AI-powered search or Retrieval-Augmented Generation applications.


Candidates need to be fluent in English.


Tasks and responsibilities:

  • Design and implement scalable data ingestion pipelines to ingest structured, semi-structured, and unstructured content from various enterprise data sources (SharePoint, Liferay, Web crawls etc) into Elasticsearch or an equivalent search index;
  • Develop connectors and ingestion jobs capable of supporting batch, incremental, and near-real-time indexing patterns;
  • Design mechanisms to track document versions, source provenance, access permissions, update timestamps, and deletion events to ensure the search index remains accurate and current;
  • Develop content extraction pipelines for various file formats, including PDF, Word, Excel, PowerPoint, HTML, emails, scanned documents, and other common enterprise formats, and convert them into standard markdown format for improving AI readability;
  • Design and implement semantic chunking strategies optimized for retrieval quality, including chunk size, overlap, section-aware splitting, heading preservation, table handling, and context retention;
  • Implement metadata extraction and enrichment, deduplication of content during the ingestion;
  • Develop hybrid search capabilities combining keyword-based search, semantic vector search, metadata filtering, and contextual retrieval;
  • Develop re-ranking pipelines using cross-encoder models, LLM-based ranking, or other suitable re-ranking techniques to improve relevance of retrieved results;
  • Design and build the Retrieval-Augmented Generation pipeline that retrieves relevant enterprise content and uses large language models to generate grounded answers;
  • Implement agentic workflows where the AI application can invoke tools, perform multi-step reasoning, call enterprise APIs, refine searches, retrieve additional context, to answer user queries;


Profile:

  • Master degree;
  • +5 years of IT professional work experience;
  • +3 years hands-on experience building AI-powered search, semantic search, Retrieval-Augmented Generation, or LLM-based applications is required;
  • Experience building production-grade RAG AI applications with observability, monitoring, evaluation, security controls, and CI/CD practices is required;
  • Experience implementing agentic AI workflows, tool calling, or enterprise AI assistants is required;
  • Proficiency in Python and experience with data processing frameworks and libraries;
  • Strong hands-on experience with Elasticsearch, OpenSearch, Azure AI Search, or similar enterprise search platforms;
  • Experience designing index mappings, metadata filters, ranking profiles, synonyms, and relevance tuning;
  • Experience with vector search, approximate nearest neighbour search, hybrid retrieval, and semantic search;
  • Experience implementing semantic chunking strategies that split content into contextually coherent sections while preserving headings, structure, metadata, and parent-document relationships to improve retrieval accuracy in RAG applications;
  • Experience designing ingestion pipelines that convert enterprise documents into structured Markdown using tools such as Marker, Docling, or equivalent document conversion frameworks to preserve layout, tables, headings, and metadata for downstream RAG indexing;
  • Fluent in English;

Sign up for Job Alerts