Oct 4, 2025

IntelliDoc Assistant

devfestdelhi2025 google gemma ai rag share your build! project showcase gdgnewdelhi innovation devfestdilli2025 buildwithgoogle aiprojects llms langchain vector db vector embeddings hugging face

What this Build is About

IntelliDoc Assistant is an AI-powered RAG-based system that enables intelligent querying of PDFs by extracting insights from text, tables and scanned data using LangChain, Google Gemma 3, PaddleOCR, FAISS and Ollama.

IntelliDoc Assistant is an AI-powered RAG-based system designed to make PDF documents truly interactive. It integrates Docling for structured parsing, PaddleOCR for extracting text from scanned or screenshot-based PDFs, and advanced text chunking with semantic embeddings to convert unstructured content into queryable knowledge. Using FAISS for vector storage and LangChain to orchestrate retrieval, the system enables intelligent search across text, tables, and OCR-derived data. Large Language Models (LLMs) such as Ollama with Gemma-3-4B-Instruct generate accurate, context-aware answers by grounding responses in retrieved document chunks. This makes the assistant a robust, multimodal solution for handling diverse enterprise documents.
Built on a Retrieval-Augmented Generation (RAG) architecture, it integrates multiple cutting-edge components to handle textual, tabular, and scanned image data.

  • Document Ingestion & OCR → Powered by Docling and PaddleOCR, enabling structured extraction from digital and scanned PDFs.

  • Embeddings & Vectorization → Uses Hugging Face Sentence Transformers (all-MiniLM-L6-v2) for semantic embeddings, ensuring contextual understanding of document content.

  • Vector Storage & Retrieval → Employs FAISS for high-speed similarity search with Maximum Marginal Relevance (MMR) to improve diversity in retrieval.

  • Language Models → Integrated with Ollama for local inference and supports Google’s Gemma-3-4B-Instruct and Gemini models for context-grounded natural language responses.

  • RAG Orchestration → Uses LangChain to connect embeddings, retrievers and LLMs into a seamless QA pipeline.

Motivation

  • Organizations like NHPC deal with thousands of policy documents, amendments, bills, and audit reports.

  • Employees spend hours searching for relevant clauses, allowances, or rules buried in text or tables.

  • Traditional keyword-based search fails when dealing with scanned images, inconsistent tables, or queries requiring reasoning (e.g., “Show me allowances between 20–25%”).

  • The motivation was to build a smart, multimodal assistant that reduces information retrieval time and improves decision-making.


Why I Built It

The motivation for building this assistant came from real-world challenges faced by employees in organizations like NHPC, who deal with large volumes of policy documents, amendments, audit reports, and circulars. Manually searching through these documents is time-consuming, especially when information is hidden in complex tables or scanned images. Traditional keyword-based search often fails in such cases, as it lacks semantic understanding and cannot handle numeric reasoning (e.g., "Find allowances between 20–25%"). I built this project to create a scalable, intelligent solution that reduces information retrieval time, ensures accuracy, and empowers users with a conversational way to access enterprise knowledge.

How It Can Be Useful for Others

This system can be extended far beyond NHPC’s internal use:

  • Enterprises - Employees can instantly query HR policies, financial reports, contracts, and notices, saving hours of manual search.

  • Legal/Government Sector - Acts, laws, and amendments can be queried conversationally, improving accessibility for officials and citizens.

  • Education & Research - Students and researchers can interact with study materials, technical papers, and e-books without scanning through lengthy PDFs.

  • General Users - Anyone managing invoices, scanned bills, or official forms can retrieve specific details quickly with natural language queries.

By combining OCR, table normalization, fuzzy matching, vector embeddings and LLM-based reasoning, IntelliDoc Assistant demonstrates how AI can bridge the gap between static documents and intelligent knowledge systems.

2

Give a star to encourage!Discussion
Disha Arora
Disha Arora7 months ago

Yes, done!!

Arpan Garg
Arpan Garg7 months ago

Thanks for sharing! Have you registered here? https://www.commudle.com/fill-form/3897

Login to join the discussion