Feb 21, 2026

DataSage AI – Intelligent Metadata & Data Quality Copilot

datasage-ai ai dictionary ai data dictionary hackathon postgresql gdgcloudnd gdgindia gdgnewdelhi

Overview

DataSage AI is an AI-powered metadata intelligence platform that automatically extracts database schema, evaluates data quality, and generates business-friendly documentation. It bridges the gap between technical database structures and business understanding.

In many organizations, database documentation is outdated or too technical, making it difficult for business users to understand what data means and how reliable it is. DataSage AI solves this by combining automated profiling with AI-driven interpretation and conversational querying.

 

Problem We Solve

Enterprise databases often lack:

  • Updated documentation

  • Business context for technical schema

  • Clear data reliability indicators

  • Easy accessibility for non-technical users

This leads to slow analytics, low data trust, and inefficient decision-making.

Our Solution

DataSage AI automatically:

  • Connects to PostgreSQL databases

  • Extracts complete schema metadata (tables, columns, keys)

  • Performs intelligent data profiling (null %, uniqueness, DQ score)

  • Generates AI-powered business summaries

  • Enables natural language chat with the database schema

  • Generates SQL queries from user questions

We don’t just document databases β€” we make them understandable.

Key Features

πŸ”Ή Automated Metadata Extraction

Extracts tables, columns, primary keys, and relationships automatically.

πŸ”Ή Intelligent Data Profiling

Calculates:

  • Null percentage

  • Distinct count

  • Completeness

  • Uniqueness

  • Data Quality Score

πŸ”Ή AI-Generated Business Documentation

Converts technical schema into:

  • Business summary

  • Use cases

  • Risk insights

  • Data quality recommendations

πŸ”Ή Conversational Schema Intelligence

Users can ask:

  • β€œWhich table contains revenue?”

  • β€œIs customer data reliable?”

  • β€œWhich columns have high null values?”

πŸ”Ή SQL Generation

Natural language β†’ Correct PostgreSQL query.

Tech Stack

Backend:

  • Flask

  • SQLAlchemy

  • Pandas

AI Layer:

  • Gemini API

  • RAG (Vector Search using ChromaDB)

Database:

  • PostgreSQL

Frontend:

  • Bootstrap Dashboard

Future Scope

  • Multi-database support (Snowflake, SQL Server)

  • Real-time schema monitoring

  • Data lineage visualization

  • Enterprise SaaS deployment

  • Role-based access control

  • ML-based anomaly detection

Impact

DataSage AI:

  • Reduces manual documentation effort by up to 80%

  • Improves data trust across teams

  • Empowers non-technical users

  • Accelerates analytics workflows

This project has strong potential to evolve into a scalable enterprise data intelligence SaaS platform.

This build was uploaded as a hackathon project

Hackathon

HackFest 2.0

View All Projects
Give a star to encourage!Discussion
Start a new conversation!
Login to join the discussion
Updates
  • undefined
    Sunday, Feb 22nd, 2026