DataLensAi-Understand your data instantly.
Link to open source: https://github.com/Shourya523/data-dictionary
Link to Live Project: https://data-dictionary-flax.vercel.app/
The Intelligent Data Dictionary Agent
By Team LocalHost
The Intelligent Data Dictionary Agent is an AI-powered platform that transforms complex enterprise schemas into a business-friendly, continuously updated knowledge layer. It automates documentation and governance to improve data trust and accessibility across organizations.
Core Goals
-
Automated Enrichment: Uses AI to generate clear, user-friendly descriptions and summaries for technical metadata.
-
Intelligent Governance: Performs real-time quality analysis (completeness/freshness) and automatically flags sensitive PII for compliance.
-
Natural Language Accessibility: Democratizes data through a conversational chat interface, allowing users to query data meaning without writing SQL.
-
Automated Lineage Mapping: Constructing dynamic Lineage Graphs to visualize how data flows between tables and systems, identifying dependencies and performing impact analysis.
How It Works
-
Ingestion: Connects securely to source databases (PostgreSQL, Snowflake, etc.) using read-only connectors to extract schema metadata.
-
AI Enrichment: Gemini/OpenAI generates business context, detects sensitive data, and provides impact analysis.
-
Graph Construction: The system parses foreign keys, query logs, and join patterns to build a relationship graph representing the data's lineage.
-
Storage: Metadata and AI summaries are stored in Neon (PostgreSQL), utilizing pgvector for semantic retrieval.
-
Discovery: Users interact via a natural language chat that retrieves context from the vector store to answer data questions.
-
Sync: Employs incremental updates to ensure documentation and lineage reflect real-time schema changes.
The Tech Stack
-
Frontend: Next.js and Tailwind CSS.
-
Database & Storage: Neon (PostgreSQL) for metadata and pgvector for embeddings.
-
ORM: Drizzle for schema management and multi-dialect adapters.
-
AI Providers: Gemini and OpenAI.
-
Infrastructure: Edge Functions for low-latency processing and Clerk/Neon Auth for secure authentication.
Outcome
This solution reduces operational costs through automated processing, accelerates decision-making with instant data discovery, and ensures proactive compliance via automatic PII detection and visual lineage tracking.
This build was uploaded as a hackathon project












