TiempoLens AI by team Tiempo Legends
Link to open source: https://github.com/AMP0075/TiempoLegends_HackFest2
# ๐ TiempoLens AI
**Intelligent Data Dictionary Agent**
Automatically generate comprehensive, AI-enhanced data dictionaries from enterprise databases.
---
## ๐ฏ Problem Statement
Enterprise organizations struggle with **tribal knowledge**, **stale documentation**, and **disconnected metadata** across their data estates. Data engineers waste hours manually documenting schemas, analysts can't find what data means, and data quality issues go undetected until they cause downstream failures.
**TiempoLens AI** solves this by connecting to your databases and automatically generating rich, AI-enhanced data dictionaries with quality profiling, lineage visualization, and an interactive chat interface for querying metadata in natural language.
---
## โจ Key Features
Feature
Description
๐ **Multi-Database Connectivity**
PostgreSQL, MySQL, SQL Server, Snowflake, MongoDB
๐ **Auto Schema Extraction**
Tables, columns, indexes, constraints, relationships
๐ค **DS-STAR AI Agents**
Planner โ Coder โ Verifier โ Router pipeline for intelligent analysis
๐ **Data Quality Profiling**
Completeness, freshness, uniqueness, validity scoring with alerts
๐ฎ **Time Series Forecasting**
Prophet-powered trend analysis on quality metrics
๐ฌ **Interactive AI Chat**
Natural language queries with streaming responses & voice I/O
๐ **Data Lineage**
Visual table & column-level lineage graphs
๐ **Doc Generation**
Markdown & JSON data dictionary export
๐ง **RAG Pipeline**
pgvector-powered semantic search over schema metadata
๐จ **Professional UI**
Dark/light themes, responsive layout, real-time updates
---
## ๐๏ธ Architecture
```
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Next.js โ โ FastAPI Backend โโ Frontend โโโโโ>โ โโ (React 18) โ โ โโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โ โ REST โ โ WebSocketโ โ Celery โ โ โ โ โ API โ โ Streamingโ โ Workers โ โ โ โ โโโโโโฌโโโโโ โโโโโโโฌโโโโโ โโโโโโโฌโโโโโโ โ โ โ โ โ โ โ โโโโโโดโโโโโ โ โโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโ โ โ Nginx โ โ โ Service Layer โ โ โ Reverse โ โ โ Connection โ Schema โ Quality โ RAG โ โ โ Proxy โ โ โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโ โ โโโโโโโโโโโ โ โ โ โ โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโ โ โ โ DS-STAR Agent Layer โ โ โ โ Planner โ Coder โ Verifier โ Routerโ โ โ โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโ โ PostgreSQL โ Redis โ โ + pgvector โ Cache / PubSub โ โ (Metadata DB) โ (Celery Broker) โ โโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโ
```
### DS-STAR Agent Pattern
Based on the research paper *"Data Science with LLM and Agents"*, our multi-agent system follows the DS-STAR architecture:
1. **Router** โ Classifies incoming queries and routes to the right agent
2. **Planner** โ Decomposes complex questions into executable steps
3. **Coder** โ Generates SQL, Python code, or analytical answers
4. **Verifier** โ Validates outputs for correctness and completeness
5. **Analyzer** โ Specialized metadata analysis with business-friendly explanations
---
## ๐ ๏ธ Tech Stack
### Backend
- **Python 3.12** + **FastAPI** (async, HATEOAS REST)
- **SQLAlchemy 2.0** (async ORM with AsyncAdaptedQueuePool)
- **LangChain / LangGraph** (agent orchestration)
- **pgvector** (vector similarity search for RAG)
- **Celery** + **Redis** (background task processing)
- **Prophet** (time series forecasting)
### Frontend
- **Next.js 14** + **React 18** + **TypeScript** (strict mode)
- **Shadcn UI** + **Tailwind CSS** (component library)
- **Zustand** (state management)
- **ECharts** / **React Flow** (visualizations)
- **Web Speech API** (voice input/output)
### Infrastructure
- **Docker Compose** (7 services)
- **Nginx** (reverse proxy, rate limiting, WebSocket)
- **PostgreSQL 16** + pgvector (metadata & vector store)
- **Redis 7** (cache, pub/sub, Celery broker)
---
- **Frontend**: [http://localhost:3000](http://localhost:3000)
- **Backend API**: [http://localhost:8000](http://localhost:8000)
- **API Docs**: [http://localhost:8000/api/docs](http://localhost:8000/api/docs)
---
## ๐ Project Structure
```
โโโ backend/โ โโโ app/โ โ โโโ agents/ # DS-STAR agent layerโ โ โ โโโ graph.py # LangGraph state machineโ โ โ โโโ planner.py # Query decompositionโ โ โ โโโ coder.py # Code/SQL generationโ โ โ โโโ verifier.py # Output validationโ โ โ โโโ router_agent.py # State routingโ โ โ โโโ analyzer.py # Metadata analysisโ โ โโโ api/v1/ # REST API routesโ โ โโโ core/ # Cache, security, middlewareโ โ โโโ models/ # SQLAlchemy ORM modelsโ โ โโโ schemas/ # Pydantic request/response DTOsโ โ โโโ services/ # Business logic layerโ โ โโโ workers/ # Celery tasksโ โ โโโ config.py # App configurationโ โ โโโ database.py # Async DB engineโ โ โโโ main.py # FastAPI app entry pointโ โโโ Dockerfileโ โโโ requirements.txtโโโ frontend/โ โโโ src/โ โ โโโ app/ # Next.js app routerโ โ โโโ components/โ โ โ โโโ layout/ # Sidebar, headerโ โ โ โโโ ui/ # Shadcn primitivesโ โ โ โโโ views/ # Dashboard, explorer, quality, chat, lineage, settingsโ โ โโโ lib/ # Store, API client, utilsโ โโโ Dockerfileโ โโโ package.jsonโโโ nginx/โ โโโ nginx.confโโโ scripts/โ โโโ init-db.sql # Database schema initializationโโโ docker-compose.ymlโโโ .env.exampleโโโ README.md
```
---
## ๐ก API Endpoints
### Connections
Method
Endpoint
Description
GET
`/api/v1/connections`
List all connections
POST
`/api/v1/connections`
Create connection
POST
`/api/v1/connections/test`
Test connection
POST
`/api/v1/connections/{id}/sync`
Sync schema
### Schema
Method
Endpoint
Description
GET
`/api/v1/schemas/{conn_id}/overview`
Schema overview
GET
`/api/v1/schemas/{conn_id}/tables/{table}`
Table detail
GET
`/api/v1/schemas/{conn_id}/relationships`
All relationships
### Data Quality
Method
Endpoint
Description
GET
`/api/v1/quality/{conn_id}/overview`
Quality dashboard
POST
`/api/v1/quality/{conn_id}/profile/{table}`
Profile table
GET
`/api/v1/quality/{conn_id}/alerts`
Quality alerts
POST
`/api/v1/quality/{conn_id}/timeseries`
Time series analysis
### Chat
Method
Endpoint
Description
POST
`/api/v1/chat/sessions`
Create chat session
POST
`/api/v1/chat/sessions/{id}/messages`
Send message
WS
`/api/v1/chat/ws/{session_id}`
Streaming WebSocket
### Export
Method
Endpoint
Description
POST
`/api/v1/export/{conn_id}/dictionary`
Generate data dictionary
POST
`/api/v1/export/{conn_id}/ai-docs`
Generate AI documentation
### Lineage
Method
Endpoint
Description
GET
`/api/v1/lineage/{conn_id}/graph`
Full lineage graph
GET
`/api/v1/lineage/{conn_id}/table/{table}`
Table lineage
---
## ๐๏ธ Database Schema
Four PostgreSQL schemas organize all metadata:
- **`metadata`** โ connections, schemas, tables, columns, relationships, indexes, constraints
- **`quality`** โ table_profiles, column_profiles, alerts, trend_metrics
- **`ai`** โ embeddings (pgvector), documentation_artifacts, chat_sessions, chat_messages
- **`audit`** โ schema_versions, activity_log
---
## ๐งช Recommended Test Datasets
Dataset
Source
Tables
Description
**Olist Brazilian E-Commerce**
Kaggle
9
Orders, customers, products, reviews
**Bike Store**
SQLServerTutorial
9
Sales, inventory, staff
**Chinook**
GitHub
11
Digital media store
---
## ๐ฅ Team Tiempo Legends
Built for **HackFest 2.0** (February 21-22, 2026)
---
This build was uploaded as a hackathon project








