data compilance agent
Link to open source: https://github.com/ManishKudtarkar/Data-Policy-Agent
Link to Live Project: https://data-policy-agent-iavh.onrender.com/
This project is built around the concept of Automated Regulatory Intelligence. It solves a major pain point for businesses: the gap between "what the rules say" (legal PDFs) and "what the data shows" (company databases).
In a traditional company, compliance officers have to manually read 100-page policy documents and then manually check spreadsheets or databases to see if any rules were broken. This project replaces that manual process with an AI Agent that can read, think, and execute audits.
Core Value Proposition
The project essentially builds a "Digital Compliance Officer" that performs three main tasks:
-
Semantic Translation: It uses an LLM (Gemini) to "read" legal prose and translate it into logic (SQL queries).
-
Continuous Enforcement: It doesn't just check once; it monitors the database continuously for new violations.
-
Explainable Auditing: For every flagged record, it doesn't just say "Violation"; it provides a human-readable justification based on the specific section of the policy it read.
The "Fusion" Strategy
To make this project robust and "production-ready," you are combining three high-value datasets to cover multiple compliance domains:
| Dataset | Domain | What the Agent Learns |
| IBM AML | Finance | How to detect "Money Laundering" patterns (Fan-In/Fan-Out). |
| PaySim | Mobile Money | How to detect "Fraud" and balance-manipulation patterns. |
| Employee Policy | HR | How to detect internal policy breaches (Leave, Attendance). |
Key Technical Features
-
Multimodal Ingestion: You are building a system that can "see" a PDF. It doesn't just scrape text; it understands the structure of a policy document.
-
Agentic Reasoning: Instead of hardcoded rules, the agent generates the rules dynamically. If the company updates its policy PDF tomorrow, the agent automatically updates its SQL queries without a developer needing to rewrite code.
-
Production Backend: By using a structured project layout (FastAPI + Pydantic +
model.pkl), you are creating a system that can be deployed as a web service for any company to use.
Summary of Goal
The goal is to build an end-to-end software platform that takes a raw PDF, connects it to a massive data warehouse, and identifies business risks with 100% transparency and zero manual effort.
This build was uploaded as a hackathon project


