1EdTech 5 PS
Create AI-powered tools for smarter learning, personalized education, skill development, assessments, and digital classrooms to transform the future of education
Problem statements on next page.
AI-Based Notes & Study Material Distribution System
Problem Statement 2: The "Smart Librarian" (Resource Distribution)
1. The Context (The Pain Point)
Teachers are overwhelmed by students messaging them: "Sir, please send Unit 2 notes," "Ma'am, where is the assignment?" Teachers end up manually resharing the same PDFs on 10 different WhatsApp groups. This is a massive waste of academic time.
2. The Problem
Create an On-Demand Study Material Bot that retrieves specific academic resources instantly based on natural language requests.
3. What to Build (The Solution)
-
The Repository: A cloud database where teachers dump all files (PDFs, PPTs, Recordings) with simple tags.
-
The GenAI Search: A bot that understands vague requests.
-
Student: "I need the notes for the topic we studied last Tuesday."
-
AI: checks the schedule $\rightarrow$ identifies topic "DBMS Normalization" $\rightarrow$ fetches "Unit-3_DBMS.pdf".
-
Content Generation: If a student asks for a summary instead of the full file, the AI generates a "One-Page Cheat Sheet" from the PDF on the fly.
4. Key Features Required
-
Semantic Search: It should match "Database design notes" with "DBMS_Unit2.pdf" even if keywords don't match exactly.
-
Context Awareness: If a student asks "Send the assignment," it sends the pending assignment for their specific batch.
-
Analytics Dashboard: Show teachers which topics are being requested most (identifying learning gaps).
5. Evaluation Criteria
-
Retrieval Speed: Can it find the right file in under 3 seconds?
-
Query Understanding: Handling vague queries like "Last lecture's PDF".
AI-Powered Lecture Summarizer & Revision Helper
Problem Statement 3: The "Lecture Pilot" (Video-to-Knowledge)
1. The Context (The Pain Point)
Recorded lectures are often 1-2 hours long. Students revising for exams do not have time to re-watch the whole video to find one specific concept (e.g., "Schrodinger's Equation"). Manually scrubbing through video timelines is inefficient.
2. The Problem
Build a "Lecture Summarization & Revision Engine" that processes long educational videos and transforms them into "Exam-Ready" assets.
3. What to Build (The Solution)
-
The Indexer: Uses Speech-to-Text (Whisper) to transcribe the full video.
-
The Segmenter: Breaks the video into "Chapters" with timestamps (e.g., 00:00 Intro, 15:30 Thermodynamics Law 1, 45:00 Numerical Problem).
-
The "Doubts" Layer: A chatbot that "lives" inside the video player. Students can ask, "What did the professor say about Entropy?" and the AI answers using only the transcript context.
4. Key Features Required
-
Auto-Notes: Generate a bulleted PDF summary of the lecture automatically.
-
Formula Extraction: Identify mathematical formulas written on the board or spoken, and list them separately.
-
"Explain Like I'm 5": A button to simplify complex sections of the transcript for weaker students.
5. Evaluation Criteria
-
Segmentation Accuracy: Do the chapters match the actual topic changes?
-
Hallucination Check: The AI must not invent concepts not taught in the lecture.
Open Innovation : Participants are welcome to propose disruptive ideas in any Ed-tech-related domain. But Given problem statements are encouraged
Anonymous Doubt Resolution System for Students to increase their confidence, learning pace, and exam performance
Problem Statement 4: The "Anonymous Doubt" Forum
1. The Context (The Pain Point)
In Indian classrooms, "Log kya kahenge" (What will people say) kills curiosity. Students fear being judged for asking "silly" questions, leading to weak foundational concepts. They need a safe space to ask without fear.
2. The Problem
Build an Anonymous Doubt Resolution Platform that encourages curiosity by masking identity, while using AI to filter toxicity and provide instant preliminary answers.
3. What to Build (The Solution)
-
The Safe Space: A mobile interface where students post questions. The frontend hides their name from peers (showing "Anonymous Rabbit" etc.).
-
The "First Responder" AI: Before the teacher answers, an LLM provides an immediate, detailed explanation.
-
Benefit: 80% of basic doubts are solved instantly.
-
Teacher Dashboard: The teacher sees the real names (optional) or just the class statistics ("50% of class is confused about Newton's 3rd Law").
4. Key Features Required
-
Toxicity Filter: AI must block bullying or inappropriate comments in the anonymous chat.
-
Smart Grouping: If 10 students ask the same question, the AI groups them into one thread so the teacher answers once.
-
Gamification: "Question of the Day" awards to encourage asking.
5. Evaluation Criteria
-
Safety: How robust is the toxicity filter?
-
AI Quality: Does the AI answer accurately help the student while waiting for the teacher?
-
User Experience: Is it easy to ask a doubt in 2 clicks?
Automated Response Bot for Admission & Course Queries
Problem Statement 1: The "Admission Officer" Agent (Omichannel Sales)
1. The Context (The Pain Point)
During admission season, institutes are bombarded with thousands of repetitive queries ("Fees?", "Placement?", "Batch Date?") across WhatsApp, Instagram, and calls. Human counselors burn out answering the same questions, leading to delayed replies and lost leads (students joining competitors).
2. The Problem
Build an Omnichannel AI Admissions Agent that acts as a 24/7 counselor. It shouldn't just answer FAQs; it should "Sell" the course by understanding the student's intent, sharing persuasive brochures, and capturing the lead into a CRM.
3. What to Build (The Solution)
-
Unified Inbox: A system that listens to Instagram DMs, WhatsApp, and Website Chat simultaneously.
-
RAG-Based Brain: Instead of hardcoded replies, the AI reads the institute’s actual Admission Brochure PDF and website to generate accurate, natural answers.
-
Lead Qualifier: The bot must intelligently steer the conversation to get the student's Phone Number and Interest Area before handing over to a human.
4. Key Features Required
-
Intent Recognition: Distinguish between a "Casual Browser" (send brochure) vs. "Ready to Join" (send payment link).
-
Rich Media Responses: Send a specific "Campus Tour Video" or "Placement Report PDF" when asked about facilities, not just text.
-
Human Handoff: If a student asks a complex specific question (e.g., "Can I pay via EMI with Bajaj Finance?"), alert a human staff member immediately.
5. Evaluation Criteria
-
Conversion Focus: Does the bot try to capture the lead?
-
Accuracy: Does it strictly stick to the brochure facts (no hallucinations about fake scholarships)?
-
Tone: Is it professional yet welcoming?
2CyberSecurity 4 PS
Develop intelligent systems to detect threats, prevent cyber attacks, protect data, and ensure secure digital ecosystems using AI-driven security solutions
Problem statements on next page.
Open Innovation : Participants are welcome to propose disruptive ideas in any CyberSecurity-related domain. But Given problem statements are encouraged
Design an 'AI Firewall' proxy that sits between the user and the LLM, detecting adversarial prompts (Jailbreaks) and preventing PII leakage in the output
3. The "LLM Firewall" (Adversarial AI Defense)
Context:
As companies integrate Large Language Models (LLMs) into their internal tools, they face "Prompt Injection" attacks. Attackers can use clever inputs to "jailbreak" the AI, forcing it to reveal sensitive corporate data or ignore safety guidelines.
Problem Statement:
"Design an 'AI Firewall' proxy that sits between the user and the LLM, detecting adversarial prompts (Jailbreaks) and preventing PII leakage in the output."
Proposed Solution:
A middleware API that intercepts prompts. It uses a specialized classification model (like a fine-tuned BERT) to detect malicious patterns (e.g., "Ignore previous instructions") and scans the output to redact sensitive data before it reaches the user.
Key Features Required:
-
Jailbreak Detection: Classify prompts as "Safe" or "Malicious" (detects DAN, Grandma exploit, etc.).
-
PII Redaction: Automatically mask emails, phone numbers, or credit card numbers in the response.
-
Canary Token Integration: Inject hidden strings; if they appear in the output, block the response immediately.
-
Low Latency: Security checks must add <200ms overhead.
Evaluation Criteria:
-
Defense Rate: What % of standard jailbreak prompts does it block?
-
Utility: Does it block innocent questions by mistake? (False Positive Rate).
-
Adaptability: How easily can new attack signatures be added?
A real-time detection tool (Browser Extension or API) that analyzes audio/video streams to flag potential deepfakes or synthesized media signatures
1. The "Deepfake Defender" (Media Forensics)
Context:
Generative AI has democratized the creation of hyper-realistic fake audio and video. This fuels voice cloning scams, CEO fraud, and disinformation campaigns. Detecting synthesized media is now a critical security frontier.
Problem Statement:
"Build a real-time detection tool (Browser Extension or API) that analyzes audio/video streams to flag potential deepfakes or synthesized media signatures."
Proposed Solution:
A lightweight AI tool that analyzes media for artifacts—such as irregular blinking patterns, lip-sync mismatches, or specific audio frequency cuts typical of AI generation (e.g., lack of breath sounds). It provides a "Fake Probability Score" instantly.
Key Features Required:
-
Explainability: The tool must highlight why it flagged the content (e.g., "Inconsistent lighting on face" or "Audio spectrum anomaly").
-
Real-Time Processing: Must analyze a live stream or 30s clip in under 5 seconds.
-
Audio Forensics: Detect cloned voices by analyzing background noise consistency.
-
Edge Compatibility: Ideally runs in the browser to preserve user privacy.
Evaluation Criteria:
-
False Positive Rate: Does it flag real videos as fake? (Must be <5%).
-
Robustness: Can it detect a deepfake even if the video is compressed (like on WhatsApp)?
-
Speed: Latency per video frame.
A browser plugin that uses Computer Vision and NLP to detect visually deceptive phishing sites and AI-generated spear-phishing emails
2. The "Zero-Day Phishing Sniffer" (Vision & NLP)
Context:
Modern phishing emails no longer look like "Nigerian Prince" scams. They use perfect grammar (thanks to AI) and clone real login pages pixel-perfectly. Standard text-based filters often miss them.
Problem Statement:
"Develop a browser plugin that uses Computer Vision and NLP to detect visually deceptive phishing sites and AI-generated spear-phishing emails."
Proposed Solution:
A tool that looks at the visual rendering of a webpage. If a site looks exactly like the Microsoft Login page but the URL is micro-soft-login.com, the computer vision model flags it (Visual Similarity Search).
Key Features Required:
-
Visual Similarity Matching: Compare screenshots of the visited site against a database of known brands (PayPal, Gmail, Bank of America).
-
Homoglyph Detection: Flag URLs using deceptive characters (e.g., G00gle.com).
-
Urgency Detection (NLP): Analyze email text for manipulation tactics ("Act now or account deleted!").
-
Heatmap Warning: Highlight the specific suspicious elements on the page.
Evaluation Criteria:
-
Detection Accuracy: Can it catch a brand new phishing site not yet on any blacklist?
-
Performance: Does it slow down web browsing?
-
Privacy: Does it transmit user browsing history? (Should run locally).
3HealthTech 4 PS
Innovate with AI to improve healthcare delivery, diagnosis, patient monitoring, hospital management, and overall medical efficiency. Build solutions that save lives and enhance patient care
Problem statements on next page.
Open Innovation : Participants are welcome to propose disruptive ideas in any Healt-tech-related domain. But Given problem statements are encouraged
Build a computer vision-based 'Virtual Physiotherapist' that analyzes patient movements in real-time via a webcam and provides instant audio-visual feedback on form
3. The "AI Rehab Coach" (Computer Vision & Pose Estimation)
Context:
Physical therapy relies on correct movement execution. Patients recovering at home often perform exercises incorrectly, delaying recovery. Physiotherapists cannot monitor every patient 24/7.
Problem Statement:
"Build a computer vision-based 'Virtual Physiotherapist' that analyzes patient movements in real-time via a webcam and provides instant audio-visual feedback on form."
Proposed Solution:
A web app using MediaPipe or MoveNet to track human joints. As the user performs a squat or arm raise, the AI calculates joint angles. If the knee collapses inward, the AI immediately speaks: "Keep your knees aligned with your toes."
Key Features Required:
-
Real-Time Latency: Must run at 30 FPS on a standard laptop (no GPU required).
-
Angle Logic: Geometric calculation of "safe zones" for specific exercises.
-
Rep Counter: Only counting reps that meet the form quality threshold.
-
Privacy Preserving: Processing must happen on-device (Edge AI); no video is sent to the cloud.
Evaluation Criteria:
-
Accuracy: Does it detect bad form as well as a human trainer?
-
Feedback Speed: Is the correction given during the rep, or 5 seconds later?
-
Gamification: Does the user get a "score" to motivate them?
an AI-powered conversational triage assistant that assesses patient symptoms, assigns a risk score, and directs them to the appropriate level of care (Self-care, Tele-consult, or ER)
1. The "Intelligent Triage" (NLP & Clinical Logic)
Context:
Emergency Rooms (ERs) are overwhelmed. Patients with minor issues clog queues, while critical patients might be delayed. Efficient "digital front doors" are needed to assess patients before they even step foot in a hospital.
Problem Statement:
"Create an AI-powered conversational triage assistant that assesses patient symptoms, assigns a risk score, and directs them to the appropriate level of care (Self-care, Tele-consult, or ER)."
Proposed Solution:
A WhatsApp/Web chatbot that uses an LLM (like Llama 3 or GPT-4o) fine-tuned on clinical protocols (e.g., Manchester Triage System). It interviews the patient, detects "Red Flag" keywords (e.g., "radiating chest pain"), and generates a structured summary for the doctor.
Key Features Required:
-
Medical Entity Recognition (MER): Extracting symptoms, duration, and pain severity from free text.
-
Risk Scoring Algorithm: Logic that maps symptoms to urgency (e.g., Chest Pain > 45yo male = Critical).
-
Doctor Handover: Generating a concise "SBAR" (Situation, Background, Assessment, Recommendation) summary for the physician.
-
Voice-Enabled: Allowing elderly users to speak their symptoms instead of typing.
Evaluation Criteria:
-
Safety: Does it correctly identify 10/10 critical cases in the test set? (Zero tolerance for missing a heart attack).
-
Empathy: Is the bot's tone calming and professional?
-
Efficiency: Can it reach a conclusion in under 2 minutes?
Develop a medication management tool that uses OCR to digitize prescription labels and uses a Knowledge Graph to flag potential drug-to-drug interactions
2. The "Snap-and-Check" Safety (OCR & Knowledge Graph)
Context:
Polypharmacy (using multiple medications) causes dangerous adverse reactions, especially in the elderly. Patients often lose track of their schedules or fail to realize that a new prescription conflicts with an old one.
Problem Statement:
"Develop a medication management tool that uses OCR to digitize prescription labels and uses a Knowledge Graph to flag potential drug-to-drug interactions."
Proposed Solution:
A mobile app where a user photographs their pill bottles. The app uses Google Lens/Tesseract (OCR) to read the drug names, cross-references them with an open FDA database, and warns if two drugs are dangerous to take together.
Key Features Required:
-
Robust OCR: Reading curved text on pill bottles or messy handwritten prescriptions.
-
Interaction Engine: Checking pairs of drugs against a known interaction database (e.g., "Aspirin + Warfarin = Bleeding Risk").
-
Visual Schedule: Auto-creating a "Morning/Afternoon/Night" pill calendar from the data.
-
Caregiver Alerts: Notification sent to family if a critical interaction is found.
Evaluation Criteria:
-
Detection Rate: How many drug names were correctly transcribed from a blurry photo?
-
False Alarm Rate: Does it flag harmless combinations as dangerous? (Too many warnings cause "alert fatigue").
-
UX Design: Is the warning clear and red?
4Agri Tech 4 PS
Develop technology-driven solutions for challenges in the agricultural sector, often using artificial intelligence (AI), data analytics Etc.
Problem statements on next page.
An offline-first mobile solution that uses lightweight Computer Vision (Edge AI) to detect crop diseases/pests in real-time and provides actionable, localized remedies
1. The "Pocket Agronomist" (Edge AI & Computer Vision)
Context:
Smallholder farmers often lack access to timely expert advice. In remote areas, internet connectivity is spotty, making cloud-heavy solutions unreliable. A delay in diagnosing a pest attack or nutrient deficiency can ruin an entire season's yield.
Problem Statement:
"Develop an offline-first mobile solution that uses lightweight Computer Vision (Edge AI) to detect crop diseases/pests in real-time and provides actionable, localized remedies."
Proposed Solution:
A mobile app running a quantized (compressed) model like TensorFlow Lite or YOLOv8 Nano directly on the phone. It scans leaves, identifies diseases (e.g., Early Blight), and suggests remedies in the local language without needing the cloud.
Key Features Required:
-
Edge Inference: AI model must run locally on the device (no API calls for inference).
-
Severity Grading: AI should classify infection not just by type, but by severity (Low/Medium/Critical) to recommend dosage.
-
Multilingual Output: Results mapped to local vernacular audio/text.
-
Geo-Hotspots: Background tagging of location to create a "heat map" of disease outbreaks once connectivity is restored.
Evaluation Criteria:
-
Model Accuracy & Size: High mAP (mean Average Precision) with a model size under 20MB.
-
Inference Speed: Detection within <2 seconds on a mid-range phone.
-
Offline Capability: Full functionality in "Airplane Mode."
Open Innovation : Participants are welcome to propose disruptive ideas in any Agri-tech-related domain. But Given problem statements are encouraged
Build a predictive market intelligence engine that leverages historical data and external factors to forecast short-term crop prices and optimize selling timing
2. The "Fair Price" Forecaster (Predictive Analytics)
Context:
Farmers often panic-sell produce at low rates to middlemen because they lack visibility into future price trends. Without data-driven insights, they cannot decide whether to sell immediately or store their harvest for better returns.
Problem Statement:
"Build a predictive market intelligence engine that leverages historical data and external factors to forecast short-term crop prices and optimize selling timing."
Proposed Solution:
An AI dashboard that ingests historical market prices, weather patterns, and transportation fuel costs. It uses a Time-Series Forecasting model (like LSTM or Prophet) to predict price trends for the next 14–30 days, advising farmers to "Sell Now" or "Hold."
Key Features Required:
-
Predictive Model: A regression or time-series AI model trained on historical market data (Mandi prices).
-
Sentiment Analysis: Scrape news or local trade forums to gauge market sentiment (optional but high value).
-
Personalized Recommendation: Logic that calculates storage costs vs. predicted profit (e.g., "Holding for 1 week costs $10 but gains $50").
-
Visual Trends: Simple traffic-light system (Red=Wait, Green=Sell) for literacy-agnostic usability.
Evaluation Criteria:
-
Forecasting Logic: soundness of the algorithm and feature selection (e.g., did they account for seasonality?).
-
Usability: How easily can a non-expert interpret the "Sell vs. Hold" advice?
-
Data Visualization: Quality of the graphs/charts presented.
Develop a Multi-Modal RAG (Retrieval-Augmented Generation) assistant that ingests soil health cards, weather data, and satellite imagery to simulate yield scenarios and answer complex queries
3. The "Hyper-Local Yield Simulator" (Generative AI / RAG)
Context:
Farming is a high-stakes bet against nature. Farmers often make decisions based on intuition. With Generative AI, we can move beyond simple forecasts to "scenario simulation," allowing farmers to ask complex "What if?" questions about their specific plot.
Problem Statement:
"Develop a Multi-Modal RAG (Retrieval-Augmented Generation) assistant that ingests soil health cards, weather data, and satellite imagery to simulate yield scenarios and answer complex queries."
Proposed Solution:
A "Digital Twin" chat interface. The farmer uploads a photo of their soil report. The AI parses the text (OCR), retrieves agronomy best practices (Vector DB), and combines it with weather forecasts to answer questions like: "My soil pH is 5.5 and rain is delayed—should I plant Soybeans now?"
Key Features Required:
-
Multi-Modal Ingestion: Ability to process both text (questions) and images (soil reports/leaf photos) simultaneously.
-
RAG Architecture: Grounding the LLM's answers in verified agricultural handbooks to prevent hallucinations.
-
Causal Reasoning: The AI must explain the why behind its advice (e.g., "High acidity + rain delay increases root rot risk").
-
Voice-First Interface: Speech-to-Text inputs for accessibility.
Evaluation Criteria:
-
Hallucination Rate: Does the AI invent fake chemicals or farming methods? (Strict penalty).
-
Context Awareness: Does the answer change if the location/weather context changes?
-
Complexity Handling: Ability to handle multi-part questions.
5FinTech 5 PS
Build smart financial solutions for payments, fraud detection, credit scoring, wealth management, and digital banking using AI and data-driven intelligence
Problem statements on next page.
Open Innovation : Participants are welcome to propose disruptive ideas in any Fin-tech-related domain. But Given problem statements are encouraged
Build a "Visual Underwriting Agent" that eliminates data entry. The user should be able to simply walk around their room recording a video, and the AI should generate a complete, priced insurance quote instantly
Problem Statement 3: The "Video-to-Quote" Insurance Assistant
1. The Context (The Pain Point)
Getting "Home Insurance" or "Renters Insurance" is a tedious process involving endless forms where users must list every item they own (TV, Sofa, Jewelry) and their values. Most users guess or lie, leading to "Under-Insurance" (coverage that is too low) or disputes during claims.
2. The Problem
Build a "Visual Underwriting Agent" that eliminates data entry. The user should be able to simply walk around their room recording a video, and the AI should generate a complete, priced insurance quote instantly.
3. What to Build (The Solution)
-
Video Analysis Module: A mobile-friendly web app that accepts a video file.
-
Multimodal AI Processing: Use a Vision-Language Model (like Gemini 1.5 Pro or GPT-4o) to analyze the video frame-by-frame. It must:
-
Detect Objects: Identify high-value assets (Electronics, Furniture, Art).
-
Assess Condition: Estimate if items look "New," "Used," or "Damaged."
-
The Pricing Agent: Connect the identified item list to a real-time Search API (like Google Shopping/SerpApi) to fetch current market prices.
-
The Quote Generator: Sum up the values and present a "Recommended Coverage Amount."
4. Key Features Required
-
Granularity: It should distinguish between a "Generic Laptop" and a "MacBook Pro" if the logo is visible.
-
Transparency: The user must be able to see the itemized list (Inventory) generated by the AI and edit it if the price is wrong.
5. Evaluation Criteria
-
The "Wow" Factor: Smoothness of the Video-to-Text pipeline.
-
Accuracy: Does it correctly identify items in a cluttered room?
-
Pricing Logic: Real-time price fetching vs. static guessing.
Build a Real-Time Fact-Checking Browser Extension that "listens" to financial videos as the user watches them. It must instantaneously analyze the spoken advice, cross-reference it with live market data, and verify the creator's credibility history
Problem Statement 4: The "FinTok" Truth Detector
1. The Context (The Pain Point)
Social media platforms (YouTube, Instagram, TikTok) are flooded with "Finfluencers" giving aggressive financial advice (e.g., "Buy this penny stock, it will 10x next week!" or "Market is crashing, sell everything!"). Most retail investors follow this advice blindly without knowing:
-
Is this influencer actually credible? (What was their past track record?)
-
Is the financial data they are quoting actually true?
2. The Problem
Build a Real-Time Fact-Checking Browser Extension that "listens" to financial videos as the user watches them. It must instantaneously analyze the spoken advice, cross-reference it with live market data, and verify the creator's credibility history.
3. What to Build (The Solution)
-
The "Ear" (Transcription Agent): A browser extension that captures system audio from the active tab (YouTube/Reels). It uses an Automatic Speech Recognition (ASR) model (like OpenAI Whisper) to transcribe the video in real-time.
-
The "Brain" (Claim Extraction): An LLM that processes the transcript to extract specific financial claims.
-
Spoken: "Suzlon Energy is going to hit 100 rupees by Friday because profits doubled."
-
Extracted Claim: {Asset: "Suzlon", Direction: "Buy", Reason: "Profits doubled"}.
-
The "Judge" (Verification Engine):
-
Market Check: Connects to a Stock API (Yahoo Finance/AlphaVantage) to check if profits actually doubled.
-
Reputation Check: Scrapes the influencer's past videos to calculate a "Win Rate" (e.g., "Only 20% of his predictions came true").
-
The "Overlay" (UI): A floating widget on the video player that displays a "Trust Score" (Green/Red) and context notes.
4. Key Features Required
-
Real-Time Sentiment Analysis: Detects "Hype" vs. "Fact." (e.g., Excessive use of words like "Guaranteed," "Moon," "1000x").
-
Visual Warning System: If the influencer mentions a stock that is currently flagging red flags (e.g., SEBI investigation), flash a warning instantly.
-
"Receipts" Mode: A button that says "Show Track Record" which pulls up a list of the influencer's past failed predictions.
5. Evaluation Criteria
-
Latency: Can it fact-check within 5-10 seconds of the sentence being spoken?
-
Accuracy: Does it correctly identify the stock ticker (e.g., distinguishing "Apple" the fruit from "AAPL" the stock)?
-
GenAI Integration: Effective use of LLMs to separate "Opinion" from "Financial Advice."
Build a Peer-to-Peer (P2P) Offline Payment SDK that allows two smartphones to complete a transaction without any active internet connection on either device. The system must securely transmit transaction data, verify the intent, and queue the ledger update for when connectivity returns
Problem Statement 1: The "Sound-Wave" Offline Payment System
1. The Context (The Pain Point)
Digital payments (UPI) have revolutionized India, but they have a single point of failure: The Internet. In crowded festivals, underground metros, rural areas, or during natural disasters, internet connectivity fails, rendering UPI useless. This forces users back to cash, breaking the digital economy loop.
2. The Problem
Build a Peer-to-Peer (P2P) Offline Payment SDK that allows two smartphones to complete a transaction without any active internet connection on either device. The system must securely transmit transaction data, verify the intent, and queue the ledger update for when connectivity returns.
3. What to Build (The Solution)
-
Data-Over-Sound Protocol: A mechanism to encode transaction details (Amount, Sender ID, Token) into an encrypted audio signal (ultrasonic or audible chirp).
-
The Receiver: A listening module that captures the sound, decodes it, and cryptographically verifies the signature offline.
-
The "Edge AI" Risk Engine (The GenAI Twist): Since the bank server cannot verify the balance in real-time, implement a Small Language Model (SLM) or lightweight AI model on the device. This model should analyze the user's past spending behavior and local device signals to assign a "Trust Score" to the transaction, authorizing it only if the risk of fraud is low.
4. Key Features Required
-
Transmitter UI: Enter amount -> Generate Audio Token.
-
Receiver UI: "Listening Mode" -> Decode & Success Screen.
-
Sync Mechanism: Auto-sync transaction logs to the server once the internet is restored.
5. Evaluation Criteria
-
Reliability: Does the sound transfer work in a noisy room?
-
Security: Is the audio token encrypted (preventing replay attacks)?
-
GenAI Usage: Innovative use of on-device AI for risk/fraud scoring without internet.
Create an Intelligent Document Processing (IDP) Agent that ingests messy, unstructured financial documents and automatically converts them into a standardized ESG (Environmental, Social, and Governance) Compliance Report
Problem Statement 2: The "Green Invoice" ESG Auditor
1. The Context (The Pain Point)
Banks and financial institutions are under immense pressure to report the Carbon Footprint (Scope 3 Emissions) of their lending portfolios. However, small businesses (MSMEs) do not have "Sustainability Teams." Their data is locked in thousands of unstructured PDF invoices (fuel bills, electricity receipts, raw material purchase orders), making manual calculation impossible.
2. The Problem
Create an Intelligent Document Processing (IDP) Agent that ingests messy, unstructured financial documents and automatically converts them into a standardized ESG (Environmental, Social, and Governance) Compliance Report.
3. What to Build (The Solution)
-
The Ingestion Engine: A drag-and-drop interface for bulk uploading PDFs/Images of invoices.
-
The GenAI Extractor: Use an LLM (e.g., GPT-4o, Gemini) to extract specific line items (e.g., "500 Liters of Diesel," "1000 Units of Electricity," "Plastic Packaging").
-
The Calculator: Map these extracted items to a global Carbon Emission Factor Database (e.g., 1L Diesel = 2.6kg CO2) to calculate the total footprint.
-
The Dashboard: Visualize the "Carbon Intensity" of the business.
4. Key Features Required
-
Hallucination Control: The system must cite where in the document it found the data (e.g., highlight the specific row in the invoice).
-
Categorization: Automatically distinguish between "OpEx" (Office supplies - Low Carbon) and "Energy" (Fuel - High Carbon).
5. Evaluation Criteria
-
Accuracy: How well does it extract data from low-quality scans?
-
Logic: Is the Carbon mapping logic sound?
-
Utility: Is the final dashboard actionable for a bank?

