🎥 QueryTube

Transform any video into an intelligent Q&A chatbot using AI-powered semantic search

Created by Vivek Kumar Singh (@reachvivek)

📖 Table of Contents

Overview
Key Features
Architecture
Data Flow
Getting Started
Configuration
Database Schema
API Documentation
Free Tier Setup
License
Contact

🌟 Overview

QueryTube converts any video (YouTube or uploaded files) into a smart Q&A assistant that understands natural language questions and provides accurate answers with timestamps.

Why QueryTube?

100% FREE to run (using Mistral AI + Groq + Pinecone free tiers)
Semantic Search: Understands meaning, not just keywords
Source Citations: Every answer includes exact video timestamps
Multi-Language: Auto-detects and processes any language
Self-Hosted: You own your data and control costs
Per-Video Sessions: Each video maintains independent processing state

✨ Key Features

Video Processing

✅ YouTube URLs (public videos)
✅ Direct file uploads (MP4, MP3, WAV, M4A, WebM, up to 25MB)
✅ Auto language detection (EN/FR/other)
✅ YouTube captions extraction (instant, free)
✅ Groq Whisper AI transcription fallback (when captions unavailable)
✅ Retry mechanism for re-extracting transcripts

Intelligent Features

✅ Semantic embedding generation (Mistral/OpenAI)
✅ Vector-based similarity search (Pinecone)
✅ Enhanced context-aware Q&A with detailed citations
✅ Structured AI-generated video summaries with key topics
✅ Chat history persistence per video
✅ Session management (resume after refresh)
✅ Duplicate video detection

User Experience

✅ Magic link authentication (Better Auth v4)
✅ Clean, intuitive dashboard
✅ Real-time progress tracking with ETA
✅ Per-video isolated sessions
✅ Custom modal confirmations (no browser alerts)
✅ Direct chat access from dashboard
✅ Complete video deletion (DB + vectors)
✅ Retry button for transcript re-extraction

Analytics & Management

✅ Provider tracking (Mistral/OpenAI for embeddings, Groq/OpenAI/Claude for Q&A)
✅ Usage statistics with real database data
✅ Q&A history per video
✅ Processing status monitoring
✅ Weekly activity charts

🏗 Architecture

Technology Stack

Frontend:

Next.js 16 (App Router)
TypeScript 5
Tailwind CSS
shadcn/ui components
Lucide icons

Backend:

Next.js API Routes
Prisma ORM
SQLite database

AI Services (FREE Tier):

Mistral AI: Embeddings (1B tokens/month free)
Groq: Q&A generation (14.4K requests/day free)
Pinecone: Vector storage (100K vectors free)

Video Processing:

yt-dlp (YouTube download)
youtube-caption-extractor (caption extraction)

System Architecture

┌─────────────────────────────────────────────────────────┐
│                    USER INTERFACE                        │
│  (Next.js Frontend + shadcn/ui components)              │
└────────────────┬────────────────────────────────────────┘
                 │
    ┌────────────┴────────────┐
    │                         │
┌───▼──────┐         ┌────────▼─────┐
│ SQLite DB│         │  Pinecone    │
│ (Videos, │         │ Vector DB    │
│ Chunks,  │         │ (Embeddings) │
│Analytics)│         └──────────────┘
└──────────┘
    │
    │         ┌──────────────────────────────┐
    │         │    External AI Services      │
    │         │  • Mistral (Embeddings)     │
    └─────────┤  • Groq (Q&A Generation)    │
              │  • OpenAI (Fallback)        │
              └──────────────────────────────┘

🔄 Data Flow

Complete Processing Pipeline

┌─────────────────────────────────────────────────────────────┐
│                    STEP 1: UPLOAD VIDEO                     │
│  YouTube URL or File Upload → Validation → Create Video DB  │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│              STEP 2: EXTRACT TRANSCRIPT                     │
│                                                              │
│  ┌──────────────┐      ┌────────────────┐                 │
│  │ YouTube      │ YES  │ Extract        │                 │
│  │ Has Captions?├─────►│ Captions       │                 │
│  └──────┬───────┘      └────────┬───────┘                 │
│         │ NO                     │                          │
│         ▼                        ▼                          │
│  ┌──────────────┐      ┌────────────────┐                 │
│  │ Use Whisper  │      │ Split into     │                 │
│  │ Transcription│─────►│ Time-stamped   │                 │
│  └──────────────┘      │ Chunks (~90s)  │                 │
│                        └────────┬───────┘                 │
│                                 │                          │
│                        ┌────────▼───────┐                 │
│                        │ Save chunks to │                 │
│                        │ SQLite with    │                 │
│                        │ videoId FK     │                 │
│                        └────────────────┘                 │
└─────────────────────────────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│                STEP 3: VECTORIZE                            │
│                                                              │
│  ┌────────────────┐      ┌────────────────┐               │
│  │ Generate       │      │ Upload to      │               │
│  │ Embeddings     │─────►│ Pinecone       │               │
│  │ (Mistral AI)   │      │ Vector DB      │               │
│  │ 1024 dimensions│      └────────┬───────┘               │
│  └────────────────┘               │                        │
│                                    ▼                        │
│                           ┌────────────────┐               │
│                           │ Update chunks  │               │
│                           │ with vectorId  │               │
│                           │ & provider     │               │
│                           └────────┬───────┘               │
│                                    │                        │
│                                    ▼                        │
│                           ┌────────────────┐               │
│                           │ Generate AI    │               │
│                           │ Summary (Groq) │               │
│                           └────────────────┘               │
└─────────────────────────────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│              STEP 4: QUERY & ANSWER (RAG)                   │
│                                                              │
│  User Question                                              │
│       │                                                      │
│       ▼                                                      │
│  ┌────────────────┐                                        │
│  │ Convert to     │                                        │
│  │ Embedding      │                                        │
│  │ (Mistral AI)   │                                        │
│  └────────┬───────┘                                        │
│           │                                                 │
│           ▼                                                 │
│  ┌────────────────┐                                        │
│  │ Search         │                                        │
│  │ Pinecone for   │                                        │
│  │ Similar Chunks │                                        │
│  └────────┬───────┘                                        │
│           │                                                 │
│           ▼                                                 │
│  ┌────────────────┐                                        │
│  │ Retrieve Top-K │                                        │
│  │ Relevant       │                                        │
│  │ Segments       │                                        │
│  └────────┬───────┘                                        │
│           │                                                 │
│           ▼                                                 │
│  ┌────────────────┐                                        │
│  │ Generate       │                                        │
│  │ Answer (Groq/  │                                        │
│  │ OpenAI/Claude) │                                        │
│  └────────┬───────┘                                        │
│           │                                                 │
│           ▼                                                 │
│  ┌────────────────┐                                        │
│  │ Save to        │                                        │
│  │ Analytics DB   │                                        │
│  │ with provider  │                                        │
│  └────────────────┘                                        │
└─────────────────────────────────────────────────────────────┘

Session Management Flow

┌─────────────────────────────────────────────────────────┐
│              SESSION STORAGE ARCHITECTURE                │
│                                                          │
│  localStorage Structure:                                 │
│  ├── youtube-qa-last-video          (Last active video) │
│  ├── youtube-qa-session-{videoId1}  (Video 1 session)  │
│  ├── youtube-qa-session-{videoId2}  (Video 2 session)  │
│  ├── chat-history-{videoId1}        (Video 1 chat)     │
│  └── chat-history-{videoId2}        (Video 2 chat)     │
│                                                          │
│  Benefits:                                               │
│  • Each video has isolated processing state             │
│  • Switch between videos without losing progress       │
│  • Chat history persists per video                     │
│  • Resume from exactly where you left off              │
└─────────────────────────────────────────────────────────┘

Delete Operation Flow

┌─────────────────────────────────────────────────────────┐
│              VIDEO DELETION PROCESS                      │
│                                                          │
│  User clicks Delete                                      │
│       │                                                  │
│       ▼                                                  │
│  ┌────────────────┐                                    │
│  │ Show Custom    │                                    │
│  │ Confirmation   │                                    │
│  │ Modal (shadcn) │                                    │
│  └────────┬───────┘                                    │
│           │ Confirmed                                   │
│           ▼                                             │
│  ┌────────────────┐                                    │
│  │ 1. Fetch all   │                                    │
│  │    chunks with │                                    │
│  │    vectorIds   │                                    │
│  └────────┬───────┘                                    │
│           │                                             │
│           ▼                                             │
│  ┌────────────────┐                                    │
│  │ 2. Delete      │                                    │
│  │    vectors from│                                    │
│  │    Pinecone    │                                    │
│  │    (batches of │                                    │
│  │    100)        │                                    │
│  └────────┬───────┘                                    │
│           │                                             │
│           ▼                                             │
│  ┌────────────────┐                                    │
│  │ 3. Delete      │                                    │
│  │    video from  │                                    │
│  │    SQLite      │                                    │
│  │    (CASCADE    │                                    │
│  │    deletes     │                                    │
│  │    chunks &    │                                    │
│  │    analytics)  │                                    │
│  └────────┬───────┘                                    │
│           │                                             │
│           ▼                                             │
│  ┌────────────────┐                                    │
│  │ 4. Clear       │                                    │
│  │    localStorage│                                    │
│  │    sessions    │                                    │
│  └────────────────┘                                    │
└─────────────────────────────────────────────────────────┘

🚀 Getting Started

Prerequisites

Node.js 18+
One of the following:
- FREE Stack: Mistral AI API Key + Groq API Key + Pinecone Account
- Paid Stack: OpenAI API Key + Pinecone Account
yt-dlp (for YouTube videos)

Installation

1. Install yt-dlp

Windows:

winget install yt-dlp

macOS:

brew install yt-dlp

Linux:

sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp

2. Clone & Install

git clone https://github.com/reachvivek/QueryTube.git
cd QueryTube
npm install

3. Configure Environment

cp .env.example .env.local

Edit .env.local:

# FREE Stack Configuration (Recommended)
MISTRAL_API_KEY=your-mistral-key-here
GROQ_API_KEY=your-groq-key-here
DEFAULT_EMBEDDING_PROVIDER=mistral
DEFAULT_AI_PROVIDER=groq
DEFAULT_MODEL=llama-3.3-70b-versatile

# Pinecone Configuration (Required)
PINECONE_API_KEY=your-pinecone-key-here
PINECONE_INDEX=youtube-qa
PINECONE_ENVIRONMENT=us-east-1-aws

# Optional: OpenAI (Fallback)
OPENAI_API_KEY=your-openai-key-here

# Optional: Claude (Alternative)
ANTHROPIC_API_KEY=your-claude-key-here

4. Set Up Pinecone Index

Go to Pinecone Console
Create new index:
- Name: youtube-qa
- Dimensions: 1024 (for Mistral) or 1536 (for OpenAI)
- Metric: Cosine
- Cloud Provider: Your choice (free tier available)

5. Initialize Database

npx prisma generate
npx prisma migrate dev

6. Run Development Server

npm run dev

Open http://localhost:3000

⚙️ Configuration

Environment Variables

Variable	Required	Default	Description
`MISTRAL_API_KEY`	For FREE stack	-	Mistral AI API key for embeddings
`GROQ_API_KEY`	For FREE stack	-	Groq API key for Q&A
`PINECONE_API_KEY`	✅ Yes	-	Pinecone API key for vector storage
`PINECONE_INDEX`	✅ Yes	`youtube-qa`	Pinecone index name
`PINECONE_ENVIRONMENT`	✅ Yes	-	Pinecone region
`DEFAULT_EMBEDDING_PROVIDER`	No	`mistral`	Embedding provider (mistral/openai)
`DEFAULT_AI_PROVIDER`	No	`groq`	Q&A provider (groq/openai/claude)
`DEFAULT_MODEL`	No	`llama-3.3-70b-versatile`	AI model name
`OPENAI_API_KEY`	Optional	-	OpenAI API key (fallback)
`ANTHROPIC_API_KEY`	Optional	-	Claude API key (alternative)

🗄 Database Schema

Entity Relationship Diagram

┌─────────────────────────────────────────┐
│              Video                      │
├─────────────────────────────────────────┤
│ id                 (PK, UUID)           │
│ youtubeUrl         (String?)           │
│ youtubeId          (String? UNIQUE)    │
│ title              (String)            │
│ description        (String?)           │
│ duration           (Int - seconds)     │
│ durationFormatted  (String)            │
│ thumbnail          (String?)           │
│ uploader           (String?)           │
│ language           (String)            │
│ status             (String)            │
│ errorMessage       (String?)           │
│ transcriptSource   (String?)           │
│ transcript         (String?)           │
│ summary            (String?)           │  ← AI-generated
│ uploadedAt         (DateTime)          │
│ processedAt        (DateTime?)         │
└──────────────┬──────────────────────────┘
               │ 1
               │
               │ N
┌──────────────▼──────────────────────────┐
│              Chunk                      │
├─────────────────────────────────────────┤
│ id                 (PK, UUID)           │
│ videoId            (FK → Video) CASCADE │  ← Deletes on video delete
│ chunkIndex         (Int)                │
│ text               (String)             │
│ startTime          (Int - seconds)      │
│ endTime            (Int - seconds)      │
│ timestamp          (String "MM:SS")     │
│ vectorId           (String? UNIQUE)     │  ← Pinecone vector ID
│ embeddingProvider  (String?)            │  ← mistral/openai tracking
│ createdAt          (DateTime)           │
└──────────────┬──────────────────────────┘
               │
               │
┌──────────────▼──────────────────────────┐
│            Analytics                    │
├─────────────────────────────────────────┤
│ id                 (PK, UUID)           │
│ videoId            (FK → Video) CASCADE │  ← Deletes on video delete
│ question           (String)             │
│ answer             (String)             │
│ responseTime       (Float - seconds)    │
│ provider           (String)             │  ← groq/openai/claude
│ model              (String)             │  ← llama-3.3-70b-versatile, etc
│ chunksUsed         (Int)                │
│ timestamp          (DateTime)           │
└─────────────────────────────────────────┘

Indexes:
• Video: status, youtubeId
• Chunk: videoId, vectorId, (videoId + chunkIndex UNIQUE)
• Analytics: videoId, timestamp, provider

Key Relationships

Video → Chunks: One-to-Many with CASCADE delete
- When a video is deleted, all its chunks are automatically removed
- Each chunk links to Pinecone via vectorId
Video → Analytics: One-to-Many with CASCADE delete
- When a video is deleted, all Q&A history is removed
- Tracks which AI provider generated each answer
Chunk → Pinecone: One-to-One
- Each chunk maps to one vector in Pinecone
- embeddingProvider tracks which service created the embedding

📚 API Documentation

Core Endpoints

1. Validate Video

GET /api/validate-video?url={youtube_url}

2. Process Transcript

POST /api/process-transcript
Body: { videoId, url }

3. Generate Embeddings

POST /api/embeddings
Body: { chunks, model?, provider? }

4. Upload to Pinecone

POST /api/upload-vectors
Body: { videoId, chunks, embeddings, embeddingProvider, metadata }

5. Generate Summary

POST /api/generate-summary
Body: { videoId }

6. Ask Question (RAG)

POST /api/qa
Body: { question, videoId?, provider?, model?, topK? }

7. Get Video Details

GET /api/videos/{id}

8. Update Video

PATCH /api/videos/{id}
Body: { transcript?, chunks?, summary?, status?, ... }

9. Delete Video

DELETE /api/videos/{id}

Deletes:

Video record from SQLite
All chunks (CASCADE)
All analytics (CASCADE)
All vectors from Pinecone

💰 Free Tier Setup

100% FREE Configuration

Service	Free Tier	What It Provides
Mistral AI	1B tokens/month	Embeddings (1024 dimensions)
Groq	14,400 requests/day	Q&A generation (Llama 3.3 70B)
Pinecone	100K vectors	Vector storage & search

Cost Comparison: FREE vs PAID

Per 45-minute video (~30 chunks):

Provider	FREE Stack	PAID Stack (OpenAI)
Embeddings	$0.00 (Mistral)	$0.002 (OpenAI)
Q&A (100 questions)	$0.00 (Groq)	$0.05 (GPT-4o-mini)
Vector Storage	$0.00 (Pinecone free)	$0.00 (Pinecone free)
Total	$0.00	~$0.052

Monthly Example (50 videos, 5,000 questions):

FREE Stack: $0/month 🎉
PAID Stack: ~$18.50 first month, ~$2.50/month after

Getting Free API Keys

Mistral AI: console.mistral.ai → Create API key
Groq: console.groq.com → Get API key
Pinecone: app.pinecone.io → Free tier, no credit card

🤝 Contributing

This is a proprietary project. See LICENSE.md for usage restrictions.

For bugs and feature requests, contact @reachvivek.

📝 License

Custom Proprietary License - See LICENSE.md

TL;DR:

✅ Personal & educational use allowed
✅ Fork for learning (with attribution)
❌ Commercial use prohibited
❌ Redistribution prohibited

📧 Contact

Vivek Kumar Singh (reachvivek)

🔗 LinkedIn: linkedin.com/in/reachvivek
💻 GitHub: @reachvivek
📸 Instagram: @rogerthatvivek

Built with ❤️ by Vivek Kumar Singh

⭐ Star this repo if you find it useful!

Report Bug · Request Feature

QueryTube™ is a proprietary project.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
app		app
components		components
docs		docs
hooks		hooks
lib		lib
prisma		prisma
public		public
types		types
utils		utils
.env.example		.env.example
.gitignore		.gitignore
CAPTION-LIBRARY-MIGRATION.md		CAPTION-LIBRARY-MIGRATION.md
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE.md		LICENSE.md
README.md		README.md
TRANSCRIPTION-ANALYSIS.md		TRANSCRIPTION-ANALYSIS.md
VERCEL-DEPLOY.md		VERCEL-DEPLOY.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

License

reachvivek/QueryTube

Folders and files

Latest commit

History

Repository files navigation