asktube

asktube

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖. Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more.

Stars: 54

Visit
 screenshot

AskTube is an AI-powered YouTube video summarizer and QA assistant that utilizes Retrieval Augmented Generation (RAG) technology. It offers a comprehensive solution with Q&A functionality and aims to provide a user-friendly experience for local machine usage. The project integrates various technologies including Python, JS, Sanic, Peewee, Pytubefix, Sentence Transformers, Sqlite, Chroma, and NuxtJs/DaisyUI. AskTube supports multiple providers for analysis, AI services, and speech-to-text conversion. The tool is designed to extract data from YouTube URLs, store embedding chapter subtitles, and facilitate interactive Q&A sessions with enriched questions. It is not intended for production use but rather for end-users on their local machines.

README:

AskTube's Logo

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖

Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more


🤷🏽 Why does this project exist?

  • I’ve seen several GitHub repositories offering AI-powered summaries for YouTube videos, but none include Q&A functionality.
  • I want to implement a more comprehensive solution while also gaining experience with AI to build my own RAG application.

🔨 Technology

  • Language: Python, JS
  • Server: [email protected], Bun@v1
  • Framework/Lib: Sanic, Peewee, Pytubefix, Sentence Transformers, Sqlite, Chroma, NuxtJs/DaisyUI, etc.
  • Embedding Provider (Analysis Provider):
    • [x] OpenAI
    • [x] Gemini
    • [x] VoyageAI
    • [x] Mistral
    • [x] Sentence Transformers (Local)
  • AI Provider:
    • [x] OpenAI
    • [x] Claude
    • [x] Gemini
    • [x] Mistral
    • [x] Ollama (Local)
  • Speech To Text:

🗓️ Next Todo Tasks

  • [ ] Implement Speech To Text for cloud models
    • [ ] AssemblyAI
    • [ ] OpenAI
    • [ ] Gemini
  • [ ] Enhance
    • [x] Skip using RAG for short videos
    • [ ] Chat prompts, chat messages by context limit
    • [ ] RAG: Implement Query Translation
      • [x] Multiquery
      • [ ] Fusion
      • [ ] Decomposition
      • [ ] Step back
      • [ ] HyDE

🚀 How to run ?

For the first time running, the program maybe a bit slow due they need to install local models.

Run on your machine

  • Ensure you installed:

    • Python 3.10
      • Windows User, please download here
      • Linux, MacOS User, please use homebrew or your install package command (apt, dnf, etc)
      • Or use conda
    • Poetry
      • Windows User open Powershell and run:
      (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
      • Linux, MacOS User open Terminal and run:
      curl -sSL https://install.python-poetry.org | python3 -
    • Bun
    • ffmpeg
      • MacOS User
      brew install ffmpeg
      • Linux User
      # Ubuntu
      sudo apt install ffmpeg
      # Fedora
      sudo dnf install -y ffmpeg
  • Clone repostiory

    git clone https://github.com/jonaskahn/asktube.git
  • Create file .env in asktube/engine directory:

  • Run program

    • You may need to run first:
    poetry env use python
    • Open terminal/cmd/powershell in asktube/engine directory, then run:
    poetry install && poetry python engine/server.py
    • Open terminal/cmd/powershell in asktube/web directory, then run:
    bun install && bun run dev
  • Open web: http://localhost:3000

With docker (In process)

Before You Start

  1. I built these services to docker images, but if you want to build local images, please run build.local.bat for Windows or build.local.amd64.sh or build.local.aarch64.sh for MacOS, Linux
  2. If you have a GPU (cuda or rocm), please refer ENV settings above, change params like above

Locally

  • Use local.yaml compose file to start
  • Open terminal/cmd/powershell in asktube directory
docker compose -f compose/local.yaml pull && docker compose -f compose/local.yaml up -d
  • After run, you need install Ollama model qwen2 and llama3.1 for QA
docker run ollama ollama run qwen2
docker run ollama ollama run llama3.1

Free (with rate limit)

  • You need to go Google Gemini and VoyageAI to register account and generate your own API keys:
    • Gemini is free with your Google Account
    • VoyageAI (recommended by Claude) gives you free 50M tokens (a huge amount) but you need to add your credit card first.
  • Replace your ENV setting in docker file free and start docker
  • Open terminal/cmd/powershell in asktube directory
docker compose -f compose/free.yaml pull && docker compose -f compose/free.yaml up -d

Ideal

  • Using VoyageAI for embedding texts
  • Using OpenAI and Claude for QA, register account and generate your own API keys
  • Replace your ENV setting in docker file ideal and start docker
  • Open terminal/cmd/powershell in asktube directory
docker compose -f compose/ideal.yaml pull && docker compose -f compose/ideal.yaml up -d

Result


💡 Architecture

The real implementation might differ from this art due to its complexity.

1️⃣ Extract data from given URL

P1.png

2️⃣ Storing embedding chapter subtitles

P2.png

3️⃣ Asking (included enrich question)

P3.png


🪧 Notice

  1. Do not use this for production. This aimed for end-users on their local machines.
  2. Do not request any advanced features for management.

🏃🏽‍➡️ Demo & Screenshot

Demo image 2

Demo image 1

Update 09/19/2004

  • New UI, mobile responsive

Demo newui 1

✍🏿 For development


⁉️ FAQ and Troubleshooting


For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for asktube

Similar Open Source Tools

For similar tasks

For similar jobs