gurubase

gurubase

Gurubase is an open-source RAG system that lets you create AI-powered Q&A assistants by indexing websites, PDF documents, YouTube videos, and GitHub code repositories.

Stars: 103

Visit
 screenshot

Gurubase is an open-source RAG system that enables users to create AI-powered Q&A assistants ('Gurus') for various topics by integrating web pages, PDFs, YouTube videos, and GitHub repositories. It offers advanced LLM-based question answering, accurate context-aware responses through the RAG system, multiple data sources integration, easy website embedding, creation of custom AI assistants, real-time updates, personalized learning paths, and self-hosting options. Users can request Guru creation, manage existing Gurus, update datasources, and benefit from the system's features for enhancing user engagement and knowledge sharing.

README:

Gurubase Light Logo Gurubase Dark Logo

Gurubase - AI-powered Q&A assistants for any topic

Gurubase Intro

Discord Twitter Mastodon Bluesky License

What is Gurubase

Gurubase is an open-source RAG system that lets you create AI-powered Q&A assistants ("Gurus") for any topic or need. Create a new Guru by adding:

  • 📄 Webpages
  • 📑 PDFs
  • 🎥 YouTube videos
  • 💻 GitHub repositories

Start asking questions directly on Gurubase, or embed it on your website to let your users ask questions about your product. It's already being used by hundreds of open-source repositories. You can also install the entire system on your server, check INSTALL.md for instructions on how to self-host Gurubase.

Features

  • 🤖 AI-Powered Q&A: Advanced LLM-based question answering, including instant evaluation mechanism to minimize hallucination as much as possible
  • 🔄 RAG System: Retrieval Augmented Generation for accurate, context-aware responses
  • 📚 Multiple Data Sources: Add web pages, PDFs, videos, and GitHub repositories as data sources for your Guru.
  • 🔌 Easy Integration: Embeddable widget for your website. Discord and Slack Bots coming soon
  • 🎯 Custom Gurus: Create specialized AI assistants for specific topics
  • 🔄 Real-time Updates: Keep the data sources up to date by reindexing them with one click
  • ⛬ Binge: Visualize your learning path while talking with a Guru. You can navigate through it and create a personalized path
  • 🛠 Self-hosted Option: Full control over your deployment. Install the entire system on your servers

Quick Install

If you prefer not to use Gurubase.io, you can install the entire system on your own servers.

curl -fsSL https://raw.githubusercontent.com/Gurubase/gurubase/refs/heads/master/gurubase.sh -o gurubase.sh
bash gurubase.sh

See INSTALL.md for detailed installation instructions and prerequisites.

How to Create a Guru

Currently, only the Gurubase team can create a Guru on Gurubase.io. Please open an issue on this repository with the title "Guru Creation Request" and include the GitHub repository link in the issue content. We prioritize Guru creation requests from the maintainers of the tools. Please mention whether you are the maintainer of the tool. If you are not the maintainer, it would be helpful to obtain the maintainer's permission before opening a creation request for the tool.

How to Claim a Guru

Although you can't create a Guru on Gurubase.io, you can manage it on Gurubase. For example, you can add, remove, or reindex the datasources. To claim a Guru, you must have a Gurubase account and be one of the tool's maintainers. Please open an issue with the title "Guru Claim Request". Include the link to the Guru (e.g., https://gurubase.io/g/anteon), your Gurubase username, and a link proving you are one of the maintainers of the tool, such as a PR merged by you.

Showcase Your Guru

1. Widget

Add an "Ask AI" widget to your website by importing a small JS script. For an example, check the Anteon docs.

Gurubase Widget Demo

2. Badge

Like hundreds of GitHub repositories, add a badge to your README to guide your users to learn about your tool on Gurubase.

Example Badge:

[![Gurubase](https://img.shields.io/badge/Gurubase-Ask%20OpenCost%20Guru-006BFF)](https://gurubase.io/g/opencost)

Gurubase Badge

How to Update Datasources

Datasources can include your tool's documentation webpages, YouTube videos, or PDF files. You can add new ones, remove existing ones, or reindex them. Reindexing ensures your Guru is updated based on changes to the indexed datasources. For example, if you update your tool's documentation, you can reindex those pages so your Guru generates answers based on the latest data.

Once you claim your Guru, you will see your Gurus in the "My Gurus" section.

Gurubase My Gurus

Click the Guru you want to update. On the edit page, click "Reindex" for the datasource you want to reindex.

Gurubase Reindex

You can also see the "Last Index Date" on the URL pages.

Gurubase Last Index Date

License

Licensed under the Apache 2.0 License.

All the content generated by gurubase.io aligns with the license of the datasources used to generate answers. More details can be found on the Terms of Usage page, Section 2.

Help

We prefer Discord for written communication. Join our channel! To stay updated on new features, you can follow us on X, Mastodon, and Bluesky.

Used By

Gurubase currently hosts hundreds of Gurus, and it grows every day. Here are some repositories that showcase their Gurus in their READMEs or documentation.


Sunshine
21.7K ★

Teable
15K ★

Albumentations
14.5K ★

Open IM
14.3K ★

Sandboxie
14.2K ★

Quarkus
14K ★

Navidrome
12.9K ★

Vanna
12.6K ★

Tamagui
11.9K ★

Carla
11.9K ★

Duplicati
11.5K ★

Mongoose
11.3K ★

Assimp
11.2K ★

WatermelonDB
10.7K ★

Gorse
8.7K ★

SQLFluff
8.4K ★

Databend
8.1K ★

Nhost
8K ★

ast-grep(sg)
7.9K ★

DoWhy
7.2K ★
100+ more

Frequently Asked Questions

What is Gurubase?

Gurubase is an open-source RAG system that creates AI-powered Q&A assistants ("Gurus"). It processes various data sources like web pages, videos, PDFs, and GitHub code repositories to provide context-aware answers.

How does Gurubase work?

Gurubase uses a modern RAG architecture:

  1. Indexing: Processes and chunks data sources
  2. Embedding: Converts text into vector representations
  3. Storage: Stores vectors in Milvus for efficient similarity search
  4. Retrieval: Finds relevant context when questions are asked
  5. Generation: Uses LLMs to generate accurate answers based on retrieved context
  6. Evaluation: Evaluates the contexts to prevent hallucinations

Check the ARCHITECTURE.md file for more details.

What types of data sources can I use?

Gurubase supports multiple data source types:

  • 📄 Web Pages
  • 📑 PDF Documents
  • 🎥 YouTube Videos
  • 💻 GitHub repositories for codebase indexing
  • More formats coming soon! Open an issue if you want a new data source type.

What's the system architecture?

Gurubase follows a microservices architecture, deployed as Docker compose.

  • Frontend: Next.js 14 with TailwindCSS
  • Backend: Django REST framework
  • Vector Store: Milvus
  • Message Queue: RabbitMQ
  • Cache: Redis
  • Database: PostgreSQL See ARCHITECTURE.md for details.

What are the system requirements?

Minimum requirements:

  • CPU: 4 cores
  • RAM: 8GB
  • Storage: 10GB SSD
  • OS: Linux or macOS (Windows via WSL2) See INSTALL.md for detailed requirements.

What are the use cases for using my Gurus created on Gurubase?

  1. You can use it on Gurubase.io (or on Gurubase Self-hosted if you’ve installed it on your servers).
  2. You can embed an Ask AI widget into your website.
  3. You can add a Gurubase badge to your GitHub repository README.
  4. We will release an API soon.

Are there Discord/Slack integrations?

Discord and Slack integrations are currently in development. Join our Discord for updates.

What is Binge?

Binge lets you:

  • Create personalized learning paths on any Guru.
  • Ask follow-up questions to dive deeper into the content.
  • Visualize your learning path on the Binge Map and navigate it easily and efficiently.
  • Save your progress to pick up where you left off.

How often is data reindexed?

  • Manual reindexing available anytime. Check How to Update Datasources section to learn more
  • Periodic reindexing will be available soon

Is there an API available?

A public API is in development. Features will include:

  • Question answering
  • Data source management
  • Analytics and usage stats Join our Discord for API release updates.

What's the license for self-hosted Gurubase?

How is data handled and secured?

  • All data is stored locally in self-hosted deployments including the API keys
  • No data is sent to external servers except LLM API calls
  • Optional telemetry can be disabled

What is Gurubase.io?

Gurubase.io is a hosted version of Gurubase. It's a great way to get started with Gurubase without the hassle of self-hosting.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for gurubase

Similar Open Source Tools

For similar tasks

For similar jobs