Lumos

A RAG LLM co-pilot for browsing the web, powered by local LLMs

Stars: 1257

Visit

Lumos is a Chrome extension powered by a local LLM co-pilot for browsing the web. It allows users to summarize long threads, news articles, and technical documentation. Users can ask questions about reviews and product pages. The tool requires a local Ollama server for LLM inference and embedding database. Lumos supports multimodal models and file attachments for processing text and image content. It also provides options to customize models, hosts, and content parsers. The extension can be easily accessed through keyboard shortcuts and offers tools for automatic invocation based on prompts.

README:

Lumos

A RAG LLM co-pilot for browsing the web, powered by local LLMs.

This Chrome extension is powered by Ollama. Inference is done on your local machine without any remote server support. However, due to security constraints in the Chrome extension platform, the app does rely on local server support to run the LLM. This app is inspired by the Chrome extension example provided by the Web LLM project and the local LLM examples provided by LangChain.

Lumos. Nox. Lumos. Nox.

Use Cases

Summarize long threads on issue tracking sites, forums, and social media sites.
Summarize news articles.
Ask questions about reviews on business and product pages.
Ask questions about long, technical documentation.
... what else?

Ollama Server

A local Ollama server is needed for the embedding database and LLM inference. Download and install Ollama and the CLI here.

Pull Image

Example:

ollama pull llama2

Start Server

Example:

OLLAMA_ORIGINS=chrome-extension://* ollama serve

Terminal output:

2023/11/19 20:55:16 images.go:799: total blobs: 6
2023/11/19 20:55:16 images.go:806: total unused blobs removed: 0
2023/11/19 20:55:16 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.10)

Note: The environment variable OLLAMA_ORIGINS must be set to chrome-extension://* to allow requests originating from the Chrome extension. The following error will occur in the Chrome extension if OLLAMA_ORIGINS is not set properly.

Access to fetch at 'http://localhost:11434/api/tags' from origin 'chrome-extension://<extension_id>' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

macOS

Run launchctl setenv to set OLLAMA_ORIGINS.

launchctl setenv OLLAMA_ORIGINS "chrome-extension://*"

Setting environment variables on Mac (Ollama)

Docker

The Ollama server can also be run in a Docker container. The container should have the OLLAMA_ORIGINS environment variable set to chrome-extension://*.

Run docker run with the -e flag to set the OLLAMA_ORIGINS environment variable:

docker run -e OLLAMA_ORIGINS="chrome-extension://*" -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Chrome Extension

In the project directory, you can run:

`npm test`

Launches the test runner in the interactive watch mode.
See the section about running tests for more information.

`npm run lint`

Runs eslint and prettier on src and __tests__ files.

`npm run build`

Builds the app for production to the dist folder.
It correctly bundles React in production mode and optimizes the build for the best performance.

The build is minified and the filenames include the hashes.
Your app is ready to be deployed!

See the section about deployment for more information.

Load Unpacked Extension (Install)

https://developer.chrome.com/docs/extensions/mv3/getstarted/development-basics/#load-unpacked

Keyboard Shortcut

Create a keyboard shortcut to make Lumos easily accessible.

Navigate to chrome://extensions/shortcuts.
Configure a shortcut for Lumos (Activate the extension). For example, ⌘L (command key + L).

Releases

If you don't have npm installed, you can download the pre-built extension package from the Releases page.

Lumos Options

Right-click on the extension icon and select Options to access the extension's Options page.

Ollama Model: Select desired model (e.g. llama2)
Ollama Embedding Model: Select desired embedding model (e.g. nomic-embed-text). Caution: Using a different embedding model requires Ollama to swap models, which may incur undesired latency in the app. This is a known limitation in Ollama and may be improved in the future.
Ollama Host: Select desired host (defaults to http://localhost:11434)
Vector Store TTL (minutes): Number of minutes to store a URL's content in the vector store cache.
Content Parser Config: Lumos's default content parser will extract all text content between a page's <body></body> tag. To customize the content parser, add an entry to the configuration.
Enable/Disable Tools: Enable or disable individual tools. If a tool is enabled, a custom prefix trigger (e.g. "calc:") can be specified to override the app's internal prompt classification mechanism.
Enable/Disable Dark Arts: 😈

Content Parser Config

Each URL path can have its own content parser. The content parser config for the longest URL path will be matched.

chunkSize: Number of characters to chunk page content into for indexing into RAG vectorstore
chunkOverlap: Number of characters to overlap in chunks for indexing into RAG vectorstore
selectors: document.querySelector() queries to perform to retrieve page content
selectorsAll: document.querySelectorAll() queries to perform to retrieve page content

For example, given the following config, if the URL path of the current tab is domain.com/path1/subpath1/subsubpath1, then the config for domain.com/path1/subpath1 will be used (i.e. chunkSize=600).

{
  "domain.com/path1/subpath1": {
    "chunkSize": 600,
    "chunkOverlap": 200,
    "selectors": [
      "#id"
    ],
    "selectorsAll": []
  },
  "domain.com/path1": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [
      ".className"
    ],
    "selectorsAll": []
  }
}

See docs for How to Create a Custom Content Parser. See documentation for querySelector() and querySelectorAll() to confirm all querying capabilities.

Example:

{
  "default": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [
      "body"
    ],
    "selectorsAll": []
  },
  "medium.com": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [
      "article"
    ],
    "selectorsAll": []
  },
  "reddit.com": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [],
    "selectorsAll": [
      "shreddit-comment"
    ]
  },
  "stackoverflow.com": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [
      "#question-header",
      "#mainbar"
    ],
    "selectorsAll": []
  },
  "wikipedia.org": {
    "chunkSize": 2000,
    "chunkOverlap": 500,
    "selectors": [
      "#bodyContent"
    ],
    "selectorsAll": []
  },
  "yelp.com": {
    "chunkSize": 500,
    "chunkOverlap": 0,
    "selectors": [
      "#location-and-hours",
      "#reviews"
    ],
    "selectorsAll": []
  }
}

Highlighted Content

Alternatively, if content is highlighted on a page (e.g. highlighted text), that content will be parsed instead of the content produced from the content parser configuration.

Note: Content that is highlighted will not be cached in the vector store cache. Each subsequent prompt containing highlighted content will generate new embeddings.

Shortcuts

cmd + c: Copy last message to clipboard.
cmd + j: Toggle Disable content parsing checkbox.
cmd + k: Clear all messages.
cmd + ;: Open/close Chat History panel.
ctrl + c: Cancel request (LLM request/streaming or embeddings generation)
ctrl + x: Remove file attachment.
ctrl + r: Regenerate last LLM response.

Multimodal

Lumos supports multimodal models! Images that are present on the current page will be downloaded and bound to the model for prompting. See documentation and examples here.

File Attachments

File attachments can be uploaded to Lumos. The contents of a file will be parsed and processed through Lumos's RAG workflow (similar to processing page content). By default, the text content of a file will be parsed if the extension type is not explicitly listed below.

Supported extension types:

.csv
.json
.pdf
any plain text file format (.txt, .md, .py, etc)

Note: If an attachment is present, page content will not be parsed. Remove the file attachment to resume parsing page content.

Image Files

Image files will be processed through Lumos's multimodal workflow (requires multimodal model).

Supported image types:

.jpeg, .jpg
.png

Tools (Experimental)

Lumos invokes Tools automatically based on the provided prompt. See documentation and examples here.

Reading

For Tasks:

Click tags to check more tools for each tasks

summarize threads ask questions about reviews summarize news articles browse technical documentation process file attachments

For Jobs:

web developer data analyst content creator research analyst product manager

Alternative AI tools for Lumos

Similar Open Source Tools

Lumos

github

: 1.3k

OpenAI

OpenAI is a Swift community-maintained implementation over OpenAI public API. It is a non-profit artificial intelligence research organization founded in San Francisco, California in 2015. OpenAI's mission is to ensure safe and responsible use of AI for civic good, economic growth, and other public benefits. The repository provides functionalities for text completions, chats, image generation, audio processing, edits, embeddings, models, moderations, utilities, and Combine extensions.

github

: 2.4k

bot-on-anything

The 'bot-on-anything' repository allows developers to integrate various AI models into messaging applications, enabling the creation of intelligent chatbots. By configuring the connections between models and applications, developers can easily switch between multiple channels within a project. The architecture is highly scalable, allowing the reuse of algorithmic capabilities for each new application and model integration. Supported models include ChatGPT, GPT-3.0, New Bing, and Google Bard, while supported applications range from terminals and web platforms to messaging apps like WeChat, Telegram, QQ, and more. The repository provides detailed instructions for setting up the environment, configuring the models and channels, and running the chatbot for various tasks across different messaging platforms.

github

: 4.0k

call-center-ai

Call Center AI is an AI-powered call center solution that leverages Azure and OpenAI GPT. It is a proof of concept demonstrating the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI to build an automated call center solution. The project showcases features like accessing claims on a public website, customer conversation history, language change during conversation, bot interaction via phone number, multiple voice tones, lexicon understanding, todo list creation, customizable prompts, content filtering, GPT-4 Turbo for customer requests, specific data schema for claims, documentation database access, SMS report sending, conversation resumption, and more. The system architecture includes components like RAG AI Search, SMS gateway, call gateway, moderation, Cosmos DB, event broker, GPT-4 Turbo, Redis cache, translation service, and more. The tool can be deployed remotely using GitHub Actions and locally with prerequisites like Azure environment setup, configuration file creation, and resource hosting. Advanced usage includes custom training data with AI Search, prompt customization, language customization, moderation level customization, claim data schema customization, OpenAI compatible model usage for the LLM, and Twilio integration for SMS.

github

: 119

json-repair

JSON Repair is a toolkit designed to address JSON anomalies that can arise from Large Language Models (LLMs). It offers a comprehensive solution for repairing JSON strings, ensuring accuracy and reliability in your data processing. With its user-friendly interface and extensive capabilities, JSON Repair empowers developers to seamlessly integrate JSON repair into their workflows.

github

: 135

swarmzero

SwarmZero SDK is a library that simplifies the creation and execution of AI Agents and Swarms of Agents. It supports various LLM Providers such as OpenAI, Azure OpenAI, Anthropic, MistralAI, Gemini, Nebius, and Ollama. Users can easily install the library using pip or poetry, set up the environment and configuration, create and run Agents, collaborate with Swarms, add tools for complex tasks, and utilize retriever tools for semantic information retrieval. Sample prompts are provided to help users explore the capabilities of the agents and swarms. The SDK also includes detailed examples and documentation for reference.

github

: 218

memobase

Memobase is a user profile-based memory system designed to enhance Generative AI applications by enabling them to remember, understand, and evolve with users. It provides structured user profiles, scalable profiling, easy integration with existing LLM stacks, batch processing for speed, and is production-ready. Users can manage users, insert data, get memory profiles, and track user preferences and behaviors. Memobase is ideal for applications that require user analysis, tracking, and personalized interactions.

github

: 994

hf-waitress

HF-Waitress is a powerful server application for deploying and interacting with HuggingFace Transformer models. It simplifies running open-source Large Language Models (LLMs) locally on-device, providing on-the-fly quantization via BitsAndBytes, HQQ, and Quanto. It requires no manual model downloads, offers concurrency, streaming responses, and supports various hardware and platforms. The server uses a `config.json` file for easy configuration management and provides detailed error handling and logging.

github

: 64

claim-ai-phone-bot

AI-powered call center solution with Azure and OpenAI GPT. The bot can answer calls, understand the customer's request, and provide relevant information or assistance. It can also create a todo list of tasks to complete the claim, and send a report after the call. The bot is customizable, and can be used in multiple languages.

github

: 65

instructor

Instructor is a popular Python library for managing structured outputs from large language models (LLMs). It offers a user-friendly API for validation, retries, and streaming responses. With support for various LLM providers and multiple languages, Instructor simplifies working with LLM outputs. The library includes features like response models, retry management, validation, streaming support, and flexible backends. It also provides hooks for logging and monitoring LLM interactions, and supports integration with Anthropic, Cohere, Gemini, Litellm, and Google AI models. Instructor facilitates tasks such as extracting user data from natural language, creating fine-tuned models, managing uploaded files, and monitoring usage of OpenAI models.

github

: 10.0k

hayhooks

Hayhooks is a tool that simplifies the deployment and serving of Haystack pipelines as REST APIs. It allows users to wrap their pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, users can easily convert their Haystack pipelines into API services with minimal boilerplate code.

github

: 51

gfm-rag

The GFM-RAG is a graph foundation model-powered pipeline that combines graph neural networks to reason over knowledge graphs and retrieve relevant documents for question answering. It features a knowledge graph index, efficiency in multi-hop reasoning, generalizability to unseen datasets, transferability for fine-tuning, compatibility with agent-based frameworks, and interpretability of reasoning paths. The tool can be used for conducting retrieval and question answering tasks using pre-trained models or fine-tuning on custom datasets.

github

: 54

ai-artifacts

AI Artifacts is an open source tool that replicates Anthropic's Artifacts UI in the Claude chat app. It utilizes E2B's Code Interpreter SDK and Core SDK for secure AI code execution in a cloud sandbox environment. Users can run AI-generated code in various languages such as Python, JavaScript, R, and Nextjs apps. The tool also supports running AI-generated Python in Jupyter notebook, Next.js apps, and Streamlit apps. Additionally, it offers integration with Vercel AI SDK for tool calling and streaming responses from the model.

github

: 2.4k

mcphost

MCPHost is a CLI host application that enables Large Language Models (LLMs) to interact with external tools through the Model Context Protocol (MCP). It acts as a host in the MCP client-server architecture, allowing language models to access external tools and data sources, maintain consistent context across interactions, and execute commands safely. The tool supports interactive conversations with Claude 3.5 Sonnet and Ollama models, multiple concurrent MCP servers, dynamic tool discovery and integration, configurable server locations and arguments, and a consistent command interface across model types.

github

: 172

langserve

LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

github

: 1.9k

genaiscript

GenAIScript is a scripting environment designed to facilitate file ingestion, prompt development, and structured data extraction. Users can define metadata and model configurations, specify data sources, and define tasks to extract specific information. The tool provides a convenient way to analyze files and extract desired content in a structured format. It offers a user-friendly interface for working with data and automating data extraction processes, making it suitable for various data processing tasks.

github

: 2.5k

For similar tasks

Lumos

github

: 1.3k

open-source-slack-ai

This repository provides a ready-to-run basic Slack AI solution that allows users to summarize threads and channels using OpenAI. Users can generate thread summaries, channel overviews, channel summaries since a specific time, and full channel summaries. The tool is powered by GPT-3.5-Turbo and an ensemble of NLP models. It requires Python 3.8 or higher, an OpenAI API key, Slack App with associated API tokens, Poetry package manager, and ngrok for local development. Users can customize channel and thread summaries, run tests with coverage using pytest, and contribute to the project for future enhancements.

github

: 142

For similar jobs

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

daily-poetry-image

Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.

github

: 492

exif-photo-blog

EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.

github

: 992

SillyTavern

SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.

github

: 13.2k

Twitter-Insight-LLM

This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).

github

: 401

AISuperDomain

Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.

github

: 1.2k

ChatGPT-On-CS

This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.

github

: 768

obs-localvocal

LocalVocal is a live-streaming AI assistant plugin for OBS that allows you to transcribe audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). It's privacy-first, with all data staying on your machine, and requires no GPU, cloud costs, network, or downtime.

github

: 248