ha-llmvision

ha-llmvision

Let Home Assistant see!

Stars: 124

Visit
 screenshot

LLM Vision is a Home Assistant integration that allows users to analyze images, videos, and camera feeds using multimodal LLMs. It supports providers such as OpenAI, Anthropic, Google Gemini, LocalAI, and Ollama. Users can input images and videos from camera entities or local files, with the option to downscale images for faster processing. The tool provides detailed instructions on setting up LLM Vision and each supported provider, along with usage examples and service call parameters.

README:

Issues

Image and video analyzer for Home Assistant using multimodal LLMs

🌟 Features · 📖 Resources · ⬇️ Installation · 🚧 Roadmap · 🪲 How to report Bugs




LLM Vision is a Home Assistant integration to analyze images, videos and camera feeds using the vision capabilities of multimodal LLMs.
Supported providers are OpenAI, Anthropic, Google Gemini, LocalAI, Ollama and any OpenAI compatible API.

Features

  • Compatible with OpenAI, Anthropic Claude, Google Gemini, LocalAI, Ollama and custom OpenAI compatible APIs
  • Takes images and video from camera entities as input
  • Takes local image and video files as input
  • Images can be downscaled for faster processing

Resources

Check the docs for detailed instructions on how to set up LLM Vision and each of the supported providers, get inspiration from examples or join the discussion on the Home Assistant Community.

Installation

Open a repository inside the Home Assistant Community Store.

  1. Search for LLM Vision in Home Assistant Settings/Devices & services
  2. Select your provider
  3. Follow the instructions to add your AI providers.

Detailed instruction on how to set up LLM Vision and each of the supported providers are available here: https://llm-vision.gitbook.io/getting-started/

Debugging

To enable debugging, add the following to your configuration.yaml:

logger:
  logs:
    custom_components.llmvision: debug

Roadmap

[!NOTE] These are planned features and ideas. They are subject to change and may not be implemented in the order listed or at all.

  1. New Provider: NVIDIA ChatRTX
  2. HACS: Include in HACS default
  3. [x] Animation Support: Support for animated GIFs
  4. [x] New Provider: Custom (OpenAI API compatible) Providers
  5. [x] Feature: HTTPS support for LocalAI and Ollama
  6. [x] Feature: Support for video files
  7. [x] Feature: Analyze Frigate Recordings using frigate's event_id

How to report a bug or request a feature

[!IMPORTANT] Bugs: If you encounter any bugs and have followed the instructions carefully, feel free to file a bug report.
Feature Requests: If you have an idea for a feature, create a feature request.


Create new Issue 

Support

You can support this project by starring this GitHub repository. If you want, you can also buy me a coffee here:

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for ha-llmvision

Similar Open Source Tools

For similar tasks

For similar jobs