edgen

edgen

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.

Stars: 279

Visit
 screenshot

Edgen is a local GenAI API server that serves as a drop-in replacement for OpenAI's API. It provides multi-endpoint support for chat completions and speech-to-text, is model agnostic, offers optimized inference, and features model caching. Built in Rust, Edgen is natively compiled for Windows, MacOS, and Linux, eliminating the need for Docker. It allows users to utilize GenAI locally on their devices for free and with data privacy. With features like session caching, GPU support, and support for various endpoints, Edgen offers a scalable, reliable, and cost-effective solution for running GenAI applications locally.

README:

GitHub

A Local GenAI API Server: A drop-in replacement for OpenAI's API for Local GenAI

| Documentation | Blog | Discord | Roadmap |

EdgenChat, a local chat app powered by ⚡Edgen

EdgenChat, a local chat app powered by ⚡Edgen

  • [x] OpenAI Compliant API: ⚡Edgen implements an OpenAI compatible API, making it a drop-in replacement.
  • [x] Multi-Endpoint Support: ⚡Edgen exposes multiple AI endpoints such as chat completions (LLMs) and speech-to-text (Whisper) for audio transcriptions.
  • [x] Model Agnostic: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
  • [x] Optimized Inference: You don't need to take a PhD in AI optimization. ⚡Edgen abstracts the complexity of optimizing inference for different hardware, platforms and models.
  • [x] Modular: ⚡Edgen is model and runtime agnostic. New models can be added easily and ⚡Edgen can select the best runtime for the user's hardware: you don't need to keep up about the latest models and ML runtimes - ⚡Edgen will do that for you.
  • [x] Model Caching: ⚡Edgen caches foundational models locally, so 1 model can power hundreds of different apps - users don't need to download the same model multiple times.
  • [x] Native: ⚡Edgen is built in 🦀Rust and is natively compiled to all popular platforms: Windows, MacOS and Linux. No docker required.
  • [ ] Graphical Interface: A graphical user interface to help users efficiently manage their models, endpoints and permissions.

⚡Edgen lets you use GenAI in your app, completely locally on your user's devices, for free and with data-privacy. It's a drop-in replacement for OpenAI (it uses the a compatible API), supports various functions like text generation, speech-to-text and works on Windows, Linux, and MacOS.

Features

  • [x] Session Caching: ⚡Edgen maintains top performance with big contexts (big chat histories), by caching sessions. Sessions are auto-detected in function of the chat history.
  • [x] GPU support: CUDA, Vulkan. Metal

Endpoints

Supported Models

Check in the documentation

Supported platforms

  • [x] Windows
  • [x] Linux
  • [x] MacOS

🔥 Hot Topics

Why local GenAI?

  • Data Private: On-device inference means users' data never leave their devices.

  • Scalable: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.

  • Reliable: No internet, no downtime, no rate limits, no API keys.

  • Free: It runs locally on hardware the user already owns.

Quickstart

  1. Download and start ⚡Edgen
  2. Chat with ⚡EdgenChat

Ready to start your own GenAI application? Checkout our guides!

⚡Edgen usage:

Usage: edgen [<command>] [<args>]

Toplevel CLI commands and options. Subcommands are optional. If no command is provided "serve" will be invoked with default options.

Options:
  --help            display usage information

Commands:
  serve             Starts the edgen server. This is the default command when no
                    command is provided.
  config            Configuration-related subcommands.
  version           Prints the edgen version to stdout.
  oasgen            Generates the Edgen OpenAPI specification.

edgen serve usage:

Usage: edgen serve [-b <uri...>] [-g]

Starts the edgen server. This is the default command when no command is provided.

Options:
  -b, --uri         if present, one or more URIs/hosts to bind the server to.
                    `unix://` (on Linux), `http://`, and `ws://` are supported.
                    For use in scripts, it is recommended to explicitly add this
                    option to make your scripts future-proof.
  -g, --nogui       if present, edgen will not start the GUI; the default
                    behavior is to start the GUI.
  --help            display usage information

GPU Support

⚡Edgen also supports compilation and execution on a GPU, when building from source, through Vulkan, CUDA and Metal. The following cargo features enable the GPU:

  • llama_vulkan - execute LLM models using Vulkan. Requires a Vulkan SDK to be installed.
  • llama_cuda - execute LLM models using CUDA. Requires a CUDA Toolkit to be installed.
  • llama_metal - execute LLM models using Metal.
  • whisper_cuda - execute Whisper models using CUDA. Requires a CUDA Toolkit to be installed.

Note that, at the moment, llama_vulkan, llama_cuda and llama_metal cannot be enabled at the same time.

Example usage (building from source, you need to first install the prerequisites):

cargo run --features llama_vulkan --release -- serve

Architecture Overview

⚡Edgen architecture overview

⚡Edgen architecture overview

Contribute

If you don't know where to start, check Edgen's roadmap! Before you start working on something, see if there's an existing issue/pull-request. Pop into Discord to check with the team or see if someone's already tackling it.

Communication Channels

Special Thanks

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for edgen

Similar Open Source Tools

For similar tasks

For similar jobs