lollms_legacy

Lord of LLMS

Stars: 294

Visit

Lord of Large Language Models (LoLLMs) Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications. The tool supports multiple personalities for generating text with different styles and tones, real-time text generation with WebSocket-based communication, RESTful API for listing personalities and adding new personalities, easy integration with various applications and frameworks, sending files to personalities, running on multiple nodes to provide a generation service to many outputs at once, and keeping data local even in the remote version.

README:

Lord of Large Language Models (LoLLMs)

Features

Fully integrated library with access to bindings, personalities and helper tools.
Generate text using large language models.
Supports multiple personalities for generating text with different styles and tones.
Real-time text generation with WebSocket-based communication.
RESTful API for listing personalities and adding new personalities.
Easy integration with various applications and frameworks.
Possibility to send files to personalities
Possibility to run on multiple nodes and provide a generation service to many outputs at once.
Data stays local even in the remote version. Only generations are sent to the host node. The logs, data and discussion history are kept in your local disucssion folder.

Installation

You can install LoLLMs using pip, the Python package manager. Open your terminal or command prompt and run the following command:

pip install --upgrade lollms

Or if you want to get the latest version from the git:

pip install --upgrade git+https://github.com/ParisNeo/lollms_legacy.git

GPU support

If you want to use cuda. Either install it directly or use conda to install everything:

conda create --name lollms python=3.10

Activate the environment

conda activate lollms

Install cudatoolkit

conda install -c anaconda cudatoolkit

Install lollms

pip install --upgrade lollms

Now you are ready.

To simply configure your environment run the settings app:

lollms-settings

The tool is intuitive and will guide you through configuration process.

The first time you will be prompted to select a binding.

Once the binding is selected, you have to install at least a model. You have two options:

1- install from internet. Just give the link to a model on hugging face. For example. if you select the default llamacpp python bindings (7), you can install this model:

https://huggingface.co/TheBloke/airoboros-7b-gpt4-GGML/resolve/main/airoboros-7b-gpt4.ggmlv3.q4_1.bin

2- install from local drive. Just give the path to a model on your pc. The model will not be copied. We only create a reference to the model. This is useful if you use multiple clients so that you can mutualize your models with other tools.

Now you are ready to use the server.

Library example

Here is the smallest possible example that allows you to use the full potential of the tool with nearly no code

from lollms.console import Conversation 

cv = Conversation(None)
cv.start_conversation()

Now you can reimplement the start_conversation method to do the things you want:

from lollms.console import Conversation 

class MyConversation(Conversation):
  def __init__(self, cfg=None):
    super().__init__(cfg, show_welcome_message=False)

  def start_conversation(self):
    prompt = "Once apon a time"
    def callback(text, type=None):
        print(text, end="", flush=True)
        return True
    print(prompt, end="", flush=True)
    output = self.safe_generate(prompt, callback=callback)

if __name__ == '__main__':
  cv = MyConversation()
  cv.start_conversation()

Or if you want here is a conversation tool written in few lines

from lollms.console import Conversation 

class MyConversation(Conversation):
  def __init__(self, cfg=None):
    super().__init__(cfg, show_welcome_message=False)

  def start_conversation(self):
    full_discussion=""
    while True:
      prompt = input("You: ")
      if prompt=="exit":
        return
      if prompt=="menu":
        self.menu.main_menu()
      full_discussion += self.personality.user_message_prefix+prompt+self.personality.link_text
      full_discussion += self.personality.ai_message_prefix
      def callback(text, type=None):
          print(text, end="", flush=True)
          return True
      print(self.personality.name+": ",end="",flush=True)
      output = self.safe_generate(full_discussion, callback=callback)
      full_discussion += output.strip()+self.personality.link_text
      print()

if __name__ == '__main__':
  cv = MyConversation()
  cv.start_conversation()

Here we use the safe_generate method that does all the cropping for you ,so you can chat forever and will never run out of context.

Socket IO Server Usage

Once installed, you can start the LoLLMs Server using the lollms-server command followed by the desired parameters.

lollms-server --host <host> --port <port> --config <config_file> --bindings_path <bindings_path> --personalities_path <personalities_path> --models_path <models_path> --binding_name <binding_name> --model_name <model_name> --personality_full_name <personality_full_name>

Parameters

--host: The hostname or IP address to bind the server (default: localhost).
--port: The port number to run the server (default: 9600).
--config: Path to the configuration file (default: None).
--bindings_path: The path to the Bindings folder (default: "./bindings_zoo").
--personalities_path: The path to the personalities folder (default: "./personalities_zoo").
--models_path: The path to the models folder (default: "./models").
--binding_name: The default binding to be used (default: "llama_cpp_official").
--model_name: The default model name (default: "Manticore-13B.ggmlv3.q4_0.bin").
--personality_full_name: The full name of the default personality (default: "personality").

Examples

Start the server with default settings:

lollms-server

Start the server on a specific host and port:

lollms-server --host 0.0.0.0 --port 5000

API Endpoints

WebSocket Events

connect: Triggered when a client connects to the server.
disconnect: Triggered when a client disconnects from the server.
list_personalities: List all available personalities.
add_personality: Add a new personality to the server.
generate_text: Generate text based on the provided prompt and selected personality.

RESTful API

GET /personalities: List all available personalities.
POST /personalities: Add a new personality to the server.

Sure! Here are examples of how to communicate with the LoLLMs Server using JavaScript and Python.

JavaScript Example

// Establish a WebSocket connection with the server
const socket = io.connect('http://localhost:9600');

// Event: When connected to the server
socket.on('connect', () => {
  console.log('Connected to the server');

  // Request the list of available personalities
  socket.emit('list_personalities');
});

// Event: Receive the list of personalities from the server
socket.on('personalities_list', (data) => {
  const personalities = data.personalities;
  console.log('Available Personalities:', personalities);

  // Select a personality and send a text generation request
  const selectedPersonality = personalities[0];
  const prompt = 'Once upon a time...';
  socket.emit('generate_text', { personality: selectedPersonality, prompt: prompt });
});

// Event: Receive the generated text from the server
socket.on('text_generated', (data) => {
  const generatedText = data.text;
  console.log('Generated Text:', generatedText);

  // Do something with the generated text
});

// Event: When disconnected from the server
socket.on('disconnect', () => {
  console.log('Disconnected from the server');
});

Python Example

import socketio

# Create a SocketIO client
sio = socketio.Client()

# Event: When connected to the server
@sio.on('connect')
def on_connect():
    print('Connected to the server')

    # Request the list of available personalities
    sio.emit('list_personalities')

# Event: Receive the list of personalities from the server
@sio.on('personalities_list')
def on_personalities_list(data):
    personalities = data['personalities']
    print('Available Personalities:', personalities)

    # Select a personality and send a text generation request
    selected_personality = personalities[0]
    prompt = 'Once upon a time...'
    sio.emit('generate_text', {'personality': selected_personality, 'prompt': prompt})

# Event: Receive the generated text from the server
@sio.on('text_generated')
def on_text_generated(data):
    generated_text = data['text']
    print('Generated Text:', generated_text)

    # Do something with the generated text

# Event: When disconnected from the server
@sio.on('disconnect')
def on_disconnect():
    print('Disconnected from the server')

# Connect to the server
sio.connect('http://localhost:9600')

# Keep the client running
sio.wait()

Make sure to have the necessary dependencies installed for the JavaScript and Python examples. For JavaScript, you need the socket.io-client package, and for Python, you need the python-socketio package.

Contributing

Contributions to the LoLLMs Server project are welcome and appreciated. If you would like to contribute, please follow the guidelines outlined in the CONTRIBUTING.md file.

License

LoLLMs Server is licensed under the Apache 2.0 License. See the LICENSE file for more information.

Repository

The source code for LoLLMs Server can be found on GitHub

For Tasks:

Click tags to check more tools for each tasks

generate text list personalities add new personality communicate with server integrate text generation

For Jobs:

software developer data scientist ai engineer web developer content creator

Alternative AI tools for lollms_legacy

Similar Open Source Tools

lollms_legacy

github

: 294

lollms

LoLLMs Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications.

github

: 287

hugging-chat-api

Unofficial HuggingChat Python API for creating chatbots, supporting features like image generation, web search, memorizing context, and changing LLMs. Users can log in, chat with the ChatBot, perform web searches, create new conversations, manage conversations, switch models, get conversation info, use assistants, and delete conversations. The API also includes a CLI mode with various commands for interacting with the tool. Users are advised not to use the application for high-stakes decisions or advice and to avoid high-frequency requests to preserve server resources.

github

: 780

IntelliNode

IntelliNode is a javascript module that integrates cutting-edge AI models like ChatGPT, LLaMA, WaveNet, Gemini, and Stable diffusion into projects. It offers functions for generating text, speech, and images, as well as semantic search, multi-model evaluation, and chatbot capabilities. The module provides a wrapper layer for low-level model access, a controller layer for unified input handling, and a function layer for abstract functionality tailored to various use cases.

github

: 201

suno-api

Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.

github

: 1.7k

langgraph4j

Langgraph4j is a Java library for language processing tasks such as text classification, sentiment analysis, and named entity recognition. It provides a set of tools and algorithms for analyzing text data and extracting useful information. The library is designed to be efficient and easy to use, making it suitable for both research and production applications.

github

: 916

generative-ai

The 'Generative AI' repository provides a C# library for interacting with Google's Generative AI models, specifically the Gemini models. It allows users to access and integrate the Gemini API into .NET applications, supporting functionalities such as listing available models, generating content, creating tuned models, working with large files, starting chat sessions, and more. The repository also includes helper classes and enums for Gemini API aspects. Authentication methods include API key, OAuth, and various authentication modes for Google AI and Vertex AI. The package offers features for both Google AI Studio and Google Cloud Vertex AI, with detailed instructions on installation, usage, and troubleshooting.

github

: 145

langserve

LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

github

: 1.9k

aimeos-laravel

Aimeos Laravel is a professional, full-featured, and ultra-fast Laravel ecommerce package that can be easily integrated into existing Laravel applications. It offers a wide range of features including multi-vendor, multi-channel, and multi-warehouse support, fast performance, support for various product types, subscriptions with recurring payments, multiple payment gateways, full RTL support, flexible pricing options, admin backend, REST and GraphQL APIs, modular structure, SEO optimization, multi-language support, AI-based text translation, mobile optimization, and high-quality source code. The package is highly configurable and extensible, making it suitable for e-commerce SaaS solutions, marketplaces, and online shops with millions of vendors.

github

: 7.6k

jina

Jina is a tool that allows users to build multimodal AI services and pipelines using cloud-native technologies. It provides a Pythonic experience for serving ML models and transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Users can build and serve models for any data type and deep learning framework, design high-performance services with easy scaling, serve LLM models while streaming their output, integrate with Docker containers via Executor Hub, and host on CPU/GPU using Jina AI Cloud. Jina also offers advanced orchestration and scaling capabilities, a smooth transition to the cloud, and easy scalability and concurrency features for applications. Users can deploy to their own cloud or system with Kubernetes and Docker Compose integration, and even deploy to JCloud for autoscaling and monitoring.

github

: 21.0k

Gemini-API

Gemini-API is a reverse-engineered asynchronous Python wrapper for Google Gemini web app (formerly Bard). It provides features like persistent cookies, ImageFx support, extension support, classified outputs, official flavor, and asynchronous operation. The tool allows users to generate contents from text or images, have conversations across multiple turns, retrieve images in response, generate images with ImageFx, save images to local files, use Gemini extensions, check and switch reply candidates, and control log level.

github

: 160

hydraai

Generate React components on-the-fly at runtime using AI. Register your components, and let Hydra choose when to show them in your App. Hydra development is still early, and patterns for different types of components and apps are still being developed. Join the discord to chat with the developers. Expects to be used in a NextJS project. Components that have function props do not work.

github

: 90

graphiti

Graphiti is a framework for building and querying temporally-aware knowledge graphs, tailored for AI agents in dynamic environments. It continuously integrates user interactions, structured and unstructured data, and external information into a coherent, queryable graph. The framework supports incremental data updates, efficient retrieval, and precise historical queries without complete graph recomputation, making it suitable for developing interactive, context-aware AI applications.

github

: 18.5k

exospherehost

Exosphere is an open source infrastructure designed to run AI agents at scale for large data and long running flows. It allows developers to define plug and playable nodes that can be run on a reliable backbone in the form of a workflow, with features like dynamic state creation at runtime, infinite parallel agents, persistent state management, and failure handling. This enables the deployment of production agents that can scale beautifully to build robust autonomous AI workflows.

github

: 65

aiomqtt

aiomqtt is an idiomatic asyncio MQTT client that allows users to interact with MQTT brokers using asyncio in Python. It eliminates the need for callbacks and return codes, providing a more streamlined experience. The tool supports MQTT versions 5.0, 3.1.1, and 3.1, and offers graceful disconnection handling. It is fully type-hinted, making it easier to work with. Users can publish and subscribe to MQTT topics with ease, making it a versatile tool for MQTT communication in Python.

github

: 462

raglite

RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite. It offers configurable options for choosing LLM providers, database types, and rerankers. The toolkit is fast and permissive, utilizing lightweight dependencies and hardware acceleration. RAGLite provides features like PDF to Markdown conversion, multi-vector chunk embedding, optimal semantic chunking, hybrid search capabilities, adaptive retrieval, and improved output quality. It is extensible with a built-in Model Context Protocol server, customizable ChatGPT-like frontend, document conversion to Markdown, and evaluation tools. Users can configure RAGLite for various tasks like configuring, inserting documents, running RAG pipelines, computing query adapters, evaluating performance, running MCP servers, and serving frontends.

github

: 1.1k

For similar tasks

lollms_legacy

github

: 294

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

LocalAI

LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.

github

: 35.5k

AiTreasureBox

AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

github

: 368

glide

Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.

github

: 110

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

infinity

Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.

github

: 4.1k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k