GhostOS

A framework offers an OS simulator within a Python Code Interface for AI Agents

Stars: 58

Visit

GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol). It aims to create intelligent entities capable of continuous learning and growth through code generation and project management. The framework supports various capabilities such as turning Python files into web agents, real-time voice conversation, body movements control, and emotion expression. GhostOS is still in early experimental development and focuses on out-of-the-box capabilities for AI agents.

README:

GhostOS

The AI Ghosts wander in the Shells.

(This document is translated from zh-cn to english by Moonshot)

Example

Using Python code SpheroBoltGPT, an intelligent robot with a SpheroBolt as its body is defined. If you have a SpheroBolt, running ghostos web ghostos.demo.sphero.bolt_gpt can start this robot.

The demo initially implements the following features:

Real-time voice conversation.
Control of body movements and drawing graphics on an 8x8 LED matrix.
Learning skills that include actions and animations through natural language dialogue.
Expressing emotions through movements during conversation.

Introduce

GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol), becoming the core method for interaction between LLM and Agent system capabilities. For more details: MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents

The expected objects called through code include tools, personality, agent swarm, workflows, thinking, planning, knowledge, and memory. This allows a Meta-Agent to become an intelligent entity capable of continuous learning and growth through code generation and project management.

And such an intelligent agent implemented with a code repository can also be shared and installed in the form of a repository.

GhostOS Still in the early experimental developing, the current version mainly implements out-of-the-box capabilities, including:

[x] Turn a python file into a web agent
[x] Agent web UI built by Streamlit Web
[x] Support llms like OpenAI, Moonshot
[x] Support OpenAI vision
[x] Support OpenAI Realtime Beta

Quick Start

GhostOS remains a beta AI project, strongly recommending installation in containers such as Docker rather than running locally.

Install GhostOS package:

pip install ghostos

Initialize workspace (directory app as default), The runtime files of the current version will be stored in the directory.

ghostos init

Configure the model. Default to use OpenAI gpt-4o, requiring the environment variable OPENAI_API_KEY.

export OPENAI_API_KEY="your openai api key"
# Optionals: 
export OPENAI_PROXY="sock5://localhost:[your-port]" # setup openai proxy
export DEEPSEEK_API_KEY="your deepseek api key"
epoxrt MOONSHOT_API_KEY="your moonshot api key"

Or you can use configuration ui by streamlit:

ghostos config

Then test the default agent:

# run an agent with python filename or modulename
ghostos web ghostos.demo.agents.jojo

Or turn a local Python file into an Agent, that can be instructed to call functions or methods within the file through natural language conversations.

ghostos web [my_path_file_path]

some demo agents

ghostos web ghostos.demo.agents.jojo
ghostos web ghostos.demo.test_agents.moonshot         # moonshot-v1-32k model
ghostos web ghostos.demo.test_agents.deepseek_chat    # deepseek chat model
ghostos web ghostos.demo.test_agents.openai_o1_mini   # openai o1 mini model

You can create a local Python file and define your own Agents. For more details

Chatbot: simplest chatbot
MossAgent: an agent that can interact with the python module

Install Realtime

GhostOS support OpenAI Realtime, using pyaudio to handle realtime audio i/o. Need to install the dependencies first:

pip install 'ghostos[realtime]'

You may face some difficulties while install pyaudio on your device, I'm sure gpt-4o, google or stackoverflow will offer you solutions.

Use In Python

from ghostos.bootstrap import make_app_container, get_ghostos
from ghostos.ghosts.chatbot import Chatbot

# create your own root ioc container.
# register or replace the dependencies by IoC service providers.
container = make_app_container(...)

# fetch the GhostOS instance.
ghostos = get_ghostos(container)

# Create a shell instance, which managing sessions that keep AI Ghost inside it.
# and initialize the shell level dependency providers.
shell = ghostos.create_shell("your robot shell")
# Shell can handle parallel ghosts running, and communicate them through an EventBus.
# So the Multi-Agent swarm in GhostOS is asynchronous.
shell.background_run()  # Optional

# need an instance implements `ghostos.abcd.Ghost` interface.
my_chatbot: Chatbot = ...

# use Shell to create a synchronous conversation channel with the Ghost.
conversation = shell.sync(my_chatbot)

# use the conversation channel to talk
event, receiver = conversation.talk("hello?")
with receiver:
    for chunk in receiver.recv():
        print(chunk.content)

Developing Features

[ ] Out-of-the-box Agent capability libraries.
[ ] Variable type messaging and Streamlit rendering.
[ ] Asynchronous Multi-Agent.
[ ] Long-term task planning and execution.
[ ] Atomic thinking capabilities.
[ ] Automated execution and management of tree-based projects.
[ ] Configurable components of the framework.
[ ] Experiments with toy-level embodied intelligence.

GhostOS, as a personal project, currently lacks the energy to focus on improving documentation, storage modules, stability, or security issues.

The project's iteration will be centered on validating three directions for a long time: code-driven embodied intelligence, code-based thinking capabilities, and code-based learning. I will also aim to optimize out-of-the-box agent abilities.

So What is GhostOS purpose?

The GhostOS project is developed by the author for exploring AI applications. The basic idea is as follows:

AI Agent technology has two parallel evolutionary paths: one is the perfection of the model's own capabilities, and the other is the evolution of the Agent engineering framework. The productivity level of the Agent framework determines the feasibility of AI models in practical application scenarios.

GhostOS reflects the capabilities of an Agent from code into prompts, providing them to AI models, and the code generated by the models runs directly in the environment. Expecting the large language model do everything through a Turing-complete programming language interface, including computation, tool invocation, body control, personality switching, thinking paradigms, state scheduling, Multi-Agent, memory and recall, and other actions.

This will have stronger interaction capabilities and lower overhead than methods based on JSON schema. The conversation data generated in this process can be used for post-training or reinforcement learning of the model, thereby continuously optimizing the code generation.

The AI Agent itself is also defined by code. Therefore, a Meta-Agent can develop other Agents just like a normal programming task.

Ideally, the Meta-Agent can write code, write its own tools, define memories and chain of thoughts with data structures, and develop other Agents for itself.

Furthermore, most complex tasks with rigorous steps can be described using tree or graph data structures. Constructing a nested graph or tree using methods like JSON is very difficult, while using programming languages is the most efficient.

models can consolidate the results learned from conversations into nodes in the code, and then plan them into trees or graphs, thereby executing sufficiently complex tasks.

In this way, an AI Agent can store the knowledge and capabilities learned from natural language in the form of files and code, thereby evolving itself. This is a path of evolution beyond model iteration.

Based on this idea, GhostOS aims to turn an Agent swarm into a project constructed through code. The Agents continuously precipitate new knowledge and capabilities in the form of code, enriching the project. The Agent project can be copied, shared, or deployed in the form of repositories,

In this new form of productivity, interacting purely through code is the most critical step.

The author's ultimate goal is not GhostOS itself, but to verify and promote the code interaction design and applications. The hope is that one day, agents, paradigms, bodies, and tools for AI Agents can all be designed based on the same programming language protocols, achieving cross-project universality.

For Tasks:

Click tags to check more tools for each tasks

create web agents control body movements express emotions handle real-time conversation implement learning skills

For Jobs:

ai developer machine learning engineer software engineer data scientist research scientist

Alternative AI tools for GhostOS

Similar Open Source Tools

GhostOS

github

: 58

2p-kt

2P-Kt is a Kotlin-based and multi-platform reboot of tuProlog (2P), a multi-paradigm logic programming framework written in Java. It consists of an open ecosystem for Symbolic Artificial Intelligence (AI) with modules supporting logic terms, unification, indexing, resolution of logic queries, probabilistic logic programming, binary decision diagrams, OR-concurrent resolution, DSL for logic programming, parsing modules, serialisation modules, command-line interface, and graphical user interface. The tool is designed to support knowledge representation and automatic reasoning through logic programming in an extensible and flexible way, encouraging extensions towards other symbolic AI systems than Prolog. It is a pure, multi-platform Kotlin project supporting JVM, JS, Android, and Native platforms, with a lightweight library leveraging the Kotlin common library.

github

: 86

gptauthor

GPT Author is a command-line tool designed to help users write long form, multi-chapter stories by providing a story prompt and generating a synopsis and subsequent chapters using ChatGPT. Users can review and make changes to the generated content before finalizing the story output in Markdown and HTML formats. The tool aims to unleash storytelling genius by combining human input with AI-generated content, offering a seamless writing experience for creating engaging narratives.

github

: 73

generative-ai-sagemaker-cdk-demo

This repository showcases how to deploy generative AI models from Amazon SageMaker JumpStart using the AWS CDK. Generative AI is a type of AI that can create new content and ideas, such as conversations, stories, images, videos, and music. The repository provides a detailed guide on deploying image and text generative AI models, utilizing pre-trained models from SageMaker JumpStart. The web application is built on Streamlit and hosted on Amazon ECS with Fargate. It interacts with the SageMaker model endpoints through Lambda functions and Amazon API Gateway. The repository also includes instructions on setting up the AWS CDK application, deploying the stacks, using the models, and viewing the deployed resources on the AWS Management Console.

github

: 65

aisuite

Aisuite is a simple, unified interface to multiple Generative AI providers. It allows developers to easily interact with various Language Model (LLM) providers like OpenAI, Anthropic, Azure, Google, AWS, and more through a standardized interface. The library focuses on chat completions and provides a thin wrapper around python client libraries, enabling creators to test responses from different LLM providers without changing their code. Aisuite maximizes stability by using HTTP endpoints or SDKs for making calls to the providers. Users can install the base package or specific provider packages, set up API keys, and utilize the library to generate chat completion responses from different models.

github

: 9.5k

modus

Modus is an open-source, serverless framework for building APIs powered by WebAssembly. It simplifies integrating AI models, data, and business logic with sandboxed execution. Modus extracts metadata, compiles code with optimizations, caches compiled modules, prepares invocation plans, generates API schema, and activates endpoints. Users query the endpoint, and Modus loads compiled code into a sandboxed environment, runs code securely, queries data and AI models, and responds via API. It provides a production-ready scalable endpoint for AI-enabled apps, optimized for sub-second response times. Modus supports programming languages like AssemblyScript and Go, and can be hosted on Hypermode or any platform. Developed by Hypermode as an open-source project, Modus welcomes external contributions.

github

: 352

curate-gpt

CurateGPT is a prototype web application and framework for performing general purpose AI-guided curation and curation-related operations over collections of objects. It allows users to load JSON, YAML, or CSV data, build vector database indexes for ontologies, and interact with various data sources like GitHub, Google Drives, Google Sheets, and more. The tool supports ontology curation, knowledge base querying, term autocompletion, and all-by-all comparisons for objects in a collection.

github

: 56

LeanAide

LeanAide is a work in progress AI tool designed to assist with development using the Lean Theorem Prover. It currently offers a tool that translates natural language statements to Lean types, including theorem statements. The tool is based on GPT 3.5-turbo/GPT 4 and requires an OpenAI key for usage. Users can include LeanAide as a dependency in their projects to access the translation functionality.

github

: 69

ScreenAgent

ScreenAgent is a project focused on creating an environment for Visual Language Model agents (VLM Agent) to interact with real computer screens. The project includes designing an automatic control process for agents to interact with the environment and complete multi-step tasks. It also involves building the ScreenAgent dataset, which collects screenshots and action sequences for various daily computer tasks. The project provides a controller client code, configuration files, and model training code to enable users to control a desktop with a large model.

github

: 175

honcho

Honcho is a platform for creating personalized AI agents and LLM powered applications for end users. The repository is a monorepo containing the server/API for managing database interactions and storing application state, along with a Python SDK. It utilizes FastAPI for user context management and Poetry for dependency management. The API can be run using Docker or manually by setting environment variables. The client SDK can be installed using pip or Poetry. The project is open source and welcomes contributions, following a fork and PR workflow. Honcho is licensed under the AGPL-3.0 License.

github

: 168

curategpt

CurateGPT is a prototype web application and framework designed for general purpose AI-guided curation and curation-related operations over collections of objects. It provides functionalities for loading example data, building indexes, interacting with knowledge bases, and performing tasks such as chatting with a knowledge base, querying Pubmed, interacting with a GitHub issue tracker, term autocompletion, and all-by-all comparisons. The tool is built to work best with the OpenAI gpt-4 model and OpenAI ada-text-embedding-002 for embedding, but also supports alternative models through a plugin architecture.

github

: 81

BentoDiffusion

BentoDiffusion is a BentoML example project that demonstrates how to serve and deploy diffusion models in the Stable Diffusion (SD) family. These models are specialized in generating and manipulating images based on text prompts. The project provides a guide on using SDXL Turbo as an example, along with instructions on prerequisites, installing dependencies, running the BentoML service, and deploying to BentoCloud. Users can interact with the deployed service using Swagger UI or other methods. Additionally, the project offers the option to choose from various diffusion models available in the repository for deployment.

github

: 325

admet_ai

ADMET-AI is a platform for ADMET prediction using Chemprop-RDKit models trained on ADMET datasets from the Therapeutics Data Commons. It offers command line, Python API, and web server interfaces for making ADMET predictions on new molecules. The platform can be easily installed using pip and supports GPU acceleration. It also provides options for processing TDC data, plotting results, and hosting a web server. ADMET-AI is a machine learning platform for evaluating large-scale chemical libraries.

github

: 56

llamabot

LlamaBot is a Pythonic bot interface to Large Language Models (LLMs), providing an easy way to experiment with LLMs in Jupyter notebooks and build Python apps utilizing LLMs. It supports all models available in LiteLLM. Users can access LLMs either through local models with Ollama or by using API providers like OpenAI and Mistral. LlamaBot offers different bot interfaces like SimpleBot, ChatBot, QueryBot, and ImageBot for various tasks such as rephrasing text, maintaining chat history, querying documents, and generating images. The tool also includes CLI demos showcasing its capabilities and supports contributions for new features and bug reports from the community.

github

: 132

hugescm

HugeSCM is a cloud-based version control system designed to address R&D repository size issues. It effectively manages large repositories and individual large files by separating data storage and utilizing advanced algorithms and data structures. It aims for optimal performance in handling version control operations of large-scale repositories, making it suitable for single large library R&D, AI model development, and game or driver development.

github

: 97

BTGenBot

BTGenBot is a tool that generates behavior trees for robots using lightweight large language models (LLMs) with a maximum of 7 billion parameters. It fine-tunes on a specific dataset, compares multiple LLMs, and evaluates generated behavior trees using various methods. The tool demonstrates the potential of LLMs with a limited number of parameters in creating effective and efficient robot behaviors.

github

: 65

For similar tasks

GhostOS

github

: 58

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529