
GhostOS
A framework offers an OS simulator within a Python Code Interface for AI Agents
Stars: 58

GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol). It aims to create intelligent entities capable of continuous learning and growth through code generation and project management. The framework supports various capabilities such as turning Python files into web agents, real-time voice conversation, body movements control, and emotion expression. GhostOS is still in early experimental development and focuses on out-of-the-box capabilities for AI agents.
README:
The AI
Ghosts
wander in theShells
.
(This document is translated from zh-cn to english by Moonshot)
Using Python code SpheroBoltGPT,
an intelligent robot with a SpheroBolt as its body is defined.
If you have a SpheroBolt, running ghostos web ghostos.demo.sphero.bolt_gpt
can start this robot.
The demo initially implements the following features:
- Real-time voice conversation.
- Control of body movements and drawing graphics on an 8x8 LED matrix.
- Learning skills that include actions and animations through natural language dialogue.
- Expressing emotions through movements during conversation.
GhostOS
is an AI Agent framework designed to replace JSON Schema
with a Turing-complete code interaction interface (Moss Protocol),
becoming the core method for interaction between LLM and Agent system capabilities. For more details:
MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents
The expected objects called through code
include tools
, personality
, agent swarm
, workflows
, thinking
, planning
, knowledge
, and memory
.
This allows a Meta-Agent to become an intelligent entity capable of continuous learning and growth through code
generation and project management.
And such an intelligent agent implemented with a code repository can also be shared and installed in the form of a repository.
GhostOS
Still in the early experimental developing, the current version mainly implements out-of-the-box capabilities,
including:
- [x] Turn a python file into a web agent
- [x] Agent web UI built by Streamlit Web
- [x] Support llms like
OpenAI
,Moonshot
- [x] Support OpenAI vision
- [x] Support OpenAI Realtime Beta
GhostOS
remains a beta AI project, strongly recommending installation in containers such as Docker rather than running locally.
Install GhostOS
package:
pip install ghostos
Initialize workspace
(directory app
as default), The runtime files of the current version will be stored in the
directory.
ghostos init
Configure the model. Default to use OpenAI gpt-4o
, requiring the environment variable OPENAI_API_KEY
.
export OPENAI_API_KEY="your openai api key"
# Optionals:
export OPENAI_PROXY="sock5://localhost:[your-port]" # setup openai proxy
export DEEPSEEK_API_KEY="your deepseek api key"
epoxrt MOONSHOT_API_KEY="your moonshot api key"
Or you can use configuration ui by streamlit:
ghostos config
Then test the default agent:
# run an agent with python filename or modulename
ghostos web ghostos.demo.agents.jojo
Or turn a local Python file into an Agent, that can be instructed to call functions or methods within the file through natural language conversations.
ghostos web [my_path_file_path]
some demo agents
ghostos web ghostos.demo.agents.jojo
ghostos web ghostos.demo.test_agents.moonshot # moonshot-v1-32k model
ghostos web ghostos.demo.test_agents.deepseek_chat # deepseek chat model
ghostos web ghostos.demo.test_agents.openai_o1_mini # openai o1 mini model
You can create a local Python file and define your own Agents. For more details
GhostOS
support OpenAI Realtime,
using pyaudio to handle realtime audio i/o.
Need to install the dependencies first:
pip install 'ghostos[realtime]'
You may face some difficulties while install pyaudio on your device, I'm sure gpt-4o, google or stackoverflow will offer you solutions.
from ghostos.bootstrap import make_app_container, get_ghostos
from ghostos.ghosts.chatbot import Chatbot
# create your own root ioc container.
# register or replace the dependencies by IoC service providers.
container = make_app_container(...)
# fetch the GhostOS instance.
ghostos = get_ghostos(container)
# Create a shell instance, which managing sessions that keep AI Ghost inside it.
# and initialize the shell level dependency providers.
shell = ghostos.create_shell("your robot shell")
# Shell can handle parallel ghosts running, and communicate them through an EventBus.
# So the Multi-Agent swarm in GhostOS is asynchronous.
shell.background_run() # Optional
# need an instance implements `ghostos.abcd.Ghost` interface.
my_chatbot: Chatbot = ...
# use Shell to create a synchronous conversation channel with the Ghost.
conversation = shell.sync(my_chatbot)
# use the conversation channel to talk
event, receiver = conversation.talk("hello?")
with receiver:
for chunk in receiver.recv():
print(chunk.content)
- [ ] Out-of-the-box Agent capability libraries.
- [ ] Variable type messaging and Streamlit rendering.
- [ ] Asynchronous Multi-Agent.
- [ ] Long-term task planning and execution.
- [ ] Atomic thinking capabilities.
- [ ] Automated execution and management of tree-based projects.
- [ ] Configurable components of the framework.
- [ ] Experiments with toy-level embodied intelligence.
GhostOS, as a personal project, currently lacks the energy to focus on improving documentation, storage modules, stability, or security issues.
The project's iteration will be centered on validating three directions for a long time: code-driven embodied intelligence, code-based thinking capabilities, and code-based learning. I will also aim to optimize out-of-the-box agent abilities.
The GhostOS project is developed by the author for exploring AI applications. The basic idea is as follows:
AI Agent technology has two parallel evolutionary paths: one is the perfection of the model's own capabilities, and the other is the evolution of the Agent engineering framework. The productivity level of the Agent framework determines the feasibility of AI models in practical application scenarios.
GhostOS reflects the capabilities of an Agent from code into prompts, providing them to AI models, and the code generated by the models runs directly in the environment. Expecting the large language model do everything through a Turing-complete programming language interface, including computation, tool invocation, body control, personality switching, thinking paradigms, state scheduling, Multi-Agent, memory and recall, and other actions.
This will have stronger interaction capabilities and lower overhead than methods based on JSON schema. The conversation data generated in this process can be used for post-training or reinforcement learning of the model, thereby continuously optimizing the code generation.
The AI Agent itself is also defined by code. Therefore, a Meta-Agent can develop other Agents just like a normal programming task.
Ideally, the Meta-Agent can write code, write its own tools, define memories and chain of thoughts with data structures, and develop other Agents for itself.
Furthermore, most complex tasks with rigorous steps can be described using tree or graph data structures. Constructing a nested graph or tree using methods like JSON is very difficult, while using programming languages is the most efficient.
models can consolidate the results learned from conversations into nodes in the code, and then plan them into trees or graphs, thereby executing sufficiently complex tasks.
In this way, an AI Agent can store the knowledge and capabilities learned from natural language in the form of files and code, thereby evolving itself. This is a path of evolution beyond model iteration.
Based on this idea, GhostOS aims to turn an Agent swarm into a project constructed through code. The Agents continuously precipitate new knowledge and capabilities in the form of code, enriching the project. The Agent project can be copied, shared, or deployed in the form of repositories,
In this new form of productivity, interacting purely through code is the most critical step.
The author's ultimate goal is not GhostOS
itself,
but to verify and promote the code interaction design and applications.
The hope is that one day, agents, paradigms, bodies, and tools for AI Agents can all be designed based on the same
programming language protocols,
achieving cross-project universality.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for GhostOS
Similar Open Source Tools

GhostOS
GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol). It aims to create intelligent entities capable of continuous learning and growth through code generation and project management. The framework supports various capabilities such as turning Python files into web agents, real-time voice conversation, body movements control, and emotion expression. GhostOS is still in early experimental development and focuses on out-of-the-box capabilities for AI agents.

2p-kt
2P-Kt is a Kotlin-based and multi-platform reboot of tuProlog (2P), a multi-paradigm logic programming framework written in Java. It consists of an open ecosystem for Symbolic Artificial Intelligence (AI) with modules supporting logic terms, unification, indexing, resolution of logic queries, probabilistic logic programming, binary decision diagrams, OR-concurrent resolution, DSL for logic programming, parsing modules, serialisation modules, command-line interface, and graphical user interface. The tool is designed to support knowledge representation and automatic reasoning through logic programming in an extensible and flexible way, encouraging extensions towards other symbolic AI systems than Prolog. It is a pure, multi-platform Kotlin project supporting JVM, JS, Android, and Native platforms, with a lightweight library leveraging the Kotlin common library.

gptauthor
GPT Author is a command-line tool designed to help users write long form, multi-chapter stories by providing a story prompt and generating a synopsis and subsequent chapters using ChatGPT. Users can review and make changes to the generated content before finalizing the story output in Markdown and HTML formats. The tool aims to unleash storytelling genius by combining human input with AI-generated content, offering a seamless writing experience for creating engaging narratives.

generative-ai-sagemaker-cdk-demo
This repository showcases how to deploy generative AI models from Amazon SageMaker JumpStart using the AWS CDK. Generative AI is a type of AI that can create new content and ideas, such as conversations, stories, images, videos, and music. The repository provides a detailed guide on deploying image and text generative AI models, utilizing pre-trained models from SageMaker JumpStart. The web application is built on Streamlit and hosted on Amazon ECS with Fargate. It interacts with the SageMaker model endpoints through Lambda functions and Amazon API Gateway. The repository also includes instructions on setting up the AWS CDK application, deploying the stacks, using the models, and viewing the deployed resources on the AWS Management Console.

aisuite
Aisuite is a simple, unified interface to multiple Generative AI providers. It allows developers to easily interact with various Language Model (LLM) providers like OpenAI, Anthropic, Azure, Google, AWS, and more through a standardized interface. The library focuses on chat completions and provides a thin wrapper around python client libraries, enabling creators to test responses from different LLM providers without changing their code. Aisuite maximizes stability by using HTTP endpoints or SDKs for making calls to the providers. Users can install the base package or specific provider packages, set up API keys, and utilize the library to generate chat completion responses from different models.

curate-gpt
CurateGPT is a prototype web application and framework for performing general purpose AI-guided curation and curation-related operations over collections of objects. It allows users to load JSON, YAML, or CSV data, build vector database indexes for ontologies, and interact with various data sources like GitHub, Google Drives, Google Sheets, and more. The tool supports ontology curation, knowledge base querying, term autocompletion, and all-by-all comparisons for objects in a collection.

LeanAide
LeanAide is a work in progress AI tool designed to assist with development using the Lean Theorem Prover. It currently offers a tool that translates natural language statements to Lean types, including theorem statements. The tool is based on GPT 3.5-turbo/GPT 4 and requires an OpenAI key for usage. Users can include LeanAide as a dependency in their projects to access the translation functionality.

ScreenAgent
ScreenAgent is a project focused on creating an environment for Visual Language Model agents (VLM Agent) to interact with real computer screens. The project includes designing an automatic control process for agents to interact with the environment and complete multi-step tasks. It also involves building the ScreenAgent dataset, which collects screenshots and action sequences for various daily computer tasks. The project provides a controller client code, configuration files, and model training code to enable users to control a desktop with a large model.

KrillinAI
KrillinAI is a video subtitle translation and dubbing tool based on AI large models, featuring speech recognition, intelligent sentence segmentation, professional translation, and one-click deployment of the entire process. It provides a one-stop workflow from video downloading to the final product, empowering cross-language cultural communication with AI. The tool supports multiple languages for input and translation, integrates features like automatic dependency installation, video downloading from platforms like YouTube and Bilibili, high-speed subtitle recognition, intelligent subtitle segmentation and alignment, custom vocabulary replacement, professional-level translation engine, and diverse external service selection for speech and large model services.

honcho
Honcho is a platform for creating personalized AI agents and LLM powered applications for end users. The repository is a monorepo containing the server/API for managing database interactions and storing application state, along with a Python SDK. It utilizes FastAPI for user context management and Poetry for dependency management. The API can be run using Docker or manually by setting environment variables. The client SDK can be installed using pip or Poetry. The project is open source and welcomes contributions, following a fork and PR workflow. Honcho is licensed under the AGPL-3.0 License.

curategpt
CurateGPT is a prototype web application and framework designed for general purpose AI-guided curation and curation-related operations over collections of objects. It provides functionalities for loading example data, building indexes, interacting with knowledge bases, and performing tasks such as chatting with a knowledge base, querying Pubmed, interacting with a GitHub issue tracker, term autocompletion, and all-by-all comparisons. The tool is built to work best with the OpenAI gpt-4 model and OpenAI ada-text-embedding-002 for embedding, but also supports alternative models through a plugin architecture.

BentoDiffusion
BentoDiffusion is a BentoML example project that demonstrates how to serve and deploy diffusion models in the Stable Diffusion (SD) family. These models are specialized in generating and manipulating images based on text prompts. The project provides a guide on using SDXL Turbo as an example, along with instructions on prerequisites, installing dependencies, running the BentoML service, and deploying to BentoCloud. Users can interact with the deployed service using Swagger UI or other methods. Additionally, the project offers the option to choose from various diffusion models available in the repository for deployment.

admet_ai
ADMET-AI is a platform for ADMET prediction using Chemprop-RDKit models trained on ADMET datasets from the Therapeutics Data Commons. It offers command line, Python API, and web server interfaces for making ADMET predictions on new molecules. The platform can be easily installed using pip and supports GPU acceleration. It also provides options for processing TDC data, plotting results, and hosting a web server. ADMET-AI is a machine learning platform for evaluating large-scale chemical libraries.

llamabot
LlamaBot is a Pythonic bot interface to Large Language Models (LLMs), providing an easy way to experiment with LLMs in Jupyter notebooks and build Python apps utilizing LLMs. It supports all models available in LiteLLM. Users can access LLMs either through local models with Ollama or by using API providers like OpenAI and Mistral. LlamaBot offers different bot interfaces like SimpleBot, ChatBot, QueryBot, and ImageBot for various tasks such as rephrasing text, maintaining chat history, querying documents, and generating images. The tool also includes CLI demos showcasing its capabilities and supports contributions for new features and bug reports from the community.

hugescm
HugeSCM is a cloud-based version control system designed to address R&D repository size issues. It effectively manages large repositories and individual large files by separating data storage and utilizing advanced algorithms and data structures. It aims for optimal performance in handling version control operations of large-scale repositories, making it suitable for single large library R&D, AI model development, and game or driver development.

BTGenBot
BTGenBot is a tool that generates behavior trees for robots using lightweight large language models (LLMs) with a maximum of 7 billion parameters. It fine-tunes on a specific dataset, compares multiple LLMs, and evaluates generated behavior trees using various methods. The tool demonstrates the potential of LLMs with a limited number of parameters in creating effective and efficient robot behaviors.
For similar tasks

GhostOS
GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol). It aims to create intelligent entities capable of continuous learning and growth through code generation and project management. The framework supports various capabilities such as turning Python files into web agents, real-time voice conversation, body movements control, and emotion expression. GhostOS is still in early experimental development and focuses on out-of-the-box capabilities for AI agents.
For similar jobs

promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.