
hal-9100
Edge full-stack LLM platform. Written in Rust
Stars: 353

This repository is now archived and the code is privately maintained. If you are interested in this infrastructure, please contact the maintainer directly.
README:
πΌοΈ Infra β¨ Feature? β€οΈβπ©Ή Bug? π Help?
- [x] Code Interpreter: Generate and runs Python code in a sandboxed environment autonomously. (beta)
- [x] Knowledge Retrieval: Retrieves external knowledge or documents autonomously.
- [x] Function Calling: Defines and executes custom functions autonomously.
- [x] Actions: Execute requests to external APIs autonomously.
- [x] Files: Supports a range of file formats.
- [x] OpenAI compatible: Works with OpenAI (Assistants) SDK
- You want to increase customization (e.g. use your own models, extend the API, etc.)
- You work in a data-sensitive environment (healthcare, IoT, military, law, etc.)
- Your product does have poor or no internet access (military, IoT, edge, extreme environment, etc.)
- (not our main focus) You operate on a large scale and want to reduce your costs
- (not our main focus) You operate on a large scale and want to increase your speed
First, our definition of Software 3.0, as it is a loaded term: Software 3.0 is the bridge connecting the cognitive capabilities of Large Language Models with the practical needs of human digital activity. It is a comprehensive approach that allows LLMs to:
- perform the same activity (or better) on the digital world than humans
- generally, allow the user to perform more operations without conscious effort
HAL-9100 is in continuous development, with the aim of always offering better infrastructure for Edge Software 3.0. To achieve this, it is based on several principles that define its functionality and scope.
Less prompt is more
As few prompts as possible should be hard-coded into the infrastructure, just enough to bridge the gap between Software 1.0 and Software 3.0 and give the client as much control as possible on the prompts.
Edge-first
HAL-9100 does not require internet access by focusing on open source LLMs. Which means you own your data and your models. It runs on a Raspberry PI (LLM included).
OpenAI-compatible
OpenAI spent a large amount of the best brain power to design this API, which makes it an incredible experience for developers. Support for OpenAI LLMs are not a priority at all though.
Reliable and deterministic
HAL-9100 focus on reliability and being as deterministic as possible by default. That's why everything has to be tested and benchmarked.
Flexible
A minimal number of hard-coded prompts and behaviors, a wide range of models, infrastructure components and deployment options and it play well with the open-source ecosystem, while only integrating projects that have stood the test of time.
Get started in less than a minute through GitHub Codespaces:
Or:
git clone https://github.com/llm-edge/hal-9100
cd hal-9100
To get started quickly, let's use Anyscale API.
Get an API key from Anyscale. You can get it here. Replace in hal-9100.toml the model_api_key
with your API key.
Usage w/ ollama
- use
model_url = "http://localhost:11434/v1/chat/completions"
- set
gemma:2b
in examples/quickstart.js - and run
ollama run gemma:2b & && docker compose --profile api -f docker/docker-compose.yml up
Install OpenAI SDK: npm i openai
Start the infra:
docker compose --profile api -f docker/docker-compose.yml up
Run the quickstart:
node examples/quickstart.js
Is there a hosted version?
No. HAL-9100 is not a hosted service. It's a software that you can deploy on your infrastructure. We can help you deploy it on your infrastructure. Contact us.
Which LLM API can I use?
Examples of LLM APIs that does support OpenAI API-like, that you can use:
- ollama
- MLC-LLM
- FastChat (good if you have a mac)
- vLLM (good if you have a modern gpu)
- Perplexity API
- Mistral API
- anyscale
- together ai
We recommend these models:
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/mistral-7b
Other models have not been extensively tested and may not work as expected, but you can try them.
What's the difference with LangChain?
1. LangChain spans proprietary LLM and open source, among the thousands of things it spans. HAL-9100 laser focuses on Software 3.0 for the edge.- You can write AI products in 50 lines of code instead of 5000 and having to learn a whole new abstraction
Are you related to OpenAI?
No.I don't use Assistants API. Can I use this?
We recommend switching to the Assistants API for a more streamlined experience, allowing you to focus more on your product than on infrastructure.Does the Assistants API support audio and images?
Images soon, working on it.For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for hal-9100
Similar Open Source Tools

hal-9100
This repository is now archived and the code is privately maintained. If you are interested in this infrastructure, please contact the maintainer directly.

momentum-core
Momentum is an open-source behavioral auditor for backend code that helps developers generate powerful insights into their codebase. It analyzes code behavior, tests it at every git push, and ensures readiness for production. Momentum understands backend code, visualizes dependencies, identifies behaviors, generates test code, runs code in the local environment, and provides debugging solutions. It aims to improve code quality, streamline testing processes, and enhance developer productivity.

CodeGPT
CodeGPT is an extension for JetBrains IDEs that provides access to state-of-the-art large language models (LLMs) for coding assistance. It offers a range of features to enhance the coding experience, including code completions, a ChatGPT-like interface for instant coding advice, commit message generation, reference file support, name suggestions, and offline development support. CodeGPT is designed to keep privacy in mind, ensuring that user data remains secure and private.

colors_ai
Colors AI is a cross-platform color scheme generator that uses deep learning from public API providers. It is available for all mainstream operating systems, including mobile. Features: - Choose from open APIs, with the ability to set up custom settings - Export section with many export formats to save or clipboard copy - URL providers to other static color generators - Localized to several languages - Dark and light theme - Material Design 3 - Data encryption - Accessibility - And much more

portia-sdk-python
Portia AI is an open source developer framework for predictable, stateful, authenticated agentic workflows. It allows developers to have oversight over their multi-agent deployments and focuses on production readiness. The framework supports iterating on agents' reasoning, extensive tool support including MCP support, authentication for API and web agents, and is production-ready with features like attribute multi-agent runs, large inputs and outputs storage, and connecting any LLM. Portia AI aims to provide a flexible and reliable platform for developing AI agents with tools, authentication, and smart control.

lighteval
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron. We're releasing it with the community in the spirit of building in the open. Note that it is still very much early so don't expect 100% stability ^^' In case of problems or question, feel free to open an issue!

haystack
Haystack is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search and more. Whether you want to perform retrieval-augmented generation (RAG), document search, question answering or answer generation, Haystack can orchestrate state-of-the-art embedding models and LLMs into pipelines to build end-to-end NLP applications and solve your use case.

flow-like
Flow-Like is an enterprise-grade workflow operating system built upon Rust for uncompromising performance, efficiency, and code safety. It offers a modular frontend for apps, a rich set of events, a node catalog, a powerful no-code workflow IDE, and tools to manage teams, templates, and projects within organizations. With typed workflows, users can create complex, large-scale workflows with clear data origins, transformations, and contracts. Flow-Like is designed to automate any process through seamless integration of LLM, ML-based, and deterministic decision-making instances.

qdrant
Qdrant is a vector similarity search engine and vector database. It is written in Rust, which makes it fast and reliable even under high load. Qdrant can be used for a variety of applications, including: * Semantic search * Image search * Product recommendations * Chatbots * Anomaly detection Qdrant offers a variety of features, including: * Payload storage and filtering * Hybrid search with sparse vectors * Vector quantization and on-disk storage * Distributed deployment * Highlighted features such as query planning, payload indexes, SIMD hardware acceleration, async I/O, and write-ahead logging Qdrant is available as a fully managed cloud service or as an open-source software that can be deployed on-premises.

openllmetry-js
OpenLLMetry-JS is a set of extensions built on top of OpenTelemetry that gives you complete observability over your LLM application. Because it uses OpenTelemetry under the hood, it can be connected to your existing observability solutions - Datadog, Honeycomb, and others. It's built and maintained by Traceloop under the Apache 2.0 license. The repo contains standard OpenTelemetry instrumentations for LLM providers and Vector DBs, as well as a Traceloop SDK that makes it easy to get started with OpenLLMetry-JS, while still outputting standard OpenTelemetry data that can be connected to your observability stack. If you already have OpenTelemetry instrumented, you can just add any of our instrumentations directly.

TaskingAI
TaskingAI brings Firebase's simplicity to **AI-native app development**. The platform enables the creation of GPTs-like multi-tenant applications using a wide range of LLMs from various providers. It features distinct, modular functions such as Inference, Retrieval, Assistant, and Tool, seamlessly integrated to enhance the development process. TaskingAIβs cohesive design ensures an efficient, intelligent, and user-friendly experience in AI application development.

kalavai-client
Kalavai is an open-source platform that transforms everyday devices into an AI supercomputer by aggregating resources from multiple machines. It facilitates matchmaking of resources for large AI projects, making AI hardware accessible and affordable. Users can create local and public pools, connect with the community's resources, and share computing power. The platform aims to be a management layer for research groups and organizations, enabling users to unlock the power of existing hardware without needing a devops team. Kalavai CLI tool helps manage both versions of the platform.

ProxyAI
ProxyAI is an open-source AI copilot for JetBrains, offering advanced code assistance features powered by top-tier language models. Users can customize their coding experience, receive AI-suggested code changes, autocomplete suggestions, and context-aware naming suggestions. The tool also allows users to chat with images, reference project files and folders, web docs, git history, and search the web. ProxyAI prioritizes user privacy by not collecting sensitive information and only gathering anonymous usage data with consent.

llm-answer-engine
This repository contains the code and instructions needed to build a sophisticated answer engine that leverages the capabilities of Groq, Mistral AI's Mixtral, Langchain.JS, Brave Search, Serper API, and OpenAI. Designed to efficiently return sources, answers, images, videos, and follow-up questions based on user queries, this project is an ideal starting point for developers interested in natural language processing and search technologies.

PlanExe
PlanExe is a planning AI tool that helps users generate detailed plans based on vague descriptions. It offers a Gradio-based web interface for easy input and output. Users can choose between running models in the cloud or locally on a high-end computer. The tool aims to provide a straightforward path to planning various tasks efficiently.

upscayl
Upscayl is a free and open-source AI image upscaler that uses advanced AI algorithms to enlarge and enhance low-resolution images without losing quality. It is a cross-platform application built with the Linux-first philosophy, available on all major desktop operating systems. Upscayl utilizes Real-ESRGAN and Vulkan architecture for image enhancement, and its backend is fully open-source under the AGPLv3 license. It is important to note that a Vulkan compatible GPU is required for Upscayl to function effectively.