machinascript-for-robots
Build LLM-powered robots in your garage with MachinaScript For Robots!
Stars: 158
MachinaScript For Robots is a dynamic set of tools and a LLM-JSON-based language designed to empower humans in the creation of their own robots. It facilitates the animation of generative movements, the integration of personality, and the teaching of new skills with a high degree of autonomy. With MachinaScript, users can control a wide range of electronic components, including Arduinos, Raspberry Pis, servo motors, cameras, sensors, and more. The tool enables the creation of intelligent robots accessible to everyone, allowing for complex tasks to be performed with elegance and precision.
README:
MACHINA3 embodies a loop of perception and action that simulates the flow of human thought. Through the integration of a vision systems and a serial to analogic signals parser, MACHINA3 interprets visual data and crafts responses in near real-time, enabling machines to perform complex tasks with elegance and some level of precision - a fantastic achievement for such an early stage of the technology.
Added support for Llama 3.2 Vision at ultra fast Groq Inference (+300tps) Added support for JSON mode
See full release Here
Intro • How MachinaScript Works • Getting Started • Installation • Community •
MachinaScript is a dynamic set of tools and a LLM-JSON-based language designed to empower humans in the creation of their own robots.
It facilitates the animation of generative movements, the integration of personality, and the teaching of new skills with a high degree of autonomy. With MachinaScript, you can control a wide range of electronic components, including Arduinos, Raspberry Pis, servo motors, cameras, sensors, and much more.
MachinaScript's mission is to make cutting-edge intelligent robotics accessible for everyone.
Read all about it on the medium article
Read the user manual in the code directory here.
-
Input Reception: Upon receiving an input, the brain unit, (a central processing unit like a raspberry pi or a computer of your choice) initiates the process. For example listen for a wake up word, or a function to keep reading images in real time on a multimodal LLM.
-
Instruction Generation: A Language Model (LLM) then crafts a sequence of instructions for actions, movements and skills. These are formatted in MachinaScript, optimized for sequential execution.
-
Instruction Parsing: The robot's brain unit interprets the generated MachinaScript instructions.
-
Action Serialization: Instructions are relayed to the microcontroller, the entity governing the robot's physical operations like servo motors and sensors.
The MachinaScript language LLM-JSON-based synthax is incredibly modular because it is generative. It is composed of three major nested components: Actions, Movements and Skills.
Actions: a set of instructions to be executed in a specific order. They may contain multiple movements and multiple skill usages.
Movements: they address motors to move and parameters like degrees and the speed. This can be used to create very personal animations.
Skills: function calling the MachinaScript way, to make use of cameras, sensors and even to speak with text-to-speech.
As long as your brain unit code is adapted to interpret it, you have no ending for your creativity.
This is an example of the complete language structure in its current latest version. Note you can change the complete synthax for the language structure for your needs, no strings attached. Just make sure it will work with your brain module generating, parsing and serializing.
The project was designed to be used accross the wide ecosystem of large language models, multimodals and non-multimodals, locals and non-locals. Note that autopilot units like Machina2 would require some form of multi-modality to sense the world via images and plan actions by itself.
To instruct a LLM to talk in the MachinaScript Synthax, we pass a system message that looks like this:
You are a MachinaScript for Robots generator.
MachinaScript is a LLM-JSON-based format used to define robotic actions, including
motor movements and skill usage, under specific contexts given by the user.
Each action can involve multiple movements, motors and skills, with defined parameters
like motor positions, speeds, and skill-specific details, like this:
(...)
Please generate a new MachinaScript using the exact given format and project specifications.
This piece of code is refered as machinascript_language.txt and is recommended to stay unchanged.
Ideally you will only change the specs of your project.
No artisanal robot is the same. They are all beautifully unique.
One of the most mind blowing things about MachinaScript is that it can embody any design ever. You just need to tell it in a set of specs what are their physical properties and limitations, as well as instructions for the behavior of the LLM. Should it be funny? Serious? What are its goals? Favorite color? The machinascript_project_specs.txt is where you put everything related to your robot personality.
For this to work, we will append a little extra information in the system message containing the following information:
Project specs:
{
"Motors": [
{"id": "motor_neck_vertical", "range": [0, 180]},
{"id": "motor_neck_horizontal", "range": [0, 180]}
],
"Skills": [
{"id": "photograph", "description": "Captures a photograph using an attached camera and send to a multimodal LLM."},
{"id": "blink_led", "parameters": {"led_pin": 10, "duration": 500, "times": 3}, "description": "Blinks an LED to indicate action."}
],
"Limitations": [
{"motor": "motor_neck_vertical", "max_speed": "medium"}
{"motor speeds": [slow, medium, high]}
]
Personality: Funny, delicate
Agency Level: high
}
note the JSON-style here can be completely reworked into any kind of text you want. You can even describe it in a single paragraph if you feel like. However for sake of human readability and developer experience, you can use this template for better "mental mapping" your project specs. This is all in very early beta so take it with a grain of salt.
We are releasing a set of finetuned models for MachinaScript soon to make its generations even better. You can also finetune models for your own specific usecase too.
An action can contain multiple movements in an order to perform animations (set of movements). It may even contain embodied personality in the motion.
Check out Disney's latest robot that combines engineering with their team of motion designers to create a more human friendly machine in the style of BD-1.
You can learn more about the 12 principles of animation here.
-
Begin with Arduino: The easiest entry point is to start by programming your robot with Arduino code.
- Construct your robot and get it moving with simple programmed commands.
- Modify the Arduino code to accept dynamic commands, similar to how a remote-controlled car operates.
-
Components: Utilize a variety of components to enhance your robot:
- Servo motors, sensors, buttons, LEDs, and any other compatible electronics.
-
Connect the Hardware: Link your Arduino to a computing device of your choice. This could be a Raspberry Pi, a personal computer, or even an older laptop with internet access.
-
Edit the Brain Code:
- Map Arduino components within your code and establish their rules and functions for interaction. For instance, a servo motor might be named
head_motor_vertical
and programmed to move up to 180 degrees. - Modify the "system prompt" passed to the LLM with your defined rules and component names.
- Map Arduino components within your code and establish their rules and functions for interaction. For instance, a servo motor might be named
- Skills encompass any function callable from the LLM, ranging from complex movement sequences (e.g., making a drink, dancing) to interactive tasks like taking pictures or utilizing text-to-speech.
Here's a quick overview:
- Clone/Download: Clone or download this repository into a chosen directory.
- Edit the Brain Code: Customize the brain code's system prompt to describe your robot's capabilities.
- Connect Hardware: Integrate your robot's locomotion and sensory systems as previously outlined.
Ready to share your projects to the world? Join our community on discord: https://discord.gg/SQFZNkQP3x
MachinaScript is my gift for the maker community,
wich has teached me so much about being a human.
Let the robots live forever.
Made with love for all the makers out there!
This project is and always will be free and open source for everyone.
babycommando
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for machinascript-for-robots
Similar Open Source Tools
machinascript-for-robots
MachinaScript For Robots is a dynamic set of tools and a LLM-JSON-based language designed to empower humans in the creation of their own robots. It facilitates the animation of generative movements, the integration of personality, and the teaching of new skills with a high degree of autonomy. With MachinaScript, users can control a wide range of electronic components, including Arduinos, Raspberry Pis, servo motors, cameras, sensors, and more. The tool enables the creation of intelligent robots accessible to everyone, allowing for complex tasks to be performed with elegance and precision.
shards
Shards is a high-performance, multi-platform, type-safe programming language designed for visual development. It is a dataflow visual programming language that enables building full-fledged apps and games without traditional coding. Shards features automatic type checking, optimized shard implementations for high performance, and an intuitive visual workflow for beginners. The language allows seamless round-trip engineering between code and visual models, empowering users to create multi-platform apps easily. Shards also powers an upcoming AI-powered game creation system, enabling real-time collaboration and game development in a low to no-code environment.
llm_client
llm_client is a Rust interface designed for Local Large Language Models (LLMs) that offers automated build support for CPU, CUDA, MacOS, easy model presets, and a novel cascading prompt workflow for controlled generation. It provides a breadth of configuration options and API support for various OpenAI compatible APIs. The tool is primarily focused on deterministic signals from probabilistic LLM vibes, enabling specialized workflows for specific tasks and reproducible outcomes.
miyagi
Project Miyagi showcases Microsoft's Copilot Stack in an envisioning workshop aimed at designing, developing, and deploying enterprise-grade intelligent apps. By exploring both generative and traditional ML use cases, Miyagi offers an experiential approach to developing AI-infused product experiences that enhance productivity and enable hyper-personalization. Additionally, the workshop introduces traditional software engineers to emerging design patterns in prompt engineering, such as chain-of-thought and retrieval-augmentation, as well as to techniques like vectorization for long-term memory, fine-tuning of OSS models, agent-like orchestration, and plugins or tools for augmenting and grounding LLMs.
bocoel
BoCoEL is a tool that leverages Bayesian Optimization to efficiently evaluate large language models by selecting a subset of the corpus for evaluation. It encodes individual entries into embeddings, uses Bayesian optimization to select queries, retrieves from the corpus, and provides easily managed evaluations. The tool aims to reduce computation costs during evaluation with a dynamic budget, supporting models like GPT2, Pythia, and LLAMA through integration with Hugging Face transformers and datasets. BoCoEL offers a modular design and efficient representation of the corpus to enhance evaluation quality.
TensorRT-Model-Optimizer
The NVIDIA TensorRT Model Optimizer is a library designed to quantize and compress deep learning models for optimized inference on GPUs. It offers state-of-the-art model optimization techniques including quantization and sparsity to reduce inference costs for generative AI models. Users can easily stack different optimization techniques to produce quantized checkpoints from torch or ONNX models. The quantized checkpoints are ready for deployment in inference frameworks like TensorRT-LLM or TensorRT, with planned integrations for NVIDIA NeMo and Megatron-LM. The tool also supports 8-bit quantization with Stable Diffusion for enterprise users on NVIDIA NIM. Model Optimizer is available for free on NVIDIA PyPI, and this repository serves as a platform for sharing examples, GPU-optimized recipes, and collecting community feedback.
nixtla
Nixtla is a production-ready generative pretrained transformer for time series forecasting and anomaly detection. It can accurately predict various domains such as retail, electricity, finance, and IoT with just a few lines of code. TimeGPT introduces a paradigm shift with its standout performance, efficiency, and simplicity, making it accessible even to users with minimal coding experience. The model is based on self-attention and is independently trained on a vast time series dataset to minimize forecasting error. It offers features like zero-shot inference, fine-tuning, API access, adding exogenous variables, multiple series forecasting, custom loss function, cross-validation, prediction intervals, and handling irregular timestamps.
Robyn
Robyn is an experimental, semi-automated and open-sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. It uses various machine learning techniques to define media channel efficiency and effectivity, explore adstock rates and saturation curves. Built for granular datasets with many independent variables, especially suitable for digital and direct response advertisers with rich data sources. Aiming to democratize MMM, make it accessible for advertisers of all sizes, and contribute to the measurement landscape.
argilla
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency. It helps users improve AI output quality through data quality, take control of their data and models, and improve efficiency by quickly iterating on the right data and models. Argilla is an open-source community-driven project that provides tools for achieving and maintaining high-quality data standards, with a focus on NLP and LLMs. It is used by AI teams from companies like the Red Cross, Loris.ai, and Prolific to improve the quality and efficiency of AI projects.
contracts
AXONE Smart Contracts repository hosts Smart Contracts for the AXONE network, compatible with any Cosmos blockchains using the CosmWasm framework. It includes storage, sovereignty, and resource management oriented Smart Contracts. Each contract has different functionalities and maturity stages, with detailed tech documentation and emojis indicating maturity levels. The repository provides tools for building, testing, deploying, and interacting with Smart Contracts, along with guidelines for contributing and community engagement.
NeMo
NeMo Framework is a generative AI framework built for researchers and pytorch developers working on large language models (LLMs), multimodal models (MM), automatic speech recognition (ASR), and text-to-speech synthesis (TTS). The primary objective of NeMo is to provide a scalable framework for researchers and developers from industry and academia to more easily implement and design new generative AI models by being able to leverage existing code and pretrained models.
InfLLM
InfLLM is a training-free memory-based method that unveils the intrinsic ability of LLMs to process streaming long sequences. It stores distant contexts into additional memory units and employs an efficient mechanism to lookup token-relevant units for attention computation. Thereby, InfLLM allows LLMs to efficiently process long sequences while maintaining the ability to capture long-distance dependencies. Without any training, InfLLM enables LLMs pre-trained on sequences of a few thousand tokens to achieve superior performance than competitive baselines continually training these LLMs on long sequences. Even when the sequence length is scaled to 1, 024K, InfLLM still effectively captures long-distance dependencies.
MInference
MInference is a tool designed to accelerate pre-filling for long-context Language Models (LLMs) by leveraging dynamic sparse attention. It achieves up to a 10x speedup for pre-filling on an A100 while maintaining accuracy. The tool supports various decoding LLMs, including LLaMA-style models and Phi models, and provides custom kernels for attention computation. MInference is useful for researchers and developers working with large-scale language models who aim to improve efficiency without compromising accuracy.
AITemplate
AITemplate (AIT) is a Python framework that transforms deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for lightning-fast inference serving. It offers high performance close to roofline fp16 TensorCore (NVIDIA GPU) / MatrixCore (AMD GPU) performance on major models. AITemplate is unified, open, and flexible, supporting a comprehensive range of fusions for both GPU platforms. It provides excellent backward capability, horizontal fusion, vertical fusion, memory fusion, and works with or without PyTorch. FX2AIT is a tool that converts PyTorch models into AIT for fast inference serving, offering easy conversion and expanded support for models with unsupported operators.
asreview
The ASReview project implements active learning for systematic reviews, utilizing AI-aided pipelines to assist in finding relevant texts for search tasks. It accelerates the screening of textual data with minimal human input, saving time and increasing output quality. The software offers three modes: Oracle for interactive screening, Exploration for teaching purposes, and Simulation for evaluating active learning models. ASReview LAB is designed to support decision-making in any discipline or industry by improving efficiency and transparency in screening large amounts of textual data.
ain
DeFiChain is a blockchain platform dedicated to enabling decentralized finance with Bitcoin-grade security, strength, and immutability. It offers fast, intelligent, and transparent financial services accessible to everyone. DeFiChain has made significant modifications from Bitcoin Core, including moving to Proof-of-Stake, introducing a masternode model, supporting a community fund, anchoring to the Bitcoin blockchain, and enhancing decentralized financial transaction and opcode support. The platform is under active development with regular releases and contributions are welcomed.
For similar tasks
awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.
machinascript-for-robots
MachinaScript For Robots is a dynamic set of tools and a LLM-JSON-based language designed to empower humans in the creation of their own robots. It facilitates the animation of generative movements, the integration of personality, and the teaching of new skills with a high degree of autonomy. With MachinaScript, users can control a wide range of electronic components, including Arduinos, Raspberry Pis, servo motors, cameras, sensors, and more. The tool enables the creation of intelligent robots accessible to everyone, allowing for complex tasks to be performed with elegance and precision.
lerobot
LeRobot is a state-of-the-art AI library for real-world robotics in PyTorch. It aims to provide models, datasets, and tools to lower the barrier to entry to robotics, focusing on imitation learning and reinforcement learning. LeRobot offers pretrained models, datasets with human-collected demonstrations, and simulation environments. It plans to support real-world robotics on affordable and capable robots. The library hosts pretrained models and datasets on the Hugging Face community page.
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.