ai-starter-kit
None
Stars: 132
SambaNova AI Starter Kits is a collection of open-source examples and guides designed to facilitate the deployment of AI-driven use cases for developers and enterprises. The kits cover various categories such as Data Ingestion & Preparation, Model Development & Optimization, Intelligent Information Retrieval, and Advanced AI Capabilities. Users can obtain a free API key using SambaNova Cloud or deploy models using SambaStudio. Most examples are written in Python but can be applied to any programming language. The kits provide resources for tasks like text extraction, fine-tuning embeddings, prompt engineering, question-answering, image search, post-call analysis, and more.
README:
SambaNova AI Starter Kits are a collection of open-source examples and guides designed to facilitate the deployment of AI-driven use cases for both developers and enterprises.
To run these examples, you can obtain a free API key using SambaNova Cloud. Alternatively, if you are a current SambaNova customer, you can deploy your models using SambaStudio. Most of the code examples are written in Python, although the concepts can be applied to any programming language.
Questions? Just message us on SambaNova Community or create an issue in GitHub. We're happy to help live!
The table belows lists the available kits, which are grouped into four categories: 1) Data Ingestion & Preparation, 2) Model Development & Optimization, 3) Intelligent Information Retrieval, and 4) Advanced AI Capabilities.
Note: For each kit, we specify whether it is compatible with SambaNova Cloud, SambaStudio, or both.
Name | Kit Description | Compatible APIs | Category |
---|---|---|---|
Data Extraction | Series of notebooks that demonstrate methods for extracting text from documents in different input formats. | SambaNova Cloud, SambaStudio | Data Ingestion & Preparation |
YoDA: Your Data Your model | Sample training recipe to train a Language Model (LLM) using a customer's private data. | SambaStudio | Data Ingestion & Preparation |
Fine tuning embeddings | Example workflow for fine-tuning embeddings from unstructured data, leveraging Large Language Models (LLMs) and open-source embedding models to enhance NLP task performance. | SambaStudio | Model Development & Optimization |
Fine tuning SQL | Example workflow for fine-tuning an SQL model for Question-Answering purposes, leveraging Large Language Models (LLMs) and open-source embedding models to enhance SQL generation task performance. | SambaStudio | Model Development & Optimization |
Prompt Engineering | Starting point demo for prompt engineering using SambaNova's API to experiment with different use case templates. Provides useful resources to improve prompt crafting, making it an ideal entry point for those new to this AISK. | SambaNova Cloud, SambaStudio | Model Development & Optimization |
Enterprise Knowledge Retrieval | Sample implementation of the semantic search workflow using the SambaNova platform to get answers to questions about your documents. Includes a runnable demo. | SambaNova Cloud, SambaStudio | Intelligent Information Retrieval |
Image Search | This example workflow shows a simple approach to image search by image description or image similarity. All workflows are built using the SambaNova platform. | SambaStudio | Intelligent Information Retrieval |
Multimodal Knowledge Retriever | Sample implementation of the semantic search workflow leveraging the SambaNova platform to get answers using text, tables, and images to questions about your documents. Includes a runnable demo. | SambaNova Cloud, SambaStudio | Intelligent Information Retrieval |
Post Call Analysis | Example workflow that shows a systematic approach to post-call analysis including Automatic Speech Recognition (ASR), diarization, large language model analysis, and retrieval augmented generation (RAG) workflows. All workflows are built using the SambaNova platform. | SambaNova Cloud, SambaStudio | Intelligent Information Retrieval |
RAG Evaluation Kit | A tool for evaluating the performance of LLM APIs using the RAG Evaluation methodology. | SambaStudio | Intelligent Information Retrieval |
Search Assistant | Sample implementation of the semantic search workflow built using the SambaNova platform to get answers to your questions using search engine snippets, and website crawled information as the source. Includes a runnable demo. | SambaNova Cloud, SambaStudio | Intelligent Information Retrieval |
Web Crawled Data Retrieval | Sample implementation of a semantic search workflow built using the SambaNova platform to get answers to your questions using website crawled information as the source. Includes a runnable demo. | SambaNova Cloud, SambaStudio | Intelligent Information Retrieval |
Benchmarking | This kit evaluates the performance of multiple LLM models hosted in SambaStudio. It offers various performance metrics and configuration options. Users can also see these metrics within a chat interface. | SambaNova Cloud, SambaStudio | Advanced AI Capabilities |
Code Copilot | This example guide shows a simple integration with Continue VSCode and JetBrains extension using SambaNova platforms, to use Sambanova's hosted models as your custom coding assistant. | SambaStudio | Advanced AI Capabilities |
Bundle jump start | This kit demonstrates how to call SambaNova Bundle models using the Langchain framework. The script offers different approaches for calling Bundle models, including using SambaStudio with a named expert, and using SambaStudio with routing. | SambaStudio | Advanced AI Capabilities |
Financial Assistant | This app demonstrates the capabilities of LLMs in extracting and analyzing financial data using function calling, web scraping, and RAG. | SambaNova Cloud, SambaStudio | Advanced AI Capabilities |
Function Calling | Example of tools calling implementation and a generic function calling module that can be used inside your application workflows. | SambaNova Cloud, SambaStudio | Advanced AI Capabilities |
SambaNova Scribe | Example implementation of a transcription and summarization workflow. | SambaNova Cloud | Advanced AI Capabilities |
SambaCloud - Google Integration | App Scripts intended for those with SambaCloud API keys to integrate LLMs into Google Workspaces. | SambaNova Cloud | Advanced AI Capabilities |
Currently, there are two ways to obtain an API key from SambaNova. You can get a free API key using SambaNova Cloud. Alternatively, if you are a current SambaNova customer, you can deploy your models using SambaStudio.
For more information and to obtain your API key, visit the SambaNova Cloud webpage.
To integrate SambaNova Cloud LLMs with this AI starter kit, update the API information by configuring the environment variables in the ai-starter-kit/.env
file:
- Create the
.env
file atai-starter-kit/.env
if the file does not exist. - Enter the SambaNova Cloud API key in the
.env
file, for example:
SAMBANOVA_API_KEY = "456789abcdef0123456789abcdef0123"
Begin by deploying your LLM of choice (e.g., Llama 3 8B) to an endpoint for inference in SambaStudio. Use either the GUI or CLI, as described in the SambaStudio endpoint documentation.
To integrate your LLM deployed on SambaStudio with this AI starter kit, update the API information by configuring the environment variables in the ai-starter-kit/.env
file:
- Create the
.env
file atai-starter-kit/.env
if the file does not exist. - Set your SambaStudio variables. For example, an endpoint with the URL
"https://api-stage.sambanova.net/api/predict/nlp/12345678-9abc-def0-1234-56789abcdef0/456789ab-cdef-0123-4567-89abcdef0123"
is entered in the
.env
file as:
SAMBASTUDIO_URL="https://api-stage.sambanova.net/api/predict/nlp/12345678-9abc-def0-1234-56789abcdef0/456789ab-cdef-0123-4567-89abcdef0123"
SAMBASTUDIO_API_KEY="89abcdef-0123-4567-89ab-cdef01234567"
Currently, you can set your embedding models on CPU or SambaStudio. Note that embedding models are not available yet through SambaNova Cloud, but they will be in future releases.
You can run the Hugging Face embedding models locally on CPU. In this case, no information is needed in the .env
file.
Alternatively, you can use SambaStudio embedding model endpoints instead of the CPU-based HugginFace embeddings to increase inference speed. Please follow this guide to deploy your SambaStudio embedding model.
To integrate your embedding model deployed on SambaStudio with this AI starter kit, update the API information by configuring the environment variables in the ai-starter-kit/.env
file:
- Create the
.env
file atai-starter-kit/.env
if the file does not exist. - Set your SambaStudio variables. For example, an endpoint with the URL
"https://api-stage.sambanova.net/api/predict/generic/12345678-9abc-def0-1234-56789abcdef0/456789ab-cdef-0123-4567-89abcdef0123"
is entered in the.env
file as:
SAMBASTUDIO_EMBEDDINGS_BASE_URL="https://api-stage.sambanova.net"
SAMBASTUDIO_EMBEDDINGS_BASE_URI="api/predict/generic"
SAMBASTUDIO_EMBEDDINGS_PROJECT_ID="12345678-9abc-def0-1234-56789abcdef0"
SAMBASTUDIO_EMBEDDINGS_ENDPOINT_ID="456789ab-cdef-0123-4567-89abcdef0123"
SAMBASTUDIO_EMBEDDINGS_API_KEY="89abcdef-0123-4567-89ab-cdef01234567"
Go to the README.md
of the starter kit you want to use and follow the instructions. See Available AI Starter Kits.
Use Sambanova's LLMs and Langchain wrappers
Set your environment as shown in integrate your model.
- Import the SambaStudio langchain community wrapper in your project and define your *SambaStudio LLM:
- If using a Bundle endpoint:
from langchain_community.llms.sambanova import SambaStudio
load_dotenv('.env')
llm = SambaStudio(
model_kwargs={
"do_sample": False,
"max_tokens_to_generate": 512,
"temperature": 0.0,
"select_expert": "Meta-Llama-3-8B-Instruct",
"process_prompt": "False"
},
)
- If using a single model endpoint
from langchain_community.llms.sambanova import SambaStudio
load_dotenv('.env')
llm = SambaStudio(
model_kwargs={
"do_sample": False,
"max_tokens_to_generate": 512,
"temperature": 0.0,
"process_prompt": "False"
},
)
- Use the model
llm.invoke("your prompt")
See utils/usage.ipynb for an example.
- Import our SambaNovaCloud langchain internal wrapper in your project and define your SambaNovaCloud LLM:
from util..model_wrappers.llms.langchain_llms import SambaNovaCloud
load_dotenv('.env')
llm = SambaNovaCloud(model='Meta-Llama-3.1-70B-Instruct')
- Use the model
llm.invoke("your prompt")
See utils/usage.ipynb for an example.
- Import the SambaStudioEmbedding langchain community wrapper in your project and define your SambaStudioEmbeddings embedding:
- If using a Bundle endpoint
from langchain_community.embeddings import SambaStudioEmbeddings
load_dotenv('.env')
embedding = SambaStudioEmbeddings(
batch_size=1,
model_kwargs = {
"select_expert":e5-mistral-7b-instruct
}
)
- If using a single embedding model endpoint
from langchain_community.embeddings import SambaStudioEmbeddings
load_dotenv('.env')
embedding = SambaStudioEmbeddings(batch_size=32)
Note that using different embedding models (cpu or sambastudio) may change the results, and change the way they are set and their parameters
- Use your embedding model in your langchain pipeline
See utils/usage.ipynb for an example.
-
Before running the code, ensure that you have Node.js installed on your system. You can download the latest version from the official Node.js website.
-
Set Up the Environment. To set up the environment, run the following commands in your terminal:
npm init -y
npm install @langchain/openai @langchain/core
These commands will create a new package.json file and install the required dependencies.
- Create a new file named
app.js
and add the following code:
import { ChatOpenAI } from "@langchain/openai";
const SambaNovaCloudBaseURL = "https://api.sambanova.ai/v1";
const apiKey = "your-api-key";
const SambaNovaCloudChatModel = new ChatOpenAI({
temperature: 0.9,
model: "Meta-Llama-3.1-70B-Instruct",
configuration: {
baseURL: SambaNovaCloudBaseURL,
apiKey: apiKey,
},
});
const response = await SambaNovaCloudChatModel.invoke("Hi there, tell me a joke!");
console.log(response.content);
- To run the app, execute the following command in your terminal:
node app.js
Setting up your virtual environment
There are two approaches to setting up your virtual environment for the AI Starter Kits:
- Individual Kit Setup (Traditional Method)
- Base Environment Setup (WIP)
Each starter kit has its own README.md
and requirements.txt
file. You can set up a separate virtual environment for each kit by following the instructions in their respective directories. This method is suitable if you're only interested in running a single kit or prefer isolated environments for each project.
To use this method:
- Navigate to the specific kit's directory
- Create a virtual environment
- Install the requirements
- Follow the kit-specific instructions
For users who plan to work with multiple kits or prefer a unified development environment, we recommend setting up a base environment. This approach uses a Makefile to automate the setup of a consistent Python environment that works across all kits.
Benefits of the base environment approach:
- Consistent Python version across all kits
- Centralized dependency management
- Simplified setup process
- Easier switching between different kits
- pyenv: The Makefile will attempt to install pyenv if it's not already installed.
- Docker: (Optional) If you want to use the Docker-based setup, ensure Docker is installed on your system.
- Installs pyenv and Poetry if they are not already installed.
- Sets up a Python virtual environment using a specified Python version (default is 3.11.3).
- Installs all necessary dependencies for the base environment.
- Sets up the parsing service required by some kits.
- Installs system dependencies like Tesseract OCR and Poppler.
- Provides Docker-based setup options for consistent environments across different systems.
- Install and Set Up the Base Environment:
make all
This command will set up the base ai-starter-kit environment, including installing all necessary tools and dependencies.
- Activate the Base Environment:
source .venv/bin/activate
- Navigate to Your Chosen Starter Kit:
cd path/to/starter_kit
Within the starter kit there will be instructions on how to start the kit. You can skip the virtual environment creation part in the kits README.md as we've done it here.
For certain kits, we utilise a standard parsing service. By Default it's started automatically with the base environment. To work with this service in isolation, following the steps in this section.
- Start Parsing Service:
make start-parsing-service
- Stop Parsing Service:
make stop-parsing-service
- Check Parsing Service Status:
make parsing-status
- View Parsing Service Logs:
make parsing-log
To use the Docker-based setup:
- Ensure Docker is installed on your system.
- Build the Docker image:
make docker-build
- Run a specific kit in the Docker container:
make docker-run-kit KIT=<kit_name>
Replace <kit_name>
with the name of the starter kit you want to run (e.g., function_calling
).
- To open a shell in the Docker container:
make docker-shell
To clean up all virtual environments created by the makefile and stop parsing services run the following command:
make clean
This command removes all virtual environments created with the makefile, stops the parsing service, and cleans up any temporary files.
Troubleshooting
If you encounter issues while setting up or running the AI Starter Kit, here are some common problems and their solutions:
If you're having problems with Python versions:
- Ensure you have pyenv installed:
make ensure-pyenv
- Install the required Python versions:
make install-python-versions
- If issues persist, check your system's Python installation and PATH settings.
If you're experiencing dependency conflicts:
- Try cleaning your environment:
make clean
- Update the lock file:
poetry lock --no-update
- Reinstall dependencies:
make install
If you encounter an error while installing pikepdf
, such as:
ERROR: Failed building wheel for pikepdf
Failed to build pikepdf
This is likely due to missing qpdf
dependency. The Makefile should automatically install qpdf
for you, but if you're still encountering issues:
- Ensure you have proper permissions to install system packages.
- If you're on macOS, you can manually install
qpdf
using Homebrew:brew install qpdf
- On Linux, you can install it using your package manager, e.g., on Ubuntu:
sudo apt-get update && sudo apt-get install -y qpdf
- After installing
qpdf
, try runningmake install
again.
If you continue to face issues, please ensure your system meets all the requirements for building pikepdf
and consider checking the pikepdf documentation for more detailed installation instructions.
If the parsing service isn't starting or is behaving unexpectedly:
- Check its status:
make parsing-status
- View its logs:
make parsing-log
- Try stopping and restarting it:
make stop-parsing-service
followed bymake start-parsing-service
If you encounter issues related to Tesseract OCR or Poppler:
- Ensure the Makefile has successfully installed these dependencies.
- On macOS, you can manually install them using Homebrew:
brew install tesseract poppler
- On Linux (Ubuntu/Debian), you can install them manually:
sudo apt-get update && sudo apt-get install -y tesseract-ocr poppler-utils
- On Windows, you may need to install these dependencies manually and ensure they are in your system PATH.
If you're using the Docker-based setup and encounter issues:
- Ensure Docker is properly installed and running on your system.
- Try rebuilding the Docker image:
make docker-build
- Check Docker logs for any error messages.
- Ensure your firewall or antivirus is not blocking Docker operations.
- Ensure all prerequisites (Python, pyenv, Poetry) are correctly installed.
- Try cleaning and rebuilding the environment:
make clean all
- Check for any error messages in the console output and address them specifically.
- Ensure your
.env
file is correctly set up in the ai-starter-kit root with all necessary environment variables.
If you continue to experience issues, please open an issue with details about your environment, the full error message, and steps to reproduce the problem.
- Ensure you have sufficient permissions to install software on your system.
- The setup process may take several minutes, especially when installing Python versions or large dependencies.
- If you encounter any issues during setup, check the error messages and ensure your system meets all prerequisites.
- Always activate the base environment before navigating to and running a specific starter kit.
- Some kits may require additional setup steps. Always refer to the specific README of the kit you're using.
Note: These AI Starter Kit code samples are provided "as-is," and are not production-ready or supported code. Bugfix/support will be on a best-effort basis only. Code may use third-party open-source software. You are responsible for performing due diligence per your organization policies for use in your applications.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-starter-kit
Similar Open Source Tools
ai-starter-kit
SambaNova AI Starter Kits is a collection of open-source examples and guides designed to facilitate the deployment of AI-driven use cases for developers and enterprises. The kits cover various categories such as Data Ingestion & Preparation, Model Development & Optimization, Intelligent Information Retrieval, and Advanced AI Capabilities. Users can obtain a free API key using SambaNova Cloud or deploy models using SambaStudio. Most examples are written in Python but can be applied to any programming language. The kits provide resources for tasks like text extraction, fine-tuning embeddings, prompt engineering, question-answering, image search, post-call analysis, and more.
WindowsAgentArena
Windows Agent Arena (WAA) is a scalable Windows AI agent platform designed for testing and benchmarking multi-modal, desktop AI agents. It provides researchers and developers with a reproducible and realistic Windows OS environment for AI research, enabling testing of agentic AI workflows across various tasks. WAA supports deploying agents at scale using Azure ML cloud infrastructure, allowing parallel running of multiple agents and delivering quick benchmark results for hundreds of tasks in minutes.
log10
Log10 is a one-line Python integration to manage your LLM data. It helps you log both closed and open-source LLM calls, compare and identify the best models and prompts, store feedback for fine-tuning, collect performance metrics such as latency and usage, and perform analytics and monitor compliance for LLM powered applications. Log10 offers various integration methods, including a python LLM library wrapper, the Log10 LLM abstraction, and callbacks, to facilitate its use in both existing production environments and new projects. Pick the one that works best for you. Log10 also provides a copilot that can help you with suggestions on how to optimize your prompt, and a feedback feature that allows you to add feedback to your completions. Additionally, Log10 provides prompt provenance, session tracking and call stack functionality to help debug prompt chains. With Log10, you can use your data and feedback from users to fine-tune custom models with RLHF, and build and deploy more reliable, accurate and efficient self-hosted models. Log10 also supports collaboration, allowing you to create flexible groups to share and collaborate over all of the above features.
code2prompt
code2prompt is a command-line tool that converts your codebase into a single LLM prompt with a source tree, prompt templating, and token counting. It automates generating LLM prompts from codebases of any size, customizing prompt generation with Handlebars templates, respecting .gitignore, filtering and excluding files using glob patterns, displaying token count, including Git diff output, copying prompt to clipboard, saving prompt to an output file, excluding files and folders, adding line numbers to source code blocks, and more. It helps streamline the process of creating LLM prompts for code analysis, generation, and other tasks.
garak
Garak is a free tool that checks if a Large Language Model (LLM) can be made to fail in a way that is undesirable. It probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. Garak's a free tool. We love developing it and are always interested in adding functionality to support applications.
TypeGPT
TypeGPT is a Python application that enables users to interact with ChatGPT or Google Gemini from any text field in their operating system using keyboard shortcuts. It provides global accessibility, keyboard shortcuts for communication, and clipboard integration for larger text inputs. Users need to have Python 3.x installed along with specific packages and API keys from OpenAI for ChatGPT access. The tool allows users to run the program normally or in the background, manage processes, and stop the program. Users can use keyboard shortcuts like `/ask`, `/see`, `/stop`, `/chatgpt`, `/gemini`, `/check`, and `Shift + Cmd + Enter` to interact with the application in any text field. Customization options are available by modifying files like `keys.txt` and `system_prompt.txt`. Contributions are welcome, and future plans include adding support for other APIs and a user-friendly GUI.
sandbox
Sandbox is an open-source cloud-based code editing environment with custom AI code autocompletion and real-time collaboration. It consists of a frontend built with Next.js, TailwindCSS, Shadcn UI, Clerk, Monaco, and Liveblocks, and a backend with Express, Socket.io, Cloudflare Workers, D1 database, R2 storage, Workers AI, and Drizzle ORM. The backend includes microservices for database, storage, and AI functionalities. Users can run the project locally by setting up environment variables and deploying the containers. Contributions are welcome following the commit convention and structure provided in the repository.
desktop
ComfyUI Desktop is a packaged desktop application that allows users to easily use ComfyUI with bundled features like ComfyUI source code, ComfyUI-Manager, and uv. It automatically installs necessary Python dependencies and updates with stable releases. The app comes with Electron, Chromium binaries, and node modules. Users can store ComfyUI files in a specified location and manage model paths. The tool requires Python 3.12+ and Visual Studio with Desktop C++ workload for Windows. It uses nvm to manage node versions and yarn as the package manager. Users can install ComfyUI and dependencies using comfy-cli, download uv, and build/launch the code. Troubleshooting steps include rebuilding modules and installing missing libraries. The tool supports debugging in VSCode and provides utility scripts for cleanup. Crash reports can be sent to help debug issues, but no personal data is included.
crewAI-tools
This repository provides a guide for setting up tools for crewAI agents to enhance functionality. It offers steps to equip agents with ready-to-use tools and create custom ones. Tools are expected to return strings for generating responses. Users can create tools by subclassing BaseTool or using the tool decorator. Contributions are welcome to enrich the toolset, and guidelines are provided for contributing. The development setup includes installing dependencies, activating virtual environment, setting up pre-commit hooks, running tests, static type checking, packaging, and local installation. The goal is to empower AI solutions through advanced tooling.
llm-foundry
LLM Foundry is a codebase for training, finetuning, evaluating, and deploying LLMs for inference with Composer and the MosaicML platform. It is designed to be easy-to-use, efficient _and_ flexible, enabling rapid experimentation with the latest techniques. You'll find in this repo: * `llmfoundry/` - source code for models, datasets, callbacks, utilities, etc. * `scripts/` - scripts to run LLM workloads * `data_prep/` - convert text data from original sources to StreamingDataset format * `train/` - train or finetune HuggingFace and MPT models from 125M - 70B parameters * `train/benchmarking` - profile training throughput and MFU * `inference/` - convert models to HuggingFace or ONNX format, and generate responses * `inference/benchmarking` - profile inference latency and throughput * `eval/` - evaluate LLMs on academic (or custom) in-context-learning tasks * `mcli/` - launch any of these workloads using MCLI and the MosaicML platform * `TUTORIAL.md` - a deeper dive into the repo, example workflows, and FAQs
garak
Garak is a vulnerability scanner designed for LLMs (Large Language Models) that checks for various weaknesses such as hallucination, data leakage, prompt injection, misinformation, toxicity generation, and jailbreaks. It combines static, dynamic, and adaptive probes to explore vulnerabilities in LLMs. Garak is a free tool developed for red-teaming and assessment purposes, focusing on making LLMs or dialog systems fail. It supports various LLM models and can be used to assess their security and robustness.
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
unstructured
The `unstructured` library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of `unstructured` revolve around streamlining and optimizing the data processing workflow for LLMs. `unstructured` modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs.
torchchat
torchchat is a codebase showcasing the ability to run large language models (LLMs) seamlessly. It allows running LLMs using Python in various environments such as desktop, server, iOS, and Android. The tool supports running models via PyTorch, chatting, generating text, running chat in the browser, and running models on desktop/server without Python. It also provides features like AOT Inductor for faster execution, running in C++ using the runner, and deploying and running on iOS and Android. The tool supports popular hardware and OS including Linux, Mac OS, Android, and iOS, with various data types and execution modes available.
aiexe
aiexe is a cutting-edge command-line interface (CLI) and graphical user interface (GUI) tool that integrates powerful AI capabilities directly into your terminal or desktop. It is designed for developers, tech enthusiasts, and anyone interested in AI-powered automation. aiexe provides an easy-to-use yet robust platform for executing complex tasks with just a few commands. Users can harness the power of various AI models from OpenAI, Anthropic, Ollama, Gemini, and GROQ to boost productivity and enhance decision-making processes.
chatgpt-cli
ChatGPT CLI provides a powerful command-line interface for seamless interaction with ChatGPT models via OpenAI and Azure. It features streaming capabilities, extensive configuration options, and supports various modes like streaming, query, and interactive mode. Users can manage thread-based context, sliding window history, and provide custom context from any source. The CLI also offers model and thread listing, advanced configuration options, and supports GPT-4, GPT-3.5-turbo, and Perplexity's models. Installation is available via Homebrew or direct download, and users can configure settings through default values, a config.yaml file, or environment variables.
For similar tasks
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
jupyter-ai
Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.
khoj
Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.
langchain_dart
LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).
danswer
Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"
infinity
Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.