openai-chat-api-workflow
π© An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 π€π¬ It also allows image generation πΌοΈ, image understanding π, speech-to-text conversion π€, and text-to-speech synthesis π
Stars: 277
**OpenAI Chat API Workflow for Alfred** An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 π€π¬ It also allows image generation πΌοΈ, image understanding π, speech-to-text conversion π€, and text-to-speech synthesis π **Features:** * Execute all features using Alfred UI, selected text, or a dedicated web UI * Web UI is constructed by the workflow and runs locally on your Mac π» * API call is made directly between the workflow and OpenAI, ensuring your chat messages are not shared online with anyone other than OpenAI π * OpenAI does not use the data from the API Platform for training π« * Export chat data to a simple JSON format external file π * Continue the chat by importing the exported data later π
README:
π© An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 π€π¬ It also allows image generation πΌοΈ, image understanding π, speech-to-text conversion π€, and text-to-speech synthesis π
π¦ Download OpenAI Chat API Workflow (version 2.9.9.4
)
You can execute all the above features using:
- Alfred UI π₯οΈ
- Selected text π
- A dedicated web UI π
The web UI is constructed by the workflow and runs locally on your Mac π» The API call is made directly between the workflow and OpenAI, ensuring your chat messages are not shared online with anyone other than OpenAI π Furthermore, OpenAI does not use the data from the API Platform for training π«
You can export the chat data to a simple JSON format external file π, and it is possible to continue the chat by importing it later π
- Install Homebrew
- Run the following command in a terminal:
brew install pandoc mpv sox jq duti
- Download and run OpenAI Chat API Workflow
- Set your OpenAI API key
Setup Hotkeys
You can set up hotkeys in the settings screen of the workflow. To set up hotkeys, double-click on the light purple workflow elements.
- Open Web UI (Recommended)
- Direct Query
- Send Selected Text
- Screen Capture for Image Understanding
- Speech to Text
Dependencies
- Alfred 5 Powerpack
- OpenAI API key
- Pandoc: to convert Markdown to HTML
- MPV: to play text-to-speech audio stream
- Sox: to record voice input
- jq: to handle chat history in JSON
- duti: to detect the default web browser
To start using this workflow, you must set the environment variable apikey
, which you can get by creating a new OpenAI account. See also the Configuration section below.
You will also need to install the pandoc
and sox
programs. Pandoc will allow this workflow to convert the Markdown response from OpenAI to HTML and display the result in your default web browser with syntax highlighting enabled (especially useful when using this workflow to generate program code). Sox will allow you to record voice audio to convert to text using Whisper speech-to-text API.
To set up dependencies (pandoc
, mpv
, sox
, jq
, and duti
), first install homebrew. and run the following command.
brew install pandoc mpv sox jq duti
Change Log
Recent Change Log
- 2.9.9.4: Smoother response text streaming;
max_tokens
can be set to0
(set tonull
); - 2.9.9.2:
gpt-4-turbo-2024-04-09
supported; - 2.9.9.1: System prompt modifiable in the web UI
- 2.9.9: Issue concerning chat containing images fixed
- 2.9.8: JSON export and cancel button behavior improved
- 2.9.7: Stability improvements; Brave browser supported
- 2.9.6: System prompt modifiable in the web UI
- 2.9.5:
gpt-3.5-turbo-0125
supported (default model) - 2.9.4: Copy code snippet button; fix dark mode issue
- 2.9.2: Default model set to
gpt-3.5-turbo-1106
; New model (gpt-4-0125-preview
) supported - 2.9.0: Image understanding (using specified files or screen captures)
here are three methods to run the workflow: 1) Using commands within the Alfred UI, 2) Passing selected text to the workflow, 3) Utilizing the Web UI. Additionally, thereβs a convenient method for making brief inquiries to GPT
Commands within the Alfred UI
You can enter a query text directly into Alfred textbox:
- Method 1: Alfred textbox β keyword (
openai
) β space/tab β input query text β select a command (see below) - Method 2: Alfred textbox β input query text β select fallback search (
OpenAI Query
)
Passing Selected Text
You can select any text on your Mac and send it to the workflow:
- Method 1: select text β universal action hotkey β select
OpenAI Query
- Method 2: set up a custom hotkey to
Send selected text to OpenAI
Using Web Interface
You can open a web interface
- Method 1: Alfred textbox β keyword (
openai-webui
) - Method 2: set up a custom hotkey to
Open web interface
Using the Default Browser
If your default browser is set to one of the following and the duti command is installed on your system, the web interface will automatically open in your chosen browser. If not, Safari will be used as the default.
- Google Chrome (Stable, Beta, Dev, etc.)
- Microsoft Edge (Stable, Beta, Dev, etc.)
- Brave Browser
Restart OpenAI Workflow server by executing openai-restart-server
in case the web UI does not work as expected after changing the default browser.
Web UI Modes
Switch modes (light
/dark
/auto
) with Web UI Mode
selector in the settings.
Simple Direct Query/Chat
To quickly chat with GPT:
- Method 1: Type keyword
gpt
β space/tab β input query text (e.g. "gpt what is a large language model?") - Method 2: set up a custom hotkey to
OpenAI Direct Query
With Direct Query
, the input text is sent directly to the OpenAI Chat API as a prompt. You can also create a query by prepending or appending text to the input text.
The input text is directly sent as a prompt to the OpenAI Chat API.
After the initial text is entered, the user is prompted for additional text. The additional text is added before the initial text, and the resulting text is used as the query.
After the initial text is entered, the user is prompted for additional text. The additional text is added after the initial text and the resulting text is used as the query.
The DALL-E API (dall-e-3
or dall-e-2
) is used to generate images according to the prompts entered. See Image Generation below.
Some of the examples shown on OpenAI's Examples page are incorporated into this Workflow as commands. Functions not prepared as commands can be realized by giving appropriate prompts to the above Basic Commands.
GPT generates program code and example output according to the text entered. You can specify the purpose of the program, its function, the language and technology to be used, etc.
Example Input
Create a command line program that takes an English sentence and returns syntactically parsed output. Provide program code in Python and example usage.
Example Output
You can ask questions in the language set to the variable first_language
.
Note: If the value of first_language
is not English
(e.g. Japanese
), the query may result in a more or less inaccurate response.
GPT translates text in the language specified in the variable first_language
to the language specified in the second_language
.
GPT translates text in the language specified in the variable second_language
to the language specified in the variable first_language
.
GPT corrects sentences that may contain grammatical errors. See OpenAI's description.
GPT assists you in brainstorming innovative ideas based on any given text.
GPT provides study notes of a given topic. See OpenAI's description for this example.
GPT creates analogies. See OpenAI's description for this example.
GPT generates an outline for a research topic. See OpenAI's description for this example.
GPT summarizes a given text. See OpenAI's description for this example.
GPT translates complex text into more straightforward concepts. See OpenAI's description for this example.
GPT extracts keywords from a block of text. See OpenAI's description for this example.
The image generation can be executed through one of the above commands. It is also possible to use the web UI. By using the web UI, you can interactively change the prompt to get closer to the desired image.
When the image generation mode is set to dall-e-3
, the user's prompt is automatically expanded to a more detailed and specific prompt. You can also edit the expanded prompt and regenerate the image.
The image understanding can be executed through the openai-vision
command. It starts a capture mode and lets you specify a part of the screen to be analyzed. Alternatively, you can specify an image file (jpg, jpeg, png, gif) using "OpenAI Vision" file action. This mode uses gpt-4-turbo
model for image understanding irrespective of the model set in the settings.
Most text-to-speech and speech-to-text features are available on the web UI. However, there are certain specific features that are provided as commands, such as audio file to text conversion and transcription with timestamps.
Text-to-Speech Synthesis
Text entered or response text from GPT can be read out in a natural voice using OpenAI's text-to-speech API.
- Method 1: Press
Play TTS
button on the web UI - Method 2: select text β universal action hotkey β select
OpenAI Text-to-Speech
Speech-to-Text Conversion
The Whisper API can convert speech into text in a variety of languages. Please refer to the Whisper API FAQ for available languages and other limitations.
- Method 1: Press
Voice Input
button on the web UI - Method 2: Alfred textbox β keyword (
openai-speech
)
Audio File to Text
You can select an audio file in mp3
, mp4
, flac
, webm
, wav
, or m4a
format (under 25MB) and send it to the workflow:
- Select the file β universal actioin hotkey select β
OpenAI Speech-to-Text
Record Voice Audio and Transcribe
You can record voice audio and send it to the Workflow for transcription using the Whisper API. The recording must be no longer than 30 minutes and will automatically stop after this time. Recording time is limited to 30 minutes and stops automatically after this limit.
-
Alfred textbox β keyword (
openai-speech
) β Terminal window opens and recording starts -
Speak to internal or external microphone β Press Enter to finish recording
-
Choose processes to apply to the recorded audio
- transcribe (+ delete recording)
- transcribe (+ save recording to desktop)
- transcribe and query (+ delete recording)
- transcribe and query (+ save recording to desktop)
- exit (+ delete recording)
- exit (+ save recording to desktop)
You can choose the format of the transribed text from text
, srt
or vtt
in the workflow's settings. Below are examples in the text
and srt
formats:
Import/Export
You can export your chat data to a straightforward JSON format file, and resume your conversation later by importing it back in.
To export data, simply click on Show Entire Chat
in the chat window to navigate to the chat history page, then select Export Data
. To import data, just hit Import Data
on either the home page or the chat history page.
Monitor API Usage
To review your token usage for the current billing cycle on the OpenAI Usage Page, type the keyword openai-usage
. For more details on billing, visit OpenAI's Billing Overview.
You can set various parameters in the settings panel of this Workflow. Some of the parameters set here are used as default values but you can make temporary changes to the values on the web UI. You can also access the settings panel by clicking Open Config
from the web UI.
Required Settings
-
OpenAI API Key: Set your secret API key for OpenAI. Sign up for OpenAI and get your API key at https://platform.openai.com/account/api-keys/.
-
Base URL: The base URL of the OpenAI Chat API. (default:
https://api.openai.com/v1
)
Web UI Parameters
-
Loopback Address: Either
localhost
or127.0.0.1
can be used as the loopback address of the UI server. If the web UI does not work as expected, try the other. (default:127.0.0.1
) -
Stream Output: Show results in the default web browser. If unchecked, Alfred's "Large Type" feature is used to display the result. (default:
enabled
) - Hide Speech Buttons: When enabled, the buttons for TTS playback and voice input are hidden on the web UI.
-
Web UI Mode: Set your preferred UI mode (
light
/dark
/auto
). (default:auto
)
Chat Parameters
-
Model: OpenAI's chat model used for the workflow (default:
gpt-3.5-turbo
). If you want to use the latest and greatest model, set it togpt-4-turbo
. Here are the models currently available:gpt-3.5-turbo-0125
gpt-3.5-turbo-1106
gpt-3.5-turbo-16k
-
gpt-3.5-turbo
(default) -
gpt-4-turbo
(latest and greatest) gpt-4
-
Max Tokens: Maximum number of tokens to be generated upon completion (default:
2048
). If this parameter is set to0
,null
is sent to the API as the default value (the maximum number of tokens is not specified). See OpenAI's documentation. -
Temperature: See OpenAI's documentation. (default:
0.3
) -
Top P: See OpenAI's documentation. (default:
1.0
) -
Frequency Penalty: See OpenAI's documentation. (default:
0.0
) -
Presence Penalty: See OpenAI's documentation. (default:
0.0
) -
Memory Span: Set the number of past utterances sent to the API as a context. Setting 4 to this parameter means 2 conversation turns (user β assistant β user β assistant) will be sent as a context for a new query. The larger the value, more tokens will be consumed. (default:
4
) -
Max Characters: Maximum number of characters that can be included in a query (default:
50000
). -
Timeout: The number of seconds (default:
10
) to wait before opening the socket and connecting to the API. If the connection fails, reconnection (up to 20 times) will be attempted after 1 second. -
Add Emoji: If enabled, the response text from GPT will contain emoji characters appropriate for the content. This is realized by adding the following sentence at the end of the system content. (default:
enabled
)Add emojis that are appropriate to the content of the response.
-
System Content: Text to sent with every query sent to API as a general information about the specification of the chat. The default value is as follows:
You are a friendly but professional consultant who answers various questions, make decent suggestions, and give helpful advice in response to a prompt from the user. Your response must be consise, suggestive, and accurate.
Image Understading Parameters
-
Max Size for Image Understanding: The maximum pixel value (
512
to2000
) of the large side of the image data sent to the image understanding API. Larger images will be resized accordingly. (Default:512
)
Image Generation Parameters
-
Image Generation Model:
dall-e-3
anddall-e-2
are available. (defaultdall-e-3
) -
Image Size (
for dall-e-3
): Set the size of images to generate from1024x1024
,1024x1792
,1792x1024
. (default:1024x1024
) -
Quality (
for dall-e-3
): Choose the quality of image fromstandard
andhd
. (default:standard
) -
Style (
for dall-e-3
): Choose the style of image fromvivid
andnatural
. (default:vivid
) -
Number of Images (for
dall-e-2
) : Set the number of images to generate in image generation mode from1
to10
. (default:1
) -
Image Size (for
dall-e-2
): Set the size of images to generate from256x256
,512x512
,1024x1024
. (default:256x256
)
Text-to-Speech Parameters
-
Text-to-Speech Model: One of the available TTS models:
tts-1
ortts-1-hd
. (default:tts-1
) -
Text-to-Speech Voice: The voice to use when generating the audio. Supported voices are:
alloy
,echo
,fable
,onyx
,nova
, andshimmer
. (default:alloy
) -
Text-to-Speech Speed: The speed of the generated audio. Select a value from 0.25 to 4.0. (default:
1.0
) -
Automatic Text to Speech: If enabled, the results will be read aloud using the system's default text-to-speech language and voice. (default:
disabled
)
Speech-to-Text Parameters
-
Transcription Format: Set the format of the text transcribed from the microphone input or audio files from
text
,srt
, orvtt
. (default:text
) -
Processes after Recording Set the default choice of what processes follow after audio recording finishes (default:
transcribe [+ delete recording]
).- Transcribe [+ delete recording]
- Transcribe [+ save recording to desktop]
- Transcribe and query [+ delete recording]
- Transcribe and query [+ save recording desktop]
-
Audio to English: When enabled, Whisper API will transcribe the input audio and output text translated into English. (default:
disabled
)
Other Settings
-
Your First Language: Set your first language. This language is used when using GPT for translation. (default:
English
) -
Your Second Language: Set your second language. This language is used when using GPT for translation.(default:
Japanese
) -
Sound: If checked, a notification sound will play when the response is returned. (default:
disabled
) -
Save File Path: If set, the results will be saved in the specified path as a markdown file. (default:
not set
)
Environment Variables
Environment variables can be accessed by clicking the [x]
button located at the top right of the workflow settings screen. Normally, there is no need to change the values of the environment variables.
-
http_keep_alive
: This workflow starts an HTTP server when the web UI is first displayed. After that, if the web UI is not used for the time (in seconds) set by this environment variable, the server will stop. (default:7200
= 2 hours) -
http_port
: Specifies the port number for the web UI. (default:80
) -
http_server_wait
: Specifies the wait time from when the HTTP server is started until the page is displayed in the browser. (default:2.5
) -
websocket_port
: Specifies the port number for websocket communication used to display responses in streaming on the web UI. (default:8080
)
Yoichiro Hasebe ([email protected])
The MIT License
The author assumes no responsibility for any potential damages arising from the use of this software.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for openai-chat-api-workflow
Similar Open Source Tools
openai-chat-api-workflow
**OpenAI Chat API Workflow for Alfred** An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 π€π¬ It also allows image generation πΌοΈ, image understanding π, speech-to-text conversion π€, and text-to-speech synthesis π **Features:** * Execute all features using Alfred UI, selected text, or a dedicated web UI * Web UI is constructed by the workflow and runs locally on your Mac π» * API call is made directly between the workflow and OpenAI, ensuring your chat messages are not shared online with anyone other than OpenAI π * OpenAI does not use the data from the API Platform for training π« * Export chat data to a simple JSON format external file π * Continue the chat by importing the exported data later π
Deep-Live-Cam
Deep-Live-Cam is a software tool designed to assist artists in tasks such as animating custom characters or using characters as models for clothing. The tool includes built-in checks to prevent unethical applications, such as working on inappropriate media. Users are expected to use the tool responsibly and adhere to local laws, especially when using real faces for deepfake content. The tool supports both CPU and GPU acceleration for faster processing and provides a user-friendly GUI for swapping faces in images or videos.
DeepPavlov
DeepPavlov is an open-source conversational AI library built on PyTorch. It is designed for the development of production-ready chatbots and complex conversational systems, as well as for research in the area of NLP and dialog systems. The library offers a wide range of models for tasks such as Named Entity Recognition, Intent/Sentence Classification, Question Answering, Sentence Similarity/Ranking, Syntactic Parsing, and more. DeepPavlov also provides embeddings like BERT, ELMo, and FastText for various languages, along with AutoML capabilities and integrations with REST API, Socket API, and Amazon AWS.
py-llm-core
PyLLMCore is a light-weighted interface with Large Language Models with native support for llama.cpp, OpenAI API, and Azure deployments. It offers a Pythonic API that is simple to use, with structures provided by the standard library dataclasses module. The high-level API includes the assistants module for easy swapping between models. PyLLMCore supports various models including those compatible with llama.cpp, OpenAI, and Azure APIs. It covers use cases such as parsing, summarizing, question answering, hallucinations reduction, context size management, and tokenizing. The tool allows users to interact with language models for tasks like parsing text, summarizing content, answering questions, reducing hallucinations, managing context size, and tokenizing text.
videodb-python
VideoDB Python SDK allows you to interact with the VideoDB serverless database. Manage videos as intelligent data, not files. It's scalable, cost-efficient & optimized for AI applications and LLM integration. The SDK provides functionalities for uploading videos, viewing videos, streaming specific sections of videos, searching inside a video, searching inside multiple videos in a collection, adding subtitles to a video, generating thumbnails, and more. It also offers features like indexing videos by spoken words, semantic indexing, and future indexing options for scenes, faces, and specific domains like sports. The SDK aims to simplify video management and enhance AI applications with video data.
shell-ai
Shell-AI (`shai`) is a CLI utility that enables users to input commands in natural language and receive single-line command suggestions. It leverages natural language understanding and interactive CLI tools to enhance command line interactions. Users can describe tasks in plain English and receive corresponding command suggestions, making it easier to execute commands efficiently. Shell-AI supports cross-platform usage and is compatible with Azure OpenAI deployments, offering a user-friendly and efficient way to interact with the command line.
airbadge
Airbadge is a Stripe addon for Auth.js that simplifies the process of creating a SaaS site by integrating payment, authentication, gating, self-service account management, webhook handling, trials & free plans, session data, and more. It allows users to launch a SaaS app without writing any authentication or payment code. The project is open source and free to use with optional paid features under the BSL License.
code2prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.
generative-fusion-decoding
Generative Fusion Decoding (GFD) is a novel shallow fusion framework that integrates Large Language Models (LLMs) into multi-modal text recognition systems such as automatic speech recognition (ASR) and optical character recognition (OCR). GFD operates across mismatched token spaces of different models by mapping text token space to byte token space, enabling seamless fusion during the decoding process. It simplifies the complexity of aligning different model sample spaces, allows LLMs to correct errors in tandem with the recognition model, increases robustness in long-form speech recognition, and enables fusing recognition models deficient in Chinese text recognition with LLMs extensively trained on Chinese. GFD significantly improves performance in ASR and OCR tasks, offering a unified solution for leveraging existing pre-trained models through step-by-step fusion.
extension-gen-ai
The Looker GenAI Extension provides code examples and resources for building a Looker Extension that integrates with Vertex AI Large Language Models (LLMs). Users can leverage the power of LLMs to enhance data exploration and analysis within Looker. The extension offers generative explore functionality to ask natural language questions about data and generative insights on dashboards to analyze data by asking questions. It leverages components like BQML Remote Models, BQML Remote UDF with Vertex AI, and Custom Fine Tune Model for different integration options. Deployment involves setting up infrastructure with Terraform and deploying the Looker Extension by creating a Looker project, copying extension files, configuring BigQuery connection, connecting to Git, and testing the extension. Users can save example prompts and configure user settings for the extension. Development of the Looker Extension environment includes installing dependencies, starting the development server, and building for production.
stark
STaRK is a large-scale semi-structure retrieval benchmark on Textual and Relational Knowledge Bases. It provides natural-sounding and practical queries crafted to incorporate rich relational information and complex textual properties, closely mirroring real-life scenarios. The benchmark aims to assess how effectively large language models can handle the interplay between textual and relational requirements in queries, using three diverse knowledge bases constructed from public sources.
llm_aided_ocr
The LLM-Aided OCR Project is an advanced system that enhances Optical Character Recognition (OCR) output by leveraging natural language processing techniques and large language models. It offers features like PDF to image conversion, OCR using Tesseract, error correction using LLMs, smart text chunking, markdown formatting, duplicate content removal, quality assessment, support for local and cloud-based LLMs, asynchronous processing, detailed logging, and GPU acceleration. The project provides detailed technical overview, text processing pipeline, LLM integration, token management, quality assessment, logging, configuration, and customization. It requires Python 3.12+, Tesseract OCR engine, PDF2Image library, PyTesseract, and optional OpenAI or Anthropic API support for cloud-based LLMs. The installation process involves setting up the project, installing dependencies, and configuring environment variables. Users can place a PDF file in the project directory, update input file path, and run the script to generate post-processed text. The project optimizes processing with concurrent processing, context preservation, and adaptive token management. Configuration settings include choosing between local or API-based LLMs, selecting API provider, specifying models, and setting context size for local LLMs. Output files include raw OCR output and LLM-corrected text. Limitations include performance dependency on LLM quality and time-consuming processing for large documents.
ppl.llm.serving
PPL LLM Serving is a serving based on ppl.nn for various Large Language Models (LLMs). It provides inference support for LLaMA. Key features include: * **High Performance:** Optimized for fast and efficient inference on LLM models. * **Scalability:** Supports distributed deployment across multiple GPUs or machines. * **Flexibility:** Allows for customization of model configurations and inference pipelines. * **Ease of Use:** Provides a user-friendly interface for deploying and managing LLM models. This tool is suitable for various tasks, including: * **Text Generation:** Generating text, stories, or code from scratch or based on a given prompt. * **Text Summarization:** Condensing long pieces of text into concise summaries. * **Question Answering:** Answering questions based on a given context or knowledge base. * **Language Translation:** Translating text between different languages. * **Chatbot Development:** Building conversational AI systems that can engage in natural language interactions. Keywords: llm, large language model, natural language processing, text generation, question answering, language translation, chatbot development
OpenAI-Api-Unreal
The OpenAIApi Plugin provides access to the OpenAI API in Unreal Engine, allowing users to generate images, transcribe speech, and power NPCs using advanced AI models. It offers blueprint nodes for making API calls, setting parameters, and accessing completion values. Users can authenticate using an API key directly or as an environment variable. The plugin supports various tasks such as generating images, transcribing speech, and interacting with NPCs through chat endpoints.
rag-gpt
RAG-GPT is a tool that allows users to quickly launch an intelligent customer service system with Flask, LLM, and RAG. It includes frontend, backend, and admin console components. The tool supports cloud-based and local LLMs, enables deployment of conversational service robots in minutes, integrates diverse knowledge bases, offers flexible configuration options, and features an attractive user interface.
For similar tasks
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
jupyter-ai
Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.
khoj
Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.
langchain_dart
LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).
danswer
Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"
infinity
Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.
For similar jobs
ChatFAQ
ChatFAQ is an open-source comprehensive platform for creating a wide variety of chatbots: generic ones, business-trained, or even capable of redirecting requests to human operators. It includes a specialized NLP/NLG engine based on a RAG architecture and customized chat widgets, ensuring a tailored experience for users and avoiding vendor lock-in.
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.
anything-llm
AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.
glide
Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.