ChatGPT-OpenAI-Smart-Speaker

This AI Smart Speaker uses speech recognition and text-to-speech to enable voice-driven conversations and vision capabilities with OpenAI and Agents. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

Stars: 188

Visit

ChatGPT Smart Speaker is a project that enables speech recognition and text-to-speech functionalities using OpenAI and Google Speech Recognition. It provides scripts for running on PC/Mac and Raspberry Pi, allowing users to interact with a smart speaker setup. The project includes detailed instructions for setting up the required hardware and software dependencies, along with customization options for the OpenAI model engine, language settings, and response randomness control. The Raspberry Pi setup involves utilizing the ReSpeaker hardware for voice feedback and light shows. The project aims to offer an advanced smart speaker experience with features like wake word detection and response generation using AI models.

README:

ChatGPT Smart Speaker (speech recognition and text-to-speech using OpenAI and Google Speech Recognition)

Video Demos

Video Demo using activation word "Jeffers"

Video Demo with Vision

Equipment List:

- Raspberry Pi 4b 4GB

- VMini External USB Stereo Speaker

- VReSpeaker 4-Mic Array

- ANSMANN 10,000mAh Type-C 20W PD Power Bank

Running on your PC/Mac (use the chat.py or test.py script)

The chat.py and test.py scripts run directly on your PC/Mac. They both allow you to use speech recognition to input a prompt, send the prompt to OpenAI to generate a response, and then use gTTS to convert the response to an audio file and play the audio file on your Mac/PC. Your PC/Mac must have a working default microphone and speakers for this script to work. Please note that these scripts were designed on a Mac, so additional dependencies may be required on Windows and Linux. The difference between them is that chat.py is faster and always on and test.py acts like a standard smart speaker - only working once it hears the activation command (currently set to 'Jeffers').

Running on Raspberry Pi (use the pi.py script)

The pi.py script is a new and more advanced custom version of the smart_speaker.py script and is the most advanced script similar to a real smart speaker. The purpose of this script is to offload the wake up word to a custom model build via PicoVoice (https://console.picovoice.ai/). This improves efficiency and long term usage reliability. This script will be the main script for development moving forward due to greater reliability and more advanced features to be added regularly.

Prerequisites - chat.py

You need to have a valid OpenAI API key. You can sign up for a free API key at https://platform.openai.com.
You'll need to be running Python version 3.7.3 or higher. I am using 3.11.4 on a Mac and 3.7.3 on Raspberry Pi.
Run brew install portaudio after installing HomeBrew: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
You need to install the following packages: openai, gTTS, pyaudio, SpeechRecognition, playsound, python-dotenv and pyobjc if you are on a Mac. You can install these packages using pip or use pipenv if you wish to contain a virtual environment.
Firstly, update your tools: pip install --upgrade pip setuptools then pip install openai pyaudio SpeechRecognition gTTS playsound python-dotenv apa102-pi gpiozero pyobjc

Prerequisites - pi.py

To run pi.py you will need a Raspberry Pi 4b (I'm using the 4GB model but 2GB should be enough), ReSpeaker 4-Mic Array for Raspberry Pi and USB speakers.

You will also need a developer account and API key with OpenAI (https://platform.openai.com/overview), a Tavily Search agent API key (https://app.tavily.com/sign-in) and an Access Key and Custom Voice Model with PicoVoice (https://console.picovoice.ai/) and (https://console.picovoice.ai/ppn respectively. Please create your own voice model and download the correct version for use on a Raspberry Pi)

Now on to the Pi setup. Let's get started!

Run the following on your Raspberry Pi terminal:

sudo apt update
sudo apt install python3-gpiozero
git clone https://github.com/Olney1/ChatGPT-OpenAI-Smart-Speaker
Firstly, update your tools: pip install --upgrade pip setuptools then pip install openai pyaudio SpeechRecognition gTTS pydub python-dotenv apa102-pi gpiozero Next, install the dependencies, pip install -r requirements.txt. I am using Python 3.9 #!/usr/bin/env python3.9. You can install these packages using pip or use pipenv if you wish to contain a virtual environment.
PyAudio relies on PortAudio as a dependency. You can install it using the following command: sudo apt-get install portaudio19-dev
Pydub dependencies: You need to have ffmpeg installed on your system. On a Raspberry Pi you can install it using: sudo apt-get install ffmpeg. You may also need simpleaudio if you run into issues with the script hanging when finding the wake word, so it's best to install these packages just in case: sudo apt-get install python3-dev (for development headers to compile) and install simpleaudio (for a different backend to play mp3 files) and sudo apt-get install libasound2-dev (necessary dependencies).
If you are using the RESPEAKER, follow this guide to install the required dependencies: (https://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspberry_Pi/#getting-started). Then install support for the lights on the RESPEAKER board. You'll need APA102 LED: sudo apt install -y python3-rpi.gpio and then sudo pip3 install apa102-pi.
Activate SPI: sudo raspi-config; Go to "Interface Options"; Go to "SPI"; Enable SPI; While you are at it: Do change the default password! Exit the tool and reboot.
Get the Seeed voice card source code, install and reboot: git clone https://github.com/HinTak/seeed-voicecard.git cd seeed-voicecard sudo ./install.sh sudo reboot now
Finally, load audio output on Raspberry Pi sudo raspi-config -Select 1 System options -Select S2 Audio -Select your preferred Audio output device -Select Finish

Usage - applies to chat.py:

You'll need to set up the environment variables for your Open API Key. To do this create a .env file in the same directory and add your API Key to the file like this: OPENAI_API_KEY="API KEY GOES HERE". This is safer than hard coding your API key into the program. You must not change the name of the variable OPENAI_API_KEY.
Run the script using python chat.py.
The script will prompt you to say something. Speak a sentence into your microphone. You may need to allow the program permission to access your microphone on a Mac, a prompt should appear when running the program.
The script will send the spoken sentence to OpenAI, generate a response using the text-to-speech model, and play the response as an audio file.

Usage - applies to pi.py

You'll need to set up the environment variables for your Open API Key, PicoVoice Access Key and Tavily API key for agent searches. To do this create a .env file in the same directory and add your API Keys to the file like this: OPENAI_API_KEY="API KEY GOES HERE" and ACCESS_KEY="PICOVOICE ACCESS KEY GOES HERE" and TAVILY_API_KEY="API KEY GOES HERE". This is safer than hard coding your API key into the program.
Ensure that you have the pi.py script along with apa102.py and alexa_led_pattern.py scripts in the same folder saved on your Pi if using ReSpeaker.
Run the script using python3 pi.py or python3 pi.py 2> /dev/null on the Raspberry Pi. The second option omits all developer warnings and errors to keep the console focused purely on the print statements.
The script will prompt you to say the wake word which is programmed into the wake word custom model by Picovoice as 'Jeffers'. You can change this to any name you want. Once the wake word has been detected the lights will light up blue. It will now be ready for you to ask your question. When you have asked your question, or when the microphone picks up and processes noise, the lights will rotate a blue colour meaning that your recording sample/question is being sent to OpenAI.
The script will then generate a response using the text-to-speech model, and play the response as an audio file.

Customisation

You can change the OpenAI model engine by modifying the value of model_engine. For example, to use the "gpt-3.5-turbo" model for a cheaper and quicker response but with a knowledge cut-off to Sep 2021, set model_engine = "gpt-3.5-turbo".
You can change the language of the generated audio file by modifying the value of language. For example, to generate audio in French, set language = 'fr'.
You can adjust the temperature parameter in the following line to control the randomness of the generated response:

response = client.chat.completions.create(
        model=model_engine,
        messages=[{"role": "system", "content": "You are a helpful smart speaker called Jeffers!"}, # Play about with more context here.
                  {"role": "user", "content": prompt}],
        max_tokens=1024,
        n=1,
        temperature=0.7,
    )
    return response

Higher values of temperature will result in more diverse and random responses, while lower values will result in more deterministic responses.

Important notes for Raspberry Pi Installation

If you are using the same USB speaker in my video you will need to run sudo apt-get install pulseaudio to install support for this. This may also require you to set a command to start pulseaudio on every boot: pulseaudio --start.

Adding a Start Command on Boot

Open the terminal and type: sudo nano /etc/rc. local

After important network/start commands add this: su -l pi -c '/usr/bin/python3 /home/pi/ChatGPT-OpenAI-Smart-Speaker/ && pulseaudio --start && python3 pi.py 2> /dev/null’

Be sure to leave the line exit 0 at the end, then save the file and exit. In nano, to exit, type Ctrl-x, and then Y

ReSpeaker

If you want to use ReSpeaker for the lights, you can purchase this from most of the major online stores that stock Raspberry Pi. Here is the online guide: https://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspberry_Pi/

To test your microphone and speakers install Audacity on your Raspberry Pi:

sudo apt update

sudo apt install audacity

audacity

Other Possible Issues

On the raspberry pi you may encounter an error regarding the installation of flac.

See here for the resolution: https://raspberrypi.stackexchange.com/questions/137630/im-unable-to-install-flac-on-my-raspberry-pi-3

The files you will need are going to be here: https://archive.raspbian.org/raspbian/pool/main/f/flac/
Please note the links below may have changed or be updated, so please refer back to this link above for the latest file names and then update your command below.

sudo apt-get install libogg0

$ wget https://archive.raspbian.org/raspbian/pool/main/f/flac/libflac8_1.3.2-3+deb10u3_armhf.deb

$ wget https://archive.raspbian.org/raspbian/pool/main/f/flac/flac_1.3.2-3+deb10u3_armhf.deb

$ sudo dpkg -i libflac8_1.3.2-3+deb10u3_armhf.deb

$ sudo dpkg -i flac_1.3.2-3+deb10u3_armhf.deb

$ which flac /usr/bin/flac

sudo reboot

$ flac --version flac 1.3.2

You may find you need to install GStreamer if you encounter errors regarding Gst.

Install GStreamer: Open a terminal and run the following command to install GStreamer and its base plugins:

sudo apt-get install gstreamer1.0-tools gstreamer1.0-plugins-base gstreamer1.0-plugins-good This installs the GStreamer core, along with a set of essential and good-quality plugins.

Next, you need to install the Python bindings for GStreamer. Use this command:

sudo apt-get install python3-gst-1.0 This command installs the GStreamer bindings for Python 3.

Install Additional GStreamer Plugins (if needed): Depending on the audio formats you need to work with, you might need additional GStreamer plugins. For example, to install plugins for MP3 playback, use:

sudo apt-get install gstreamer1.0-plugins-ugly

To quit a running script on Pi from boot: ALT + PrtScSysRq (or Print button) + K

Credit to:

https://github.com/tinue/apa102-pi & Seeed Technology Limited for supplementary code.

For Tasks:

Click tags to check more tools for each tasks

generate responses interact with users detect wake word play audio files control lights

For Jobs:

software developer ai engineer hardware engineer voice technology specialist system administrator

Alternative AI tools for ChatGPT-OpenAI-Smart-Speaker

Similar Open Source Tools

ChatGPT-OpenAI-Smart-Speaker

github

: 188

seer

Seer is a service that provides AI capabilities to Sentry by running inference on Sentry issues and providing user insights. It is currently in early development and not yet compatible with self-hosted Sentry instances. The tool requires access to internal Sentry resources and is intended for internal Sentry employees. Users can set up the environment, download model artifacts, integrate with local Sentry, run evaluations for Autofix AI agent, and deploy to a sandbox staging environment. Development commands include applying database migrations, creating new migrations, running tests, and more. The tool also supports VCRs for recording and replaying HTTP requests.

github

: 87

LLM_AppDev-HandsOn

This repository showcases how to build a simple LLM-based chatbot for answering questions based on documents using retrieval augmented generation (RAG) technique. It also provides guidance on deploying the chatbot using Podman or on the OpenShift Container Platform. The workshop associated with this repository introduces participants to LLMs & RAG concepts and demonstrates how to customize the chatbot for specific purposes. The software stack relies on open-source tools like streamlit, LlamaIndex, and local open LLMs via Ollama, making it accessible for GPU-constrained environments.

github

: 383

ai-town

AI Town is a virtual town where AI characters live, chat, and socialize. This project provides a deployable starter kit for building and customizing your own version of AI Town. It features a game engine, database, vector search, auth, text model, deployment, pixel art generation, background music generation, and local inference. You can customize your own simulation by creating characters and stories, updating spritesheets, changing the background, and modifying the background music.

github

: 6.3k

aides-jeunes

The user interface (and the main server) of the simulator of aids and social benefits for young people. It is based on the free socio-fiscal simulator Openfisca.

github

: 79

opencommit

OpenCommit is a tool that auto-generates meaningful commits using AI, allowing users to quickly create commit messages for their staged changes. It provides a CLI interface for easy usage and supports customization of commit descriptions, emojis, and AI models. Users can configure local and global settings, switch between different AI providers, and set up Git hooks for integration with IDE Source Control. Additionally, OpenCommit can be used as a GitHub Action to automatically improve commit messages on push events, ensuring all commits are meaningful and not generic. Payments for OpenAI API requests are handled by the user, with the tool storing API keys locally.

github

: 5.9k

Open-LLM-VTuber

Open-LLM-VTuber is a project in early stages of development that allows users to interact with Large Language Models (LLM) using voice commands and receive responses through a Live2D talking face. The project aims to provide a minimum viable prototype for offline use on macOS, Linux, and Windows, with features like long-term memory using MemGPT, customizable LLM backends, speech recognition, and text-to-speech providers. Users can configure the project to chat with LLMs, choose different backend services, and utilize Live2D models for visual representation. The project supports perpetual chat, offline operation, and GPU acceleration on macOS, addressing limitations of existing solutions on macOS.

github

: 1.9k

temporal-ai-agent

Temporal AI Agent is a demo showcasing a multi-turn conversation with an AI agent running inside a Temporal workflow. The agent collects information towards a goal using a simple DSL input. It is currently set up to search for events, book flights around those events, and create an invoice for those flights. The AI agent responds with clarifications and prompts for missing information. Users can configure the agent to use ChatGPT 4o or a local LLM via Ollama. The tool requires Rapidapi key for sky-scrapper to find flights and a Stripe key for creating invoices. Users can customize the agent by modifying tool and goal definitions in the codebase.

github

: 64

unstructured

The `unstructured` library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of `unstructured` revolve around streamlining and optimizing the data processing workflow for LLMs. `unstructured` modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs.

github

: 10.5k

pyrfuniverse

pyrfuniverse is a python package used to interact with RFUniverse simulation environment. It is developed with reference to ML-Agents and produce new features. The package allows users to work with RFUniverse for simulation purposes, providing tools and functionalities to interact with the environment and create new features.

github

: 56

vectorflow

VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. It provides a simple API endpoint for ingesting large volumes of raw data, processing, and storing or returning the vectors quickly and reliably. The tool supports text-based files like TXT, PDF, HTML, and DOCX, and can be run locally with Kubernetes in production. VectorFlow offers functionalities like embedding documents, running chunking schemas, custom chunking, and integrating with vector databases like Pinecone, Qdrant, and Weaviate. It enforces a standardized schema for uploading data to a vector store and supports features like raw embeddings webhook, chunk validation webhook, S3 endpoint, and telemetry. The tool can be used with the Python client and provides detailed instructions for running and testing the functionalities.

github

: 639

TypeGPT

TypeGPT is a Python application that enables users to interact with ChatGPT or Google Gemini from any text field in their operating system using keyboard shortcuts. It provides global accessibility, keyboard shortcuts for communication, and clipboard integration for larger text inputs. Users need to have Python 3.x installed along with specific packages and API keys from OpenAI for ChatGPT access. The tool allows users to run the program normally or in the background, manage processes, and stop the program. Users can use keyboard shortcuts like `/ask`, `/see`, `/stop`, `/chatgpt`, `/gemini`, `/check`, and `Shift + Cmd + Enter` to interact with the application in any text field. Customization options are available by modifying files like `keys.txt` and `system_prompt.txt`. Contributions are welcome, and future plans include adding support for other APIs and a user-friendly GUI.

github

: 135

qb

QANTA is a system and dataset for question answering tasks. It provides a script to download datasets, preprocesses questions, and matches them with Wikipedia pages. The system includes various datasets, training, dev, and test data in JSON and SQLite formats. Dependencies include Python 3.6, `click`, and NLTK models. Elastic Search 5.6 is needed for the Guesser component. Configuration is managed through environment variables and YAML files. QANTA supports multiple guesser implementations that can be enabled/disabled. Running QANTA involves using `cli.py` and Luigi pipelines. The system accesses raw Wikipedia dumps for data processing. The QANTA ID numbering scheme categorizes datasets based on events and competitions.

github

: 167

dravid

Dravid (DRD) is an advanced, AI-powered CLI coding framework designed to follow user instructions until the job is completed, including fixing errors. It can generate code, fix errors, handle image queries, manage file operations, integrate with external APIs, and provide a development server with error handling. Dravid is extensible and requires Python 3.7+ and CLAUDE_API_KEY. Users can interact with Dravid through CLI commands for various tasks like creating projects, asking questions, generating content, handling metadata, and file-specific queries. It supports use cases like Next.js project development, working with existing projects, exploring new languages, Ruby on Rails project development, and Python project development. Dravid's project structure includes directories for source code, CLI modules, API interaction, utility functions, AI prompt templates, metadata management, and tests. Contributions are welcome, and development setup involves cloning the repository, installing dependencies with Poetry, setting up environment variables, and using Dravid for project enhancements.

github

: 114

SiriLLama

Siri LLama is an Apple shortcut that allows users to access locally running LLMs through Siri or the shortcut UI on any Apple device connected to the same network as the host machine. It utilizes Langchain and supports open source models from Ollama or Fireworks AI. Users can easily set up and configure the tool to interact with various language models for chat and multimodal tasks. The tool provides a convenient way to leverage the power of language models through Siri or the shortcut interface, enhancing user experience and productivity.

github

: 146

redbox

Redbox is a retrieval augmented generation (RAG) app that uses GenAI to chat with and summarise civil service documents. It increases organisational memory by indexing documents and can summarise reports read months ago, supplement them with current work, and produce a first draft that lets civil servants focus on what they do best. The project uses a microservice architecture with each microservice running in its own container defined by a Dockerfile. Dependencies are managed using Python Poetry. Contributions are welcome, and the project is licensed under the MIT License. Security measures are in place to ensure user data privacy and considerations are being made to make the core-api secure.

github

: 111

For similar tasks

reolink_aio

The 'reolink_aio' Python package is designed to integrate Reolink devices (NVR/cameras) into your application. It implements Reolink IP NVR and camera API, allowing users to subscribe to Reolink ONVIF SWN events for real-time event notifications via webhook. The package provides functionalities to obtain and cache NVR or camera settings, capabilities, and states, as well as enable features like infrared lights, spotlight, and siren. Users can also subscribe to events, renew timers, and disconnect from the host device.

github

: 94

ChatGPT-OpenAI-Smart-Speaker

github

: 188

aiohue

Aiohue is an asynchronous library designed to control Philips Hue lights. It requires Python 3.10+ and utilizes asyncio and aiohttp. The library supports both V1 and V2 APIs of the Hue Bridge, with V2 API offering event-based updates to eliminate the need for polling. The contribution guidelines emphasize matching object hierarchy and property/method names with the Philips Hue API.

github

: 57

aioshelly

Aioshelly is an asynchronous library designed to control Shelly devices. It is currently under development and requires Python version 3.11 or higher, along with dependencies like bluetooth-data-tools, aiohttp, and orjson. The library provides examples for interacting with Gen1 devices using CoAP protocol and Gen2/Gen3 devices using RPC and WebSocket protocols. Users can easily connect to Shelly devices, retrieve status information, and perform various actions through the provided APIs. The repository also includes example scripts for quick testing and usage guidelines for contributors to maintain consistency with the Shelly API.

github

: 51

semantic-router

Semantic Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow LLM generations to make tool-use decisions, we use the magic of semantic vector space to make those decisions — _routing_ our requests using _semantic_ meaning.

github

: 2.5k

hass-ollama-conversation

The Ollama Conversation integration adds a conversation agent powered by Ollama in Home Assistant. This agent can be used in automations to query information provided by Home Assistant about your house, including areas, devices, and their states. Users can install the integration via HACS and configure settings such as API timeout, model selection, context size, maximum tokens, and other parameters to fine-tune the responses generated by the AI language model. Contributions to the project are welcome, and discussions can be held on the Home Assistant Community platform.

github

: 113

luna-ai

Luna AI is a virtual streamer driven by a 'brain' composed of ChatterBot, GPT, Claude, langchain, chatglm, text-generation-webui, 讯飞星火, 智谱AI. It can interact with viewers in real-time during live streams on platforms like Bilibili, Douyin, Kuaishou, Douyu, or chat with you locally. Luna AI uses natural language processing and text-to-speech technologies like Edge-TTS, VITS-Fast, elevenlabs, bark-gui, VALL-E-X to generate responses to viewer questions and can change voice using so-vits-svc, DDSP-SVC. It can also collaborate with Stable Diffusion for drawing displays and loop custom texts. This project is completely free, and any identical copycat selling programs are pirated, please stop them promptly.

github

: 1.2k

KULLM

KULLM (구름) is a Korean Large Language Model developed by Korea University NLP & AI Lab and HIAI Research Institute. It is based on the upstage/SOLAR-10.7B-v1.0 model and has been fine-tuned for instruction. The model has been trained on 8×A100 GPUs and is capable of generating responses in Korean language. KULLM exhibits hallucination and repetition phenomena due to its decoding strategy. Users should be cautious as the model may produce inaccurate or harmful results. Performance may vary in benchmarks without a fixed system prompt.

github

: 527

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

ChatGPT-OpenAI-Smart-Speaker

README:

ChatGPT Smart Speaker (speech recognition and text-to-speech using OpenAI and Google Speech Recognition)

Video Demos

Equipment List:

- Raspberry Pi 4b 4GB

- VMini External USB Stereo Speaker

- VReSpeaker 4-Mic Array

- ANSMANN 10,000mAh Type-C 20W PD Power Bank

Running on your PC/Mac (use the chat.py or test.py script)

Running on Raspberry Pi (use the pi.py script)

Prerequisites - chat.py

Prerequisites - pi.py

Usage - applies to chat.py:

Usage - applies to pi.py

Customisation

Important notes for Raspberry Pi Installation

Adding a Start Command on Boot

ReSpeaker

Other Possible Issues

Credit to:

Read more about what is next for the project

For Tasks:

For Jobs:

Alternative AI tools for ChatGPT-OpenAI-Smart-Speaker

Similar Open Source Tools

ChatGPT-OpenAI-Smart-Speaker

seer

LLM_AppDev-HandsOn

ai-town

aides-jeunes

opencommit

Open-LLM-VTuber

temporal-ai-agent

unstructured

pyrfuniverse

vectorflow

TypeGPT

qb

dravid

SiriLLama

redbox

For similar tasks

reolink_aio

ChatGPT-OpenAI-Smart-Speaker

aiohue

aioshelly

semantic-router

hass-ollama-conversation

luna-ai

KULLM

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape