efficient-recorder
Attempt to create an Open Source Privacy Focused Rewind.ai Alternative for data capture
Stars: 148
Efficient Recorder is a battery-life friendly tool designed to stream video, screen, mic, and system audio to any S3-compatible cloud storage service. It captures audio, screenshots, and webcam photos at configurable fps, utilizing low-energy volume detection for audio recording. The tool streams data to a configurable S3 endpoint or a custom server using MinIO. It aims to be storage and battery efficient, providing queued upload processing and minimal system resource overhead. The tool requires SoX for audio recording and webcam capture tools for operation. Users can specify various command line options for customization, such as enabling screenshot and webcam capture with specific intervals and image quality settings.
README:
Today I was inspired by this tweet after which I fell into this rabbit hole to create an efficient recorder for your Screen, System audio, and Mic.
After trying to implement this using Claude in Swift I completely failed (see folder swift-version) so I decided to create a simpler version (also using Claude) in Node.js
Goal: Create the most battery-life friendly recorder to stream video/screen/mic/system-audio to any S3-compatible cloud-storage-service of choice, open source.
- Captures Mic audio, screenshots, and webcam photos at configurable fps
- For audio, uses low-energy volume detection to determine what to record
- Streams all data to configurable S3 endpoint (Can also use your own server using MinIO)
I really want this to exist, but won't pursue it for now. This repo serves as an example. Open to contributors! DM me
After the hackernews thread, I saw there are a couple other similar projects in various levels of completion (but none focuses on battery):
- Screenpipe https://github.com/mediar-ai/screenpipe
- Memento https://github.com/apirrone/Memento
- Rem https://github.com/jasonjmcghee/rem
Update: Also, I made a spin-off from this called yaptabber - it only records when you talk
Update January 10, 2025: I made a ip-camera streamer and this is super storage and battery efficient: https://github.com/janwilmake/ip-camera-to-s3-macos
This package requires:
-
SoX (Sound eXchange) for audio recording
- Linux:
sudo apt-get install sox libsox-fmt-all - MacOS:
brew install sox - Windows: Download from SourceForge
- Linux:
-
Webcam capture tools
- Ubuntu:
sudo apt-get install fswebcam - Arch Linux:
sudo pamac build fswebcam - MacOS:
brew install imagesnap - Windows: Standalone exe included in node-webcam
- Ubuntu:
You'll need:
- S3-compatible storage endpoint
- Access key
- Secret key
- A bucket named "recordings" (or modify the code to use a different bucket name)
Run the recorder using npx:
npx efficient-recorder \
--endpoint YOUR_S3_ENDPOINT \
--key YOUR_ACCESS_KEY \
--secret YOUR_SECRET_KEY \
--enable-screenshot \
--screenshot-interval 5000 \
--enable-webcam \
--webcam-interval 3000 \
--image-quality 80-
--endpoint: Your S3-compatible storage endpoint URL -
--key: Your AWS/S3 access key -
--secret: Your AWS/S3 secret key -
--enable-screenshot: Enable screenshot capture -
--screenshot-interval: Interval between screenshots (ms) -
--enable-webcam: Enable webcam capture -
--webcam-interval: Interval between webcam captures (ms) -
--webcam-device: Specify webcam device (optional) -
--image-quality: Image quality for webcam/screenshots (1-100)
-
Audio Monitoring
- Continuous low-quality audio monitoring
- Switches to high-quality recording when speech is detected
- Automatic recording start and stop based on sound levels
-
Screenshot Capture
- Captures screenshots at specified intervals
- Immediate upload to S3
- Configurable capture frequency
-
Webcam Capture
- Captures webcam images at specified intervals
- Supports multiple webcam devices
- Immediate upload to S3
-
Efficient Upload
- Queued upload processing
- Concurrent uploads with multi-part support
- Minimal system resource overhead
Audio
- Monitoring: 8kHz, Mono
- Recording: 44.1kHz, Stereo
- Format: WAV (16-bit PCM)
Screenshots
- Captured at system screen resolution
- Uploaded as PNG
Webcam
- Resolution: 1280x720
- Format: JPEG
- Configurable quality
-
"Command not found: rec"
- Ensure SoX is installed correctly
- Verify SoX is in your system PATH
-
S3 Upload Issues
- Check S3 credentials
- Verify bucket exists and write permissions are granted
-
No Audio/Video Input
- Check system input devices
- Verify microphone and webcam permissions
MIT License
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for efficient-recorder
Similar Open Source Tools
efficient-recorder
Efficient Recorder is a battery-life friendly tool designed to stream video, screen, mic, and system audio to any S3-compatible cloud storage service. It captures audio, screenshots, and webcam photos at configurable fps, utilizing low-energy volume detection for audio recording. The tool streams data to a configurable S3 endpoint or a custom server using MinIO. It aims to be storage and battery efficient, providing queued upload processing and minimal system resource overhead. The tool requires SoX for audio recording and webcam capture tools for operation. Users can specify various command line options for customization, such as enabling screenshot and webcam capture with specific intervals and image quality settings.
voice-pro
Voice-Pro is an integrated solution for subtitles, translation, and TTS. It offers features like multilingual subtitles, live translation, vocal remover, and supports OpenAI Whisper and Open-Source Translator. The tool provides a Studio tab for various functions, Whisper Caption tab for subtitle creation, Translate tab for translation, TTS tab for text-to-speech, Live Translation tab for real-time voice recognition, and Batch tab for processing multiple files. Users can download YouTube videos, improve voice recognition accuracy, create automatic subtitles, and produce multilingual videos with ease. The tool is easy to install with one-click and offers a Web-UI for user convenience.
FunGen-AI-Powered-Funscript-Generator
FunGen is a Python-based tool that uses AI to generate Funscript files from VR and 2D POV videos. It enables fully automated funscript creation for individual scenes or entire folders of videos. The tool includes features like automatic system scaling support, quick installation guides for Windows, Linux, and macOS, manual installation instructions, NVIDIA GPU setup, AMD GPU acceleration, YOLO model download, GUI settings, GitHub token setup, command-line usage, modular systems for funscript filtering and motion tracking, performance and parallel processing tips, and more. The project is still in early development stages and is not intended for commercial use.
easydiffusion
Easy Diffusion 3.0 is a user-friendly tool for installing and using Stable Diffusion on your computer. It offers hassle-free installation, clutter-free UI, task queue, intelligent model detection, live preview, image modifiers, multiple prompts file, saving generated images, UI themes, searchable models dropdown, and supports various image generation tasks like 'Text to Image', 'Image to Image', and 'InPainting'. The tool also provides advanced features such as custom models, merge models, custom VAE models, multi-GPU support, auto-updater, developer console, and more. It is designed for both new users and advanced users looking for powerful AI image generation capabilities.
swift-chat
SwiftChat is a fast and responsive AI chat application developed with React Native and powered by Amazon Bedrock. It offers real-time streaming conversations, AI image generation, multimodal support, conversation history management, and cross-platform compatibility across Android, iOS, and macOS. The app supports multiple AI models like Amazon Bedrock, Ollama, DeepSeek, and OpenAI, and features a customizable system prompt assistant. With a minimalist design philosophy and robust privacy protection, SwiftChat delivers a seamless chat experience with various features like rich Markdown support, comprehensive multimodal analysis, creative image suite, and quick access tools. The app prioritizes speed in launch, request, render, and storage, ensuring a fast and efficient user experience. SwiftChat also emphasizes app privacy and security by encrypting API key storage, minimal permission requirements, local-only data storage, and a privacy-first approach.
whispering-ui
Whispering Tiger UI is a Native-UI tool designed to control the Whispering Tiger application, a free and Open-Source tool that can listen/watch to audio streams or in-game images on your machine and provide transcription or translation to a web browser using Websockets or over OSC. It features a Native-UI for Windows, easy access to all Whispering Tiger features including transcription, translation, text-to-speech, and in-game image recognition. The tool supports loopback audio device, configuration saving/loading, plugin support for additional features, and auto-update functionality. Users can create profiles, configure audio devices, select A.I. devices for speech-to-text, and install/manage plugins for extended functionality.
VibeSurf
VibeSurf is an open-source AI agentic browser that combines workflow automation with intelligent AI agents, offering faster, cheaper, and smarter browser automation. It allows users to create revolutionary browser workflows, run multiple AI agents in parallel, perform intelligent AI automation tasks, maintain privacy with local LLM support, and seamlessly integrate as a Chrome extension. Users can save on token costs, achieve efficiency gains, and enjoy deterministic workflows for consistent and accurate results. VibeSurf also provides a Docker image for easy deployment and offers pre-built workflow templates for common tasks.
LLM-Hub
LLM Hub is an open-source Android app optimized for mobile usage, supporting multiple model formats for on-device LLM chat and image generation. It offers six AI tools including chat, writing aid, image generator, translator, transcriber, and scam detector. Privacy-first with on-device processing and zero data collection. Advanced capabilities include GPU/NPU acceleration, text-to-speech, RAG with global memory, and custom model import. Developed using Kotlin + Jetpack Compose, LLM Runtime, and various model runtimes.
Alice
Alice is an open-source AI companion designed to live on your desktop, providing voice interaction, intelligent context awareness, and powerful tooling. More than a chatbot, Alice is emotionally engaging and deeply useful, assisting with daily tasks and creative work. Key features include voice interaction with natural-sounding responses, memory and context management, vision and visual output capabilities, computer use tools, function calling for web search and task scheduling, wake word support, dedicated Chrome extension, and flexible settings interface. Technologies used include Vue.js, Electron, OpenAI, Go, hnswlib-node, and more. Alice is customizable and offers a dedicated Chrome extension, wake word support, and various tools for computer use and productivity tasks.
CodeNomad
CodeNomad is a fast, multi-instance workspace designed for users who spend extended hours in OpenCode. It provides a premium, low-latency environment with features like managing multiple OpenCode sessions side-by-side, global command palette for keyboard-first control, rich media previews, and browser support via CodeNomad Server. Users can choose between a Desktop App (Electron-based) with global shortcuts and deeper system integration, a Tauri App for lightweight high-performance experience, or run CodeNomad as a local server accessed via web browser. The tool supports multi-instance workspace, long-session native scrolling, command palette for easy navigation, and deep task awareness to monitor background tasks and child sessions without interruptions.
edgen
Edgen is a local GenAI API server that serves as a drop-in replacement for OpenAI's API. It provides multi-endpoint support for chat completions and speech-to-text, is model agnostic, offers optimized inference, and features model caching. Built in Rust, Edgen is natively compiled for Windows, MacOS, and Linux, eliminating the need for Docker. It allows users to utilize GenAI locally on their devices for free and with data privacy. With features like session caching, GPU support, and support for various endpoints, Edgen offers a scalable, reliable, and cost-effective solution for running GenAI applications locally.
note-companion
Note Companion is an AI-powered Obsidian plugin that automatically organizes and formats notes. It provides organizing suggestions, custom format AI prompts, automated workflows, handwritten note digitization, audio transcription, atomic note generation, YouTube summaries, and context-aware AI chat. Key use cases include smart vault management, handwritten notes digitization, and intelligent meeting notes. The tool offers advanced features like custom AI templates and multi-modal support for processing various content types. Users can seamlessly integrate with mobile workflows and utilize iOS shortcuts for sending Apple Notes to Obsidian. Note Companion enhances productivity by streamlining note organization and management tasks with AI assistance.
Apt
Apt. is a free and open-source AI productivity tool designed to enhance user productivity while ensuring privacy and data security. It offers efficient AI solutions such as built-in ChatGPT, batch image and video processing, and more. Key features include free and open-source code, privacy protection through local deployment, offline operation, no installation needed, and multi-language support. Integrated AI models cover ChatGPT for intelligent conversations, image processing features like super-resolution and color restoration, and video processing capabilities including super-resolution and frame interpolation. Future plans include integrating more AI models. The tool provides user guides and technical support via email and various platforms, with a user-friendly interface for easy navigation.
obsidian-systemsculpt-ai
SystemSculpt AI is a comprehensive AI-powered plugin for Obsidian, integrating advanced AI capabilities into note-taking, task management, knowledge organization, and content creation. It offers modules for brain integration, chat conversations, audio recording and transcription, note templates, and task generation and management. Users can customize settings, utilize AI services like OpenAI and Groq, and access documentation for detailed guidance. The plugin prioritizes data privacy by storing sensitive information locally and offering the option to use local AI models for enhanced privacy.
VideoLingo
VideoLingo is an all-in-one video translation and localization dubbing tool designed to generate Netflix-level high-quality subtitles. It aims to eliminate stiff machine translation, multiple lines of subtitles, and can even add high-quality dubbing, allowing knowledge from around the world to be shared across language barriers. Through an intuitive Streamlit web interface, the entire process from video link to embedded high-quality bilingual subtitles and even dubbing can be completed with just two clicks, easily creating Netflix-quality localized videos. Key features and functions include using yt-dlp to download videos from Youtube links, using WhisperX for word-level timeline subtitle recognition, using NLP and GPT for subtitle segmentation based on sentence meaning, summarizing intelligent term knowledge base with GPT for context-aware translation, three-step direct translation, reflection, and free translation to eliminate strange machine translation, checking single-line subtitle length and translation quality according to Netflix standards, using GPT-SoVITS for high-quality aligned dubbing, and integrating package for one-click startup and one-click output in streamlit.
AudioMuse-AI
AudioMuse-AI is a deep learning-based tool for audio analysis and music generation. It provides a user-friendly interface for processing audio data and generating music compositions. The tool utilizes state-of-the-art machine learning algorithms to analyze audio signals and extract meaningful features for music generation. With AudioMuse-AI, users can explore the possibilities of AI in music creation and experiment with different styles and genres. Whether you are a music enthusiast, a researcher, or a developer, AudioMuse-AI offers a versatile platform for audio analysis and music generation.
For similar tasks
efficient-recorder
Efficient Recorder is a battery-life friendly tool designed to stream video, screen, mic, and system audio to any S3-compatible cloud storage service. It captures audio, screenshots, and webcam photos at configurable fps, utilizing low-energy volume detection for audio recording. The tool streams data to a configurable S3 endpoint or a custom server using MinIO. It aims to be storage and battery efficient, providing queued upload processing and minimal system resource overhead. The tool requires SoX for audio recording and webcam capture tools for operation. Users can specify various command line options for customization, such as enabling screenshot and webcam capture with specific intervals and image quality settings.
screenpipe
Screenpipe is an open source application that turns your computer into a personal AI, capturing screen and audio to create a searchable memory of your activities. It allows you to remember everything, search with AI, and keep your data 100% local. The tool is designed for knowledge workers, developers, researchers, people with ADHD, remote workers, and anyone looking for a private, local-first alternative to cloud-based AI memory tools.
react-native-vision-camera
VisionCamera is a powerful, high-performance Camera library for React Native. It features Photo and Video capture, QR/Barcode scanner, Customizable devices and multi-cameras ("fish-eye" zoom), Customizable resolutions and aspect-ratios (4k/8k images), Customizable FPS (30..240 FPS), Frame Processors (JS worklets to run facial recognition, AI object detection, realtime video chats, ...), Smooth zooming (Reanimated), Fast pause and resume, HDR & Night modes, Custom C++/GPU accelerated video pipeline (OpenGL).
videokit
VideoKit is a full-featured user-generated content solution for Unity Engine, enabling video recording, camera streaming, microphone streaming, social sharing, and conversational interfaces. It is cross-platform, with C# source code available for inspection. Users can share media, save to camera roll, pick from camera roll, stream camera preview, record videos, remove background, caption audio, and convert text commands. VideoKit requires Unity 2022.3+ and supports Android, iOS, macOS, Windows, and WebGL platforms.
LLMs-Zero-to-Hero
LLMs-Zero-to-Hero is a repository dedicated to training large language models (LLMs) from scratch, covering topics such as dense models, MOE models, pre-training, supervised fine-tuning, direct preference optimization, reinforcement learning from human feedback, and deploying large models. The repository provides detailed learning notes for different chapters, code implementations, and resources for training and deploying LLMs. It aims to guide users from being beginners to proficient in building and deploying large language models.
VoiceStreamAI
VoiceStreamAI is a Python 3-based server and JavaScript client solution for near-realtime audio streaming and transcription using WebSocket. It employs Huggingface's Voice Activity Detection (VAD) and OpenAI's Whisper model for accurate speech recognition. The system features real-time audio streaming, modular design for easy integration of VAD and ASR technologies, customizable audio chunk processing strategies, support for multilingual transcription, and secure sockets support. It uses a factory and strategy pattern implementation for flexible component management and provides a unit testing framework for robust development.
addon-aircast
AirCast is a Home Assistant Community Add-on that provides AirPlay capabilities for Chromecast players. It bridges the compatibility gap between Apple's AirPlay and Google's Chromecast by creating virtual AirPlay devices for Chromecast players on the network. The add-on is based on the AirConnect project and allows users to stream audio from Apple devices to Chromecast players.
fastrtc
FastRTC is a real-time communication library for Python that allows users to turn any Python function into a real-time audio and video stream over WebRTC or WebSockets. It provides features like automatic voice detection, UI launching, WebRTC support, WebSocket support, telephone support, and customizable backend for production applications. The library offers various examples and usage scenarios for audio and video streaming, object detection, voice APIs, chat applications, and more.
For similar jobs
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.
exif-photo-blog
EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
Twitter-Insight-LLM
This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.
obs-localvocal
LocalVocal is a live-streaming AI assistant plugin for OBS that allows you to transcribe audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). It's privacy-first, with all data staying on your machine, and requires no GPU, cloud costs, network, or downtime.
