Windrecorder
Windrecorder is a memory search app by records everything on your screen in small size, to let you rewind what you have seen, query through OCR text or image description, and get activity statistics. Developed as MacOS App Rewind.ai's alternative on Windows platform.
Stars: 893
Windrecorder is an open-source tool that helps you retrieve memory cues by recording everything on your screen. It can search based on OCR text or image descriptions and provides a summary of your activities. All of its capabilities run entirely locally, without the need for an internet connection or uploading any data, giving you complete ownership of your data.
README:
An Open Source Rewind's alternative tool on Windows to help you retrieve memory cues.
一款运行在 Windows 平台上的 Rewind 替代工具,帮助你找回记忆线索
捕风记录仪 是一款通过记录屏幕上所有内容、从而实现记忆搜索的应用。它可以根据 OCR 文本、或对画面的描述进行搜索,摘要浏览活动。它的所有能力都完全运行在本地,无需联网或上传任何数据,你完全拥有所有数据。
Windrecorder 目前可以做到:
- 以较小的文件体积稳定持续地录制多个或单个屏幕;
- 只索引发生变化的画面,记录其 OCR 文本、页面标题等信息到数据库;在无人使用电脑时,自动维护数据库、清理、压缩视频;
- 完善的 webui 界面,可以回溯画面、进行 OCR /图像语义等查询;
- 提供活动统计、词云、时间轴、光箱、散点图等数据摘要;
- 支持多语言。目前内建有:简体中文、English、日本語。Welcome to contribute multilingual translations and help us improve copywriting quality.
- coming soon... 请关注我们的 PR
Windrecorder 目前局限:
- FFmpeg 在部分情况下可能会有较大内存占用;
[!WARNING] 该项目仍在较早期开发阶段,体验与使用上可能会遇上些小问题,欢迎提出 issue 反馈、关注更新、在 Discussions 讨论区发起讨论与查看 roadmap。也欢迎帮助我们优化与构建项目,提出 PR / review。
[!IMPORTANT]
由于代码编写小失误,0.0.5
以前版本可能无法正常检测更新、或通过 install_update.bat 进行升级。如是,请在Windrecorder
根目录的路径框输入cmd
打开命令行,输入git pull
进行更新。🙇♀️
-
下载 ffmpeg(下载文件名为:
ffmpeg-master-latest-win64-gpl-shared.zip
) ,将 bin 目录下的所有文件复制至C:\Windows\System32
下(或其他位于 PATH 的目录下)(不包括 bin 目录本身)- ffmpeg 可能有“在录制屏幕时光标会闪烁”的 bug,可以先根据底下 Q&A 进行修复后、再拷贝至系统目录;
-
安装 Git,一路下一步即可;
-
安装 Python,安装时确保勾选
Add python.exe to PATH
- 注意!目前暂未支持 python 3.12,推荐使用 python 3.11,即上面链接指向的版本
-
在文件管理器中,导航到想要安装此工具目录(推荐放在空间富足的分区中),通过终端命令
git clone https://github.com/yuka-friends/Windrecorder
下载该工具;-
可以打开想要安装的文件夹,在路径栏输入
cmd
并回车,即可在终端定位到当前目录,将以上命令贴入、回车执行; -
如果目录路径中包含空格,启动 app 时可能会失败;#110
-
-
打开目录下的
install_update.bat
进行工具安装与配置,顺利的话就可以开始使用了!
- 打开目录下的
start_app.bat
,工具会运行在系统托盘,通过右键菜单使用; - 所有的数据(视频、数据库、统计信息)将会存储于 Windrecorder 同目录下。如想拷贝、移动工具位置(比如更换了电脑),只需删除目录下
.venv
、在移动文件夹后,重新运行install_update.bat
安装虚拟环境即可使用;
[!TIP] 最佳实践:在 webui 中设置开机自启动,即可无感记录下一切。
当画面没有变化、或屏幕睡眠时将自动暂停记录。当电脑空闲无人使用时,工具会自动维护数据库、压缩、清理过期视频。
Just set it and forget it!
当启动记录后,捕风记录仪将逐段录制 15 分钟的视频,在录制完毕后对视频片段进行索引(因此,数据的查询可能会有 15 分钟的延迟时间)。当屏幕没有变化、窗口标题在跳过列表、或电脑进入锁屏时,将会自动暂停录制,并进行闲时维护(压缩与清理视频、进行图像嵌入识别等),直到用户回来、继续操作电脑。
- 图像嵌入索引以扩展形式提供,可以在目录
extension/install_img_embedding_module
下进行安装
视频录制大小 | SQlite 数据库大小 |
---|---|
每小时:2-100 Mb (取决于画面变化\显示器数量) | |
每个月:10-20 Gb (取决于屏幕时间) 不同的视频压缩预设,可将这些数据压缩至 0.1-0.7 倍大小 | 每个月:约 160 Mb |
未来可能会改进录制方法,降低 ffmpeg 资源占用、让回溯不必等待。目前 ffmpeg 在录制时可能有较高的内存占用。
Q: 录制过程中鼠标闪烁
- A:FFmpeg 历史遗留问题,可尝试该帖方法解决:
- 使用任意十六进制编辑器(如 HxD)打开之前下载的
FFmpeg/bin
中的avdevice-XX.dll
文件; - 搜索 hex code(字节序列)
20 00 cc 40
,将其最后两位40
改为00
; - 保存文件即可;
- 使用任意十六进制编辑器(如 HxD)打开之前下载的
Q: 打开 webui 时没有近期一段时间的数据。
- A: 当工具正在索引数据时,webui 将不会创建最新的临时数据库文件。 解决方法:尝试稍等一段时间,等待工具索引完毕后,刷新 webui 界面,或删除 db 目录下后缀为 _TEMP_READ.db 的数据库文件后刷新即可(若出现数据库文件损坏提示,不必担心,可能是工具仍然在索引中,请尝试过段时间刷新/删除)。此项策略未来将会修复重构。 #26
Q: 在打开webui时提示:FileNotFoundError: [WinError 2] The system cannot find the file specified: './db\\user_2023-10_wind.db-journal'
- A: 通常在初次访问 webui 时、工具仍正在索引数据时出现。 解决方法:在工具后台索引完毕后,删除 db 文件夹下对应后缀为 _TEMP_READ.db 的数据库文件后刷新即可。
Q: Windows.Media.Ocr.Cli OCR 不可用/识别率过低
-
A1: 检查系统中是否添加了目标语言的语言包/输入法:https://learn.microsoft.com/en-us/uwp/api/windows.media.ocr
-
A2: Windows.Media.Ocr.Cli 对较小的文本识别率可能不良,通过在设置中打开「相近字形搜索」选项可以提高搜索时的召回命中率。未来将会添加对更多本地 OCR 工具的支持。
引入了这些项目的帮助:
- https://github.com/DayBreak-u/chineseocr_lite
- https://github.com/zh-h/Windows.Media.Ocr.Cli
- https://github.com/unum-cloud/uform
- https://github.com/streamlit/streamlit
🧡 喜欢这个工具?欢迎到 Youtube 与流媒体音乐平台上听听 長瀬有花 / YUKA NAGASE 温柔的音乐,谢谢!
"Your tools suck, check out my girl Yuka Nagase, she's amazing, I code 10 times faster when listening to her." -- @jpswing
在 Product Hunt 上为 捕风记录仪 投票:
Also checkout:
- 🧡 after-you: a local-first AI diary app, responding to your heart's call
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Windrecorder
Similar Open Source Tools
Windrecorder
Windrecorder is an open-source tool that helps you retrieve memory cues by recording everything on your screen. It can search based on OCR text or image descriptions and provides a summary of your activities. All of its capabilities run entirely locally, without the need for an internet connection or uploading any data, giving you complete ownership of your data.
Noi
Noi is an AI-enhanced customizable browser designed to streamline digital experiences. It includes curated AI websites, allows adding any URL, offers prompts management, Noi Ask for batch messaging, various themes, Noi Cache Mode for quick link access, cookie data isolation, and more. Users can explore, extend, and empower their browsing experience with Noi.
LLM-Finetune-Guide
This project provides a comprehensive guide to fine-tuning large language models (LLMs) with efficient methods like LoRA and P-tuning V2. It includes detailed instructions, code examples, and performance benchmarks for various LLMs and fine-tuning techniques. The guide also covers data preparation, evaluation, prediction, and running inference on CPU environments. By leveraging this guide, users can effectively fine-tune LLMs for specific tasks and applications.
TempCompass
TempCompass is a benchmark designed to evaluate the temporal perception ability of Video LLMs. It encompasses a diverse set of temporal aspects and task formats to comprehensively assess the capability of Video LLMs in understanding videos. The benchmark includes conflicting videos to prevent models from relying on single-frame bias and language priors. Users can clone the repository, install required packages, prepare data, run inference using examples like Video-LLaVA and Gemini, and evaluate the performance of their models across different tasks such as Multi-Choice QA, Yes/No QA, Caption Matching, and Caption Generation.
PromptClip
PromptClip is a tool that allows developers to create video clips using LLM prompts. Users can upload videos from various sources, prompt the video in natural language, use different LLM models, instantly watch the generated clips, finetune the clips, and add music or image overlays. The tool provides a seamless way to extract specific moments from videos based on user queries, making video editing and content creation more efficient and intuitive.
llama-assistant
Llama Assistant is an AI-powered assistant that helps with daily tasks, such as voice recognition, natural language processing, summarizing text, rephrasing sentences, answering questions, and more. It runs offline on your local machine, ensuring privacy by not sending data to external servers. The project is a work in progress with regular feature additions.
ASTRA.ai
ASTRA is an open-source platform designed for developing applications utilizing large language models. It merges the ideas of Backend-as-a-Service and LLM operations, allowing developers to swiftly create production-ready generative AI applications. Additionally, it empowers non-technical users to engage in defining and managing data operations for AI applications. With ASTRA, you can easily create real-time, multi-modal AI applications with low latency, even without any coding knowledge.
cb-tumblebug
CB-Tumblebug (CB-TB) is a system for managing multi-cloud infrastructure consisting of resources from multiple cloud service providers. It provides an overview, features, and architecture. The tool supports various cloud providers and resource types, with ongoing development and localization efforts. Users can deploy a multi-cloud infra with GPUs, enjoy multiple LLMs in parallel, and utilize LLM-related scripts. The tool requires Linux, Docker, Docker Compose, and Golang for building the source. Users can run CB-TB with Docker Compose or from the Makefile, set up prerequisites, contribute to the project, and view a list of contributors. The tool is licensed under an open-source license.
OutofFocus
Out of Focus v1.0 is a flexible tool in Gradio for image manipulation through prompt manipulation by reconstruction via diffusion inversion process. Users can modify images using this tool, which is the first version of the Image modification tool by Out of AI.
beta9
Beta9 is an open-source platform for running scalable serverless GPU workloads across cloud providers. It allows users to scale out workloads to thousands of GPU or CPU containers, achieve ultrafast cold-start for custom ML models, automatically scale to zero to pay for only what is used, utilize flexible distributed storage, distribute workloads across multiple cloud providers, and easily deploy task queues and functions using simple Python abstractions. The platform is designed for launching remote serverless containers quickly, featuring a custom, lazy loading image format backed by S3/FUSE, a fast redis-based container scheduling engine, content-addressed storage for caching images and files, and a custom runc container runtime.
TEN-Agent
TEN Agent is an open-source multimodal agent powered by the world’s first real-time multimodal framework, TEN Framework. It offers high-performance real-time multimodal interactions, multi-language and multi-platform support, edge-cloud integration, flexibility beyond model limitations, and real-time agent state management. Users can easily build complex AI applications through drag-and-drop programming, integrating audio-visual tools, databases, RAG, and more.
onnxruntime-server
ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference. It aims to offer simple, high-performance ML inference and a good developer experience. Users can provide inference APIs for ONNX models without writing additional code by placing the models in the directory structure. Each session can choose between CPU or CUDA, analyze input/output, and provide Swagger API documentation for easy testing. Ready-to-run Docker images are available, making it convenient to deploy the server.
stm32ai-modelzoo
The STM32 AI model zoo is a collection of reference machine learning models optimized to run on STM32 microcontrollers. It provides a large collection of application-oriented models ready for re-training, scripts for easy retraining from user datasets, pre-trained models on reference datasets, and application code examples generated from user AI models. The project offers training scripts for transfer learning or training custom models from scratch. It includes performances on reference STM32 MCU and MPU for float and quantized models. The project is organized by application, providing step-by-step guides for training and deploying models.
agentscope
AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: Easy-to-Use, High Robustness, and Actor-Based Distribution. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs, including OpenAI API, DashScope API, Gemini API, and ollama. It also enables developers to rapidly deploy local model services using libraries such as ollama (CPU inference), Flask + Transformers, Flask + ModelScope, FastChat, and vllm. AgentScope supports various services, including Web Search, Data Query, Retrieval, Code Execution, File Operation, and Text Processing. Example applications include Conversation, Game, and Distribution. AgentScope is released under Apache License 2.0 and welcomes contributions.
ScaleLLM
ScaleLLM is a cutting-edge inference system engineered for large language models (LLMs), meticulously designed to meet the demands of production environments. It extends its support to a wide range of popular open-source models, including Llama3, Gemma, Bloom, GPT-NeoX, and more. ScaleLLM is currently undergoing active development. We are fully committed to consistently enhancing its efficiency while also incorporating additional features. Feel free to explore our **_Roadmap_** for more details. ## Key Features * High Efficiency: Excels in high-performance LLM inference, leveraging state-of-the-art techniques and technologies like Flash Attention, Paged Attention, Continuous batching, and more. * Tensor Parallelism: Utilizes tensor parallelism for efficient model execution. * OpenAI-compatible API: An efficient golang rest api server that compatible with OpenAI. * Huggingface models: Seamless integration with most popular HF models, supporting safetensors. * Customizable: Offers flexibility for customization to meet your specific needs, and provides an easy way to add new models. * Production Ready: Engineered with production environments in mind, ScaleLLM is equipped with robust system monitoring and management features to ensure a seamless deployment experience.
pgx
Pgx is a collection of GPU/TPU-accelerated parallel game simulators for reinforcement learning (RL). It provides JAX-native game simulators for various games like Backgammon, Chess, Shogi, and Go, offering super fast parallel execution on accelerators and beautiful visualization in SVG format. Pgx focuses on faster implementations while also being sufficiently general, allowing environments to be converted to the AEC API of PettingZoo for running Pgx environments through the PettingZoo API.
For similar tasks
Windrecorder
Windrecorder is an open-source tool that helps you retrieve memory cues by recording everything on your screen. It can search based on OCR text or image descriptions and provides a summary of your activities. All of its capabilities run entirely locally, without the need for an internet connection or uploading any data, giving you complete ownership of your data.
For similar jobs
khoj
Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.
Windrecorder
Windrecorder is an open-source tool that helps you retrieve memory cues by recording everything on your screen. It can search based on OCR text or image descriptions and provides a summary of your activities. All of its capabilities run entirely locally, without the need for an internet connection or uploading any data, giving you complete ownership of your data.
forge
Forge is a free and open-source digital collectible card game (CCG) engine written in Java. It is designed to be easy to use and extend, and it comes with a variety of features that make it a great choice for developers who want to create their own CCGs. Forge is used by a number of popular CCGs, including Ascension, Dominion, and Thunderstone.
userscripts
Greasemonkey userscripts. A userscript manager such as Tampermonkey is required to run these scripts.
freeGPT
freeGPT provides free access to text and image generation models. It supports various models, including gpt3, gpt4, alpaca_7b, falcon_40b, prodia, and pollinations. The tool offers both asynchronous and non-asynchronous interfaces for text completion and image generation. It also features an interactive Discord bot that provides access to all the models in the repository. The tool is easy to use and can be integrated into various applications.
open-saas
Open SaaS is a free and open-source React and Node.js template for building SaaS applications. It comes with a variety of features out of the box, including authentication, payments, analytics, and more. Open SaaS is built on top of the Wasp framework, which provides a number of features to make it easy to build SaaS applications, such as full-stack authentication, end-to-end type safety, jobs, and one-command deploy.
AIGODLIKE-ComfyUI-Translation
A plugin for multilingual translation of ComfyUI, This plugin implements translation of resident menu bar/search bar/right-click context menu/node, etc
free-for-life
A massive list including a huge amount of products and services that are completely free! ⭐ Star on GitHub • 🤝 Contribute # Table of Contents * APIs, Data & ML * Artificial Intelligence * BaaS * Code Editors * Code Generation * DNS * Databases * Design & UI * Domains * Email * Font * For Students * Forms * Linux Distributions * Messaging & Streaming * PaaS * Payments & Billing * SSL