GameSentenceMiner

GameSentenceMiner

An All-in-One immersion toolkit for learning Languages through games and other visual media.

Stars: 133

Visit
 screenshot

GameSentenceMiner (GSM) is an immersion toolkit designed to assist with language learning through games. It enhances Anki cards with automated audio capture, manual trim options, screenshot capture, multi-line support, and AI translation. Additionally, GSM offers OCR capabilities with easier setup, exclusion zones, two-pass OCR system, consistent audio timing, and support for multiple languages. The tool also features game launcher capabilities for simplifying game setup and launching. Basic requirements include an Anki card creation tool, a method of extracting text from games, and, of course, a game. GSM provides detailed documentation and FAQs to help users understand its functionality and troubleshoot any issues. Users can seek support through the project's Discord channel or by creating issues on the repository.

README:

GSM - An Immersion toolkit for Games.

English | 日本語 | 简体中文.

An application designed to assist with language learning through games.

Short Demo (Watch this first): https://www.youtube.com/watch?v=FeFBL7py6HY

Installation: https://www.youtube.com/watch?v=sVL9omRbGc4

Discord: https://discord.gg/yP8Qse6bb8

Features

Anki Card Enhancement

GSM significantly enhances your Anki cards with rich contextual information:

  • Automated Audio Capture: Automatically records the voice line associated with the text.

    • Automatic Trim: Some simple math around the time that the text event came in, in combination with a "Voice Activation Detection" (VAD) library gives us neatly cut audio.
    • Manual Trim: If Automatic voiceline trim is not perfect, it's possible to open the audio in an external program for trimming.
  • Screenshot: Captures a screenshot of the game at the moment the voice line is spoken.

  • Multi-Line: It's possible to capture multiple lines at once with sentence audio with GSM's very own Texthooker.

  • AI Translation: Integrates AI to provide quick translations of the captured sentence. Custom Prompts also supported. (Optional, Bring your own Key)

Game Example (Has Audio)

https://github.com/user-attachments/assets/df6bc38e-d74d-423e-b270-8a82eec2394c


VN Example (Has Audio)

https://github.com/user-attachments/assets/ee670fda-1a8b-4dec-b9e6-072264155c6e

OCR

GSM runs a fork of OwOCR to provide accurate text capture from games that do not have a hook. Here are some improvements GSM makes on stock OwOCR:

  • Easier Setup: With GSM's managed Python install, setup is only a matter of clicking a few buttons.

  • Exclusion Zones: Instead of choosing an area to OCR, you can choose an area to exclude from OCR. Useful if you have a static interface in your game and text appears randomly throughout.

  • Two-Pass OCR: To cut down on API calls and keep output clean, GSM features a "Two-Pass" OCR System. A Local OCR will be constantly running, and when the text on screen stabilizes, it will run a second, more accurate scan that gets sent to clipboard/WebSocket.

  • Consistent Audio Timing: With the two-pass system, we can still get accurate audio recorded and into Anki without the use of crazy offsets or hacks.

  • More Language Support: Stock OwOCR is hard-coded to Japanese, while in GSM you can use a variety of languages.

https://github.com/user-attachments/assets/07240472-831a-40e6-be22-c64b880b0d66

Game Launcher Capabilities (WIP)

This is probably the feature I care least about, but if you are lazy like me, you may find this helpful.

  • Launch: GSM can launch your games directly, simplifying the setup process.

  • Hook: Streamlines the process of hooking your games (Agent).

This feature simplifies the process of launching games and (potentially) hooking them, making the entire workflow more efficient.

GameSentenceMiner_1zuov0R9xK

Basic Requirements

Documentation

For help with installation, setup, and other information, please visit the project's Wiki.

FAQ

How Does It Work?

This is a common question, and understanding this process will help clarify any issues you might encounter while using GSM.

  1. The beginning of the voice line is marked by a text event. This usually comes from Textractor, Agent, or another texthooker. GSM can listen for a clipboard copy and/or a WebSocket server (configurable in GSM).

  2. The end of the voice line is detected using a Voice Activity Detection (VAD) library running locally. (Example)

In essence, GSM relies on accurately timed text events to capture the corresponding audio.

GSM provides settings to accommodate less-than-ideal hooks. However, if you experience significant audio inconsistencies, they likely stem from a poorly timed hook, loud background music, or other external factors, rather than GSM itself. The core audio trimming logic has been stable and effective for many users across various games.

Contact

If you encounter issues, please ask for help in my Discord or create an issue here.

Acknowledgements

Donations

If you've found this or any of my other projects helpful, please consider supporting my work through GitHub Sponsors, or Ko-fi.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for GameSentenceMiner

Similar Open Source Tools

For similar tasks

For similar jobs