noScribe

Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)

Stars: 1364

Visit

noScribe is an AI-based software designed for automated audio transcription, specifically tailored for transcribing interviews for qualitative social research or journalistic purposes. It is a free and open-source tool that runs locally on the user's computer, ensuring data privacy. The software can differentiate between speakers and supports transcription in 99 languages. It includes a user-friendly editor for reviewing and correcting transcripts. Developed by Kai Dröge, a PhD in sociology with a background in computer science, noScribe aims to streamline the transcription process and enhance the efficiency of qualitative analysis.

README:

noScribe

Cutting Edge AI Technology for Automated Audio Transcription

What is noScribe?

An AI-based software that transcribes interviews for qualitative social research or journalistic use
noScribe is free and open source (GPL-3.0)
It runs completely local on your computer. No data is sent to the internet. No cloud, no worries
It can distinguish different speakers and understands around 60 languages (more or less, see below)
It includes a nice editor to review, verify and correct the resulting transcript
It is standing on the shoulders of giants: Whisper from OpenAI, faster-whisper by Guillaume Klein and pyannote from Hervé Bredin

(The transcript is from this interview which I did in May 2022 with the Russian sociologist Natalia Savelyeva.)

Limitations

noScribe needs a fairly up-to-date computer, or the transcription will take very long. (Consider letting it run over night on a slower machine.)
Since it uses sophisticated AI models, the download is quite large – about 3.7 GB
Poor audio quality will lead to poor transcription results.
No automatic transcription is perfect, there will always be some manual revision necessary. Use the included Editor to check your transcripts thouroughly. (See also "Factors Influencing the Quality" and "Known Issues" below.)
If you want to know more an can understand German, Rebecca Schmidt from the University of Paderborn wrote a nice review of noScribe, also discussing its limitations. Also the German computer magazine c't recommended noScribe in a recent review.

Why the Name "noScribe"?

The urban dictionary defines scribe as "a person whose entire miserable existence has been reduced to academic grunge and pain". I hope this software will make your academic life a little less painful and grungy, hence the name noScribe :)

About Me

Kai Dröge, PhD in sociology (with a background in computer science), qualitative researcher and teacher, Lucerne University for Applied Science (Switzerland) and Institute for Social Research, Frankfurt/M. (Germany).

Download and Installation

Current Version Number: 0.6.2 (see changelog)

All releases are hosted on SWITCHdrive, a secure data sharing platform for Swiss universities.

Windows

The general purpose version for normal PCs without a NVIDIA graphics card: https://drive.switch.ch/index.php/s/HtKDKYRZRNaYBeI?path=%2FWindows%2Fnormal2
A special version using CUDA acceleration on NVIDIA graphics cards with at least 6 GB of VRAM: https://drive.switch.ch/index.php/s/HtKDKYRZRNaYBeI?path=%2FWindows%2Fcuda1. You must also install the CUDA toolkit from here (a reboot is required afterwards).
Installation:
- Start the downloaded setup file. This may take a while, be patient.
- If you get a warning that "Windows protected your PC" and the app comes from an "Unknown publisher", you have to trust us and click "Run anyway"
- To do a silent install on a larger group of computers, start the setup with the argument /S.

MacOS

ported by gernophil

Newer Macs with Apple Silicon M1-M4 processors
- Download: https://drive.switch.ch/index.php/s/HtKDKYRZRNaYBeI?path=%2FmacOS%2Farm64%20(Apple%20Silicon)
- Double-click on the downloaded dmg-file, then drag noScribe and noScribeEdit into the link to your applications folder (labeled "drag both here to install").
- You will need Apple's Rosetta2 Intel emulator since one component (ffmpeg) is still made the Intel CPUs. If you don't have it installed already, do this as follows:
  - Open the Terminal (located at /Applications/Utilities/Terminal.app).
  - Type softwareupdate --install-rosetta or softwareupdate --install-rosetta --agree-to-license.
  - Hit enter and follow the instructions on the screen.
- Start noScribe and/or noScribeEdit by double-clicking the app within your applications.
Older Macs with Intel processors Note: Version 0.6.2 on Intel based Macs is currently experimental and may not fully work. Please help us testing it. You can download it here. Otherwise, you can use the stable version version 0.5:
- for macOS Sonoma (14) and Sequoia (15): https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.5%2FmacOS%2Fx86_64%20(Intel)
- for macOS 11 (Big Sur), 12 (Monterey) and 13 (Ventura): https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.5%2FmacOS%2Fx86_64_legacy%20(old%20Intel)
- Note: Unfortunately, we are currently not able to sign the x86_64 package correctly, so you will get a warning that noScribe and noScribeEdit are from unregistered developers. You have to manually allow noScribe and noScribeEdit to be executed, if your Gatekeeper is active. Follow these steps:
- Double-click the downloaded dmg-file.
- Drag noScribe and noScribeEdit into the link to your applications folder (labeled "drag both here to install").
- Start noScribe by double-clicking the app within your applications folder. You will get an error that noScribe is from an unregistered developer. Do the same with the noScribe Editor.
- Go to Settings -> Privacy and Security -> Scroll down until you see a message stating noScribe was prevented from starting and click "open anyway". Again, do the same with the noScribe Editor.
- From now on, both programs should start without issues.

Linux

ported by Eckhard Kadasch and Florian Dobener; executable generated by gernophil.

Executable installation:
- Download the CUDA or CPU version of noScribe 0.6.2 for Linux here: https://drive.switch.ch/index.php/s/HtKDKYRZRNaYBeI?path=%2FLinux
- Untar the file using the terminal command tar -xzvf noScribe_0.6.2_cpu_linux_amd64.tar.gz or tar -xzvf noScribe_0.6.2_cuda_linux_amd64.tar.gz.
- Execute noScribe using the terminal by cding into the noScribe folder and executing ./noScribe.
- Optionally: Edit the files noScribe.desktop and noScribeEdit.desktop with a text editor and enter the complete path in the lines starting with Exce= and Icon=.

Manual installation from source: Based on instructions by mael-lenoc

# release ( must be > 0.6 in order to include the latest fixes for linux!)
NOS_REL=0.6.1
wget https://github.com/kaixxx/noScribe/archive/refs/tags/v${NOS_REL).tar.gz
tar xvz -f v${NOS_REL).tar.gz
cd noScribe-${NOS_REL)/  # from here on all happens in this directory

# alternative: current main branch
wget -O noScribe-main.zip https://github.com/kaixxx/noScribe/archive/refs/heads/main.zip
unzip noScribe-main.zip
cd noScribe-main # from here on all happens in this directory

# install noScribeEdit
rm -rf noScribeEdit/
git clone https://github.com/kaixxx/noScribeEditor.git noScribeEdit

# venv
python3 -m venv .venv
source .venv/bin/activate  # from here on all happens in the venv

# requirements
pip install -r environments/requirements_linux.txt
pip install -r noScribeEdit/environments/requirements.txt

# models/precise
# this assumes you have git large file support enabled: apt install git-lfs
rm -rf models/precise
git clone https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo models/precise
for f in config.json  model.bin  preprocessor_config.json  tokenizer.json  vocabulary.json; do wget -O models/fast/$f "https://huggingface.co/mukowaty/faster-whisper-int8/resolve/main/faster-whisper-large-v3-turbo-int8/${f}?download=true"; done

# run
python3 ./noScribe.py

Old versions:

https://drive.switch.ch/index.php/s/EIVup04qkSHb54j

Citation (APA Style)

Dröge, K. (2024). noScribe. AI-powered Audio Transcription (Version XXX) [Computer software]. https://github.com/kaixxx/noScribe

Usage

Settings

Select your audio file. NoScribe supports almost any audio or video format.
Select the filename for the transcript. You can also choose the file type: *.html is the default, supported also by the noScribe editor. *.vtt is a video subtitles format and is especially useful if you want to import your transcript into EXMARaLDA for further annotation. *.txt exports the transcript as plain text.
Start and Stop accept timestamps in the format hh:mm:ss. Use this to limit the transcription to a particular part of the recording. This is especially helpful for testing your settings with a small sample before committing to transcribing the whole interview, which may take several hours. Leave Stop empty if you want to transcribe until the end of the audio file.
Language: Select the language of your transcript, set it to 'auto' to detect the language, or choose "multilingual" if your audio contains more than one language (experimental).
Quality: 'Precise' is the recommended setting for the most accurate transcript. On slower machines, you may opt for the 'fast' option. This will be quicker but might necessitate more manual revision later. You can also install custom models, fine-tuned for specific languages, etc.
Mark Pause: If enabled, parts of your audio without voice activity will be marked as pauses. Pauses are transcribed as round brackets with one dot per second inside, e.g., '(..)' for a two-second pause. Pauses longer than 10 seconds are written out as '(XX seconds pause)' or '(XX minutes pause)'. You have the option to mark either pauses of one second and more ('1sec+'), two seconds and more ('2sec+'), or only the longer ones of three seconds and more ('3sec+'). Choose 'none' to disable this feature entirely.
Speaker Detection: This feature uses the Pyannote AI model to identify distinct speakers in your audio and organizes the transcript accordingly. Choose the number of speakers if known, or select 'auto.' Opting for 'none' bypasses this step altogether, reducing the processing time by approximately half. However, the resultant transcript will be a continuous block of text without any indicators of speaker transitions.
Overlapping Speech: If enabled, noScribe attempts to mark instances where two people speak simultaneously. The overlapping section is demarcated with //double slashes//. (Note: This is an experimental feature.)
Disfluencies: If enabled, common speech disfluencies like filler words ("um"), unfinished words or sentences, etc. will also be transcribed.
Timestamps: When enabled, noScribe incorporates timestamps in the format [hh:mm:ss] into the transcript either at every change of speaker or every 60 seconds. I find these timestamps somewhat distracting, hence my decision to disable them by default. However, they can be quite useful in certain contexts. Even with timestamps disabled, determining the audio timecode for a specific segment is straightforward: simply open the transcript in the noScribe Editor, navigate through the text, and the corresponding timecode will appear in the bottom right corner of the app.

Transcription process

If you are ready, click the Start-button in the bottom left. Cancel will abort the process.
Be aware that a one-hour interview can take up to three hours processing time and will put a heavy load on your machine. Doing this on battery-power is not recommended.
A progress indicator at the bottom of the app will show how far you are into the whole process.
The main window will log progress-messages and errors. It will also show the text of your interview during the last step of the transcription.
The transcript will be auto saved every few seconds under the given filename.
By default, noScribe produces an HTML-file. This can be opened in every common word editor (including MS Word, LibreOffice) or QDA-package (MAXQDA, ATLAS.ti, QualCoder...).
Before working with the transcript though, you should check it with the included editor. There will always be some errors.

noScribeEdit

The included editor to check the final transcript.

The noScribe Editor is a separate app. It will open automatically once the transcript is finished, but can also be run independent from noScribe. It contains some handy features to check your finished transcript for errors and correct them:

Press Ctrl + Spacebar (^Space on Mac) or the orange button in the toolbar to hear the audio which corresponds to your current position in the text.
The selection of the text will follow the audio that you hear. If you want to make changes, click anywhere in the text with your mouse or use the arrow keys to move the cursor. The audio will stop, and you can edit the text.
You can also stop the audio by pressing Ctrl + Spacebar again or clicking the orange button.
If you want to speed up or slow down the audio, change the "100%"-field next to the "Play/Pause Audio"-Button to the appropriate speed.
To change the speaker names, use the Search & Replace feature, accessible from the magnifying glass icon or the Edit menu.
Use the plus und minus icons in the toolbar to zoom in or out
You will find the most common features of a basic text editor in the toolbar as well as in the menu at the top (basic text formatting, cut, copy & paste, undo & redo).
Your typical hotkeys will also work (e.g., Ctrl+S for Save, Ctrl+F for Find & Replace). You can see all the hotkeys if you open the menu. As already mentioned, 'Ctrl+Space' is the hotkey you'll use the most as it starts or pauses the audio.

The source code of the editor can be found here: https://github.com/kaixxx/noScribeEditor

Factors Influencing the Quality of the Transcription

A good audio recording with clear voices and no ambient noise is crucial for a high-quality transcription. Investing some effort in the quality of the recording will save you much time in the manual revision process later.
Whisper (the AI powering noScribe) understands around 60 different languages, but the quality of the transcription varies widely between them. Spanish, Italian, English, Portuguese and German are best supported (see here for more info).
Whisper handles dialects fairly well (e.g., Swiss-German), but the transcript might need more manual work in the revision.

Known Issues

The whisper AI can sometimes get stuck in a loop of repeating text, especially on longer audio files. If this happens, try to transcribe shorter sections (using the "Start" and "Stop" fields in noScribe), and join them manually.
Multilingual audiois now supported, but experimental.
Nonverbal expressions like laughter are not included in the transcript and must be added later in the editor if you need them.
Speaker identification: In some recordings, the AI used by noScribe may not be able to tell the voices of certain speakers apart, even if they sound quite different to the human ear. Check the results carefully.
The whisper AI can sometimes hallucinate, especially in silent parts of the recording when it interprets background noise as 'text' (see this study from the Cornell University for more info about the issue).

Advanced Options

After the app has run for the first time, you will find a file named config.yml in the user config directory (on windows: C:\Users<username>\AppData\Local\noScribe\noScribe\config.yml; on Mac: "~/Library/Application Support/noscribe/config.yml"). Here, you can change a few extra settings, e.g., the language of the user interface.
Also in the user config directory you will find a folder named log with detailed log-files for every transcript (also unfinished ones). This can be helpful in the case of any errors. Be aware though that these files also contain the text of your transcripts which might include sensitive information.
If you want to use custom whisper models with noScribe, follow the instructions in the Wiki.

Development and Contribution

I developed noScribe in python 3.12
I cannot host the whisper-models on GitHub because they are too large. There is a readme in the models-folder with instructions on how to get them.
I am happy to review tests, bug reports and pull requests (if my time allows it)

Translations

The noScribe UI has already been translated into many languages (thanks mlynar-czyk).
Since most of the translations have been created with ChatGPT, there will be problems. Please report any errors that you’ll find and make – if possible – a pull request with a better translation.
You will find the language files in the folder "trans".
If you change anything in the language files, make sure to follow the conventions of the YAML language.
If you want to change the language of the user interface, you have to change the value of the "locale" setting in the advanced settings (see above).

Other Software

If you are interested in open source software for the analysis of qualitative data, take a look at QualCoder and Taguette.

For Tasks:

Click tags to check more tools for each tasks

transcribe interviews review transcripts verify transcription accuracy correct transcript errors distinguish speakers

For Jobs:

research assistant journalist social researcher academic writer content creator

Alternative AI tools for noScribe

Similar Open Source Tools

noScribe

github

: 1.4k

AI-Player

AI-Player is a Minecraft mod that adds an 'intelligent' second player to the game to combat loneliness while playing solo. It aims to enhance gameplay by providing companionship and interactive features. The mod leverages advanced AI algorithms and integrates with external tools to enhance the player experience. Developed with a focus on addressing the social aspect of gaming, AI-Player is a community-driven project that continues to evolve with user feedback and contributions.

github

: 75

LLavaImageTagger

LLMImageIndexer is an intelligent image processing and indexing tool that leverages local AI to generate comprehensive metadata for your image collection. It uses advanced language models to analyze images and generate captions and keyword metadata. The tool offers features like intelligent image analysis, metadata enhancement, local processing, multi-format support, user-friendly GUI, GPU acceleration, cross-platform support, stop and start capability, and keyword post-processing. It operates directly on image file metadata, allowing users to manage files, add new files, and run the tool multiple times without reprocessing previously keyworded files. Installation instructions are provided for Windows, macOS, and Linux platforms, along with usage guidelines and configuration options.

github

: 97

feedgen

FeedGen is an open-source tool that uses Google Cloud's state-of-the-art Large Language Models (LLMs) to improve product titles, generate more comprehensive descriptions, and fill missing attributes in product feeds. It helps merchants and advertisers surface and fix quality issues in their feeds using Generative AI in a simple and configurable way. The tool relies on GCP's Vertex AI API to provide both zero-shot and few-shot inference capabilities on GCP's foundational LLMs. With few-shot prompting, users can customize the model's responses towards their own data, achieving higher quality and more consistent output. FeedGen is an Apps Script based application that runs as an HTML sidebar in Google Sheets, allowing users to optimize their feeds with ease.

github

: 183

local_multimodal_ai_chat

Local Multimodal AI Chat is a hands-on project that teaches you how to build a multimodal chat application. It integrates different AI models to handle audio, images, and PDFs in a single chat interface. This project is perfect for anyone interested in AI and software development who wants to gain practical experience with these technologies.

github

: 111

LLPlayer

LLPlayer is a specialized media player designed for language learning, offering unique features such as dual subtitles, AI-generated subtitles, real-time OCR, real-time translation, word lookup, and more. It supports multiple languages, online video playback, customizable settings, and integration with browser extensions. Written in C#/WPF, LLPlayer is free, open-source, and aims to enhance the language learning experience through innovative functionalities.

github

: 683

GameSentenceMiner

GameSentenceMiner (GSM) is an immersion toolkit designed to assist with language learning through games. It enhances Anki cards with automated audio capture, manual trim options, screenshot capture, multi-line support, and AI translation. Additionally, GSM offers OCR capabilities with easier setup, exclusion zones, two-pass OCR system, consistent audio timing, and support for multiple languages. The tool also features game launcher capabilities for simplifying game setup and launching. Basic requirements include an Anki card creation tool, a method of extracting text from games, and, of course, a game. GSM provides detailed documentation and FAQs to help users understand its functionality and troubleshoot any issues. Users can seek support through the project's Discord channel or by creating issues on the repository.

github

: 154

CLIPPyX

CLIPPyX is a powerful system-wide image search and management tool that offers versatile search options to find images based on their content, text, and visual similarity. With advanced features, users can effortlessly locate desired images across their entire computer's disk(s), regardless of their location or file names. The tool utilizes OpenAI's CLIP for image embeddings and text-based search, along with OCR for extracting text from images. It also employs Voidtools Everything SDK to list paths of all images on the system. CLIPPyX server receives search queries and queries collections of image embeddings and text embeddings to return relevant images.

github

: 130

graphrag-local-ollama

GraphRAG Local Ollama is a repository that offers an adaptation of Microsoft's GraphRAG, customized to support local models downloaded using Ollama. It enables users to leverage local models with Ollama for large language models (LLMs) and embeddings, eliminating the need for costly OpenAPI models. The repository provides a simple setup process and allows users to perform question answering over private text corpora by building a graph-based text index and generating community summaries for closely-related entities. GraphRAG Local Ollama aims to improve the comprehensiveness and diversity of generated answers for global sensemaking questions over datasets.

github

: 480

persian-license-plate-recognition

The Persian License Plate Recognition (PLPR) system is a state-of-the-art solution designed for detecting and recognizing Persian license plates in images and video streams. Leveraging advanced deep learning models and a user-friendly interface, it ensures reliable performance across different scenarios. The system offers advanced detection using YOLOv5 models, precise recognition of Persian characters, real-time processing capabilities, and a user-friendly GUI. It is well-suited for applications in traffic monitoring, automated vehicle identification, and similar fields. The system's architecture includes modules for resident management, entrance management, and a detailed flowchart explaining the process from system initialization to displaying results in the GUI. Hardware requirements include an Intel Core i5 processor, 8 GB RAM, a dedicated GPU with at least 4 GB VRAM, and an SSD with 20 GB of free space. The system can be installed by cloning the repository and installing required Python packages. Users can customize the video source for processing and run the application to upload and process images or video streams. The system's GUI allows for parameter adjustments to optimize performance, and the Wiki provides in-depth information on the system's architecture and model training.

github

: 345

EdgeChains

EdgeChains is an open-source chain-of-thought engineering framework tailored for Large Language Models (LLMs)- like OpenAI GPT, LLama2, Falcon, etc. - With a focus on enterprise-grade deployability and scalability. EdgeChains is specifically designed to **orchestrate** such applications. At EdgeChains, we take a unique approach to Generative AI - we think Generative AI is a deployment and configuration management challenge rather than a UI and library design pattern challenge. We build on top of a tech that has solved this problem in a different domain - Kubernetes Config Management - and bring that to Generative AI. Edgechains is built on top of jsonnet, originally built by Google based on their experience managing a vast amount of configuration code in the Borg infrastructure.

github

: 376

OSHW-SenseCAP-Watcher

SenseCAP Watcher is a monitoring device built on ESP32S3 with Himax WiseEye2 HX6538 AI chip, excelling in image and vector data processing. It features a camera, microphone, and speaker for visual, auditory, and interactive capabilities. With LLM-enabled SenseCraft suite, it understands commands, perceives surroundings, and triggers actions. The repository provides firmware, hardware documentation, and applications for the Watcher, along with detailed guides for setup, task assignment, and firmware flashing.

github

: 77

AntSK

AntSK is an AI knowledge base/agent built with .Net8+Blazor+SemanticKernel. It features a semantic kernel for accurate natural language processing, a memory kernel for continuous learning and knowledge storage, a knowledge base for importing and querying knowledge from various document formats, a text-to-image generator integrated with StableDiffusion, GPTs generation for creating personalized GPT models, API interfaces for integrating AntSK into other applications, an open API plugin system for extending functionality, a .Net plugin system for integrating business functions, real-time information retrieval from the internet, model management for adapting and managing different models from different vendors, support for domestic models and databases for operation in a trusted environment, and planned model fine-tuning based on llamafactory.

github

: 1.3k

WDoc

WDoc is a powerful Retrieval-Augmented Generation (RAG) system designed to summarize, search, and query documents across various file types. It supports querying tens of thousands of documents simultaneously, offers tailored summaries to efficiently manage large amounts of information, and includes features like supporting multiple file types, various LLMs, local and private LLMs, advanced RAG capabilities, advanced summaries, trust verification, markdown formatted answers, sophisticated embeddings, extensive documentation, scriptability, type checking, lazy imports, caching, fast processing, shell autocompletion, notification callbacks, and more. WDoc is ideal for researchers, students, and professionals dealing with extensive information sources.

github

: 63

gabber

Gabber is a real-time AI engine that supports graph-based apps with multiple participants and simultaneous media streams. It allows developers to build powerful and developer-friendly AI applications across voice, text, video, and more. The engine consists of frontend and backend services including an editor, engine, and repository. Gabber provides SDKs for JavaScript/TypeScript, React, Python, Unity, and upcoming support for iOS, Android, React Native, and Flutter. The roadmap includes adding more nodes and examples, such as computer use nodes, Unity SDK with robotics simulation, SIP nodes, and multi-participant turn-taking. Users can create apps using nodes, pads, subgraphs, and state machines to define application flow and logic.

github

: 887

burpference

Burpference is an open-source extension designed to capture in-scope HTTP requests and responses from Burp's proxy history and send them to a remote LLM API in JSON format. It automates response capture, integrates with APIs, optimizes resource usage, provides color-coded findings visualization, offers comprehensive logging, supports native Burp reporting, and allows flexible configuration. Users can customize system prompts, API keys, and remote hosts, and host models locally to prevent high inference costs. The tool is ideal for offensive web application engagements to surface findings and vulnerabilities.

github

: 92

For similar tasks

noScribe

github

: 1.4k

AivisSpeech-Engine

AivisSpeech-Engine is a powerful open-source tool for speech recognition and synthesis. It provides state-of-the-art algorithms for converting speech to text and text to speech. The tool is designed to be user-friendly and customizable, allowing developers to easily integrate speech capabilities into their applications. With AivisSpeech-Engine, users can transcribe audio recordings, create voice-controlled interfaces, and generate natural-sounding speech output. Whether you are building a virtual assistant, developing a speech-to-text application, or experimenting with voice technology, AivisSpeech-Engine offers a comprehensive solution for all your speech processing needs.

github

: 97

NotelyVoice

Notely Voice is a free, modern, cross-platform AI voice transcription and note-taking application. It offers powerful Whisper AI Voice to Text capabilities, making it ideal for students, professionals, doctors, researchers, and anyone in need of hands-free note-taking. The app features rich text editing, simple search, smart filtering, organization with folders and tags, advanced speech-to-text, offline capability, seamless integration, audio recording, theming, cross-platform support, and sharing functionality. It includes memory-efficient audio processing, chunking configuration, and utilizes OpenAI Whisper for speech recognition technology. Built with Kotlin, Compose Multiplatform, Coroutines, Android Architecture, ViewModel, Koin, Material 3, Whisper AI, and Native Compose Navigation, Notely follows Android Architecture principles with distinct layers for UI, presentation, domain, and data.

github

: 388

For similar jobs

SLR-FC

This repository provides a comprehensive collection of AI tools and resources to enhance literature reviews. It includes a curated list of AI tools for various tasks, such as identifying research gaps, discovering relevant papers, visualizing paper content, and summarizing text. Additionally, the repository offers materials on generative AI, effective prompts, copywriting, image creation, and showcases of AI capabilities. By leveraging these tools and resources, researchers can streamline their literature review process, gain deeper insights from scholarly literature, and improve the quality of their research outputs.

github

: 131

paper-ai

Paper-ai is a tool that helps you write papers using artificial intelligence. It provides features such as AI writing assistance, reference searching, and editing and formatting tools. With Paper-ai, you can quickly and easily create high-quality papers.

github

: 664

paper-qa

PaperQA is a minimal package for question and answering from PDFs or text files, providing very good answers with in-text citations. It uses OpenAI Embeddings to embed and search documents, and follows a process of embedding docs and queries, searching for top passages, creating summaries, scoring and selecting relevant summaries, putting summaries into prompt, and generating answers. Users can customize prompts and use various models for embeddings and LLMs. The tool can be used asynchronously and supports adding documents from paths, files, or URLs.

github

: 3.6k

ChatData

ChatData is a robust chat-with-documents application designed to extract information and provide answers by querying the MyScale free knowledge base or uploaded documents. It leverages the Retrieval Augmented Generation (RAG) framework, millions of Wikipedia pages, and arXiv papers. Features include self-querying retriever, VectorSQL, session management, and building a personalized knowledge base. Users can effortlessly navigate vast data, explore academic papers, and research documents. ChatData empowers researchers, students, and knowledge enthusiasts to unlock the true potential of information retrieval.

github

: 135

noScribe

github

: 1.4k

AIStudyAssistant

AI Study Assistant is an app designed to enhance learning experience and boost academic performance. It serves as a personal tutor, lecture summarizer, writer, and question generator powered by Google PaLM 2. Features include interacting with an AI chatbot, summarizing lectures, generating essays, and creating practice questions. The app is built using 100% Kotlin, Jetpack Compose, Clean Architecture, and MVVM design pattern, with technologies like Ktor, Room DB, Hilt, and Kotlin coroutines. AI Study Assistant aims to provide comprehensive AI-powered assistance for students in various academic tasks.

github

: 69

data-to-paper

Data-to-paper is an AI-driven framework designed to guide users through the process of conducting end-to-end scientific research, starting from raw data to the creation of comprehensive and human-verifiable research papers. The framework leverages a combination of LLM and rule-based agents to assist in tasks such as hypothesis generation, literature search, data analysis, result interpretation, and paper writing. It aims to accelerate research while maintaining key scientific values like transparency, traceability, and verifiability. The framework is field-agnostic, supports both open-goal and fixed-goal research, creates data-chained manuscripts, involves human-in-the-loop interaction, and allows for transparent replay of the research process.

github

: 553

k2

K2 (GeoLLaMA) is a large language model for geoscience, trained on geoscience literature and fine-tuned with knowledge-intensive instruction data. It outperforms baseline models on objective and subjective tasks. The repository provides K2 weights, core data of GeoSignal, GeoBench benchmark, and code for further pretraining and instruction tuning. The model is available on Hugging Face for use. The project aims to create larger and more powerful geoscience language models in the future.

github

: 153