
RAGHub
A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.
Stars: 1159

RAGHub is a community-driven project focused on cataloging new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. It aims to help users stay ahead of changes in the field by providing a platform for the latest innovations in RAG. The repository includes information on RAG frameworks, evaluation frameworks, optimization frameworks, citation frameworks, engines, search reranker frameworks, projects, resources, and real-world use cases across industries and professions.
README:
Welcome to RAGHub, a living collection of new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. This is a community-driven project for r/RAG, where we aim to catalog the rapid growth of RAG tools and projects that are pushing the boundaries of the field.
Each day, it feels like a new tool or framework emerges, and choosing the right one is becoming more of an art than a science. Is the framework from three months ago still relevant? Or was it just hype, rehashing old concepts with a fresh look? RAGHub exists to help you stay ahead of these changes, providing a platform for the latest innovations in RAG.
This is a community project, and we welcome contributions from everyone! If you’d like to add a new framework, project, or resource, please check out our Contribution Guidelines for details on how to get started.
Name | Description | Website | Github | Stars | Activity |
---|---|---|---|---|---|
Dcup Open-Source RAG-as-a-Service | Connect your app to user data in minutes with self-hostable RAG pipelines. | Website | Github | 1h ago | |
LangChain | Building applications with LLMs | Website | Github | 9h ago | |
Scout | Building apps with LLMs/vector databses/web scraping | Website | Github | 1h ago | |
Haystack | A framework for building search engines using neural networks | Website | Github | Last week | |
LlamaIndex | A framework for building data-driven LLM applications | Website | Github | 7h ago | |
BentoML | Build Inference APIs, LLM apps, Multi-model chains, RAG | Website | Github | 1h ago | |
Contextual AI | End-to-end RAG including document understanding, retrieval, grounded generation, and evaluation | Website | GitHub | -- | |
LightRAG | Simple and fast Retrieval-Augmented Generation | Website | Github | 1d ago | |
Swarm by OpenAI | Educational framework for lightweight multi-agent orchestration | - | Github | 1d ago | |
Langroid | Python framework to easily build LLM-powered applications | Website | Github | 10h ago | |
NeMo-Guardrails | Add programmable guardrails to LLM-based applications | Website | Github | Last week | |
Swiftide | A Rust library for building fast, streaming applications with LLMs | Website | GitHub | 1h ago | |
Korvus | The entire RAG pipeline in a single database query | Website | GitHub | Last month | |
semantic-router | A framework for routing LLM requests using semantic vectors | Website | GitHub | 4h ago | |
AWS Bedrock Knowledge Bases | Service to build, scale, and deploy RAG-powered applications | Website | - | - | 1h ago |
langflow | Build, scale, and deploy RAG and multi-agent AI apps | Website | GitHub | 1h ago | |
dspy | Build language model apps with modular programming | Website | GitHub | 13h ago | |
mem0 | The Memory layer for your AI apps | Website | GitHub | 2h ago | |
RAGLite | A Python package for building RAG applications | Website | GitHub | 18h ago | |
cognee | Memory framework for building GraphRAG applications | Website | GitHub | 2h ago | |
ragbits | Building blocks for rapid development of GenAI applications | Website | GitHub | 1h ago | |
Interchange | End-to-end API for RAG, from document upload to search | Website | GitHub | 1h ago | |
ZeroEntropy | Rerankers, embeddings and end-to-end retrieval API | Website | GitHub | - | 1h ago |
memori | Multi-Agent Memory Engine that gives your AI agents human-like memory | Website | GitHub | 3d ago |
Name | Description | Website | GitHub | Stars | Activity |
---|---|---|---|---|---|
Trulens | Measures and enhance LLM app quality with feedback functions for scalable evaluation | Website | GitHub | 11h ago | |
Phoenix | AI observability platform designed for experimentation, evaluation, and troubleshooting | Website | GitHub | 1d ago | |
ragas | Evaluates and quantifies the performance of RAG pipelines that enhance LLM context with external data | Website | GitHub | 3h ago | |
LMUnit | Language model optimized for evaluating natural language unit tests | Website | - | - | -- |
Deepchecks | Continuous validation of AI & ML models, detecting data drift and model issues | Website | GitHub | 8m ago | |
AutoRAG | End-to-end RAG optimization: parsing, chunking, evaluation dataset creation, and pipeline deployment | Website | GitHub | 1h ago | |
evalmy.ai | Fine-tuned lightweight RAG evaluation service + Python client library | Website | GitHub | -- | |
TextGrad | A framework for LLM-based text optimization, focusing on reducing hallucinations and improving prompts | Website | GitHub | 24h ago | |
langfuse | Traces, evals, prompt management, and metrics to debug and improve your LLM application. | Website | GitHub | 1h ago | |
Vectara HHEM | Hallucination evaluation model for RAG | Huggingface | -- | -- | -- |
StepsTrack | An Observability tool built to track, inspect, and visualize every steps in a pipeline | - | GitHub | 15h ago | |
syftr | Multi-objective end-to-end agentic RAG optimization. | - | GitHub | 1h ago | |
zbench | Annotation and evaluation framework for retrieval and reranking | Website | GitHub | - | 1h ago |
Name | Description | Website | GitHub | Stars | Activity |
---|---|---|---|---|---|
Agentset | Open-source agentic RAG platform. | Website | GitHub | 1d ago | |
Engramic | RAG engine focused on long-term memory and advanced context management | Website | Github | 2h ago | |
TrustGraph | LLM Agnostic Agent Development Platform | Website | GitHub | 2d ago | |
R2R | The Elasticsearch for RAG, helps you quickly build and launch scalable RAG solutions | Website | GitHub | 6h ago | |
RAGFlow | Open-source RAG engine based on deep document understanding | Website | GitHub | 1h ago | |
Liquid Index | The Unified RAG Platform. One API. Every Tool You Need | Website | - | - | 1h ago |
Vertex AI Knowledge Engine | A data framework for context-augmented LLM applications | Website | - | - | 1d ago |
Embedchain | Open Source Framework for personalizing LLM responses under 10 lines of code | Website | GitHub | Last week | |
txtai | All-in-one embeddings database for semantic search, LLM orchestration, and RAG workflows | Website | GitHub | Last week | |
dsRAG | High-performance retrieval engine for unstructured data | - | GitHub | Last week | |
Flash-Rank | Use Pairwise or Listwise rerankers to improve search accuracy before passing to LLMs. | - | GitHub | 2w ago | |
Graphlit | API-first platform for building knowledge-driven AI applications and agents | Website | GitHub | 8h ago | |
rag-citation | Combines RAG with automatic citation generation to enhance content credibility | Website | GitHub | Last week | |
PostgresML | Postgres + GPUs with functions for chunking, embedding, transforming and ranking | Website | GitHub | Yesterday | |
chainlit | Build production-ready Conversational AI applications in minutes, not weeks | Website | GitHub | 24h ago | |
pathway | Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG. | Website | GitHub | 7h ago | |
cognita | RAG framework for modular, open-source production apps. | Website | GitHub | 2 days ago | |
FlashRAG | A Python Toolkit for Efficient RAG Research | - | GitHub | 3h ago | |
RAGatouille | Easily train and use advanced retrieval methods in any RAG pipeline. | - | GitHub | 4 months ago | |
pgai | A suite of tools to develop RAG, semantic search, and other AI applications just in PostgreSQL | Website | GitHub | 10h ago | |
Vectara | The trusted RAG platform for quickly building AI assistants and agents. | Website | GitHub | - | - |
mode | RAG framework with expert models, smart clustering,and efficient retrieval for small datasets. | - | GitHub | 2 days ago | |
haiku.rag | Open-Source RAG framework with monitoring, CLI, search, Q/A, MCP support on SQLite. | - | Github | 3h ago | |
ZeroEntropy AI | Open-Weight Rerankers, Embeddings and End-to-End Retrieval API | Website | Github | - | 3h ago |
Name | Description | Website | GitHub | Stars | Activity |
---|---|---|---|---|---|
CocoIndex | ETL framework to build fresh index | Website | Github | 1h ago | |
Gitana.io | Content platform for editorial approval and scheduled deployment of trained data sets to RAG vector DBs | Website | - | - | - |
Chonkie | No-nonsense, lightweight and fast RAG chunking library | Website | GitHub | 1h ago |
Name | Description | Website | GitHub | Stars | Activity |
---|---|---|---|---|---|
LlamaParse | GenAI-native document parsing platform | Website | GitHub | 2d ago | |
Langchain-extract | Web server to extract information from text and files using LLMs | Website | GitHub | 4m ago | |
Needle | Production-ready RAG pipelines out of the box. | Website | GitHub | 1h ago | |
Unstructured.io | Build custom preprocessing pipelines for labeling, training, or production ML | Website | GitHub | 3d ago | |
Verba | RAG chatbot powered by Weaviate | Website | GitHub | 2w ago | |
Unstract | No-code platform to launch APIs and ETL Pipelines to structure unstructured documents | Website | GitHub | 4h ago | |
Humata.ai | Ask questions across all of your document files | Website | - | - | 4h ago |
Ragie.ai | Fully managed RAG-as-a-Service for developers. | Website | GitHub | - | 12h ago |
Reducto | Parses complex documents and creates LLM-ready inputs | Website | GitHub | - | 2w ago |
Midship | Extract document data straight into your spreadsheet/ERP/CRM | Website | - | - | - |
DocuPanda | Convert documents into a structured, standard set of fields and values | Website | - | - | - |
contextual-doc-retrieval-opneai-reranker | Using GPT-4 and Cohere for query expansion and re-ranking with BM25 | - | GitHub | Last week | |
Raggenie | Low-code platform to build custom RAG-based AI applications | Website | GitHub | 10h ago | |
Chunkr | Vision model-based PDF chunking and OCR, optimized for fast processing of large datasets | Website | GitHub | 11h ago | |
tldw | Open-source project similar to NotebookLM | Website | GitHub | Yesterday | |
Cerbos | Access control for RAG and LLMs. | Website | GitHub | 14h ago | |
extractous | Extremely fast data extraction for your AI applications | Website | GitHub | - | |
SWIRL | AI search & RAG for your workplace. Get AI insights from your company's knowledge instantly. | Website | GitHub | 2w ago | |
ChatDOC PDF Parser | Precision PDF parsing that transforms documents into flawless structured data for RAG systems. | Website | - | - | - |
Gurubase | Create AI-powered Q&A assistants by indexing websites, PDF documents, YouTube videos, and GitHub code repositories. | Website | GitHub | 1d ago | |
Archive Agent | Open-source semantic file tracker with OCR + AI search. Smart indexer with RAG engine. | - | GitHub | - | |
MidrasAI | Simple API for Colpali, a multi-modal retrieval model. | - | Github | 6m ago | |
EmbeddingBridge | Version control and migration tool for embeddings | - | Github | - | |
Stream-Rag-Agent | Streaming RAG Agent for Kafka | - | Github | - | |
zchunk | Open-Source efficient LLM-based chunking | Website | Github | - | 2h ago |
hydrot | A production-ready Retrieval-Augmented Generation (RAG) system designed for enterprise documentation, with first-class support for markdown content. Built with a microservices architecture for horizontal scaling and flexibility. | - | Github | - |
Site/Article | Description | Link |
---|---|---|
Contextual Retrieval | Anthropic introducing Contextual Retrieval | Website |
Open-RAG | Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models | Website |
ColPali | Efficient Document Retrieval with Vision Language Models | Website |
RAG_Techniques | Showcases various advanced techniques for RAG systems | Website |
GenAI_Agents | Tutorials and implementations for various AI Agent techniques | Website |
Name | Description | Link |
---|---|---|
Artificial Analysis | LLM Comparison | Website |
HuggingFace/mteb | Embedding models leaderboard | Website |
Vectara Hallucination Leaderboard | Hallucination leaderboard for LLMs | Website |
If you're looking for mainstream RAG frameworks and techniques**, check out the excellent repository by Nir Diamant: RAG Techniques. This repository focuses on more established tools and methods that have already gained traction in the community.
This project is licensed under the MIT License. See the LICENSE file for details.
This project is part of the r/RAG community. Have feedback or suggestions? Feel free to open an issue, start a discussion, or join the conversation on our Discord server! We want to make this repository a valuable resource for everyone exploring the RAG ecosystem, and your input is crucial.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for RAGHub
Similar Open Source Tools

RAGHub
RAGHub is a community-driven project focused on cataloging new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. It aims to help users stay ahead of changes in the field by providing a platform for the latest innovations in RAG. The repository includes information on RAG frameworks, evaluation frameworks, optimization frameworks, citation frameworks, engines, search reranker frameworks, projects, resources, and real-world use cases across industries and professions.

nntrainer
NNtrainer is a software framework for training neural network models on devices with limited resources. It enables on-device fine-tuning of neural networks using user data for personalization. NNtrainer supports various machine learning algorithms and provides examples for tasks such as few-shot learning, ResNet, VGG, and product rating. It is optimized for embedded devices and utilizes CBLAS and CUBLAS for accelerated calculations. NNtrainer is open source and released under the Apache License version 2.0.

TRACE
TRACE is a temporal grounding video model that utilizes causal event modeling to capture videos' inherent structure. It presents a task-interleaved video LLM model tailored for sequential encoding/decoding of timestamps, salient scores, and textual captions. The project includes various model checkpoints for different stages and fine-tuning on specific datasets. It provides evaluation codes for different tasks like VTG, MVBench, and VideoMME. The repository also offers annotation files and links to raw videos preparation projects. Users can train the model on different tasks and evaluate the performance based on metrics like CIDER, METEOR, SODA_c, F1, mAP, Hit@1, etc. TRACE has been enhanced with trace-retrieval and trace-uni models, showing improved performance on dense video captioning and general video understanding tasks.

LLamaTuner
LLamaTuner is a repository for the Efficient Finetuning of Quantized LLMs project, focusing on building and sharing instruction-following Chinese baichuan-7b/LLaMA/Pythia/GLM model tuning methods. The project enables training on a single Nvidia RTX-2080TI and RTX-3090 for multi-round chatbot training. It utilizes bitsandbytes for quantization and is integrated with Huggingface's PEFT and transformers libraries. The repository supports various models, training approaches, and datasets for supervised fine-tuning, LoRA, QLoRA, and more. It also provides tools for data preprocessing and offers models in the Hugging Face model hub for inference and finetuning. The project is licensed under Apache 2.0 and acknowledges contributions from various open-source contributors.

kangaroo
Kangaroo is an AI-powered SQL client and admin tool for popular databases like SQLite, MySQL, PostgreSQL, etc. It supports various functionalities such as table design, query, model, sync, export/import, and more. The tool is designed to be comfortable, fun, and developer-friendly, with features like code intellisense and autocomplete. Kangaroo aims to provide a seamless experience for database management across different operating systems.

aidea-server
AIdea Server is an open-source Golang-based server that integrates mainstream large language models and drawing models. It supports various functionalities including OpenAI's GPT-3.5 and GPT-4, Anthropic's Claude instant and Claude 2.1, Google's Gemini Pro, as well as Chinese models like Tongyi Qianwen, Wenxin Yiyuan, and more. It also supports open-source large models like Yi 34B, Llama2, and AquilaChat 7B. Additionally, it provides features for text-to-image, super-resolution, coloring black and white images, generating art fonts and QR codes, among others.

cool-ai-stuff
This repository contains an uncensored list of free to use APIs and sites for several AI models. > _This list is mainly managed by @zukixa, the queen of zukijourney, so any decisions may have bias!~_ > > **Scroll down for the sites, APIs come first!** * * * > [!WARNING] > We are not endorsing _any_ of the listed services! Some of them might be considered controversial. We are not responsible for any legal, technical or any other damage caused by using the listed services. Data is provided without warranty of any kind. **Use these at your own risk!** * * * # APIs Table of Contents #### Overview of Existing APIs #### Overview of Existing APIs -- Top LLM Models Available #### Overview of Existing APIs -- Top Image Models Available #### Overview of Existing APIs -- Top Other Features & Models Available #### Overview of Existing APIs -- Available Donator Perks * * * ## API List:* *: This list solely covers all providers I (@zukixa) was able to collect metrics in. Any mistakes are not my responsibility, as I am either banned, or not aware of x API. \ 1: Last Updated 4/14/24 ### Overview of APIs: | Service | # of Users1 | Link | Stablity | NSFW Ok? | Open Source? | Owner(s) | Other Notes | | ----------- | ---------- | ------------------------------------------ | ------------------------------------------ | --------------------------- | ------------------------------------------------------ | -------------------------- | ----------------------------------------------------------------------------------------------------------- | | zukijourney| 4441 | D | High | On /unf/, not /v1/ | ✅, Here | @zukixa | Largest & Oldest GPT-4 API still continuously around. Offers other popular AI-related Bots too. | | Hyzenberg| 1234 | D | High | Forbidden | ❌ | @thatlukinhasguy & @voidiii | Experimental sister API to Zukijourney. Successor to HentAI | | NagaAI | 2883 | D | High | Forbidden | ❌ | @zentixua | Honorary successor to ChimeraGPT, the largest API in history (15k users). | | WebRaftAI | 993 | D | High | Forbidden | ❌ | @ds_gamer | Largest API by model count. Provides a lot of service/hosting related stuff too. | | KrakenAI | 388 | D | High | Discouraged | ❌ | @paninico | It is an API of all time. | | ShuttleAI | 3585 | D | Medium | Generally Permitted | ❌ | @xtristan | Faked GPT-4 Before 1, 2 | | Mandrill | 931 | D | Medium | Enterprise-Tier-Only | ❌ | @fredipy | DALL-E-3 access pioneering API. Has some issues with speed & stability nowadays. | oxygen | 742 | D | Medium | Donator-Only | ❌ | @thesketchubuser | Bri'ish 🤮 & Fren'sh 🤮 | | Skailar | 399 | D | Medium | Forbidden | ❌ | @aquadraws | Service is the personification of the word 'feature creep'. Lots of things announced, not much operational. |

auto-dev-vscode
AutoDev for VSCode is an AI-powered coding wizard with multilingual support, auto code generation, and a bug-slaying assistant. It offers customizable prompts and features like Auto Dev/Testing/Document/Agent. The tool aims to enhance coding productivity and efficiency by providing intelligent assistance and automation capabilities within the Visual Studio Code environment.

awesome-llm-webapps
This repository is a curated list of open-source, actively maintained web applications that leverage large language models (LLMs) for various use cases, including chatbots, natural language interfaces, assistants, and question answering systems. The projects are evaluated based on key criteria such as licensing, maintenance status, complexity, and features, to help users select the most suitable starting point for their LLM-based applications. The repository welcomes contributions and encourages users to submit projects that meet the criteria or suggest improvements to the existing list.

Awesome-LLM-Tabular
This repository is a curated list of research papers that explore the integration of Large Language Model (LLM) technology with tabular data. It aims to provide a comprehensive resource for researchers and practitioners interested in this emerging field. The repository includes papers on a wide range of topics, including table-to-text generation, table question answering, and tabular data classification. It also includes a section on related datasets and resources.

nx
Nx is a build system optimized for monorepos, featuring AI-powered architectural awareness and advanced CI capabilities. It provides faster task scheduling, caching, and more for existing workspaces. Nx Cloud enhances CI by offering remote caching, task distribution, automated e2e test splitting, and task flakiness detection. The tool aims to scale monorepos efficiently and improve developer productivity.

Awesome-Tabular-LLMs
This repository is a collection of papers on Tabular Large Language Models (LLMs) specialized for processing tabular data. It includes surveys, models, and applications related to table understanding tasks such as Table Question Answering, Table-to-Text, Text-to-SQL, and more. The repository categorizes the papers based on key ideas and provides insights into the advancements in using LLMs for processing diverse tables and fulfilling various tabular tasks based on natural language instructions.

visionOS-examples
visionOS-examples is a repository containing accelerators for Spatial Computing. It includes examples such as Local Large Language Model, Chat Apple Vision Pro, WebSockets, Anchor To Head, Hand Tracking, Battery Life, Countdown, Plane Detection, Timer Vision, and PencilKit for visionOS. The repository showcases various functionalities and features for Apple Vision Pro, offering tools for developers to enhance their visionOS apps with capabilities like hand tracking, plane detection, and real-time cryptocurrency prices.

llm-deploy
LLM-Deploy focuses on the theory and practice of model/LLM reasoning and deployment, aiming to be your partner in mastering the art of LLM reasoning and deployment. Whether you are a newcomer to this field or a senior professional seeking to deepen your skills, you can find the key path to successfully deploy large language models here. The project covers reasoning and deployment theories, model and service optimization practices, and outputs from experienced engineers. It serves as a valuable resource for algorithm engineers and individuals interested in reasoning deployment.

web-builder
Web Builder is a low-code front-end framework based on Material for Angular, offering a rich component library for excellent digital innovation experience. It allows rapid construction of modern responsive UI, multi-theme, multi-language web pages through drag-and-drop visual configuration. The framework includes a beautiful admin theme, complete front-end solutions, and AI integration in the Pro version for optimizing copy, creating components, and generating pages with a single sentence.

AIO-Firebog-Blocklists
AIO-Firebog-Blocklists is a comprehensive tool that combines various sources into a single, cohesive blocklist. It offers customizable options to suit individual preferences and needs, ensuring regular updates to stay up-to-date with the latest threats. The tool focuses on performance optimization to minimize impact while maintaining effective filtering. It is designed to help users with ad blocking, malware protection, tracker prevention, and content filtering.
For similar tasks

lance
Lance is a modern columnar data format optimized for ML workflows and datasets. It offers high-performance random access, vector search, zero-copy automatic versioning, and ecosystem integrations with Apache Arrow, Pandas, Polars, and DuckDB. Lance is designed to address the challenges of the ML development cycle, providing a unified data format for collection, exploration, analytics, feature engineering, training, evaluation, deployment, and monitoring. It aims to reduce data silos and streamline the ML development process.

ai-powered-search
AI-Powered Search provides code examples for the book 'AI-Powered Search' by Trey Grainger, Doug Turnbull, and Max Irwin. The book teaches modern machine learning techniques for building search engines that continuously learn from users and content to deliver more intelligent and domain-aware search experiences. It covers semantic search, retrieval augmented generation, question answering, summarization, fine-tuning transformer-based models, personalized search, machine-learned ranking, click models, and more. The code examples are in Python, leveraging PySpark for data processing and Apache Solr as the default search engine. The repository is open source under the Apache License, Version 2.0.

RAGHub
RAGHub is a community-driven project focused on cataloging new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. It aims to help users stay ahead of changes in the field by providing a platform for the latest innovations in RAG. The repository includes information on RAG frameworks, evaluation frameworks, optimization frameworks, citation frameworks, engines, search reranker frameworks, projects, resources, and real-world use cases across industries and professions.

Co-LLM-Agents
This repository contains code for building cooperative embodied agents modularly with large language models. The agents are trained to perform tasks in two different environments: ThreeDWorld Multi-Agent Transport (TDW-MAT) and Communicative Watch-And-Help (C-WAH). TDW-MAT is a multi-agent environment where agents must transport objects to a goal position using containers. C-WAH is an extension of the Watch-And-Help challenge, which enables agents to send messages to each other. The code in this repository can be used to train agents to perform tasks in both of these environments.

GPT4Point
GPT4Point is a unified framework for point-language understanding and generation. It aligns 3D point clouds with language, providing a comprehensive solution for tasks such as 3D captioning and controlled 3D generation. The project includes an automated point-language dataset annotation engine, a novel object-level point cloud benchmark, and a 3D multi-modality model. Users can train and evaluate models using the provided code and datasets, with a focus on improving models' understanding capabilities and facilitating the generation of 3D objects.

asreview
The ASReview project implements active learning for systematic reviews, utilizing AI-aided pipelines to assist in finding relevant texts for search tasks. It accelerates the screening of textual data with minimal human input, saving time and increasing output quality. The software offers three modes: Oracle for interactive screening, Exploration for teaching purposes, and Simulation for evaluating active learning models. ASReview LAB is designed to support decision-making in any discipline or industry by improving efficiency and transparency in screening large amounts of textual data.

Groma
Groma is a grounded multimodal assistant that excels in region understanding and visual grounding. It can process user-defined region inputs and generate contextually grounded long-form responses. The tool presents a unique paradigm for multimodal large language models, focusing on visual tokenization for localization. Groma achieves state-of-the-art performance in referring expression comprehension benchmarks. The tool provides pretrained model weights and instructions for data preparation, training, inference, and evaluation. Users can customize training by starting from intermediate checkpoints. Groma is designed to handle tasks related to detection pretraining, alignment pretraining, instruction finetuning, instruction following, and more.

amber-train
Amber is the first model in the LLM360 family, an initiative for comprehensive and fully open-sourced LLMs. It is a 7B English language model with the LLaMA architecture. The model type is a language model with the same architecture as LLaMA-7B. It is licensed under Apache 2.0. The resources available include training code, data preparation, metrics, and fully processed Amber pretraining data. The model has been trained on various datasets like Arxiv, Book, C4, Refined-Web, StarCoder, StackExchange, and Wikipedia. The hyperparameters include a total of 6.7B parameters, hidden size of 4096, intermediate size of 11008, 32 attention heads, 32 hidden layers, RMSNorm ε of 1e^-6, max sequence length of 2048, and a vocabulary size of 32000.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.