awesome-llm

Awesome series for Large Language Model(LLM)s

Stars: 58

Visit

Awesome LLM is a curated list of resources related to Large Language Models (LLMs), including models, projects, datasets, benchmarks, materials, papers, posts, GitHub repositories, HuggingFace repositories, and reading materials. It provides detailed information on various LLMs, their parameter sizes, announcement dates, and contributors. The repository covers a wide range of LLM-related topics and serves as a valuable resource for researchers, developers, and enthusiasts interested in the field of natural language processing and artificial intelligence.

README:

Awesome LLM

Awesome series for Large Language Model(LLM)s

Models
Datasets
Benchmarks
Materials
Contributing

Models

Overview

Name	Parameter size	Announcement date
BERT-Large (336M)	336 million	2018
T5 (11B)	11 billion	2020
Gopher (280B)	280 billion	2021
GPT-J (6B)	6 billion	2021
LaMDA (137B)	137 billion	2021
Megatron-Turing NLG (530B)	530 billion	2021
T0 (11B)	11 billion	2021
Macaw (11B)	11 billion	2021
GLaM (1.2T)	1.2 trillion	2021
T5 FLAN (540B)	540 billion	2022
OPT-175B (175B)	175 billion	2022
ChatGPT (175B)	175 billion	2022
GPT 3.5 (175B)	175 billion	2022
AlexaTM (20B)	20 billion	2022
Bloom (176B)	176 billion	2022
Bard	Not yet announced	2023
GPT 4	Not yet announced	2023
AlphaCode (41.4B)	41.4 billion	2022
Chinchilla (70B)	70 billion	2022
Sparrow (70B)	70 billion	2022
PaLM (540B)	540 billion	2022
NLLB (54.5B)	54.5 billion	2022
Alexa TM (20B)	20 billion	2022
Galactica (120B)	120 billion	2022
UL2 (20B)	20 billion	2022
Jurassic-1 (178B)	178 billion	2022
LLaMA (65B)	65 billion	2023
Stanford Alpaca (7B)	7 billion	2023
GPT-NeoX 2.0 (20B)	20 billion	2023
BloombergGPT	50 billion	2023
Dolly	6 billion	2023
Jurassic-2	Not yet announced	2023
OpenAssistant LLaMa	30 billion	2023
Koala	13 billion	2023
Vicuna	13 billion	2023
PaLM2	Not yet announced, Smaller than PaLM1	2023
LIMA	65 billion	2023
MPT	7 billion	2023
Falcon	40 billion	2023
Llama 2	70 billion	2023
Google Gemini	Not yet announced	2023
Microsoft Phi-2	2.7 billion	2023
Grok-0	33 billion	2023
Grok-1	314 billion	2023
Solar	10.7 billion	2024
Gemma	7 billion	2024
Grok-1.5	Not yet announced	2024
DBRX	132 billion	2024
Claude 3	Not yet announced	2024
Gemma 1.1	7 billion	2024
Llama 3	70 billion	2024

⬆️ Go to top

Open models

T5 (11B) - Announced by Google / 2020
T5 FLAN (540B) - Announced by Google / 2022
T0 (11B) - Announced by BigScience (HuggingFace) / 2021
OPT-175B (175B) - Announced by Meta / 2022
UL2 (20B) - Announced by Google / 2022
Bloom (176B) - Announced by BigScience (HuggingFace) / 2022
BERT-Large (336M) - Announced by Google / 2018
GPT-NeoX 2.0 (20B) - Announced by EleutherAI / 2023
GPT-J (6B) - Announced by EleutherAI / 2021
Macaw (11B) - Announced by AI2 / 2021
Stanford Alpaca (7B) - Announced by Stanford University / 2023

⬆️ Go to top

Projects

Visual ChatGPT - Announced by Microsoft / 2023
LMOps - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.

⬆️ Go to top

Commercial models

GPT

GPT 4 (Parameter size unannounced, gpt-4-32k) - Announced by OpenAI / 2023
ChatGPT (175B) - Announced by OpenAI / 2022
ChatGPT Plus (175B) - Announced by OpenAI / 2023
GPT 3.5 (175B, text-davinci-003) - Announced by OpenAI / 2022

⬆️ Go to top

Gemini

Gemini - Announced by Google Deepmind / 2023

Bard

Bard - Announced by Google / 2023

⬆️ Go to top

Codex

Codex (11B) - Announced by OpenAI / 2021

⬆️ Go to top

Datasets

Sphere - Announced by Meta / 2022
- 134M documents split into 906M passages as the web corpus.
Common Crawl
- 3.15B pages and over than 380TiB size dataset, public, free to use.
SQuAD 2.0
- 100,000+ question dataset for QA.
Pile
- 825 GiB diverse, open source language modelling data set.
RACE
- A large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions.
Wikipedia
- Wikipedia dataset containing cleaned articles of all languages.

⬆️ Go to top

Benchmarks

BIG-bench

⬆️ Go to top

Materials

Papers

Megatron-Turing NLG (530B) - Announced by NVIDIA and Microsoft / 2021
LaMDA (137B) - Announced by Google / 2021
GLaM (1.2T) - Announced by Google / 2021
PaLM (540B) - Announced by Google / 2022
AlphaCode (41.4B) - Announced by DeepMind / 2022
Chinchilla (70B) - Announced by DeepMind / 2022
Sparrow (70B) - Announced by DeepMind / 2022
NLLB (54.5B) - Announced by Meta / 2022
LLaMA (65B) - Announced by Meta / 2023
AlexaTM (20B) - Announced by Amazon / 2022
Gopher (280B) - Announced by DeepMind / 2021
Galactica (120B) - Announced by Meta / 2022
PaLM2 Tech Report - Announced by Google / 2023
LIMA - Announced by Meta / 2023

Posts

Llama 2 (70B) - Announced by Meta / 2023
Luminous (13B) - Announced by Aleph Alpha / 2021
Turing NLG (17B) - Announced by Microsoft / 2020
Claude (52B) - Announced by Anthropic / 2021
Minerva (Parameter size unannounced) - Announced by Google / 2022
BloombergGPT (50B) - Announced by Bloomberg / 2023
AlexaTM (20B - Announced by Amazon / 2023
Dolly (6B) - Announced by Databricks / 2023
Jurassic-1 - Announced by AI21 / 2022
Jurassic-2 - Announced by AI21 / 2023
Koala - Announced by Berkeley Artificial Intelligence Research(BAIR) / 2023
Gemma - Gemma: Introducing new state-of-the-art open models / 2024
Grok-1 - Open Release of Grok-1 / 2023
Grok-1.5 - Announced by XAI / 2024
DBRX - Announced by Databricks / 2024

⬆️ Go to top

Projects

BigScience - Maintained by HuggingFace (Twitter) (Notion)
HuggingChat - Maintained by HuggingFace / 2023
OpenAssistant - Maintained by Open Assistant / 2023
StableLM - Maintained by Stability AI / 2023
Eleuther AI Language Model- Maintained by Eleuther AI / 2023
Falcon LLM - Maintained by Technology Innovation Institute / 2023
Gemma - Maintained by Google / 2024

GitHub repositories

Stanford Alpaca - - A repository of Stanford Alpaca project, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
Dolly - - A large language model trained on the Databricks Machine Learning Platform.
AutoGPT - - An experimental open-source attempt to make GPT-4 fully autonomous.
dalai - - The cli tool to run LLaMA on the local machine.
LLaMA-Adapter - - Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters.
alpaca-lora - - Instruct-tune LLaMA on consumer hardware.
llama_index - - A project that provides a central interface to connect your LLM's with external data.
openai/evals - - A curated list of reinforcement learning with human feedback resources.
trlx - - A repo for distributed training of language models with Reinforcement Learning via Human Feedback. (RLHF)
pythia - - A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
Embedchain - - Framework to create ChatGPT like bots over your dataset.

⬆️ Go to top

HuggingFace repositories

OpenAssistant SFT 6 - 30 billion LLaMa-based model made by HuggingFace for the chatting conversation.
Vicuna Delta v0 - An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
MPT 7B - A decoder-style transformer pre-trained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.
Falcon 7B - A 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.

⬆️ Go to top

Reading materials

⬆️ Go to top

Contributing

We welcome contributions to the Awesome LLMOps list! If you'd like to suggest an addition or make a correction, please follow these guidelines:

Fork the repository and create a new branch for your contribution.
Make your changes to the README.md file.
Ensure that your contribution is relevant to the topic of LLM.
Use the following format to add your contribution:

[Name of Resource](Link to Resource) - Description of resource

Add your contribution in alphabetical order within its category.
Make sure that your contribution is not already listed.
Provide a brief description of the resource and explain why it is relevant to LLM.
Create a pull request with a clear title and description of your changes.

We appreciate your contributions and thank you for helping to make the Awesome LLM list even more awesome!

⬆️ Go to top

For Tasks:

Click tags to check more tools for each tasks

generate text analyze data train models fine-tune models conduct research

For Jobs:

researcher developer data scientist ai engineer linguist

Alternative AI tools for awesome-llm

Similar Open Source Tools

awesome-llm

github

: 58

visionOS-examples

visionOS-examples is a repository containing accelerators for Spatial Computing. It includes examples such as Local Large Language Model, Chat Apple Vision Pro, WebSockets, Anchor To Head, Hand Tracking, Battery Life, Countdown, Plane Detection, Timer Vision, and PencilKit for visionOS. The repository showcases various functionalities and features for Apple Vision Pro, offering tools for developers to enhance their visionOS apps with capabilities like hand tracking, plane detection, and real-time cryptocurrency prices.

github

: 223

InternVL

InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM. It is a vision-language foundation model that can perform various tasks, including: **Visual Perception** - Linear-Probe Image Classification - Semantic Segmentation - Zero-Shot Image Classification - Multilingual Zero-Shot Image Classification - Zero-Shot Video Classification **Cross-Modal Retrieval** - English Zero-Shot Image-Text Retrieval - Chinese Zero-Shot Image-Text Retrieval - Multilingual Zero-Shot Image-Text Retrieval on XTD **Multimodal Dialogue** - Zero-Shot Image Captioning - Multimodal Benchmarks with Frozen LLM - Multimodal Benchmarks with Trainable LLM - Tiny LVLM InternVL has been shown to achieve state-of-the-art results on a variety of benchmarks. For example, on the MMMU image classification benchmark, InternVL achieves a top-1 accuracy of 51.6%, which is higher than GPT-4V and Gemini Pro. On the DocVQA question answering benchmark, InternVL achieves a score of 82.2%, which is also higher than GPT-4V and Gemini Pro. InternVL is open-sourced and available on Hugging Face. It can be used for a variety of applications, including image classification, object detection, semantic segmentation, image captioning, and question answering.

github

: 6.5k

awesome-mobile-llm

Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.

github

: 154

inference-speed-tests

This repository contains inference speed tests on Local Large Language Models on various devices. It provides results for different models tested on Macbook Pro and Mac Studio. Users can contribute their own results by running models with the provided prompt and adding the tokens-per-second output. Note that the results are not verified.

github

: 54

chat-master

ChatMASTER is a self-built backend conversation service based on AI large model APIs, supporting synchronous and streaming responses with perfect printer effects. It supports switching between mainstream models such as DeepSeek, Kimi, Doubao, OpenAI, Claude3, Yiyan, Tongyi, Xinghuo, ChatGLM, Shusheng, and more. It also supports loading local models and knowledge bases using Ollama and Langchain, as well as online API interfaces like Coze and Gitee AI. The project includes Java server-side, web-side, mobile-side, and management background configuration. It provides various assistant types for prompt output and allows creating custom assistant templates in the management background. The project uses technologies like Spring Boot, Spring Security + JWT, Mybatis-Plus, Lombok, Mysql & Redis, with easy-to-understand code and comprehensive permission control using JWT authentication system for multi-terminal support.

github

: 99

rookie_text2data

A natural language to SQL plugin powered by large language models, supporting seamless database connection for zero-code SQL queries. The plugin is designed to facilitate communication and learning among users. It supports MySQL database and various large models for natural language processing. Users can quickly install the plugin, authorize a database address, import the plugin, select a model, and perform natural language SQL queries.

github

: 95

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

TRACE

TRACE is a temporal grounding video model that utilizes causal event modeling to capture videos' inherent structure. It presents a task-interleaved video LLM model tailored for sequential encoding/decoding of timestamps, salient scores, and textual captions. The project includes various model checkpoints for different stages and fine-tuning on specific datasets. It provides evaluation codes for different tasks like VTG, MVBench, and VideoMME. The repository also offers annotation files and links to raw videos preparation projects. Users can train the model on different tasks and evaluate the performance based on metrics like CIDER, METEOR, SODA_c, F1, mAP, Hit@1, etc. TRACE has been enhanced with trace-retrieval and trace-uni models, showing improved performance on dense video captioning and general video understanding tasks.

github

: 54

Awesome-Tabular-LLMs

This repository is a collection of papers on Tabular Large Language Models (LLMs) specialized for processing tabular data. It includes surveys, models, and applications related to table understanding tasks such as Table Question Answering, Table-to-Text, Text-to-SQL, and more. The repository categorizes the papers based on key ideas and provides insights into the advancements in using LLMs for processing diverse tables and fulfilling various tabular tasks based on natural language instructions.

github

: 151

dora

Dataflow-oriented robotic application (dora-rs) is a framework that makes creation of robotic applications fast and simple. Building a robotic application can be summed up as bringing together hardwares, algorithms, and AI models, and make them communicate with each others. At dora-rs, we try to: make integration of hardware and software easy by supporting Python, C, C++, and also ROS2. make communication low latency by using zero-copy Arrow messages. dora-rs is still experimental and you might experience bugs, but we're working very hard to make it stable as possible.

github

: 2.0k

PocketFlow

Pocket Flow is a 100-line minimalist LLM framework designed for (Multi-)Agents, Workflow, RAG, etc. It provides a core abstraction for LLM projects by focusing on computation and communication through a graph structure and shared store. The framework aims to support the development of LLM Agents, such as Cursor AI, by offering a minimal and low-level approach that is well-suited for understanding and usage. Users can install Pocket Flow via pip or by copying the source code, and detailed documentation is available on the project website.

github

: 1.8k

Awesome-LLM-Tabular

This repository is a curated list of research papers that explore the integration of Large Language Model (LLM) technology with tabular data. It aims to provide a comprehensive resource for researchers and practitioners interested in this emerging field. The repository includes papers on a wide range of topics, including table-to-text generation, table question answering, and tabular data classification. It also includes a section on related datasets and resources.

github

: 335

nntrainer

NNtrainer is a software framework for training neural network models on devices with limited resources. It enables on-device fine-tuning of neural networks using user data for personalization. NNtrainer supports various machine learning algorithms and provides examples for tasks such as few-shot learning, ResNet, VGG, and product rating. It is optimized for embedded devices and utilizes CBLAS and CUBLAS for accelerated calculations. NNtrainer is open source and released under the Apache License version 2.0.

github

: 135

Awesome-LLMOps

github

: 4.3k

Awesome-LLMs-Datasets

github

: 804

For similar tasks

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136

sorrentum

Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.

github

: 89

tidb

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

github

: 37.1k

zep-python

Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

github

: 60

telemetry-airflow

This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

github

: 185

mojo

Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.

github

: 23.0k

pandas-ai

PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

github

: 14.0k

databend

Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.

github

: 7.7k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

awesome-llm

README:

Awesome LLM

Contents

Models

Overview

Open models

Projects

Commercial models

GPT

Gemini

Bard

Codex

Datasets

Benchmarks

Materials

Papers

Posts

Projects

GitHub repositories

HuggingFace repositories

Reading materials

Contributing

For Tasks:

For Jobs:

Alternative AI tools for awesome-llm

Similar Open Source Tools

awesome-llm

visionOS-examples

InternVL

awesome-mobile-llm

inference-speed-tests

chat-master

rookie_text2data

TrustLLM

TRACE

Awesome-Tabular-LLMs

dora

PocketFlow

Awesome-LLM-Tabular

nntrainer

Awesome-LLMOps

Awesome-LLMs-Datasets

For similar tasks

Azure-Analytics-and-AI-Engagement

sorrentum

tidb

zep-python

telemetry-airflow

mojo

pandas-ai

databend

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape