Awesome-Colorful-LLM

Awesome-Colorful-LLM

Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics, Fundamental Sciences such as Mathematics, and Ominous.

Stars: 106

Visit
 screenshot

Awesome-Colorful-LLM is a meticulously assembled anthology of vibrant multimodal research focusing on advancements propelled by large language models (LLMs) in domains such as Vision, Audio, Agent, Robotics, and Fundamental Sciences like Mathematics. The repository contains curated collections of works, datasets, benchmarks, projects, and tools related to LLMs and multimodal learning. It serves as a comprehensive resource for researchers and practitioners interested in exploring the intersection of language models and various modalities for tasks like image understanding, video pretraining, 3D modeling, document understanding, audio analysis, agent learning, robotic applications, and mathematical research.

README:

awesome-colorful-ai Colorful Multimodal Research

awesome

Welcome to our meticulously assembled anthology of vibrant multimodal research, encompassing an array of domains including Vision, Audio, Agent, Robotics, and Fundamental Sciences such as Mathematics, and Ominous including anything you want. Our collection primarily focuses on the advancements propelled by large language models (LLMs), complemented by an assortment of related collections.

Table of Contents

๐Ÿ‘€ Vision

๐Ÿ–ผ Image

Collection of works about Image + LLMs, Diffusion, see Image for details

  • Image Understanding
    • Reading List
    • Datasets & Benchmarks
  • Image Generation
    • Reading List
  • Open-source Projects

Related Collections (Understanding)

  • VLM_survey GitHub last commit (by committer)Dynamic JSON Badge, This is the repository of "Vision Language Models for Vision Tasks: a Survey", a systematic survey of VLM studies in various visual recognition tasks including image classification, object detection, semantic segmentation, etc.
  • Awesome-Multimodal-Large-Language-Models GitHub last commit (by committer)Dynamic JSON Badge, A curated list of Multimodal Large Language Models (MLLMs), including datasets, multimodal instruction tuning, multimodal in-context learning, multimodal chain-of-thought, llm-aided visual reasoning, foundation models, and others. This list will be updated in real time.
  • LLM-in-Vision GitHub last commit (by committer)Dynamic JSON Badge, Recent LLM (Large Language Models)-based CV and multi-modal works
  • Awesome-Transformer-Attention GitHub last commit (by committer)Dynamic JSON Badge, This repo contains a comprehensive paper list of Vision Transformer & Attention, including papers, codes, and related websites
  • Multimodal-AND-Large-Language-Models GitHub last commit (by committer)Dynamic JSON Badge, Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
  • Efficient_Foundation_Model_Survey GitHub last commit (by committer)Dynamic JSON Badge, This repo contains the paper list and figures for A Survey of Resource-efficient LLM and Multimodal Foundation Models.
  • CVinW_Readings GitHub last commit (by committer)Dynamic JSON Badge, A collection of papers on the topic of Computer Vision in the Wild (CVinW)
  • Awesome-Vision-and-Language GitHub last commit (by committer)Dynamic JSON Badge, A curated list of awesome vision and language resources
  • Awesome-Multimodal-Research GitHub last commit (by committer)Dynamic JSON Badge, This repo is reorganized from Awesome-Multimodal-ML
  • Awesome-Multimodal-ML GitHub last commit (by committer)Dynamic JSON Badge, Reading list for research topics in multimodal machine learning
  • Awesome-Referring-Image-Segmentation GitHub last commit (by committer)Dynamic JSON Badge, A collection of referring image (video, 3D) segmentation papers and datasets.
  • Awesome-Prompting-on-Vision-Language-Model GitHub last commit (by committer)Dynamic JSON Badge, This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
  • Mamba-in-CV GitHub last commit (by committer)Dynamic JSON Badge, A paper list of some recent Mamba-based CV works. If you find some ignored papers, please open issues or pull requests.
  • Efficient-Multimodal-LLMs-Survey GitHub last commit (by committer)Dynamic JSON Badge, Efficient Multimodal Large Language Models: A Survey

Related Collections (Evaluation)

Related Collections (Generation)

  • Awesome-VQVAE GitHub last commit (by committer)Dynamic JSON Badge, A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
  • Awesome-Diffusion-Models GitHub last commit (by committer)Dynamic JSON Badge, This repository contains a collection of resources and papers on Diffusion Models
  • Awesome-Controllable-Diffusion GitHub last commit (by committer)Dynamic JSON Badge, Collection of papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, and others.
  • Awesome-LLMs-meet-Multimodal-Generation GitHub last commit (by committer)Dynamic JSON Badge, A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

Tutorials

๐Ÿ“บ Video

Collection of works about Video-Language Pretraining, Video + LLMs, see Video for details

  • Video Understanding
    • Reading List
    • Pretraining Tasks
    • Datasets
      • Pretraining Corpora
      • Video Instructions
    • Benchmarks
      • Common Downstream Tasks
      • Advanced Downstream Tasks
        • Task-Specific Benchmarks
        • Multifaceted Benchmarks
    • Metrics
    • Projects & Tools
  • Video Generation
    • Reading List
    • Metrics
    • Projects

Related Collections (datasets)

Related Collections (understanding)

Related Collections (generation)

  • i2vgen-xl GitHub last commit (by committer)Dynamic JSON Badge, VGen is an open-source video synthesis codebase developed by the Tongyi Lab of Alibaba Group, featuring state-of-the-art video generative models.

๐Ÿ“ท 3D

Collection of works about 3D+LLM, see 3D for details

  • Reading List

Related Collections

๐Ÿ“ฐ Documnent

Related Collections

  • Awesome Document Understanding GitHub last commit (by committer)Dynamic JSON Badge, A curated list of resources for Document Understanding (DU) topic related to Intelligent Document Processing (IDP), which is relative to Robotic Process Automation (RPA) from unstructured data, especially form Visually Rich Documents (VRDs).

Vision Encoder

Collection of existing popular vision encoder, see Vision Encoder for details

  • Image Encoder
  • Video Encoder
  • Audio Encoder

๐Ÿ‘‚ Audio

Collection of works about audio+LLM, see Audio for details

  • Reading List

Related Collections

  • awesome-large-audio-models GitHub last commit (by committer)Dynamic JSON Badge, Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
  • speech-trident GitHub last commit (by committer)Dynamic JSON Badge, Awesome speech/audio LLMs, representation learning, and codec models
  • Audio-AI-Timeline GitHub last commit (by committer)Dynamic JSON Badge, Here we will keep track of the latest AI models for waveform based audio generation, starting in 2023!

๐Ÿ”ง Agent

Collection of works about agent learning, see Agent for details

  • Reading List
  • Datasets & Benchmarks
  • Projects
  • Applications

Related Collections

  • LLM-Agent-Paper-Digest GitHub last commit (by committer)Dynamic JSON Badge, For benefiting the research community and promoting LLM-powered agent direction, we organize papers related to LLM-powered agent that published on top conferences recently
  • LLMAgentPapers GitHub last commit (by committer)Dynamic JSON Badge, Must-read Papers on Large Language Model Agents.
  • LLM-Agent-Paper-List GitHub last commit (by committer)Dynamic JSON Badge, In this repository, we provide a systematic and comprehensive survey on LLM-based agents, and list some must-read papers.
  • XLang Paper Reading GitHub last commit (by committer)Dynamic JSON Badge, Paper collection on building and evaluating language model agents via executable language grounding
  • Awesome-LLMOps GitHub last commit (by committer)Dynamic JSON Badge, An awesome & curated list of best LLMOps tools for developers
  • Awesome LLM-Powered Agent GitHub last commit (by committer)Dynamic JSON Badge, Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
  • Awesome LMs with Tools GitHub last commit (by committer)Dynamic JSON Badge, Language models (LMs) are powerful yet mostly for text-generation tasks. Tools have substantially enhanced their performance for tasks that require complex skills.
  • ToolLearningPapers GitHub last commit (by committer)Dynamic JSON Badge, Must-read papers on tool learning with foundation models
  • Awesome-ALM GitHub last commit (by committer)Dynamic JSON Badge, This repo collect research papers about leveraging the capabilities of language models, which can be a good reference for building upper-layer applications
  • LLM-powered Autonomous Agents, Lil'Log, Overview: panning, memory, tool use
  • World Model Papers, GitHub last commit (by committer)Dynamic JSON Badge, Paper collections of the continuous effort start from World Models

๐Ÿค– Robotic

Collection of works about robotics+LLM, see Robotic for details

  • Reading List

Related Collections (Robotics)

  • Awesome-Robotics-Foundation-Models GitHub last commit (by committer)Dynamic JSON Badge, This is the partner repository for the survey paper "Foundation Models in Robotics: Applications, Challenges, and the Future". The authors hope this repository can act as a quick reference for roboticists who wish to read the relevant papers and implement the associated methods.
  • Awesome-LLM-Robotics GitHub last commit (by committer)Dynamic JSON Badge, This repo contains a curative list of papers using Large Language/Multi-Modal Models for Robotics/RL
  • Simulately GitHub last commit (by committer)Dynamic JSON Badge, a website where we gather useful information of physics simulator for cutting-edge robot learning research. It is still under active development, so stay tuned!
  • Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation GitHub last commit (by committer)Dynamic JSON Badge, Temporal Action Detection & Weakly Supervised & Semi Supervised Temporal Action Detection & Temporal Action Proposal Generation & Open-Vocabulary Temporal Action Detection.
  • Awesome-TimeSeries-SpatioTemporal-LM-LLM GitHub last commit (by committer)Dynamic JSON Badge, A professionally curated list of Large (Language) Models and Foundation Models (LLM, LM, FM) for Temporal Data (Time Series, Spatio-temporal, and Event Data) with awesome resources (paper, code, data, etc.), which aims to comprehensively and systematically summarize the recent advances to the best of our knowledge.
  • PromptCraft-Robotics GitHub last commit (by committer)Dynamic JSON Badge, The PromptCraft-Robotics repository serves as a community for people to test and share interesting prompting examples for large language models (LLMs) within the robotics domain
  • Awesome-Robotics GitHub last commit (by committer)Dynamic JSON Badge, A curated list of awesome links and software libraries that are useful for robots

Related Collections (embodied)

Related Collections (autonomous driving)

  • Awesome-LLM4AD GitHub last commit (by committer)Dynamic JSON Badge, A curated list of awesome LLM for Autonomous Driving resources (continually updated)

๐Ÿ”ฌ Science

โ™พ๏ธ AI for Math

Collection of works about Mathematics + LLMs, see AI4Math for details

  • Reading List

Related Collections

  • Awesome-Scientific-Language-Models GitHub last commit (by committer)Dynamic JSON Badge, A curated list of pre-trained language models in scientific domains (e.g., mathematics, physics, chemistry, biology, medicine, materials science, and geoscience), covering different model sizes (from <100M to 70B parameters) and modalities (e.g., language, vision, molecule, protein, graph, and table)

๐ŸŒ Ominous

Collection of works about LLM + ominous modality, see Ominous for details

  • Reading List

Contributing

Please freely create a pull request or drop me an email: [email protected]

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Awesome-Colorful-LLM

Similar Open Source Tools

For similar tasks

For similar jobs