
Awesome-Latent-CoT
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
Stars: 156

This repository contains a regularly updated paper list for Large Language Models (LLMs) reasoning in latent space. Reasoning in latent space allows for more flexible and efficient thought representation beyond language tokens, bringing AI closer to human-like cognition. The repository covers various aspects of LLMs, including pre-training, supervised finetuning, analysis, interpretability, multimodal reasoning, and applications. It aims to showcase the advancements in reasoning with latent thoughts and continuous concepts in AI models.
README:
This repository contains a regularly updated paper list for Latent CoT Reasoning.
Whereof one cannot speak, thereof one must be silent. -- Ludwig Wittgenstein
Reasoning in latent space shifts the way AI models think, moving beyond language tokens to represent thought processes in a more abstract, non-language space. Just as humans often think without words, latent space allows for more flexible and efficient reasoning.
- Richer Thought Representation: Latent space captures complex, non-verbal thoughts that language alone can't express.
- Lower Latency: It allows for higher information density, reducing the need for token-based decoding and speeding up reasoning.
This approach brings AI closer to human-like cognition, enabling faster, more flexible, and powerful models for real-world tasks.
If you find our survey useful for your research, please consider citing the following paper:
@article{eit2025latentcot,
title={Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning},
author={Xinghao Chen and Anhao Zhao and Heming Xia and Xuan Lu and Hanlin Wang and Yanjun Chen and Wei Zhang and Jian Wang and Wenjie Li and Xiaoyu Shen},
year={2025},
eprint={2505.16782},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.16782},
}
- 2025-05-22: 📝 The survey is now available on arXiv!
- 2025-02-16: 🚀 Latent CoT Repo launched!
-
Think before you speak: Training language models with pause tokens
Sachin Goyal,Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan. [pdf], 2023.10. -
Guiding Language Model Reasoning with Planning Tokens
Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni. [pdf], [code], 2023.10. -
Thinking Tokens for Language Modeling
David Herel, Tomas Mikolov. [pdf], 2024.05. -
Let's think dot by dot: Hidden computation in transformer language models
Jacob Pfau, William Merrill, Samuel R. Bowman. [pdf], [code], 2024.04. -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman. [pdf], 2024.03. -
Reasoning to Learn from Latent Thoughts
Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori Hashimoto. [pdf], [code], 2025.03. -
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning
Yoichi Ishibashi, Taro Yano, Masafumi Oyamada. [pdf], 2025.03. -
Disentangling Memory and Reasoning Ability in Large Language Models
Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang. [pdf], [code], 2024.11. -
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng. [pdf], 2025.02. -
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Zhuocheng Gong, Jian Guan, Wei Wu, Huishuai Zhang, Dongyan Zhao. [pdf], 2025.02. -
Efficient Pretraining Length Scaling
Bohong Wu, Shen Yan, Sijun Zhang, Jianqiao Lu, Yutao Zeng, Ya Wang, Xun Zhou. [pdf], 2025.04.
-
Training Large Language Models to Reason in a Continuous Latent Space
Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian. [pdf], [code], 2024.12. -
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
Jeffrey Cheng, Benjamin Van Durme. [pdf], 2024.12. -
Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding
Tianqiao Liu, Zui Chen, Zitao Liu, Mi Tian, Weiqi Luo. [pdf], 2024.09. -
LightThinker: Thinking Step-by-Step Compression
Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang. [pdf], 2025.02. -
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen, Hanqi Yan, Linhai Zhang, Zhanghao Hu, Yali Du, Yulan He. [pdf], 2025.02. -
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
Yige Xu, Xu Guo, Zhiwei Zeng, Chunyan Miao. [pdf], 2025.02. -
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
Yige Xu, Xu Guo, Zhiwei Zeng, Chunyan Miao. [pdf], 2025.05. -
LLM Pretraining with Continuous Concepts
Jihoon Tack, Jack Lanchantin, Jane Yu, Andrew Cohen, Ilia Kulikov, Janice Lan, Shibo Hao, Yuandong Tian, Jason Weston, Xian Li. [pdf], [code], 2025.02. -
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
Zhen Zhang, Xuehai He, Weixiang Yan, Ao Shen, Chenyang Zhao, Shuohang Wang, Yelong Shen, Xin Eric Wang. [pdf], [code], 2025.05. -
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
Wenhui Tan, Jiaze Li, Jianzhong Ju, Zhenbo Luo, Jian Luan, Ruihua Song. [pdf], [code], 2025.05. -
Hybrid Latent Reasoning via Reinforcement Learning
Zhenrui Yue, Bowen Jin, Huimin Zeng, Honglei Zhuang, Zhen Qin, Jinsung Yoon, Lanyu Shang, Jiawei Han, Dong Wang. [pdf], [code], 2025.05. -
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
Zhenrui Yue, Bowen Jin, Huimin Zeng, Honglei Zhuang, Zhen Qin, Jinsung Yoon, Lanyu Shang, Jiawei Han, Dong Wang. [pdf], [code], 2025.05. -
Enhancing Latent Computation in Transformers with Latent Tokens
Yuchang Sun, Yanxi Chen, Yaliang Li, Bolin Ding. [pdf], [code], 2025.05. -
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, Bang Liu. [pdf], 2025.05. -
Efficient Post-Training Refinement of Latent Reasoning in Large Language Models
Xinyuan Wang, Dongjie Wang, Wangyang Ying, Haoyue Bai, Nanxu Gong, Sixun Dong, Kunpeng Liu, Yanjie Fu. [pdf], [code], 2025.06. -
DART: Distilling Autoregressive Reasoning to Silent Thought
Nan Jiang, Ziming Wu, De-Chuan Zhan, Fuming Lai, Shaobing Lian. [pdf], 2025.06. -
Parallel Continuous Chain-of-Thought with Jacobi Iteration
Haoyi Wu, Zhihao Teng, Kewei Tu. [pdf], 2025.06. -
SynAdapt: Learning Adaptive Reasoning in Large Language Models via Synthetic Continuous Chain-of-Thought
Jianwei Wang, Ziming Wu, Fuming Lai, Shaobing Lian, Ziqian Zeng. [pdf], 2025.08. -
LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning
Jiaqi Wang, Binquan Ji, Haibo Luo, Yiyang Qi, Ruiting Li, Huiyan Wang, Yuantao Han, Cangyi Yang, jiaxu Zhang, Feiliang Ren. [pdf], 2025.09.
-
CoTFormer: A Chain-of-Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi. [pdf], 2024.08. -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein. [pdf], [code], [model], 2025.02. -
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He. [pdf], [code], 2025.02. -
Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Yilong Chen, Junyuan Shang, Zhenyu Zhang, Yanxi Xie, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang. [pdf], 2025.02. -
Reasoning with Latent Thoughts: On the Power of Looped Transformers
Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sashank J. Reddi, Sanjiv Kumar. [pdf], 2025.01. -
Pretraining Language Models to Ponder in Continuous Space
Boyi Zeng, Shixiang Song, Siyuan Huang, Yixuan Wang, He Li, Ziwei He, Xinbing Wang, Zhiyu Li, Zhouhan Lin. [pdf], 2025.05. -
The 4th Dimension for Scaling Model Size
Ruike Zhu, Hanwen Zhang, Tianyu Shi, Chi Wang, Tianyi Zhou, Zengyi Qin. [pdf], 2025.05. -
Hierarchical Reasoning Model
Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, Yasin Abbasi Yadkori. [pdf], [code], 2025.06. -
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou. [pdf], 2025.07. -
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Sangmin Bae, Yujin Kim, Reza Bayat, Sungnyun Kim, Jiyoun Ha, Tal Schuster, Adam Fisch, Hrayr Harutyunyan, Ziwei Ji, Aaron Courville, Se-Young Yun. [pdf], 2025.07.
-
Implicit Chain of Thought Reasoning via Knowledge Distillation
Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber. [pdf], [code], 2023.11. -
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step
Yuntian Deng, Yejin Choi, Stuart Shieber. [pdf], [code], 2024.05. -
Distilling System 2 into System 1
Ping Yu, Jing Xu, Jason Weston, Ilia Kulikov. [pdf], 2024.06.
-
On the Biology of a Large Language Model
Anthropic. [pdf], 2025.03. -
Jump to Conclusions: Short-Cutting Transformers with Linear Transformations
Alexander Yom Din, Taelin Karidi, Leshem Choshen, Mor Geva. [pdf], 2023.03. -
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan. [pdf], 2023.10. -
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task
Jannik Brinkmann, Abhay Sheshadri, Victor Levoso, Paul Swoboda, Christian Bartelt. [pdf], 2024.02. -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Sohee Yang, Elena Gribovskaya, Nora Kassner, Mor Geva, Sebastian Riedel. [pdf], 2024.02. -
Understanding and Patching Compositional Reasoning in LLMs
Zhaoyi Li, Gangwei Jiang, Hong Xie, Linqi Song, Defu Lian, Ying Wei. [pdf], 2024.02. -
Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning
Yuval Shalev, Amir Feder, Ariel Goldstein. [pdf], 2024.06. -
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang, Xiang Yue, Yu Su, Huan Sun. [pdf], 2024.05. -
Can Language Models Learn to Skip Steps?
Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang. [pdf], [code], 2024.09. -
Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning
Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Ana Brassard, Keisuke Sakaguchi, Kentaro Inui. [pdf], 2024.12. -
Do LLMs Really Think Step-by-step In Implicit Reasoning?
Yijiong Yu. [pdf], [code], 2024.11. -
Implicit Reasoning in Transformers is Reasoning through Shortcuts
Tianhe Lin, Jian Xie, Siyu Yuan, Deqing Yang. [pdf], 2025.03. -
Uncovering Latent Chain of Thought Vectors in Language Models
Jason Zhang, Scott Viteri. [pdf], 2024.09. -
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Rui Wang. [pdf], 2024.10. -
Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs
Zhipeng Yang, Junzhuo Li, Siyu Xia, Xuming Hu. [pdf], [code], 2025.05. -
To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers
Kevin Xu, Issei Sato. [pdf], 2025.05. -
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Hanlin Zhu, Shibo Hao, Zhiting Hu, Jiantao Jiao, Stuart Russell, Yuandong Tian. [pdf], 2025.05. -
Continuous Chain of Thought Enables Parallel Exploration and Reasoning
Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang, Hrayr Harutyunyan, Ankit Singh Rawat, Samet Oymak. [pdf], 2025.05. -
Do Language Models Use Their Depth Efficiently?
Róbert Csordás, Christopher D. Manning, Christopher Potts. [pdf], 2025.05. -
Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer
Wenquan Lu, Yuechuan Yang, Kyle Lee, Yanshu Li, Enqi Liu. [pdf], 2025.07. -
LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models
Chünhung Wu, Jinliang Lu, Zixuan Ren, Gangqiang Hu, Zhi Wu, Dai Dai, Hua Wu. [pdf], 2025.08.
-
Efficient Reasoning with Hidden Thinking
Xuan Shen, Yizhou Wang, Xiangxi Shi, Yanzhi Wang, Pu Zhao, Jiuxiang Gu. [pdf], [code], 2025.01. -
Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search
Yifan Ji, Zhipeng Xu, Zhenghao Liu, Yukun Yan, Shi Yu, Yishan Li, Zhiyuan Liu, Yu Gu, Ge Yu, Maosong Sun. [pdf], [code], 2025.02. -
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation
Jiakai Tang, Sunhao Dai, Teng Shi, Jun Xu, Xu Chen, Wen Chen, Wu Jian, Yuning Jiang. [pdf], [code], 2025.03. -
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning
Hongfei Xue, Yufeng Tang, Hexin Liu, Jun Zhang, Xuelong Geng, Lei Xie. [pdf], 2025.04.
-
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Xin Jiang, Zhenguo Li, Wei Bi, Lingpeng Kong. [pdf], 2024.02. -
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang, Tiancheng Li, Guo-Jun Qi. [pdf], 2025.05. -
Multimodal Latent Language Modeling with Next-Token Diffusion
Yutao Sun, Hangbo Bao, Wenhui Wang, Zhiliang Peng, Li Dong, Shaohan Huang, Jianyong Wang, Furu Wei. [pdf], 2024.12. -
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
Runjin Chen, Zhenyu Zhang, Junyuan Hong, Souvik Kundu, Zhangyang Wang. [pdf], [code], 2025.04. -
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu, Ming Ma, Xiaomin Yu, Pengxiang Ding, Han Zhao, Mingyang Sun, Siteng Huang, Donglin Wang. [pdf], [code], 2025.05. -
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
Thilo Hagendorff, Sarah Fabi. [pdf], 2025.04. -
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
Zeyuan Yang, Xueyang Yu, Delin Chen, Maohao Shen, Chuang Gan. [pdf], [code], 2025.06. -
Bridging Search and Recommendation through Latent Cross Reasoning
Teng Shi, Weicong Qin, Weijie Yu, Xiao Zhang, Ming He, Jianping Fan, Jun Xu. [pdf], 2025.08. -
LARES: Latent Reasoning for Sequential Recommendation
Enze Liu, Bowen Zheng, Xiaolei Wang, Wayne Xin Zhao, Jinpeng Wang, Sheng Chen, Ji-Rong Wen. [pdf], 2025.06. -
Reinforced Latent Reasoning for LLM-based Recommendation
Yang Zhang, Wenxin Xu, Xiaoyan Zhao, Wenjie Wang, Fuli Feng, Xiangnan He, Tat-Seng Chua. [pdf], 2025.05. -
Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models
Tan-Hanh Pham, Chris Ngo. [pdf], 2025.08.
For most recent Efficient Reasoning research, see Awesome-Efficient-Reasoning, and Awesome-Efficient-Reasoning-Models [Paper].
LatentCoT-Horizon: a list of papers contains a larger scope of latent reasoning.
If We’ve accidentally missed your papers on the list, please reach out to us, and we’ll make sure to add them as soon as possible!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Awesome-Latent-CoT
Similar Open Source Tools

Awesome-Latent-CoT
This repository contains a regularly updated paper list for Large Language Models (LLMs) reasoning in latent space. Reasoning in latent space allows for more flexible and efficient thought representation beyond language tokens, bringing AI closer to human-like cognition. The repository covers various aspects of LLMs, including pre-training, supervised finetuning, analysis, interpretability, multimodal reasoning, and applications. It aims to showcase the advancements in reasoning with latent thoughts and continuous concepts in AI models.

Awesome-Story-Generation
Awesome-Story-Generation is a repository that curates a comprehensive list of papers related to Story Generation and Storytelling, focusing on the era of Large Language Models (LLMs). The repository includes papers on various topics such as Literature Review, Large Language Model, Plot Development, Better Storytelling, Story Character, Writing Style, Story Planning, Controllable Story, Reasonable Story, and Benchmark. It aims to provide a chronological collection of influential papers in the field, with a focus on citation counts for LLMs-era papers and some earlier influential papers. The repository also encourages contributions and feedback from the community to improve the collection.

awesome-open-ended
A curated list of open-ended learning AI resources focusing on algorithms that invent new and complex tasks endlessly, inspired by human advancements. The repository includes papers, safety considerations, surveys, perspectives, and blog posts related to open-ended AI research.

ai4math-papers
The 'ai4math-papers' repository contains a collection of research papers related to AI applications in mathematics, including automated theorem proving, synthetic theorem generation, autoformalization, proof refactoring, premise selection, benchmarks, human-in-the-loop interactions, and constructing examples/counterexamples. The papers cover various topics such as neural theorem proving, reinforcement learning for theorem proving, generative language modeling, formal mathematics statement curriculum learning, and more. The repository serves as a valuable resource for researchers and practitioners interested in the intersection of AI and mathematics.

Awesome-LLM-Reasoning-Openai-o1-Survey
The repository 'Awesome LLM Reasoning Openai-o1 Survey' provides a collection of survey papers and related works on OpenAI o1, focusing on topics such as LLM reasoning, self-play reinforcement learning, complex logic reasoning, and scaling law. It includes papers from various institutions and researchers, showcasing advancements in reasoning bootstrapping, reasoning scaling law, self-play learning, step-wise and process-based optimization, and applications beyond math. The repository serves as a valuable resource for researchers interested in exploring the intersection of language models and reasoning techniques.

LLM-IR-Bias-Fairness-Survey
LLM-IR-Bias-Fairness-Survey is a collection of papers related to bias and fairness in Information Retrieval (IR) with Large Language Models (LLMs). The repository organizes papers according to a survey paper titled 'Bias and Unfairness in Information Retrieval Systems: New Challenges in the LLM Era'. The survey provides a comprehensive review of emerging issues related to bias and unfairness in the integration of LLMs into IR systems, categorizing mitigation strategies into data sampling and distribution reconstruction approaches.

Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.

awesome-generative-information-retrieval
This repository contains a curated list of resources on generative information retrieval, including research papers, datasets, tools, and applications. Generative information retrieval is a subfield of information retrieval that uses generative models to generate new documents or passages of text that are relevant to a given query. This can be useful for a variety of tasks, such as question answering, summarization, and document generation. The resources in this repository are intended to help researchers and practitioners stay up-to-date on the latest advances in generative information retrieval.

Awesome-LLM-Preference-Learning
The repository 'Awesome-LLM-Preference-Learning' is the official repository of a survey paper titled 'Towards a Unified View of Preference Learning for Large Language Models: A Survey'. It contains a curated list of papers related to preference learning for Large Language Models (LLMs). The repository covers various aspects of preference learning, including on-policy and off-policy methods, feedback mechanisms, reward models, algorithms, evaluation techniques, and more. The papers included in the repository explore different approaches to aligning LLMs with human preferences, improving mathematical reasoning in LLMs, enhancing code generation, and optimizing language model performance.

awesome-deeplogic
Awesome deep logic is a curated list of papers and resources focusing on integrating symbolic logic into deep neural networks. It includes surveys, tutorials, and research papers that explore the intersection of logic and deep learning. The repository aims to provide valuable insights and knowledge on how logic can be used to enhance reasoning, knowledge regularization, weak supervision, and explainability in neural networks.

Awesome-LLM-RAG
This repository, Awesome-LLM-RAG, aims to record advanced papers on Retrieval Augmented Generation (RAG) in Large Language Models (LLMs). It serves as a resource hub for researchers interested in promoting their work related to LLM RAG by updating paper information through pull requests. The repository covers various topics such as workshops, tutorials, papers, surveys, benchmarks, retrieval-enhanced LLMs, RAG instruction tuning, RAG in-context learning, RAG embeddings, RAG simulators, RAG search, RAG long-text and memory, RAG evaluation, RAG optimization, and RAG applications.

LLM4DB
LLM4DB is a repository focused on the intersection of Large Language Models (LLM) and Database technologies. It covers various aspects such as data processing, data analysis, database optimization, and data management for LLM. The repository includes works on data cleaning, entity matching, schema matching, data discovery, NL2SQL, data exploration, data visualization, configuration tuning, query optimization, and anomaly diagnosis using LLMs. It aims to provide insights and advancements in leveraging LLMs for improving data processing, analysis, and database management tasks.

LLM4DB
LLM4DB is a repository focused on the intersection of Large Language Models (LLMs) and Database technologies. It covers various aspects such as data processing, data analysis, database optimization, and data management for LLMs. The repository includes research papers, tools, and techniques related to leveraging LLMs for tasks like data cleaning, entity matching, schema matching, data discovery, NL2SQL, data exploration, data visualization, knob tuning, query optimization, and database diagnosis.
For similar tasks

Awesome-Latent-CoT
This repository contains a regularly updated paper list for Large Language Models (LLMs) reasoning in latent space. Reasoning in latent space allows for more flexible and efficient thought representation beyond language tokens, bringing AI closer to human-like cognition. The repository covers various aspects of LLMs, including pre-training, supervised finetuning, analysis, interpretability, multimodal reasoning, and applications. It aims to showcase the advancements in reasoning with latent thoughts and continuous concepts in AI models.

graphrag
The GraphRAG project is a data pipeline and transformation suite designed to extract meaningful, structured data from unstructured text using LLMs. It enhances LLMs' ability to reason about private data. The repository provides guidance on using knowledge graph memory structures to enhance LLM outputs, with a warning about the potential costs of GraphRAG indexing. It offers contribution guidelines, development resources, and encourages prompt tuning for optimal results. The Responsible AI FAQ addresses GraphRAG's capabilities, intended uses, evaluation metrics, limitations, and operational factors for effective and responsible use.

llm-self-correction-papers
This repository contains a curated list of papers focusing on the self-correction of large language models (LLMs) during inference. It covers various frameworks for self-correction, including intrinsic self-correction, self-correction with external tools, self-correction with information retrieval, and self-correction with training designed specifically for self-correction. The list includes survey papers, negative results, and frameworks utilizing reinforcement learning and OpenAI o1-like approaches. Contributions are welcome through pull requests following a specific format.

ERNIE
ERNIE 4.5 is a family of large-scale multimodal models with 10 distinct variants, including Mixture-of-Experts (MoE) models with 47B and 3B active parameters. The models feature a novel heterogeneous modality structure supporting parameter sharing across modalities while allowing dedicated parameters for each individual modality. Trained with optimal efficiency using PaddlePaddle deep learning framework, ERNIE 4.5 models achieve state-of-the-art performance across text and multimodal benchmarks, enhancing multimodal understanding without compromising performance on text-related tasks. The open-source development toolkits for ERNIE 4.5 offer industrial-grade capabilities, resource-efficient training and inference workflows, and multi-hardware compatibility.

LLM-KG4QA
LLM-KG4QA is a repository focused on the integration of Large Language Models (LLMs) and Knowledge Graphs (KGs) for Question Answering (QA). It covers various aspects such as using KGs as background knowledge, reasoning guideline, and refiner/filter. The repository provides detailed information on pre-training, fine-tuning, and Retrieval Augmented Generation (RAG) techniques for enhancing QA performance. It also explores complex QA tasks like Explainable QA, Multi-Modal QA, Multi-Document QA, Multi-Hop QA, Multi-run and Conversational QA, Temporal QA, Multi-domain and Multilingual QA, along with advanced topics like Optimization and Data Management. Additionally, it includes benchmark datasets, industrial and scientific applications, demos, and related surveys in the field.

llm-verified-with-monte-carlo-tree-search
This prototype synthesizes verified code with an LLM using Monte Carlo Tree Search (MCTS). It explores the space of possible generation of a verified program and checks at every step that it's on the right track by calling the verifier. This prototype uses Dafny, Coq, Lean, Scala, or Rust. By using this technique, weaker models that might not even know the generated language all that well can compete with stronger models.

flashinfer
FlashInfer is a library for Language Languages Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, PageAttention and LoRA. FlashInfer focus on LLM serving and inference, and delivers state-the-art performance across diverse scenarios.

dolma
Dolma is a dataset and toolkit for curating large datasets for (pre)-training ML models. The dataset consists of 3 trillion tokens from a diverse mix of web content, academic publications, code, books, and encyclopedic materials. The toolkit provides high-performance, portable, and extensible tools for processing, tagging, and deduplicating documents. Key features of the toolkit include built-in taggers, fast deduplication, and cloud support.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.