llms-learning

llms-learning

A repository sharing the literatures about large language models

Stars: 64

Visit
 screenshot

A repository sharing literatures and resources about Large Language Models (LLMs) and beyond. It includes tutorials, notebooks, course assignments, development stages, modeling, inference, training, applications, study, and basics related to LLMs. The repository covers various topics such as language models, transformers, state space models, multi-modal language models, training recipes, applications in autonomous driving, code, math, embodied intelligence, and more. The content is organized by different categories and provides comprehensive information on LLMs and related topics.

README:

llms-learning šŸ“š šŸ¦™

A repository sharing the literatures and resources about Large Language Models (LLMs) and beyond.

Hope you find this repository handy and helpful for your llms learning journey! šŸ˜Š

News šŸ”„

  • 2025.01.25
    • Deepseek has unveiled its latest large reasoning model (LRM), Deepseek-R1, trained using its proprietary large-scale reinforcement-learning recipe, GRPO, built upon its newest pretrained large language model (LLM) Deepseek-V3. This achievement marks a significant milestone, particularly from two key perspectives:
        1. In the intense AGI competition sparked by OpenAI, it stands out as the first Chinese model to not only match but frequently surpass the performance of the state-of-the-art LRM OpenAI-o1, all while operating at a fraction of the computational cost.
        1. as for the exploration of AGI, it further validates the effectiveness of the inference scaling law and emergent abilities on reasoning capabilities through the innovative Long-CoT method introduced by OpenAI-o1, which mimics the System-2 Slow Thinking pattern of human intelligence.
  • 2025.01.15
    • Minimax has officially open-sourced their latest Mixture of Experts (MoE) model featuring Lightning Attention, named MiniMax-01, along with the paper, the code and the model!
    • Iā€™m truly honored to have contributed as one of the authors of this groundbreaking work šŸ˜†!
  • 2024.10.24
    • Welcome to watch our new online free LLMs intro course on bilibili!
    • We also open-source the course assignments for you to take a deep dive into LLMs.
    • If you like this course or this repository, you can subscribe to the teacher's bilibili account and maybe ā­ this GitHub repo šŸ˜œ.
  • 2024.03.07
    • We offer a comprehensive notebook tutorial on efficient GPU kernel coding using Triton, building upon the official tutorials and extending them with additional hands-on examples, such as the Flash Attention 2 forward/backward kernel.
    • In addition, we also provide a step-by-step math derivation of Flash Attention 2, enabling a deeper understanding of its underlying mechanics.

Table of Contents


Note:

  • Each markdown file contains collected papers roughly sorted by published year in descending order; in other words, newer papers are generally placed at the top. However, this arrangement is not guaranteed to be completely accurate, as the published year may not always be clear.

  • The taxonomy is complex and not strictly orthogonal, so don't be surprised if the same paper appears multiple times under different tracks.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for llms-learning

Similar Open Source Tools

For similar tasks

For similar jobs