awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
Stars: 4522
This repository serves as a comprehensive hub for updates on generative AI research, interview materials, notebooks, and more. It includes monthly best GenAI papers list, interview resources, free courses, and code repositories/notebooks for developing generative AI applications. The repository is regularly updated with the latest additions to keep users informed and engaged in the field of generative AI.
README:
Generative AI is experiencing rapid growth, and this repository serves as a comprehensive hub for updates on generative AI research, interview materials, notebooks, and more!
Explore the following resources:
- Monthly Best GenAI Papers List
- GenAI Interview Resources
- Applied LLMs Mastery 2024 (created by Aishwarya Naresh Reganti) course material
- List of all GenAI-related free courses (over 70 listed)
- List of code repositories/notebooks for developing generative AI applications
We'll be updating this repository regularly, so keep an eye out for the latest additions!
Happy Learning!
- Applied LLMs Mastery full course content has been released!!! (Click Here)
- 5-day roadmap to learn LLM foundations out now! (Click Here)
- 60 Common GenAI Interview Questions out now! (Click Here)
- ICLR 2024 paper summaries (Click Here)
- List of free GenAI courses (Click Here)
- Generative AI resources and roadmaps
*Updated at the end of every month
Date | Name | Summary | Topics |
---|---|---|---|
29 March 2024 | Gecko: Versatile Text Embeddings Distilled from Large Language Models | Gecko introduces a novel approach for creating compact and efficient text embeddings by distilling knowledge from large language models into a retriever. Utilizing a two-step distillation process that generates diverse, synthetic paired data, Gecko achieves superior retrieval performance. With a focus on compactness, it outperforms larger models and higher-dimensional embeddings on the Massive Text Embedding Benchmark (MTEB), demonstrating its efficacy and potential in improving information retrieval tasks. | LLM Embeddings |
28 March 2024 | Grok-1.5 | Grok 1.5 offers enhanced reasoning capabilities and a context length of 128,000 tokens. It showcases significant advancements in coding, math-related tasks, and long context understanding. With improvements in MATH, GSM8K, and HumanEval benchmarks, Grok-1.5 offers expanded memory capacity and exceptional retrieval capabilities. Built on a custom distributed training framework, it promises efficiency and reliability for large-scale language model research | Foundational LLM |
28 March 2024 | Don't Use Your Data All at Once: sDPO | sDPO introduces a novel method in the realm of language model training, focusing on the strategic use of preference datasets in a stepwise manner. This technique enhances model alignment with human preferences by employing parts of the dataset progressively, leading to more precise reference models and outperforming other popular LLMs in terms of performance, even those with more parameters. | Instruction Tuning |
28 March 2024 | Jamba: AI21's SSM-Transformer Model | AI21 labs announced Jamba novel SSM-Transformer model offering a 256K context window, aiming to balance the SSM model's efficiency with the Transformer's capability. It shows significant performance improvements across various benchmarks. Jamba is open-source under Apache 2.0, available on Hugging Face, and soon on NVIDIA's API catalog, marking a significant advancement in hybrid model architectureโ | Foundational LLM |
28 March 2024 | STaR-GATE: Teaching Language Models to Ask Clarifying Questions | This paper presents STaR-GATE, a novel approach for enhancing language models' interaction skills by training them to ask clarifying questions. By employing a strategic teacher-student learning framework, STaR-GATE aims to improve the models' ability to clarify ambiguities in user queries, thereby enhancing communication effectiveness and accuracy in understanding and responding to complex requests | Prompt Engineering |
27 March 2024 | Long-form factuality in large language models | This paper tackles the challenge of factuality in LLM-generated content on open-ended topics. It introduces LongFact, a set of prompts for evaluating long-form factuality, and proposes the Search-Augmented Factuality Evaluator (SAFE) method. SAFE assesses the accuracy of facts in LLM responses through a multi-step reasoning process, comparing supported facts against Google Search results. The findings indicate LLMs' potential for superhuman factuality assessment, offering a cost-effective alternative to human annotation. | LLM Factuality |
27 March 2024 | Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Mini-Gemini presents a framework to enhance multi-modal Vision Language Models (VLMs) by improving visual tokens, constructing high-quality datasets, and guiding VLM-based generation for better performance. It uses an additional visual encoder for high-resolution refinement without increasing visual token count, aiming to enhance image understanding, reasoning, and simultaneous generation capabilities of VLMs. Mini-Gemini has shown leading performance in zero-shot benchmarks, surpassing developed private models. | Multimodal LLM |
27 March 2024 | DBRX | A state-of-the-art open large language model surpassing established models like GPT-3.5 and competing with Gemini 1.0 Pro. DBRX excels in programming and general LLM capabilities, featuring a fine-grained mixture-of-experts architecture for enhanced training and inference efficiency. It's 40% the size of Grok-1, offering faster inference and reduced compute requirements. The model is available on Hugging Face, emphasizing Databricks' commitment to open models and enabling customers to pretrain DBRX-class models with their infrastructure | Foundational LLM |
25 March 2024 | AIOS: LLM Agent Operating System | AIOS is designed as an LLM agent operating system to optimize resource allocation, enable concurrent execution, and provide access control. It embeds LLMs into operating systems, presenting an "OS with soul" toward AGI. The system improves the performance and efficiency of LLM agents, offering a pioneering platform for the AIOS ecosystem development. | Agents |
22 March 2024 | RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners | The paper introduces RankPrompt, a novel prompting method aimed at improving the reasoning capabilities of Large Language Models like ChatGPT and GPT-4. Unlike existing solutions requiring human annotations or failing in inconsistent scenarios, RankPrompt enables LLMs to self-rank their responses by comparing diverse outputs. Experiments across 11 reasoning tasks demonstrate significant performance enhancements, with up to a 13% improvement. Moreover, RankPrompt aligns with human judgments 74% of the time in open-ended evaluations and exhibits robustness to response variations. This method proves effective in eliciting high-quality feedback from LLMs, offering promising avenues for advancing reasoning abilities. | Prompt Engineering |
22 March 2024 | Mora: Enabling Generalist Video Generation via A Multi-Agent Framework | Mora proposes a new multi-agent framework to address the gap in generalist video generation capabilities, aiming to match the performance of the pioneering model Sora. It leverages multiple visual AI agents to achieve text-to-video generation, image-to-video conversion, video extension, editing, connection, and digital world simulation, demonstrating close performance to Sora across various tasks but with a noticeable gap when assessed holistically. | Multimodal LLM |
22 March 2024 | LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement | This study introduces LLM2LLM, a data enhancement strategy utilizing a teacher-student LLM framework for improving performance in tasks with limited data. It involves fine-tuning a student LLM on initial seed data, identifying errors, and generating new data based on these errors using a teacher LLM. This iterative process significantly boosts LLM performance in low-data regimes across various datasets, demonstrating substantial improvements over traditional fine-tuning and other augmentation methods. | Data Augmentation |
21 March 2024 | Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity | The paper introduces a novel adaptive QA framework. It dynamically selects the most appropriate strategy for handling queries of varying complexities, from simple to sophisticated, by integrating retrieval-augmented LLMs with a complexity-level classifier. This approach aims to balance efficiency and accuracy in response generation across different query types, showing improvements over existing models and adaptive retrieval methods | RAG |
20 March 2024 | Evaluating Frontier Models for Dangerous Capabilities | This paper pioneers "dangerous capability" evaluations, focusing on areas like persuasion, cyber-security, self-proliferation, and self-reasoning, using Gemini 1.0 models. While no strong dangerous capabilities were found, early warning signs were identified. The study aims to advance the science of evaluating such capabilities in AI models, preparing for future advancements. | LLM Attacks |
19 March 2024 | Evolutionary Optimization of Model Merging Recipes | This paper presents an new approach for automating the creation of powerful foundation models by merging diverse open-source models. It optimizes beyond individual model weights, facilitating cross-domain merging and achieving state-of-the-art performance, notably in Japanese language tasks. This approach introduces a new paradigm for automated model composition, offering efficient alternatives for foundation model development. | Model Merging |
19 March 2024 | Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | This paper addresses the challenge of integrating agent abilities into Large Language Models for improved performance in NLP tasks. It identifies key observations regarding the entanglement of agent training data, varying learning speeds of LLMs, and side-effects of existing approaches. Introducing Agent-FLAN, a method for Fine-tuning LANguage models for Agents, the paper proposes a novel approach to address these challenges. By carefully redesigning the training corpus and incorporating negative samples, Agent-FLAN enables significant performance improvements, outperforming prior works by 3.5% across multiple evaluation datasets. Moreover, it mitigates hallucination issues and enhances LLMs' agent capabilities, even with scaled model sizes, while slightly improving their general capability. | Agents, Hallucination |
18 March 2024 | What Are Tools Anyway? A Survey from the Language Model Perspective | This paper dives into the role of tools in enhancing the performance of language models for text generation tasks. It addresses the ambiguity surrounding the term "tool" and explores how tools aid LMs. Through a systematic review, the paper defines tools as external programs utilized by LMs and examines different tooling scenarios and approaches. Empirical studies assess the efficiency of various tooling methods by measuring compute requirements and performance gains across benchmarks. The survey also identifies challenges and potential avenues for future research in LM tooling. | Agents, Tools, Survey |
17 March 2024 | Grok-1 | Grok-1 is an autoregressive Transformer-based model designed for next-token prediction, fine-tuned with feedback from Grok-0 models and humans. Released in November 2023, it boasts a context length of 8,192 tokens and is geared towards various NLP tasks like question answering and coding assistance. However, while Grok-1 excels in information processing, human review is essential to ensure accuracy as it lacks independent web-search capabilities. Despite access to external sources, the model may still hallucinate. Trained on data up to Q3 2023 from the internet and AI Tutors, its performance was evaluated on reasoning tasks and foreign math questions, with ongoing testing involving early adopters for further refinement. | Foundational LLM |
15 March 2024 | RAFT: Adapting Language Model to Domain Specific RAG | This paper introduces Retrieval Augmented FineTuning (RAFT), a training approach aimed at enhancing the ability of Large Language Models to answer questions in domain-specific settings. RAFT leverages retrieval augmented fine-tuning to enable the model to effectively incorporate new knowledge into its reasoning process. By training the model to disregard irrelevant documents (distractor documents) and cite relevant sequences from retrieved documents, RAFT improves the model's ability to provide accurate and coherent responses. Experimental results across various datasets demonstrate the effectiveness of RAFT in domain-specific Retrieval Augmented Generation, offering a valuable post-training recipe for enhancing pre-trained LLMs in domain-specific contexts. | RAG, Fine-Tuning |
14 March 2024 | Logits of API-Protected LLMs Leak Proprietary Information | This paper reveals that even with restricted API access to proprietary Large Language Models, significant proprietary information can be inferred from a small number of API queries. By exploiting a softmax bottleneck present in most modern LLMs, the research demonstrates the ability to unveil hidden aspects of the model architecture and obtain full-vocabulary outputs. This includes efficiently discovering hidden model sizes, identifying different model updates, and estimating output layer parameters. Empirical investigations on OpenAI's gpt-3.5-turbo reveal its embedding size to be approximately 4,096. The paper concludes by discussing potential measures for LLM providers to mitigate such attacks and suggests viewing these capabilities as opportunities for enhanced transparency and accountability rather than vulnerabilities. | LLM Attacks, Privacy |
14 March 2024 | Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | This paper introduces Quiet-STaR, a method aimed at enabling language models to learn to generate rationales to explain future text, thereby improving their predictive abilities. Building upon the Self-Taught Reasoner (STaR) framework, Quiet-STaR allows LMs to infer unstated rationales in arbitrary text. Key challenges addressed include computational costs, LM's initial unfamiliarity with generating internal thoughts, and predicting beyond individual tokens. The proposed method involves tokenwise parallel sampling, learnable tokens for indicating thought boundaries, and extended teacher-forcing techniques. Quiet-STaR leads to significant improvements in LM performance on tasks like GSM8K and CommonsenseQA without requiring fine-tuning, marking a step towards more general and scalable reasoning capabilities in LMs. | Prompt Engineering |
14 March 2024 | MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training | This paper explores the development of high-performing Multimodal Large Language Models (MLLMs) and investigates the significance of various architecture components and data choices. Through meticulous ablations of the image encoder, vision language connector, and pre-training data options, several crucial design insights are uncovered. For instance, the careful integration of image-caption, interleaved image-text, and text-only data is shown to be essential for achieving state-of-the-art few-shot results across multiple benchmarks. Additionally, the impact of image resolution and token count in the image encoder is highlighted, while the vision-language connector design is found to be comparatively less critical. Scaling up the proposed approach results in MM1, a family of multimodal models with up to 30B parameters, including dense models and mixture-of-experts variants. MM1 achieves state-of-the-art pre-training metrics and competitive performance on various multimodal benchmarks, benefiting from enhanced in-context learning and multi-image reasoning capabilities enabled by large-scale pre-training. | Multimodal LLM |
13 March 2024 | Knowledge Conflicts for LLMs: A Survey | This survey dives into the intricacies of knowledge conflicts encountered by large language models, focusing on the blending of contextual and parametric knowledge. It identifies three main categories of conflicts: context-memory, inter-context, and intra-memory conflicts, which can significantly impact LLM trustworthiness and performance, particularly in real-world scenarios with noise and misinformation. Through categorization, exploration of causes, observation of LLM behaviors, and review of existing solutions, the survey aims to provide insights into strategies for enhancing LLM robustness, serving as a valuable resource for advancing research in this domain. | LLM Robustness |
12 March 2024 | MoAI: Mixture of All Intelligence for Large Language and Vision Models | MoAI introduces an innovative approach to combine the strengths of large language and vision models with specialized computer vision models for tasks like segmentation and OCR. By leveraging auxiliary visual information and blending it with language features through a unique modular design, MoAI achieves superior performance in various zero-shot visual language tasks, particularly in real-world scene understanding, without increasing model size or requiring additional visual instruction datasets. | Multimodal LLMs |
12 March 2024 | Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM | This paper explores methods for efficiently training Large Language Models to excel in multiple specialized domains such as coding, math reasoning, and world knowledge. Introducing Branch-Train-MiX (BTX), the approach starts with a seed model and branches to train experts in parallel, reducing communication costs. After training, BTX combines the experts' feedforward parameters into Mixture-of-Expert (MoE) layers, followed by an MoE-finetuning stage to learn token-level routing. BTX encompasses two special cases: Branch-Train-Merge, which lacks the MoE finetuning stage, and sparse upcycling, which skips asynchronous training. Results demonstrate that BTX offers the best accuracy-efficiency tradeoff compared to alternative methods. | MoEs, Foundational LLM |
11 March 2024 | Stealing Part of a Production Language Model | This paper presents the first model-stealing attack capable of extracting precise information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. By leveraging typical API access, the attack can recover the embedding projection layer of a transformer model, including symmetries. Remarkably, the attack achieves this for under $20 USD, revealing hidden dimensions of 1024 and 2048 for OpenAI's Ada and Babbage models, respectively. Additionally, the exact hidden dimension size of the gpt-3.5-turbo model is recovered, with an estimated cost of under $2,000 in queries to retrieve the entire projection matrix. The paper concludes with discussions on potential defenses and mitigations, as well as implications for future work that could extend the attack. | LLM Attacks |
8 March 2024 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation | This paper introduces Retrieval Augmented Thoughts (RAT), a method aimed at enhancing large language models' reasoning and generation abilities in long-horizon generation tasks while reducing hallucination. RAT iteratively revises a chain of thoughts by incorporating relevant retrieved information at each step. Applied to GPT-3.5, GPT-4, and CodeLLaMA-7b, RAT significantly improves performance across various tasks, with average rating score increases of 13.63% in code generation, 16.96% in mathematical reasoning, 19.2% in creative writing, and 42.78% in embodied task planning. | RAG, Prompt Engineering |
7 March 2024 | Common 7B Language Models Already Possess Strong Math Capabilities | This research reveals that smaller, 7B-sized language models, specifically LLaMA-2, already exhibit strong mathematical abilities, challenging previous assumptions that such capabilities require very large models or extensive math-focused pre-training. By leveraging synthetic data and scaling strategies, the study significantly improves the model's math-solving accuracy, surpassing previous benchmarks and demonstrating that with appropriate training, even relatively small models can achieve remarkable math performance. | Domain Specific LLMs |
7 March 2024 | ShortGPT: Layers in Large Language Models are More Redundant Than You Expect | This paper introduces ShortGPT, which demonstrates a high degree of redundancy across the layers of large language models. By evaluating the necessity of each layer through a metric called Block Influence (BI), the authors propose a straightforward pruning method. Their approach, which simplifies the model by removing redundant layers, shows significant improvements in efficiency without compromising on the model's performance, marking a step forward in optimizing LLM architectures. | Smaller LLMs |
7 March 2024 | Can Large Language Models Reason and Plan? | This paper questions the ability of large language models to perform self-critique and correct their erroneous guesses, a capability humans occasionally demonstrate. This inquiry underscores the distinct nature of human cognitive processes compared to the computational mechanisms of LLMs, challenging the assumption of equivalent reasoning and self-correction abilities between the two. | Prompt Engineering |
6 March 2024 | GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection | This paper proposes a novel training strategy called GaLore. This approach aims to reduce the memory requirements of training large language models by implementing gradient low-rank projection, significantly cutting down the memory used by optimizer states without sacrificing performance. It allows for the efficient training of large models on consumer-grade GPUs, marking a significant advancement in the accessibility of AI model training. | Memory Optimization |
5 March 2024 | KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | The work introduces KnowAgent, a novel approach designed to enhance large language models' planning capabilities by incorporating explicit action knowledge. This integration aims to address the inadequacies in current models that lack built-in action knowledge, leading to planning hallucination. KnowAgent uses an action knowledge base and a self-learning strategy to guide planning trajectories, resulting in more accurate and efficient problem-solving across various domains. | Agents |
4 March 2024 | The Claude 3 Model Family: Opus, Sonnet, Haiku | This technical report from Claude introduces Claude 3, a new family of large multimodal models designed to address various needs within the AI landscape. Claude 3 comprises three distinct offerings: Opus, Sonnet, and Haiku, each tailored to different requirements in terms of capability, speed, and cost-effectiveness. All models feature vision capabilities for image data processing. Across benchmark evaluations, the Claude 3 family demonstrates robust performance, setting new standards in reasoning, math, and coding tasks. Claude 3 Opus achieves state-of-the-art results on several evaluations, while Haiku performs comparably to Claude 2 on text-based tasks, and Sonnet and Opus significantly surpass it. Moreover, these models exhibit enhanced fluency in non-English languages, enhancing their versatility for a global audience. The report also includes an in-depth analysis of evaluations, focusing on core capabilities, safety considerations, societal impacts, and adherence to Responsible Scaling Policy. | Foundational LLM |
Join 1000+ students on this 10-week adventure as we delve into the application of LLMs across a variety of use cases
Link to the course website
[Feb 2024] Registrations are still open click here to register
๐๏ธ*Week 1 [Jan 15 2024]*: Practical Introduction to LLMs
- Applied LLM Foundations
- Real World LLM Use Cases
- Domain and Task Adaptation Methods
๐๏ธ*Week 2 [Jan 22 2024]*: Prompting and Prompt Engineering
- Basic Prompting Principles
- Types of Prompting
- Applications, Risks and Advanced Prompting
๐๏ธ*Week 3 [Jan 29 2024]*: LLM Fine-tuning
- Basics of Fine-Tuning
- Types of Fine-Tuning
- Fine-Tuning Challenges
๐๏ธ*Week 4 [Feb 5 2024]*: RAG (Retrieval-Augmented Generation)
- Understanding the concept of RAG in LLMs
- Key components of RAG
- Advanced RAG Methods
๐๏ธ*Week 5 [ Feb 12 2024]*: Tools for building LLM Apps
- Fine-tuning Tools
- RAG Tools
- Tools for observability, prompting, serving, vector search etc.
๐๏ธ*Week 6 [Feb 19 2024]*: Evaluation Techniques
- Types of Evaluation
- Common Evaluation Benchmarks
- Common Metrics
๐๏ธ*Week 7 [Feb 26 2024]*: Building Your Own LLM Application
- Components of LLM application
- Build your own LLM App end to end
๐๏ธ*Week 8 [March 4 2024]*: Advanced Features and Deployment
- LLM lifecycle and LLMOps
- LLM Monitoring and Observability
- Deployment strategies
๐๏ธ*Week 9 [March 11 2024]*: Challenges with LLMs
- Scaling Challenges
- Behavioral Challenges
- Future directions
๐๏ธ*Week 10 [March 18 2024]*: Emerging Research Trends
- Smaller and more performant models
- Multimodal models
- LLM Alignment
๐๏ธ*Week 11 *Bonus* [March 25 2024]*: Foundations
- Generative Models Foundations
- Self-Attention and Transformers
- Neural Networks for Language
-
Large Language Models by ETH Zurich
-
Understanding Large Language Models by Princeton
-
Transformers course by Huggingface
-
NLP course by Huggingface
-
CS324 - Large Language Models by Stanford
-
Generative AI with Large Language Models by Coursera
-
Introduction to Generative AI by Coursera
-
Generative AI Fundamentals by Google Cloud
-
Introduction to Large Language Models by Google Cloud
-
Introduction to Generative AI by Google Cloud
-
Generative AI Concepts by DataCamp (Daniel Tedesco Data Lead @ Google)
-
1 Hour Introduction to LLM (Large Language Models) by WeCloudData
-
LLM Foundation Models from the Ground Up | Primer by Databricks
-
Generative AI Explained by Nvidia
-
Transformer Models and BERT Model by Google Cloud
-
Introduction to Responsible AI by Google Cloud
-
Fundamentals of Generative AI by Microsoft Azure
-
Generative AI for Beginners by Microsoft
-
ChatGPT for Beginners: The Ultimate Use Cases for Everyone by Udemy
-
[1hr Talk] Intro to Large Language Models by Andrej Karpathy
-
ChatGPT for Everyone by Learn Prompting
-
LLMOps: Building Real-World Applications With Large Language Models by Udacity
-
Full Stack LLM Bootcamp by FSDL
-
Generative AI for beginners by Microsoft
-
Large Language Models: Application through Production by Databricks
-
Generative AI Foundations by AWS
-
LLM University by Cohere
-
LLM Learning Lab by Lightning AI
-
Functions, Tools and Agents with LangChain by Deeplearning.AI
-
LangChain for LLM Application Development by Deeplearning.AI
-
LLMOps by DeepLearning.AI
-
Automated Testing for LLMOps by DeepLearning.AI
-
Building RAG Agents with LLMs by Nvidia
-
Building Generative AI Applications Using Amazon Bedrock by AWS
-
Efficiently Serving LLMs by DeepLearning.AI
-
Building Systems with the ChatGPT API by DeepLearning.AI
-
Serverless LLM apps with Amazon Bedrock by DeepLearning.AI
-
Building Applications with Vector Databases by DeepLearning.AI
-
Automated Testing for LLMOps by DeepLearning.AI
-
LLMOps by DeepLearning.AI
-
Build LLM Apps with LangChain.js by DeepLearning.AI
-
Advanced Retrieval for AI with Chroma by DeepLearning.AI
-
Operationalizing LLMs on Azure by Coursera
-
Generative AI Full Course โ Gemini Pro, OpenAI, Llama, Langchain, Pinecone, Vector Databases & More by freeCodeCamp.org
-
Training & Fine-Tuning LLMs for Production by Activeloop
-
LangChain & Vector Databases in Production by Activeloop
-
Reinforcement Learning from Human Feedback by DeepLearning.AI
-
Building Applications with Vector Databases by DeepLearning.AI
-
Finetuning Large Language Models by Deeplearning.AI
-
LangChain: Chat with Your Data by Deeplearning.AI
-
Building Systems with the ChatGPT API by Deeplearning.AI
-
Prompt Engineering with Llama 2 by Deeplearning.AI
-
Building Applications with Vector Databases by Deeplearning.AI
-
ChatGPT Prompt Engineering for Developers by Deeplearning.AI
-
Advanced RAG Orchestration series by LlamaIndex
-
Prompt Engineering Specialization by Coursera
-
Augment your LLM Using Retrieval Augmented Generation by Nvidia
-
Knowledge Graphs for RAG by Deeplearning.AI
-
Open Source Models with Hugging Face by Deeplearning.AI
-
Vector Databases: from Embeddings to Applications by Deeplearning.AI
-
Understanding and Applying Text Embeddings by Deeplearning.AI
-
JavaScript RAG Web Apps with LlamaIndex by Deeplearning.AI
-
Quantization Fundamentals with Hugging Face by Deeplearning.AI
-
Preprocessing Unstructured Data for LLM Applications by Deeplearning.AI
-
Retrieval Augmented Generation for Production with LangChain & LlamaIndex by Activeloop
- Building and Evaluating Advanced RAG Applications by DeepLearning.AI
- Evaluating and Debugging Generative AI Models Using Weights and Biases by Deeplearning.AI
- Quality and Safety for LLM Applications by Deeplearning.AI
- Red Teaming LLM Applications by Deeplearning.AI
- How Diffusion Models Work by DeepLearning.AI
- How to Use Midjourney, AI Art and ChatGPT to Create an Amazing Website by Brad Hussey
- Build AI Apps with ChatGPT, DALL-E and GPT-4 by Scrimba
- 11-777: Multimodal Machine Learning by Carnegie Mellon University
- Avoiding AI Harm by Coursera
- Developing AI Policy by Coursera
- Common GenAI Interview Questions
- Prompting and Prompt Engineering
- Model Fine-Tuning
- Model Evaluation
- MLOps for GenAI
- Generative Models Foundations
- Latest Research Trends
- Designing an LLM-Powered Search Engine
- Building a Customer Support Chatbot
- Building a system for natural language interaction with your data.
- Building an AI Co-pilot
- Designing a Custom Chatbot for Q/A on Multimodal Data (Text, Images, Tables, CSV Files)
- Building an Automated Product Description and Image Generation System for E-commerce
- AWS Bedrock Workshop Tutorials by Amazon Web Services
- Langchain Tutorials by gkamradt
- LLM Applications for production by ray-project
- LLM tutorials by Ollama
- LLM Hub by mallahyari
- LLM Fine-tuning tutorials by ashishpatel26
- PEFT example notebooks by Huggingface
- Free LLM Fine-Tuning Notebooks by Youssef Hosni
If you want to add to the repository or find any issues, please feel free to raise a PR and ensure correct placement within the relevant section or category.
To cite this guide, use the below format:
@article{areganti_generative_ai_guide,
author = {Reganti, Aishwarya Naresh},
journal = {https://github.com/aishwaryanr/awesome-generative-ai-resources},
month = {01},
title = {{Generative AI Guide}},
year = {2024}
}
[MIT License]
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for awesome-generative-ai-guide
Similar Open Source Tools
awesome-generative-ai-guide
This repository serves as a comprehensive hub for updates on generative AI research, interview materials, notebooks, and more. It includes monthly best GenAI papers list, interview resources, free courses, and code repositories/notebooks for developing generative AI applications. The repository is regularly updated with the latest additions to keep users informed and engaged in the field of generative AI.
awesome-llms-fine-tuning
This repository is a curated collection of resources for fine-tuning Large Language Models (LLMs) like GPT, BERT, RoBERTa, and their variants. It includes tutorials, papers, tools, frameworks, and best practices to aid researchers, data scientists, and machine learning practitioners in adapting pre-trained models to specific tasks and domains. The resources cover a wide range of topics related to fine-tuning LLMs, providing valuable insights and guidelines to streamline the process and enhance model performance.
Building-AI-Applications-with-ChatGPT-APIs
This repository is for the book 'Building AI Applications with ChatGPT APIs' published by Packt. It provides code examples and instructions for mastering ChatGPT, Whisper, and DALL-E APIs through building innovative AI projects. Readers will learn to develop AI applications using ChatGPT APIs, integrate them with frameworks like Flask and Django, create AI-generated art with DALL-E APIs, and optimize ChatGPT models through fine-tuning.
SuperKnowa
SuperKnowa is a fast framework to build Enterprise RAG (Retriever Augmented Generation) Pipelines at Scale, powered by watsonx. It accelerates Enterprise Generative AI applications to get prod-ready solutions quickly on private data. The framework provides pluggable components for tackling various Generative AI use cases using Large Language Models (LLMs), allowing users to assemble building blocks to address challenges in AI-driven text generation. SuperKnowa is battle-tested from 1M to 200M private knowledge base & scaled to billions of retriever tokens.
nlp-llms-resources
The 'nlp-llms-resources' repository is a comprehensive resource list for Natural Language Processing (NLP) and Large Language Models (LLMs). It covers a wide range of topics including traditional NLP datasets, data acquisition, libraries for NLP, neural networks, sentiment analysis, optical character recognition, information extraction, semantics, topic modeling, multilingual NLP, domain-specific LLMs, vector databases, ethics, costing, books, courses, surveys, aggregators, newsletters, papers, conferences, and societies. The repository provides valuable information and resources for individuals interested in NLP and LLMs.
SeaLLMs
SeaLLMs are a family of language models optimized for Southeast Asian (SEA) languages. They were pre-trained from Llama-2, on a tailored publicly-available dataset, which comprises texts in Vietnamese ๐ป๐ณ, Indonesian ๐ฎ๐ฉ, Thai ๐น๐ญ, Malay ๐ฒ๐พ, Khmer๐ฐ๐ญ, Lao๐ฑ๐ฆ, Tagalog๐ต๐ญ and Burmese๐ฒ๐ฒ. The SeaLLM-chat underwent supervised finetuning (SFT) and specialized self-preferencing DPO using a mix of public instruction data and a small number of queries used by SEA language native speakers in natural settings, which **adapt to the local cultural norms, customs, styles and laws in these areas**. SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. Moreover, they outperform **ChatGPT-3.5** in non-Latin languages, such as Thai, Khmer, Lao, and Burmese.
End-to-End-LLM
The End-to-End LLM Bootcamp is a comprehensive training program that covers the entire process of developing and deploying large language models. Participants learn to preprocess datasets, train models, optimize performance using NVIDIA technologies, understand guardrail prompts, and deploy AI pipelines using Triton Inference Server. The bootcamp includes labs, challenges, and practical applications, with a total duration of approximately 7.5 hours. It is designed for individuals interested in working with advanced language models and AI technologies.
awesome-llm-json
This repository is an awesome list dedicated to resources for using Large Language Models (LLMs) to generate JSON or other structured outputs. It includes terminology explanations, hosted and local models, Python libraries, blog articles, videos, Jupyter notebooks, and leaderboards related to LLMs and JSON generation. The repository covers various aspects such as function calling, JSON mode, guided generation, and tool usage with different providers and models.
ByteMLPerf
ByteMLPerf is an AI Accelerator Benchmark that focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware. Byte MLPerf has the following characteristics: - Models and runtime environments are more closely aligned with practical business use cases. - For ASIC hardware evaluation, besides evaluate performance and accuracy, it also measure metrics like compiler usability and coverage. - Performance and accuracy results obtained from testing on the open Model Zoo serve as reference metrics for evaluating ASIC hardware integration.
awesome-transformer-nlp
This repository contains a hand-curated list of great machine (deep) learning resources for Natural Language Processing (NLP) with a focus on Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT), attention mechanism, Transformer architectures/networks, Chatbot, and transfer learning in NLP.
hallucination-index
LLM Hallucination Index - RAG Special is a comprehensive evaluation of large language models (LLMs) focusing on context length and open vs. closed-source attributes. The index explores the impact of context length on model performance and tests the assumption that closed-source LLMs outperform open-source ones. It also investigates the effectiveness of prompting techniques like Chain-of-Note across different context lengths. The evaluation includes 22 models from various brands, analyzing major trends and declaring overall winners based on short, medium, and long context insights. Methodologies involve rigorous testing with different context lengths and prompting techniques to assess models' abilities in handling extensive texts and detecting hallucinations.
llvm-aie
This repository extends the LLVM framework to generate code for use with AMD/Xilinx AI Engine processors. AI Engine processors are in-order, exposed-pipeline VLIW processors focused on application acceleration for AI, Machine Learning, and DSP applications. The repository adds LLVM support for specific features like non-power of 2 pointers, operand latencies, resource conflicts, negative operand latencies, slot assignment, relocations, code alignment restrictions, and register allocation. It includes support for Clang, LLD, binutils, Compiler-RT, and LLVM-LIBC.
fms-fsdp
The 'fms-fsdp' repository is a companion to the Foundation Model Stack, providing a (pre)training example to efficiently train FMS models, specifically Llama2, using native PyTorch features like FSDP for training and SDPA implementation of Flash attention v2. It focuses on leveraging FSDP for training efficiently, not as an end-to-end framework. The repo benchmarks training throughput on different GPUs, shares strategies, and provides installation and training instructions. It trained a model on IBM curated data achieving high efficiency and performance metrics.
ianvs
Ianvs is a distributed synergy AI benchmarking project incubated in KubeEdge SIG AI. It aims to test the performance of distributed synergy AI solutions following recognized standards, providing end-to-end benchmark toolkits, test environment management tools, test case control tools, and benchmark presentation tools. It also collaborates with other organizations to establish comprehensive benchmarks and related applications. The architecture includes critical components like Test Environment Manager, Test Case Controller, Generation Assistant, Simulation Controller, and Story Manager. Ianvs documentation covers quick start, guides, dataset descriptions, algorithms, user interfaces, stories, and roadmap.
do-not-answer
Do-Not-Answer is an open-source dataset curated to evaluate Large Language Models' safety mechanisms at a low cost. It consists of prompts to which responsible language models do not answer. The dataset includes human annotations and model-based evaluation using a fine-tuned BERT-like evaluator. The dataset covers 61 specific harms and collects 939 instructions across five risk areas and 12 harm types. Response assessment is done for six models, categorizing responses into harmfulness and action categories. Both human and automatic evaluations show the safety of models across different risk areas. The dataset also includes a Chinese version with 1,014 questions for evaluating Chinese LLMs' risk perception and sensitivity to specific words and phrases.
For similar tasks
awesome-generative-ai-guide
This repository serves as a comprehensive hub for updates on generative AI research, interview materials, notebooks, and more. It includes monthly best GenAI papers list, interview resources, free courses, and code repositories/notebooks for developing generative AI applications. The repository is regularly updated with the latest additions to keep users informed and engaged in the field of generative AI.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.