
ai_summer
Summary repository for AI Summer 2024
Stars: 59

AI Summer is a repository focused on providing workshops and resources for developing foundational skills in generative AI models and transformer models. The repository offers practical applications for inferencing and training, with a specific emphasis on understanding and utilizing advanced AI chat models like BingGPT. Participants are encouraged to engage in interactive programming environments, decide on projects to work on, and actively participate in discussions and breakout rooms. The workshops cover topics such as generative AI models, retrieval-augmented generation, building AI solutions, and fine-tuning models. The goal is to equip individuals with the necessary skills to work with AI technologies effectively and securely, both locally and in the cloud.
README:
Summary repository for AI Summer 2024. Introduction to generative AI, with practical applications to inferencing and training
Presented by Vanderbilt Data Science Institute data scientists:
- Dr. Jesse Spencer-Smith, Chief Data Scientist
- Dr. Charreau Bell, Senior Data Scientist
- Myranda Shirk, Senior Data Scientist
- Umang Chaudhry, Data Scientist
- Dr. Abigail Petulante, DSI Postdoctoral Fellow
- Dr. Joshua Su, DSI Postdoctoral Fellow
The objective of these workshops is to develop foundational skills in understanding, inferencing and training generative AI models and other transformer models.
Practice your Python skills using the below documents. Choose either a Google Colab for interactive programming environment, or alternatively read through the Google Doc.
You’ll want to use the most advanced AI chat model that you can get access to. Microsoft just opened access to BingGPT through Bing Chat, which is based on an early version of GPT4, currently the most advanced AI chat model available to the public. You’ll need to install the Edge browser (https://www.microsoft.com › edge › download) and go to bing. com. Click on “Chat”.
Think about any data you might want to bring to the workshop. Also begin thinking about any projects you might want to accomplish during our month. We’ll have office hours for you to work with us to get your first project off the ground!
Session will run live from 9am-11am, with an office hour from 11am to noon (all times Central).
No class Friday (Vanderbilt Commencement)
Weeks 2, 5/13 - 5/17: Retrieval-Augmented Generation (RAG), Assistants, Agents, and Intro to Diffusion Models
Week 3, 5/20 - 5/24: Building AI Solutions, Running AI Securely Locally or in the Cloud, Introduction to Training Models
Monday:
-
Homework: Watch the following videos: General Backprop and (math-centric backprop](https://youtu.be/tIeHLnjs5U8?si=mnT36GTL7YqU8qBO)
Wednesday: Recording: (https://vanderbilt.zoom.us/rec/share/fswTlpFMlqAVgxRDDBza920i9brAuxaSiteHpDNUwpm9YQzedJa5g_2oZSSr2Eq1.wF73yKYGD5eY3cyY?startTime=1716392393000)
Friday: Recording: (https://vanderbilt.zoom.us/rec/share/plozihJcLFBIfjPxQ8Bsv9IdqHh39qFinkVUChsYtuiuiGAc8O2TcvTEbTE5cAUW.3XYBPJfbdZJ1GzAS?startTime=1716558902000)
No class Monday (Memorial Day)
Wednesday:
Papers/Blogs discussed:
https://arxiv.org/pdf/2405.17247
https://proceedings.mlr.press/v139/radford21a/radford21a.pdf
https://arxiv.org/pdf/2405.09818
https://arxiv.org/pdf/2304.10592
https://arxiv.org/pdf/2310.03744
https://huggingface.co/papers/2311.05437
https://arxiv.org/pdf/2311.05437
https://llava-vl.github.io/blog/2024-01-30-llava-next/
https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/
https://llava-vl.github.io/blog/2024-04-30-llava-next-video/
https://arxiv.org/abs/2310.02239
Remember we are all learning and exploring
- Please share your video upon entering the room and unmute
- Share your screens--someone volunteer to share their screen upon entering, and everyone be ready to share your screen to show what you’ve found
- Make notes of what you’ve discussed in the Response Reports below
- Everyone be ready to report out (random)
- Make some friends
- Breakout Rooms Worksheets
Google Docs has a limit of 100 people viewing/editing a document at one time.
Please be sure your display name is set in Zoom. If you are in one of the following special groups, please pre-pend your name with one of the following qualifiers.
- Data Science for Social Good: DSSG
- Center for AI in Protein Dynamics: Protein
- If you are in a lab and would like your own breakout room: Labname (keep it short, please!)
- If you are faculty and would like to be in a breakout room with other faculty: Faculty
For example, I might be DSSG-Jesse Spencer-Smith
Video recordings of these workshops can be found on our YouTube channel AI Summer playlist
Looking for the code resources for Summer 2023? View the 2023 repo version here.
- Prompt Engineering paper https://arxiv.org/abs/2302.11382
- Prompt Engineering Courserea Course: https://www.coursera.org/learn/prompt-engineering
- Visual overview of Generative AI from 3Blue1Brown: https://www.youtube.com/watch?v=wjZofJX0v4M
- Semester-long course on transformer models, DS 5690. Graduate students and advanced undergraduates can register by contacting me. I welcome auditing by a select number of postdoctoral fellows, and drop-ins from faculty!
DGX A100 Compute Grant: https://forms.gle/2mGfEy9DB4JU2GpZ8
- Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra and Thomas Wolf. If you are affiliated with Vanderbilt University, you can access this pre-print book (and any book by O’Reilly) free by logging into O'Reilly Media using your Vanderbilt email address. Vanderbilt licenses all content from O’Reilly. The book covers Transformers for purposes beyond text.
To get the most out of this workshop:
- Open Colab (workbook) notebooks and actively write code along with the instructor
- Actively participate in discussions
- Actively participate in breakout rooms
- Work on homework assignments before coming to class
- Relax your mind and ask questions
- Open the Edge browser (yes, Edge) and navigate to www.bing.com
- Select "chat". A new window should open saying you need the new Bing.
- Select "Start chatting" at the bottom of this window. This should prompt you to sign in to a Microsoft account. Do not use an organizational/school email (such as Vanderbilt). Instead, select "No account? Create a new one" and create one with your personal email. Note: if you get stuck in the "use the new Bing" window, go back to Bing.com and select "Sign in" instead. Follow instructions for Step 3.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai_summer
Similar Open Source Tools

ai_summer
AI Summer is a repository focused on providing workshops and resources for developing foundational skills in generative AI models and transformer models. The repository offers practical applications for inferencing and training, with a specific emphasis on understanding and utilizing advanced AI chat models like BingGPT. Participants are encouraged to engage in interactive programming environments, decide on projects to work on, and actively participate in discussions and breakout rooms. The workshops cover topics such as generative AI models, retrieval-augmented generation, building AI solutions, and fine-tuning models. The goal is to equip individuals with the necessary skills to work with AI technologies effectively and securely, both locally and in the cloud.

intro-to-intelligent-apps
This repository introduces and helps organizations get started with building AI Apps and incorporating Large Language Models (LLMs) into them. The workshop covers topics such as prompt engineering, AI orchestration, and deploying AI apps. Participants will learn how to use Azure OpenAI, Langchain/ Semantic Kernel, Qdrant, and Azure AI Search to build intelligent applications.

kai
Kai is an AI-enabled tool that simplifies the process of modernizing application source code to a new platform. It uses Large Language Models (LLMs) guided by static code analysis, along with data from Konveyor. This data provides insights into how the organization solved similar problems in the past, helping streamline and automate the code modernization process. Kai assists developers by providing suggestions and solutions to common problems through Retrieval Augmented Generation (RAG), working with LLMs using Konveyor analysis reports about the codebase and generating solutions based on previously solved examples.

llmops-duke-aipi
LLMOps Duke AIPI is a course focused on operationalizing Large Language Models, teaching methodologies for developing applications using software development best practices with large language models. The course covers various topics such as generative AI concepts, setting up development environments, interacting with large language models, using local large language models, applied solutions with LLMs, extensibility using plugins and functions, retrieval augmented generation, introduction to Python web frameworks for APIs, DevOps principles, deploying machine learning APIs, LLM platforms, and final presentations. Students will learn to build, share, and present portfolios using Github, YouTube, and Linkedin, as well as develop non-linear life-long learning skills. Prerequisites include basic Linux and programming skills, with coursework available in Python or Rust. Additional resources and references are provided for further learning and exploration.

TI-Mindmap-GPT
TI MINDMAP GPT is an AI-powered tool designed to assist cyber threat intelligence teams in quickly synthesizing and visualizing key information from various Threat Intelligence sources. The tool utilizes Large Language Models (LLMs) to transform lengthy content into concise, actionable summaries, going beyond mere text reduction to provide insightful encapsulations of crucial points and themes. Users can leverage their own LLM keys for personalized and efficient information processing, streamlining data analysis and enabling teams to focus on strategic decision-making.

ai2apps
AI2Apps is a visual IDE for building LLM-based AI agent applications, enabling developers to efficiently create AI agents through drag-and-drop, with features like design-to-development for rapid prototyping, direct packaging of agents into apps, powerful debugging capabilities, enhanced user interaction, efficient team collaboration, flexible deployment, multilingual support, simplified product maintenance, and extensibility through plugins.

SuperKnowa
SuperKnowa is a fast framework to build Enterprise RAG (Retriever Augmented Generation) Pipelines at Scale, powered by watsonx. It accelerates Enterprise Generative AI applications to get prod-ready solutions quickly on private data. The framework provides pluggable components for tackling various Generative AI use cases using Large Language Models (LLMs), allowing users to assemble building blocks to address challenges in AI-driven text generation. SuperKnowa is battle-tested from 1M to 200M private knowledge base & scaled to billions of retriever tokens.

doc2plan
doc2plan is a browser-based application that helps users create personalized learning plans by extracting content from documents. It features a Creator for manual or AI-assisted plan construction and a Viewer for interactive plan navigation. Users can extract chapters, key topics, generate quizzes, and track progress. The application includes AI-driven content extraction, quiz generation, progress tracking, plan import/export, assistant management, customizable settings, viewer chat with text-to-speech and speech-to-text support, and integration with various Retrieval-Augmented Generation (RAG) models. It aims to simplify the creation of comprehensive learning modules tailored to individual needs.

EngAce
EngAce is a cutting-edge, generative AI-powered application revolutionizing Vietnamese English learning. It offers personalized learning experiences combining AI with comprehensive features. The repository contains source code, documentation, and resources for the app.

kitops
KitOps is a CNCF open standards project for packaging, versioning, and securely sharing AI/ML projects. It provides a unified solution for packaging, versioning, and managing assets in security-conscious enterprises, governments, and cloud operators. KitOps elevates AI artifacts to first-class, governed assets through ModelKits, which are tamper-proof, signable, and compatible with major container registries. The tool simplifies collaboration between data scientists, developers, and SREs, ensuring reliable and repeatable workflows for both development and operations. KitOps supports packaging for various types of models, including large language models, computer vision models, multi-modal models, predictive models, and audio models. It also facilitates compliance with the EU AI Act by offering tamper-proof, signable, and auditable ModelKits.

arch
Arch is an intelligent Layer 7 gateway designed to protect, observe, and personalize LLM applications with APIs. It handles tasks like detecting and rejecting jailbreak attempts, calling backend APIs, disaster recovery, and observability. Built on Envoy Proxy, it offers features like function calling, prompt guardrails, traffic management, and standards-based observability. Arch aims to improve the speed, security, and personalization of generative AI applications.

awesome-agent-failures
Awesome AI Agent Failures is a community-curated repository documenting known failure modes for AI agents, real-world case studies, and techniques to avoid failures. It provides insights into common failure modes such as tool hallucination, response hallucination, goal misinterpretation, plan generation failures, incorrect tool use, verification & termination failures, and prompt injection. The repository also includes resources like research papers, industry resources, books, external resources, and related awesome lists to help AI engineers build more reliable AI agents by learning from production failures.

stagehand
Stagehand is an AI web browsing framework that simplifies and extends web automation using three simple APIs: act, extract, and observe. It aims to provide a lightweight, configurable framework without complex abstractions, allowing users to automate web tasks reliably. The tool generates Playwright code based on atomic instructions provided by the user, enabling natural language-driven web automation. Stagehand is open source, maintained by the Browserbase team, and supports different models and model providers for flexibility in automation tasks.

advisingapp
**Advising App™** is a software solution created by Canyon GBS™ that includes a robust personal assistant designed to support student service professionals in their day-to-day roles. The assistant can help with research tasks, draft communication, language translation, content creation, student profile analysis, project planning, ideation, and much more. The software also includes a student service CRM designed to support the management of prospective and enrolled students. Key features of the CRM include record management, email and SMS, service management, caseload management, task management, interaction tracking, files and documents, and much more.

MonikA.I
MonikA.I. submod is a project that enhances Monika After Story mod with various AI features. It utilizes multiple AI models for text generation, text-to-speech, speech-to-text, emotion detection, and NLI classification. Users can interact with Monika through chatbots, voice commands, and game actions. The project is compatible with MAS v0.12.15 and supports Windows, Linux, and MacOS. It offers a user-friendly installation process and detailed usage instructions for different AI functionalities.

twinny
Twinny is a free and open-source AI code completion plugin for Visual Studio Code and compatible editors. It integrates with various tools and frameworks, including Ollama, llama.cpp, oobabooga/text-generation-webui, LM Studio, LiteLLM, and Open WebUI. Twinny offers features such as fill-in-the-middle code completion, chat with AI about your code, customizable API endpoints, and support for single or multiline fill-in-middle completions. It is easy to install via the Visual Studio Code extensions marketplace and provides a range of customization options. Twinny supports both online and offline operation and conforms to the OpenAI API standard.
For similar tasks

ai_summer
AI Summer is a repository focused on providing workshops and resources for developing foundational skills in generative AI models and transformer models. The repository offers practical applications for inferencing and training, with a specific emphasis on understanding and utilizing advanced AI chat models like BingGPT. Participants are encouraged to engage in interactive programming environments, decide on projects to work on, and actively participate in discussions and breakout rooms. The workshops cover topics such as generative AI models, retrieval-augmented generation, building AI solutions, and fine-tuning models. The goal is to equip individuals with the necessary skills to work with AI technologies effectively and securely, both locally and in the cloud.

CS7320-AI
CS7320-AI is a repository containing lecture materials, simple Python code examples, and assignments for the course CS 5/7320 Artificial Intelligence. The code examples cover various chapters of the textbook 'Artificial Intelligence: A Modern Approach' by Russell and Norvig. The repository focuses on basic AI concepts rather than advanced implementation techniques. It includes HOWTO guides for installing Python, working on assignments, and using AI with Python.

dynamiq
Dynamiq is an orchestration framework designed to streamline the development of AI-powered applications, specializing in orchestrating retrieval-augmented generation (RAG) and large language model (LLM) agents. It provides an all-in-one Gen AI framework for agentic AI and LLM applications, offering tools for multi-agent orchestration, document indexing, and retrieval flows. With Dynamiq, users can easily build and deploy AI solutions for various tasks.

craftium
Craftium is an open-source platform based on the Minetest voxel game engine and the Gymnasium and PettingZoo APIs, designed for creating fast, rich, and diverse single and multi-agent environments. It allows for connecting to Craftium's Python process, executing actions as keyboard and mouse controls, extending the Lua API for creating RL environments and tasks, and supporting client/server synchronization for slow agents. Craftium is fully extensible, extensively documented, modern RL API compatible, fully open source, and eliminates the need for Java. It offers a variety of environments for research and development in reinforcement learning.

turing
Viglet Turing ES is an open source solution with Semantic Navigation and Chat bot features. It indexes all content in Solr as a search engine.

LLM-Learn-PK
LLM-Learn-PK is a repository for testing various LLM and RAG tests. It serves as a learning platform where the creator experiments with different tests and learns in the process.

learning-ai
This repository is a collection of notes and code examples related to AI, covering topics such as Tokenization, Architectures, GGML, Llama.cpp, Position Embeddings, GPUs, Vector Databases, and Vision. It also includes in-progress work on Model Context Protocol (MCP) and Voice Activity Detection (VAD) for whisper.cpp. The repository offers exploration code for various AI-related concepts and tools like GGML, Llama.cpp, GPU technologies (CUDA, Kompute, Metal, OpenCL, ROCm, Vulkan), Word embeddings, Huggingface API, and Qdrant Vector Database in both Rust and Python.

miles-credit
CREDIT is an open software platform for training and deploying AI atmospheric prediction models. It offers fast models with flexible configuration options for input data and neural network architecture. The user-friendly interface enables quick setup and iteration. Developed by the MILES group and NSF National Center for Atmospheric Research, CREDIT combines advanced AI/ML with atmospheric science expertise. It provides a stable release with various models, training, and deployment options, with ongoing development. Detailed documentation is available for installation, training, deployment, config file interpretation, and API usage.
For similar jobs

NanoLLM
NanoLLM is a tool designed for optimized local inference for Large Language Models (LLMs) using HuggingFace-like APIs. It supports quantization, vision/language models, multimodal agents, speech, vector DB, and RAG. The tool aims to provide efficient and effective processing for LLMs on local devices, enhancing performance and usability for various AI applications.

mslearn-ai-fundamentals
This repository contains materials for the Microsoft Learn AI Fundamentals module. It covers the basics of artificial intelligence, machine learning, and data science. The content includes hands-on labs, interactive learning modules, and assessments to help learners understand key concepts and techniques in AI. Whether you are new to AI or looking to expand your knowledge, this module provides a comprehensive introduction to the fundamentals of AI.

awesome-ai-tools
Awesome AI Tools is a curated list of popular tools and resources for artificial intelligence enthusiasts. It includes a wide range of tools such as machine learning libraries, deep learning frameworks, data visualization tools, and natural language processing resources. Whether you are a beginner or an experienced AI practitioner, this repository aims to provide you with a comprehensive collection of tools to enhance your AI projects and research. Explore the list to discover new tools, stay updated with the latest advancements in AI technology, and find the right resources to support your AI endeavors.

go2coding.github.io
The go2coding.github.io repository is a collection of resources for AI enthusiasts, providing information on AI products, open-source projects, AI learning websites, and AI learning frameworks. It aims to help users stay updated on industry trends, learn from community projects, access learning resources, and understand and choose AI frameworks. The repository also includes instructions for local and external deployment of the project as a static website, with details on domain registration, hosting services, uploading static web pages, configuring domain resolution, and a visual guide to the AI tool navigation website. Additionally, it offers a platform for AI knowledge exchange through a QQ group and promotes AI tools through a WeChat public account.

AI-Notes
AI-Notes is a repository dedicated to practical applications of artificial intelligence and deep learning. It covers concepts such as data mining, machine learning, natural language processing, and AI. The repository contains Jupyter Notebook examples for hands-on learning and experimentation. It explores the development stages of AI, from narrow artificial intelligence to general artificial intelligence and superintelligence. The content delves into machine learning algorithms, deep learning techniques, and the impact of AI on various industries like autonomous driving and healthcare. The repository aims to provide a comprehensive understanding of AI technologies and their real-world applications.

promptpanel
Prompt Panel is a tool designed to accelerate the adoption of AI agents by providing a platform where users can run large language models across any inference provider, create custom agent plugins, and use their own data safely. The tool allows users to break free from walled-gardens and have full control over their models, conversations, and logic. With Prompt Panel, users can pair their data with any language model, online or offline, and customize the system to meet their unique business needs without any restrictions.

ai-demos
The 'ai-demos' repository is a collection of example code from presentations focusing on building with AI and LLMs. It serves as a resource for developers looking to explore practical applications of artificial intelligence in their projects. The code snippets showcase various techniques and approaches to leverage AI technologies effectively. The repository aims to inspire and educate developers on integrating AI solutions into their applications.

ai_summer
AI Summer is a repository focused on providing workshops and resources for developing foundational skills in generative AI models and transformer models. The repository offers practical applications for inferencing and training, with a specific emphasis on understanding and utilizing advanced AI chat models like BingGPT. Participants are encouraged to engage in interactive programming environments, decide on projects to work on, and actively participate in discussions and breakout rooms. The workshops cover topics such as generative AI models, retrieval-augmented generation, building AI solutions, and fine-tuning models. The goal is to equip individuals with the necessary skills to work with AI technologies effectively and securely, both locally and in the cloud.