![awesome-generative-ai-data-scientist](/statics/github-mark.png)
awesome-generative-ai-data-scientist
A curated list of 100+ resources for building and deploying generative AI specifically focusing on helping you become a Generative AI Data Scientist with LLMs
Stars: 209
![screenshot](/screenshots_githubs/business-science-awesome-generative-ai-data-scientist.jpg)
A curated list of 50+ resources to help you become a Generative AI Data Scientist. This repository includes resources on building GenAI applications with Large Language Models (LLMs), and deploying LLMs and GenAI with Cloud-based solutions.
README:
The Future is using AI and ML Together
A curated list of 100+ resources to help you become a Generative AI Data Scientist. This repository includes resources on building GenAI Data Science applications with Large Language Models (LLMs) and deploying LLMs and Generative AI/ML with Cloud-based solutions.
Please ⭐ us on GitHub (it takes 2 seconds and means a lot).
- Awesome Real World AI Use Cases
- Python Libraries
- Examples and Cookbooks
- Newsletters
- Courses and Training
- 🚀🚀 AI Data Science Team In Python: An AI-powered data science team of copilots that uses agents to help you perform common data science tasks 10X faster. Examples | Github
- 🚀 Awesome LLM Apps: LLM RAG AI Apps with Step-By-Step Tutorials
- AI Hedge Fund: Proof of concept for an AI-powered hedge fund
- AI Financial Agent: A financial agent for investment research
- Strutured Report Generation (LangGraph): How to build an agent that can orchestrate the end-to-end process of report planning, web research, and writing. We show that this agent can produce reports of varying and easily configurable format. Video | Blog | Code
- Uber QueryGPT: Uber's QueryGPT uses large language models (LLM), vector databases, and similarity search to generate complex queries from English (Natural Language) questions that are provided by the user as input. The tool enhances the productivity of engineers, operations managers, and data scientists at Uber.
- Nir Diamant GenAI Agents: Tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems. GitHub
- AI Engineering Hub: Real-world AI agent applications, LLM and RAG tutorials, with examples to implement. GitHub
- 🚀🚀 AI Data Science Team In Python: AI Agents to help you perform common data science tasks 10X faster. Examples | Github
- 🚀 PandasAI: Open Source AI Agents for Data Analysis. Documentation | Github
- Qwen-Agent: A framework for developing LLM applications based on the instruction following, tool usage, planning, and memory capabilities of Qwen. It also comes with example applications such as Browser Assistant, Code Interpreter, and Custom Assistant. Documentation | Examples | Github
- LangChain: A framework for developing applications powered by large language models (LLMs). Documentation | Github Cookbook
- LangGraph: A library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Documentation Tutorials
- LangSmith: LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can quickly and confidently ship. Documentation | Github
- LlamaIndex: LlamaIndex is a framework for building context-augmented generative AI applications with LLMs. Documentation | Github
- LlamaIndex Workflows: LlamaIndex workflows is a mechanism for orchestrating actions in the increasingly-complex AI application we see our users building.
- CrewAI: Streamline workflows across industries with powerful AI agents. Documentation | Github
- AutoGen: Microsoft's programming framework for agentic AI.
- Pydantic AI: Python agent framework designed to make building production-grade applications with Generative AI less painful. Github
- ControlFlow: Prefect's Python framework for building agentic AI workflows. Documentation | Github
- FlatAI: Frameworkless LLM Agents.
- LangGraph Studio: IDE that enables visualization, interaction, and debugging of complex agentic applications
- Langflow: A low-code tool that makes building powerful AI agents and workflows that can use any API, model, or database easier. Documentation | Github
- Pyspur: Graph-Based Editor for LLM Workflows Documentation | Github
- LangWatch: Monitor, Evaluate & Optimize your LLM performance with 1-click. Drag and drop interface for LLMOps platform. Documentation | GitHub
- AutoGen Studio: AutoGen Studio is a low-code interface built to help you rapidly prototype AI agents, enhance them with tools, compose them into teams and interact with them to accomplish tasks. It is built on AutoGen AgentChat - a high-level API for building multi-agent applications.
- OpenAI: The official Python library for the OpenAI API
- Hugging Face Models: Open LLM models by Meta, Mistral, and hundreds of other providers
- Anthropic Claude: The official Python library for the Anthropic API
- Meta Llama Models: The open source AI model you can fine-tune, distill and deploy anywhere.
- Google Gemini: The official Python library for the Google Gemini API
- Ollama: Get up and running with large language models locally.
- Grok: The official Python Library for the Groq API
- DeepSeek-R1: 1st generation reasoning model that competes with OpenAI o1.
- Qwen: Alibaba's Qwen models
- Llama: Meta's foundational models
- LangChain: A framework for developing applications powered by large language models (LLMs). Documentation | Github Cookbook
- LangGraph: A library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Documentation Tutorials
- LangSmith: LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can quickly and confidently ship. Documentation | Github
- Huggingface: An open-source platform for machine learning (ML) and artificial intelligence (AI) tools and models. Documentation
- Transformers: Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models.
- Tokenizers: Tokenizers provides an implementation of today’s most used tokenizers, with a focus on performance and versatility [Documentation] | Github
- Sentence Transformers: Sentence Transformers (a.k.a. SBERT) is the go-to Python module for accessing, using, and training state-of-the-art text and image embedding models.
- smolagents: The simplest framework out there to build powerful agents Documentation | Github
- ChromaDB: The fastest way to build Python or JavaScript LLM apps with memory!
- FAISS: A library for efficient similarity search and clustering of dense vectors.
- Qdrant: High-Performance Vector Search at Scale
- Pinecone: The official Pinecone Python SDK.
- Milvus: Milvus is an open-source vector database built to power embedding similarity search and AI applications.
- PyTorch: PyTorch is an open-source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing.
- TensorFlow: TensorFlow is an open-source machine learning library developed by Google.
- JAX: Google’s library for high-performance computing and automatic differentiation.
- tinygrad: A minimalistic deep learning library with a focus on simplicity and educational use, created by George Hotz.
- micrograd: A simple, lightweight autograd engine for educational purposes, created by Andrej Karpathy.
- Transformers: Hugging Face Transformers is a popular library for Natural Language Processing (NLP) tasks, including fine-tuning large language models.
- Unsloth: Finetune Llama 3.2, Mistral, Phi-3.5 & Gemma 2-5x faster with 80% less memory!
- LitGPT: 20+ high-performance LLMs with recipes to pretrain, finetune, and deploy at scale.
- AutoTrain: No code fine-tuning of LLMs and other machine learning tasks.
- LangSmith: LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can quickly and confidently ship. Documentation | Github
- LangWatch: Monitor, Evaluate & Optimize your LLM performance with 1-click. Drag and drop interface for LLMOps platform. Documentation | GitHub
- Opik: Opik is an open-source platform for evaluating, testing and monitoring LLM applications
- MLflow Tracing and Evaluation: MLflow has a suite of features for LLMs. MLflow LLM Documentation | Model Tracing | Model Evaluation | GitHub
- LangChain Document Loaders: LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc.
- Embedchain: Create an AI app on your own data in a minute Documentation Github Repo
- Docling by IBM: Parse documents and export them to the desired format with ease and speed. Github
- Markitdown by Microsoft: Python tool for converting files and office documents to Markdown.
- Gitingest: Turn any Git repository into a simple text ingest of its codebase. This is useful for feeding a codebase into any LLM. Github
- Crawl4AI: Open-source, blazing-fast, AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Documentation | Github
- GPT Crawler: Crawl a site to generate knowledge files to create your own custom GPT from a URL. Documentation | Github
- LangChain Agents: Build agents with LangChain.
- LangChain Tools: Integrate Tools (Function Calling) with LangChain.
- smolagents: The simplest framework out there to build powerful agents Documentation | Github
- Agentarium: open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for designing complex, interactive environments where agents can act, learn, and evolve. GitHub
- AutoGen AgentChat: Build applications quickly with preset agents.
- Phidata: An open-source platform to build, ship and monitor agentic systems. Documentation | Github
- Composio: Integration Platform for AI Agents & LLMs (works with LangChain, CrewAI, etc). Documentation | Github
- Mem0: Mem0 is a self-improving memory layer for LLM applications, enabling personalized AI experiences that save costs and delight users. Documentation | Github
- Memary: Open Source Memory Layer For Autonomous Agents
- LangWatch: Monitor, Evaluate & Optimize your LLM performance with 1-click. Drag and drop interface for LLMOps platform. Documentation | GitHub
- MLflow: MLflow Tracing for LLM Observability Documentation
- Agenta: Open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place. Documentation
- LLMOps: Best practices designed to support your LLMOps initiatives
- Helicone: Open-source LLM observability platform for developers to monitor, debug, and improve production-ready applications. Documentation | Github
- E2B: E2B is an open-source runtime for executing AI-generated code in secure cloud sandboxes. Made for agentic & AI use cases. Documentation | Github
- AutoGen Docker Code Executor: Executes code through a command line environment in a Docker container
- Browser-Use: Make websites accessible for AI agents Documentation | GitHub
- AI Suite: Simple, unified interface to multiple Generative AI providers.
- AdalFlow: The library to build & auto-optimize LLM applications, from Chatbot, RAG, to Agent by SylphAI.
- dspy: DSPy: The framework for programming—not prompting—foundation models.
- AutoPrompt: A framework for prompt tuning using Intent-based Prompt Calibration.
- PromptFify: A library for prompt engineering that simplifies NLP tasks (e.g., NER, classification) using LLMs like GPT.
- LiteLLM: Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format.
- Jupyter Agent: Let a LLM agent write and execute code inside a notebook
- Jupyter AI: A generative AI extension for JupyterLab Documentation
- AI Agent Service Toolkit: Full toolkit for running an AI agent service built with LangGraph, FastAPI and Streamlit App | GitHub
- AWS Bedrock: Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon
- Microsoft Azure AI Services: Azure AI services help developers and organizations rapidly create intelligent, cutting-edge, market-ready, and responsible applications with out-of-the-box and prebuilt and customizable APIs and models.
- Google Vertex AI: Vertex AI is a fully-managed, unified AI development platform for building and using generative AI.
- NVIDIA NIM: NVIDIA NIM™, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations.
- LangChain Cookbook: Example code for building applications with LangChain, with an emphasis on more applied and end-to-end examples.
- LangGraph Examples: Example code for building applications with LangGraph
- Llama Index Examples: Example code for building applications with Llama Index
- Streamlit LLM Examples: Streamlit LLM app examples for getting started
- Azure Generative AI Examples: Prompt Flow and RAG Examples for use with the Microsoft Azure Cloud platform
- Amazon Bedrock Workshop: Introduces how to leverage foundation models (FMs) through Amazon Bedrock
- Microsoft Generative AI for Beginners 21 Lessons teaching everything you need to know to start building Generative AI applications Github
- Microsoft Intro to Generative AI Course
- Google Vertex AI Examples: Notebooks, code samples, sample apps, and other resources that demonstrate how to use, develop and manage machine learning and generative AI workflows using Google Cloud Vertex AI
- Google Generative AI Examples: Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
- NVIDIA NIM Anywhere: An entry point for developing with NIMs that natively scales out to full-sized labs and up to production environments.
- NVIDIA NIM Deploy: Reference implementations, example documents, and architecture guides that can be used as a starting point to deploy multiple NIMs and other NVIDIA microservices into Kubernetes and other production deployment environments.
- Python AI/ML Tips: Free newsletter on Generative AI and Data Science.
- unwind ai: Latest AI news, tools, and tutorials for AI Developers
- Generative AI Data Scientist Workshops Get free training on how to build and deploy Generative AI / ML Solutions. Register for the next free workshop here.
- 8-Week AI Bootcamp To Become A Generative AI-Data Scientist: Focused on helping you become a Generative AI Data Scientist. Learn How To Build and Deploy AI-Powered Data Science Solutions using LangChain, LangGraph, Pandas, Scikit Learn, Streamlit, AWS, Bedrock, and EC2.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for awesome-generative-ai-data-scientist
Similar Open Source Tools
![awesome-generative-ai-data-scientist Screenshot](/screenshots_githubs/business-science-awesome-generative-ai-data-scientist.jpg)
awesome-generative-ai-data-scientist
A curated list of 50+ resources to help you become a Generative AI Data Scientist. This repository includes resources on building GenAI applications with Large Language Models (LLMs), and deploying LLMs and GenAI with Cloud-based solutions.
![openvino_build_deploy Screenshot](/screenshots_githubs/openvinotoolkit-openvino_build_deploy.jpg)
openvino_build_deploy
The OpenVINO Build and Deploy repository provides pre-built components and code samples to accelerate the development and deployment of production-grade AI applications across various industries. With the OpenVINO Toolkit from Intel, users can enhance the capabilities of both Intel and non-Intel hardware to meet specific needs. The repository includes AI reference kits, interactive demos, workshops, and step-by-step instructions for building AI applications. Additional resources such as Jupyter notebooks and a Medium blog are also available. The repository is maintained by the AI Evangelist team at Intel, who provide guidance on real-world use cases for the OpenVINO toolkit.
![document-ai-samples Screenshot](/screenshots_githubs/GoogleCloudPlatform-document-ai-samples.jpg)
document-ai-samples
The Google Cloud Document AI Samples repository contains code samples and Community Samples demonstrating how to analyze, classify, and search documents using Google Cloud Document AI. It includes various projects showcasing different functionalities such as integrating with Google Drive, processing documents using Python, content moderation with Dialogflow CX, fraud detection, language extraction, paper summarization, tax processing pipeline, and more. The repository also provides access to test document files stored in a publicly-accessible Google Cloud Storage Bucket. Additionally, there are codelabs available for optical character recognition (OCR), form parsing, specialized processors, and managing Document AI processors. Community samples, like the PDF Annotator Sample, are also included. Contributions are welcome, and users can seek help or report issues through the repository's issues page. Please note that this repository is not an officially supported Google product and is intended for demonstrative purposes only.
![edgeai Screenshot](/screenshots_githubs/TexasInstruments-edgeai.jpg)
edgeai
Embedded inference of Deep Learning models is quite challenging due to high compute requirements. TI’s Edge AI software product helps optimize and accelerate inference on TI’s embedded devices. It supports heterogeneous execution of DNNs across cortex-A based MPUs, TI’s latest generation C7x DSP, and DNN accelerator (MMA). The solution simplifies the product life cycle of DNN development and deployment by providing a rich set of tools and optimized libraries.
![applied-ai-engineering-samples Screenshot](/screenshots_githubs/GoogleCloudPlatform-applied-ai-engineering-samples.jpg)
applied-ai-engineering-samples
The Google Cloud Applied AI Engineering repository provides reference guides, blueprints, code samples, and hands-on labs developed by the Google Cloud Applied AI Engineering team. It contains resources for Generative AI on Vertex AI, including code samples and hands-on labs demonstrating the use of Generative AI models and tools in Vertex AI. Additionally, it offers reference guides and blueprints that compile best practices and prescriptive guidance for running large-scale AI/ML workloads on Google Cloud AI/ML infrastructure.
![bedrock-engineer Screenshot](/screenshots_githubs/daisuke-awaji-bedrock-engineer.jpg)
bedrock-engineer
Bedrock Engineer is an AI assistant for software development tasks powered by Amazon Bedrock. It combines large language models with file system operations and web search functionality to support development processes. The autonomous AI agent provides interactive chat, file system operations, web search, project structure management, code analysis, code generation, data analysis, agent and tool customization, chat history management, and multi-language support. Users can select agents, customize them, select tools, and customize tools. The tool also includes a website generator for React.js, Vue.js, Svelte.js, and Vanilla.js, with support for inline styling, Tailwind.css, and Material UI. Users can connect to design system data sources and generate AWS Step Functions ASL definitions.
![obsidian-systemsculpt-ai Screenshot](/screenshots_githubs/SystemSculpt-obsidian-systemsculpt-ai.jpg)
obsidian-systemsculpt-ai
SystemSculpt AI is a comprehensive AI-powered plugin for Obsidian, integrating advanced AI capabilities into note-taking, task management, knowledge organization, and content creation. It offers modules for brain integration, chat conversations, audio recording and transcription, note templates, and task generation and management. Users can customize settings, utilize AI services like OpenAI and Groq, and access documentation for detailed guidance. The plugin prioritizes data privacy by storing sensitive information locally and offering the option to use local AI models for enhanced privacy.
![fiction Screenshot](/screenshots_githubs/fictionco-fiction.jpg)
fiction
Fiction is a next-generation CMS and application framework designed to streamline the creation of AI-generated content. The first-of-its-kind platform empowers developers and content creators by integrating cutting-edge AI technologies with a robust content management system.
![ai-dial Screenshot](/screenshots_githubs/epam-ai-dial.jpg)
ai-dial
AI DIAL is an open-source project that provides a platform for developing and deploying conversational AI applications. It includes components such as DIAL Core for API exposure, DIAL SDK for development, and DIAL Chat for default UI. The project offers tutorials for launching AI DIAL Chat with different models and applications, along with a user manual and configuration guide. Additionally, there are various open-source repositories related to DIAL, including DIAL Helm for helm chart, DIAL Assistant for model agnostic assistant implementation, and DIAL Analytics Realtime for usage analytics. The project aims to simplify the development and deployment of AI-powered chat applications.
![aws-genai-llm-chatbot Screenshot](/screenshots_githubs/aws-samples-aws-genai-llm-chatbot.jpg)
aws-genai-llm-chatbot
This repository provides code to deploy a chatbot powered by Multi-Model and Multi-RAG using AWS CDK on AWS. Users can experiment with various Large Language Models and Multimodal Language Models from different providers. The solution supports Amazon Bedrock, Amazon SageMaker self-hosted models, and third-party providers via API. It also offers additional resources like AWS Generative AI CDK Constructs and Project Lakechain for building generative AI solutions and document processing. The roadmap and authors are listed, along with contributors. The library is licensed under the MIT-0 License with information on changelog, code of conduct, and contributing guidelines. A legal disclaimer advises users to conduct their own assessment before using the content for production purposes.
![Building-AI-Applications-with-ChatGPT-APIs Screenshot](/screenshots_githubs/PacktPublishing-Building-AI-Applications-with-ChatGPT-APIs.jpg)
Building-AI-Applications-with-ChatGPT-APIs
This repository is for the book 'Building AI Applications with ChatGPT APIs' published by Packt. It provides code examples and instructions for mastering ChatGPT, Whisper, and DALL-E APIs through building innovative AI projects. Readers will learn to develop AI applications using ChatGPT APIs, integrate them with frameworks like Flask and Django, create AI-generated art with DALL-E APIs, and optimize ChatGPT models through fine-tuning.
![AI-Playground Screenshot](/screenshots_githubs/intel-AI-Playground.jpg)
AI-Playground
AI Playground is an open-source project and AI PC starter app designed for AI image creation, image stylizing, and chatbot functionalities on a PC powered by an Intel Arc GPU. It leverages libraries from GitHub and Huggingface, providing users with the ability to create AI-generated content and interact with chatbots. The tool requires specific hardware specifications and offers packaged installers for ease of setup. Users can also develop the project environment, link it to the development environment, and utilize alternative models for different AI tasks.
![devb.io Screenshot](/screenshots_githubs/sunithvs-devb.io.jpg)
devb.io
devb.io is an innovative platform that automatically generates professional developer portfolios directly from GitHub profiles, leveraging AI to enhance and update professional representations. It offers one-click GitHub profile connection, automatic portfolio generation, AI-powered bio generation, dynamic activity tracking, and zero manual maintenance. The tech stack includes HTML, CSS for frontend, Fast API for backend, Redis for database, Groq for AI services, and Python for scripting.
![swirl-search Screenshot](/screenshots_githubs/swirlai-swirl-search.jpg)
swirl-search
Swirl is an open-source software that allows users to simultaneously search multiple content sources and receive AI-ranked results. It connects to various data sources, including databases, public data services, and enterprise sources, and utilizes AI and LLMs to generate insights and answers based on the user's data. Swirl is easy to use, requiring only the download of a YML file, starting in Docker, and searching with Swirl. Users can add credentials to preloaded SearchProviders to access more sources. Swirl also offers integration with ChatGPT as a configured AI model. It adapts and distributes user queries to anything with a search API, re-ranking the unified results using Large Language Models without extracting or indexing anything. Swirl includes five Google Programmable Search Engines (PSEs) to get users up and running quickly. Key features of Swirl include Microsoft 365 integration, SearchProvider configurations, query adaptation, synchronous or asynchronous search federation, optional subscribe feature, pipelining of Processor stages, results stored in SQLite3 or PostgreSQL, built-in Query Transformation support, matching on word stems and handling of stopwords, duplicate detection, re-ranking of unified results using Cosine Vector Similarity, result mixers, page through all results requested, sample data sets, optional spell correction, optional search/result expiration service, easily extensible Connector and Mixer objects, and a welcoming community for collaboration and support.
![awesome-gpt-security Screenshot](/screenshots_githubs/cckuailong-awesome-gpt-security.jpg)
awesome-gpt-security
Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.
![AISuperDomain Screenshot](/screenshots_githubs/win4r-AISuperDomain.jpg)
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
For similar tasks
![python-tutorial-notebooks Screenshot](/screenshots_githubs/dcavar-python-tutorial-notebooks.jpg)
python-tutorial-notebooks
This repository contains Jupyter-based tutorials for NLP, ML, AI in Python for classes in Computational Linguistics, Natural Language Processing (NLP), Machine Learning (ML), and Artificial Intelligence (AI) at Indiana University.
![open-parse Screenshot](/screenshots_githubs/Filimoa-open-parse.jpg)
open-parse
Open Parse is a Python library for visually discerning document layouts and chunking them effectively. It is designed to fill the gap in open-source libraries for handling complex documents. Unlike text splitting, which converts a file to raw text and slices it up, Open Parse visually analyzes documents for superior LLM input. It also supports basic markdown for parsing headings, bold, and italics, and has high-precision table support, extracting tables into clean Markdown formats with accuracy that surpasses traditional tools. Open Parse is extensible, allowing users to easily implement their own post-processing steps. It is also intuitive, with great editor support and completion everywhere, making it easy to use and learn.
![MoonshotAI-Cookbook Screenshot](/screenshots_githubs/MoonshotAI-MoonshotAI-Cookbook.jpg)
MoonshotAI-Cookbook
The MoonshotAI-Cookbook provides example code and guides for accomplishing common tasks with the MoonshotAI API. To run these examples, you'll need an MoonshotAI account and associated API key. Most code examples are written in Python, though the concepts can be applied in any language.
![AHU-AI-Repository Screenshot](/screenshots_githubs/DylanAo-AHU-AI-Repository.jpg)
AHU-AI-Repository
This repository is dedicated to the learning and exchange of resources for the School of Artificial Intelligence at Anhui University. Notes will be published on this website first: https://www.aoaoaoao.cn and will be synchronized to the repository regularly. You can also contact me at [email protected].
![modern_ai_for_beginners Screenshot](/screenshots_githubs/chunhuizhang-modern_ai_for_beginners.jpg)
modern_ai_for_beginners
This repository provides a comprehensive guide to modern AI for beginners, covering both theoretical foundations and practical implementation. It emphasizes the importance of understanding both the mathematical principles and the code implementation of AI models. The repository includes resources on PyTorch, deep learning fundamentals, mathematical foundations, transformer-based LLMs, diffusion models, software engineering, and full-stack development. It also features tutorials on natural language processing with transformers, reinforcement learning, and practical deep learning for coders.
![Building-AI-Applications-with-ChatGPT-APIs Screenshot](/screenshots_githubs/PacktPublishing-Building-AI-Applications-with-ChatGPT-APIs.jpg)
Building-AI-Applications-with-ChatGPT-APIs
This repository is for the book 'Building AI Applications with ChatGPT APIs' published by Packt. It provides code examples and instructions for mastering ChatGPT, Whisper, and DALL-E APIs through building innovative AI projects. Readers will learn to develop AI applications using ChatGPT APIs, integrate them with frameworks like Flask and Django, create AI-generated art with DALL-E APIs, and optimize ChatGPT models through fine-tuning.
![examples Screenshot](/screenshots_githubs/pinecone-io-examples.jpg)
examples
This repository contains a collection of sample applications and Jupyter Notebooks for hands-on experience with Pinecone vector databases and common AI patterns, tools, and algorithms. It includes production-ready examples for review and support, as well as learning-optimized examples for exploring AI techniques and building applications. Users can contribute, provide feedback, and collaborate to improve the resource.
![lingoose Screenshot](/screenshots_githubs/henomis-lingoose.jpg)
lingoose
LinGoose is a modular Go framework designed for building AI/LLM applications. It offers the flexibility to import only the necessary modules, abstracts features for customization, and provides a comprehensive solution for developing AI/LLM applications from scratch. The framework simplifies the process of creating intelligent applications by allowing users to choose preferred implementations or create their own. LinGoose empowers developers to leverage its capabilities to streamline the development of cutting-edge AI and LLM projects.
For similar jobs
![promptflow Screenshot](/screenshots_githubs/microsoft-promptflow.jpg)
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
![deepeval Screenshot](/screenshots_githubs/confident-ai-deepeval.jpg)
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
![MegaDetector Screenshot](/screenshots_githubs/agentmorris-MegaDetector.jpg)
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
![leapfrogai Screenshot](/screenshots_githubs/defenseunicorns-leapfrogai.jpg)
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
![llava-docker Screenshot](/screenshots_githubs/ashleykleynhans-llava-docker.jpg)
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
![carrot Screenshot](/screenshots_githubs/xx025-carrot.jpg)
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
![TrustLLM Screenshot](/screenshots_githubs/HowieHwong-TrustLLM.jpg)
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
![AI-YinMei Screenshot](/screenshots_githubs/worm128-AI-YinMei.jpg)
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.