ai-reference-models
Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Intel® Data Center GPUs
Stars: 676
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. The purpose is to quickly replicate complete software environments showcasing the AI capabilities of Intel platforms. It includes optimizations for popular deep learning frameworks like TensorFlow and PyTorch, with additional plugins/extensions for improved performance. The repository is licensed under Apache License Version 2.0.
README:
This repository contains links to pre-trained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs.
Containers for running the workloads can be found at Intel® AI Containers.
Intel® AI Reference Models in a Jupyter Notebook is also available for the listed workloads
Intel optimizes popular deep learning frameworks such as TensorFlow* and PyTorch* by contributing to the upstream projects. Additional optimizations are built into plugins/extensions such as the Intel Extension for Pytorch* and the Intel Extension for TensorFlow*. Popular neural network models running against common datasets are the target workloads that drive these optimizations.
The purpose of the Intel® AI Reference Models repository (and associated containers) is to quickly replicate the complete software environment that demonstrates the best-known performance of each of these target model/dataset combinations. When executed in optimally-configured hardware environments, these software environments showcase the AI capabilities of Intel platforms.
DISCLAIMER: These scripts are not intended for benchmarking Intel platforms. For any performance and/or benchmarking information on specific Intel platforms, visit https://www.intel.ai/blog.
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
The Intel® AI Reference Models is licensed under Apache License Version 2.0.
To the extent that any public datasets are referenced by Intel or accessed using tools or code on this site those datasets are provided by the third party indicated as the data source. Intel does not create the data, or datasets, and does not warrant their accuracy or quality. By accessing the public dataset(s) you agree to the terms associated with those datasets and that your use complies with the applicable license.
Please check the list of datasets used in Intel® AI Reference Models in datasets directory.
Intel expressly disclaims the accuracy, adequacy, or completeness of any public datasets, and is not liable for any errors, omissions, or defects in the data, or for any reliance on the data. Intel is not liable for any liability or damages relating to your use of public datasets.
The model documentation in the tables below have information on the prerequisites to run each model. The model scripts run on Linux. Certain models are also able to run using bare metal on Windows. For more information and a list of models that are supported on Windows, see the documentation here.
Instructions available to run on Sapphire Rapids.
For best performance on Intel® Data Center GPU Flex and Max Series, please check the list of supported workloads. It provides instructions to run inference and training using Intel(R) Extension for PyTorch or Intel(R) Extension for TensorFlow.
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
ResNet 50v1.5 Sapphire Rapids | TensorFlow | Inference | Int8 FP32 BFloat16 BFloat32 | ImageNet 2012 |
ResNet 50v1.5 Sapphire Rapids | TensorFlow | Training | FP32 BFloat16 BFloat32 | ImageNet 2012 |
ResNet 50 | PyTorch | Inference | Int8 FP32 BFloat16 BFloat32 | [ImageNet 2012] |
ResNet 50 | PyTorch | Training | FP32 BFloat16 BFloat32 | [ImageNet 2012] |
Vision Transformer | PyTorch | Inference | FP32 BFloat16 BFloat32 FP16 INT8 | [ImageNet 2012] |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
3D U-Net | TensorFlow | Inference | FP32 BFloat16 Int8 | BRATS 2018 |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
BERT | TensorFlow | Inference | FP32 | MRPC |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Mask R-CNN | PyTorch | Inference | FP32 BFloat16 BFloat32 | COCO 2017 |
Mask R-CNN | PyTorch | Training | FP32 BFloat16 BFloat32 | COCO 2017 |
SSD-ResNet34 | PyTorch | Inference | FP32 Int8 BFloat16 BFloat32 | COCO 2017 |
SSD-ResNet34 | PyTorch | Training | FP32 BFloat16 BFloat32 | COCO 2017 |
Yolo V7 | PyTorch | Inference | Int8 FP32 FP16 BFloat16 BFloat32 | [COCO 2017](/models_v2/pytorch/yolov7/inference/cpu/README.md## Prepare Dataset) |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Wide & Deep | TensorFlow | Inference | FP32 | Census Income dataset |
DLRM | PyTorch | Inference | FP32 Int8 BFloat16 BFloat32 | Criteo Terabyte |
DLRM | PyTorch | Training | FP32 BFloat16 BFloat32 | Criteo Terabyte |
DLRM v2 | PyTorch | Inference | FP32 FP16 BFloat16 BFloat32 Int8 | Criteo 1TB Click Logs dataset |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Stable Diffusion | TensorFlow | Inference | FP32 BFloat16 FP16 | COCO 2017 validation dataset |
Stable Diffusion | PyTorch | Inference | FP32 BFloat16 FP16 BFloat32 Int8-FP32 Int8-BFloat16 | COCO 2017 validation dataset |
Stable Diffusion | PyTorch | Training | FP32 BFloat16 FP16 BFloat32 | cat images |
Latent Consistency Models(LCM) | PyTorch | Inference | FP32 BFloat16 FP16 BFloat32 Int8-FP32 Int8-BFloat16 | COCO 2017 validation dataset |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
GraphSAGE | TensorFlow | Inference | FP32 BFloat16 FP16 Int8 BFloat32 | Protein Protein Interaction |
*Means the model belongs to MLPerf models and will be supported long-term.
Model | Framework | Mode | GPU Type | Model Documentation |
---|---|---|---|---|
ResNet 50v1.5 | TensorFlow | Inference | Flex Series | Float32 TF32 Float16 BFloat16 Int8 |
ResNet 50 v1.5 | TensorFlow | Training | Max Series | BFloat16 FP32 |
ResNet 50 v1.5 | PyTorch | Inference | Flex Series, Max Series, Arc Series | Int8 FP32 FP16 TF32 |
ResNet 50 v1.5 | PyTorch | Training | Max Series, Arc Series | BFloat16 TF32 FP32 |
DistilBERT | PyTorch | Inference | Flex Series, Max Series | FP32 FP16 BF16 TF32 |
DLRM v1 | PyTorch | Inference | Flex Series | FP16 FP32 |
SSD-MobileNet* | PyTorch | Inference | Arc Series | INT8 FP16 FP32 |
EfficientNet | PyTorch | Inference | Flex Series | FP16 BF16 FP32 |
EfficientNet | TensorFlow | Inference | Flex Series | FP16 |
FBNet | PyTorch | Inference | Flex Series | FP16 BF16 FP32 |
Wide Deep Large Dataset | TensorFlow | Inference | Flex Series | FP16 |
YOLO V5 | PyTorch | Inference | Flex Series | FP16 |
BERT large | PyTorch | Inference | Max Series, Arc Series | BFloat16 FP32 FP16 |
BERT large | PyTorch | Training | Max Series, Arc Series | BFloat16 FP32 TF32 |
BERT large | TensorFlow | Training | Max Series | BFloat16 TF32 FP32 |
DLRM v2 | PyTorch | Inference | Max Series | FP32 BF16 |
DLRM v2 | PyTorch | Training | Max Series | FP32 TF32 BF16 |
3D-Unet | PyTorch | Inference | Max Series | FP16 INT8 FP32 |
3D-Unet | TensorFlow | Training | Max Series | BFloat16 FP32 |
Stable Diffusion | PyTorch | Inference | Flex Series, Max Series, Arc Series | FP16 FP32 |
Stable Diffusion | TensorFlow | Inference | Flex Series | FP16 FP32 |
Mask R-CNN | TensorFlow | Inference | Flex Series | FP32 Float16 |
Mask R-CNN | TensorFlow | Training | Max Series | FP32 BFloat16 |
Swin Transformer | PyTorch | Inference | Flex Series | FP16 |
FastPitch | PyTorch | Inference | Flex Series | FP16 |
UNet++ | PyTorch | Inference | Flex Series | FP16 |
RNN-T | PyTorch | Inference | Max Series | FP16 BF16 FP32 |
RNN-T | PyTorch | Training | Max Series | FP32 BF16 TF32 |
IFRNet | PyTorch | Inference | Flex Series | FP16 |
RIFE | PyTorch | Inference | Flex Series | FP16 |
If you would like to add a new benchmarking script, please use this guide.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-reference-models
Similar Open Source Tools
ai-reference-models
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. The purpose is to quickly replicate complete software environments showcasing the AI capabilities of Intel platforms. It includes optimizations for popular deep learning frameworks like TensorFlow and PyTorch, with additional plugins/extensions for improved performance. The repository is licensed under Apache License Version 2.0.
models
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. It aims to replicate the best-known performance of target model/dataset combinations in optimally-configured hardware environments. The repository will be deprecated upon the publication of v3.2.0 and will no longer be maintained or published.
ai-game-development-tools
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool
Model-References
The 'Model-References' repository contains examples for training and inference using Intel Gaudi AI Accelerator. It includes models for computer vision, natural language processing, audio, generative models, MLPerf™ training, and MLPerf™ inference. The repository provides performance data and model validation information for various frameworks like PyTorch. Users can find examples of popular models like ResNet, BERT, and Stable Diffusion optimized for Intel Gaudi AI accelerator.
RAGHub
RAGHub is a community-driven project focused on cataloging new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. It aims to help users stay ahead of changes in the field by providing a platform for the latest innovations in RAG. The repository includes information on RAG frameworks, evaluation frameworks, optimization frameworks, citation frameworks, engines, search reranker frameworks, projects, resources, and real-world use cases across industries and professions.
rknn-llm
RKLLM software stack is a toolkit designed to help users quickly deploy AI models to Rockchip chips. It consists of RKLLM-Toolkit for model conversion and quantization, RKLLM Runtime for deploying models on Rockchip NPU platform, and RKNPU kernel driver for hardware interaction. The toolkit supports RK3588 and RK3576 series chips and various models like TinyLLAMA, Qwen, Phi, ChatGLM3, Gemma, InternLM2, and MiniCPM. Users can download packages, docker images, examples, and docs from RKLLM_SDK. Additionally, RKNN-Toolkit2 SDK is available for deploying additional AI models.
cool-ai-stuff
This repository contains an uncensored list of free to use APIs and sites for several AI models. > _This list is mainly managed by @zukixa, the queen of zukijourney, so any decisions may have bias!~_ > > **Scroll down for the sites, APIs come first!** * * * > [!WARNING] > We are not endorsing _any_ of the listed services! Some of them might be considered controversial. We are not responsible for any legal, technical or any other damage caused by using the listed services. Data is provided without warranty of any kind. **Use these at your own risk!** * * * # APIs Table of Contents #### Overview of Existing APIs #### Overview of Existing APIs -- Top LLM Models Available #### Overview of Existing APIs -- Top Image Models Available #### Overview of Existing APIs -- Top Other Features & Models Available #### Overview of Existing APIs -- Available Donator Perks * * * ## API List:* *: This list solely covers all providers I (@zukixa) was able to collect metrics in. Any mistakes are not my responsibility, as I am either banned, or not aware of x API. \ 1: Last Updated 4/14/24 ### Overview of APIs: | Service | # of Users1 | Link | Stablity | NSFW Ok? | Open Source? | Owner(s) | Other Notes | | ----------- | ---------- | ------------------------------------------ | ------------------------------------------ | --------------------------- | ------------------------------------------------------ | -------------------------- | ----------------------------------------------------------------------------------------------------------- | | zukijourney| 4441 | D | High | On /unf/, not /v1/ | ✅, Here | @zukixa | Largest & Oldest GPT-4 API still continuously around. Offers other popular AI-related Bots too. | | Hyzenberg| 1234 | D | High | Forbidden | ❌ | @thatlukinhasguy & @voidiii | Experimental sister API to Zukijourney. Successor to HentAI | | NagaAI | 2883 | D | High | Forbidden | ❌ | @zentixua | Honorary successor to ChimeraGPT, the largest API in history (15k users). | | WebRaftAI | 993 | D | High | Forbidden | ❌ | @ds_gamer | Largest API by model count. Provides a lot of service/hosting related stuff too. | | KrakenAI | 388 | D | High | Discouraged | ❌ | @paninico | It is an API of all time. | | ShuttleAI | 3585 | D | Medium | Generally Permitted | ❌ | @xtristan | Faked GPT-4 Before 1, 2 | | Mandrill | 931 | D | Medium | Enterprise-Tier-Only | ❌ | @fredipy | DALL-E-3 access pioneering API. Has some issues with speed & stability nowadays. | oxygen | 742 | D | Medium | Donator-Only | ❌ | @thesketchubuser | Bri'ish 🤮 & Fren'sh 🤮 | | Skailar | 399 | D | Medium | Forbidden | ❌ | @aquadraws | Service is the personification of the word 'feature creep'. Lots of things announced, not much operational. |
Cool-GenAI-Fashion-Papers
Cool-GenAI-Fashion-Papers is a curated list of resources related to GenAI-Fashion, including papers, workshops, companies, and products. It covers a wide range of topics such as fashion design synthesis, outfit recommendation, fashion knowledge extraction, trend analysis, and more. The repository provides valuable insights and resources for researchers, industry professionals, and enthusiasts interested in the intersection of AI and fashion.
MobileLLM
This repository contains the training code of MobileLLM, a language model optimized for on-device use cases with fewer than a billion parameters. It integrates SwiGLU activation function, deep and thin architectures, embedding sharing, and grouped-query attention to achieve high-quality LLMs. MobileLLM-125M/350M shows significant accuracy improvements over previous models on zero-shot commonsense reasoning tasks. The design philosophy scales effectively to larger models, with state-of-the-art results for MobileLLM-600M/1B/1.5B.
llm-deploy
LLM-Deploy focuses on the theory and practice of model/LLM reasoning and deployment, aiming to be your partner in mastering the art of LLM reasoning and deployment. Whether you are a newcomer to this field or a senior professional seeking to deepen your skills, you can find the key path to successfully deploy large language models here. The project covers reasoning and deployment theories, model and service optimization practices, and outputs from experienced engineers. It serves as a valuable resource for algorithm engineers and individuals interested in reasoning deployment.
Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.
open-llms
Open LLMs is a repository containing various Large Language Models licensed for commercial use. It includes models like T5, GPT-NeoX, UL2, Bloom, Cerebras-GPT, Pythia, Dolly, and more. These models are designed for tasks such as transfer learning, language understanding, chatbot development, code generation, and more. The repository provides information on release dates, checkpoints, papers/blogs, parameters, context length, and licenses for each model. Contributions to the repository are welcome, and it serves as a resource for exploring the capabilities of different language models.
Github-Ranking-AI
This repository provides a list of the most starred and forked repositories on GitHub. It is updated automatically and includes information such as the project name, number of stars, number of forks, language, number of open issues, description, and last commit date. The repository is divided into two sections: LLM and chatGPT. The LLM section includes repositories related to large language models, while the chatGPT section includes repositories related to the chatGPT chatbot.
For similar tasks
awesome-open-data-annotation
At ZenML, we believe in the importance of annotation and labeling workflows in the machine learning lifecycle. This repository showcases a curated list of open-source data annotation and labeling tools that are actively maintained and fit for purpose. The tools cover various domains such as multi-modal, text, images, audio, video, time series, and other data types. Users can contribute to the list and discover tools for tasks like named entity recognition, data annotation for machine learning, image and video annotation, text classification, sequence labeling, object detection, and more. The repository aims to help users enhance their data-centric workflows by leveraging these tools.
ai-reference-models
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. The purpose is to quickly replicate complete software environments showcasing the AI capabilities of Intel platforms. It includes optimizations for popular deep learning frameworks like TensorFlow and PyTorch, with additional plugins/extensions for improved performance. The repository is licensed under Apache License Version 2.0.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.