data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="kantv"
kantv
workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Stars: 75
data:image/s3,"s3://crabby-images/6a96d/6a96de84e9a292c928f3e6ba60b31d78218b53cb" alt="screenshot"
KanTV is an open-source project that focuses on studying and practicing state-of-the-art AI technology in real applications and scenarios, such as online TV playback, transcription, translation, and video/audio recording. It is derived from the original ijkplayer project and includes many enhancements and new features, including: * Watching online TV and local media using a customized FFmpeg 6.1. * Recording online TV to automatically generate videos. * Studying ASR (Automatic Speech Recognition) using whisper.cpp. * Studying LLM (Large Language Model) using llama.cpp. * Studying SD (Text to Image by Stable Diffusion) using stablediffusion.cpp. * Generating real-time English subtitles for English online TV using whisper.cpp. * Running/experiencing LLM on Xiaomi 14 using llama.cpp. * Setting up a customized playlist and using the software to watch the content for R&D activity. * Refactoring the UI to be closer to a real commercial Android application (currently only supports English). Some goals of this project are: * To provide a well-maintained "workbench" for ASR researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To provide a well-maintained "workbench" for LLM researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To create an Android "turn-key project" for AI experts/researchers (who may not be familiar with regular Android software development) to focus on device-side AI R&D activity, where part of the AI R&D activity (algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark, etc.) can be done very easily using Android Studio IDE and a powerful Android phone.
README:
KanTV("Kan", aka Chinese PinYin "Kan" or Chinese HanZi "看" or English "watch/listen") , an open source project focus on study and practise state-of-the-art AI technology in real scenario(such as online-TV playback and online-TV transcription(real-time subtitle) and online-TV language translation and online-TV video&audio recording works at the same time) on Android phone/device, derived from original , with much enhancements and new features:
-
Watch online TV and local media by my customized
, source code of my customized FFmpeg 6.1 could be found in external/ffmpeg according to FFmpeg's license
-
Record online TV to automatically generate videos (useful for short video creators to generate short video materials but pls respect IPR of original content creator/provider); record online TV's video / audio content for gather video / audio data which might be required of/useful for AI R&D activity
-
AI subtitle(real-time English subtitle for English online-TV(aka OTT TV) by the great & excellent & amazing whisper.cpp ), pls attention Xiaomi 14 or other powerful Android mobile phone is HIGHLY required/recommended for AI subtitle feature otherwise unexpected behavior would happen
-
2D graphic performance
-
Set up a customized playlist and then use this software to watch the content of the customized playlist for R&D activity
-
UI refactor(closer to real commercial Android application and only English is supported in UI language currently)
-
Well-maintained "workbench" for ASR(Automatic Speech Recognition) researchers/developers/programmers who was interested in practise state-of-the-art AI tech(such as whisper.cpp) in real scenario on Android phone/device(PoC: realtime AI subtitle for online-TV(aka OTT TV) on Xiaomi 14 finished from 03/05/2024 to 03/16/2024)
-
Well-maintained "workbench" for LLM(Large Language Model) researchers/developers who was interested in practise state-of-the-art AI tech(such as llama.cpp) in real scenario on Android phone/device, or Run/experience LLM model(such as llama-2-7b, baichuan2-7b, qwen1_5-1_8b, gemma-2b) on Android phone/device using the magic llama.cpp
-
Well-maintained "workbench" for GGML beginners to study internal mechanism of GGML inference framework on Android phone/device(PoC:Qualcomm QNN backend for ggml finished from 03/29/2024 to 04/26/2024)
-
Well-maintained "workbench" for NCNN beginners to study and practise NCNN inference framework on Android phone/device
-
Well-maintained turn-key / self-contained project for AI researchers(whom mightbe not familiar with regular Android software development)/developers/beginners focus on edge/device-side AI learning / R&D activity, some AI R&D activities (AI algorithm validation / AI model validation / performance benchmark in ASR, LLM, TTS, NLP, CV......field) could be done by Android Studio IDE + a powerful Android phone very easily
(depend on https://github.com/zhouwg/kantv/issues/121)
git clone https://github.com/zhouwg/kantv.git
cd kantv
git checkout master
cd kantv
-
Build docker image
docker build build -t kantv --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) --build-arg USER_NAME=$(whoami)
-
Run docker container
# map source code directory into docker container docker run -it --name=kantv --volume=`pwd`:/home/`whoami`/kantv kantv # in docker container . build/envsetup.sh ./build/prebuild-download.sh
-
Prerequisites
- tools & utilities
-
Android Studio
download and install Android Studio manually
-
vim settings
Host OS information:
uname -a Linux 5.8.0-43-generic #49~20.04.1-Ubuntu SMP Fri Feb 5 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux cat /etc/issue Ubuntu 20.04.2 LTS \n \l
sudo apt-get update sudo apt-get install build-essential -y sudo apt-get install cmake -y sudo apt-get install curl -y sudo apt-get install wget -y sudo apt-get install python -y sudo apt-get install tcl expect -y sudo apt-get install nginx -y sudo apt-get install git -y sudo apt-get install vim -y sudo apt-get install spawn-fcgi -y sudo apt-get install u-boot-tools -y sudo apt-get install ffmpeg -y sudo apt-get install openssh-client -y sudo apt-get install nasm -y sudo apt-get install yasm -y sudo apt-get install openjdk-17-jdk -y sudo dpkg --add-architecture i386 sudo apt-get install lib32z1 -y sudo apt-get install -y android-tools-adb android-tools-fastboot autoconf \ automake bc bison build-essential ccache cscope curl device-tree-compiler \ expect flex ftp-upload gdisk acpica-tools libattr1-dev libcap-dev \ libfdt-dev libftdi-dev libglib2.0-dev libhidapi-dev libncurses5-dev \ libpixman-1-dev libssl-dev libtool make \ mtools netcat python-crypto python3-crypto python-pyelftools \ python3-pycryptodome python3-pyelftools python3-serial \ rsync unzip uuid-dev xdg-utils xterm xz-utils zlib1g-dev sudo apt-get install python3-pip -y sudo apt-get install indent -y pip3 install meson ninja echo "export PATH=/home/`whoami`/.local/bin:\$PATH" >> ~/.bashrc
or run below script accordingly after fetch project's source code
./build/prebuild.sh
borrow from http://ffmpeg.org/developer.html#Editor-configuration
set ai set nu set expandtab set tabstop=4 set shiftwidth=4 set softtabstop=4 set noundofile set nobackup set fileformat=unix set undodir=~/.undodir set cindent set cinoptions=(0 " Allow tabs in Makefiles. autocmd FileType make,automake set noexpandtab shiftwidth=8 softtabstop=8 " Trailing whitespace and tabs are forbidden, so highlight them. highlight ForbiddenWhitespace ctermbg=red guibg=red match ForbiddenWhitespace /\s\+$\|\t/ " Do not highlight spaces at the end of line while typing on that line. autocmd InsertEnter * match ForbiddenWhitespace /\t\|\s\+\%#\@<!$/
-
Download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist
. build/envsetup.sh ./build/prebuild-download.sh
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android device is Xiaomi 14 or Qualcomm Snapdragon 8 Gen 3 SoC based Android phone
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android phone is Qualcomm SoC based Android phone and enable QNN backend for inference framework on Qualcomm SoC based Android phone
-
Remove the hardcoded debug flag in Android NDK android-ndk issue
# open $ANDROID_NDK/build/cmake/android.toolchain.cmake for ndk < r23 # or $ANDROID_NDK/build/cmake/android-legacy.toolchain.cmake for ndk >= r23 # delete "-g" line list(APPEND ANDROID_COMPILER_FLAGS -g -DANDROID
. build/envsetup.sh
-
Option 1: Build APK from source code by Android Studio IDE
-
Option 2: Build APK from source code by command line
. build/envsetup.sh lunch 1 ./build-all.sh android
This project is a learning&research project, so the Android APK will not collect/upload user data in Android device. The Android APK should be works well on any mainstream Android phone(report issue in various Android phone to this project is greatly welcomed) and the following four permissions are required:
- Access to storage is required to generate necessary temporary files
- Access to device information is required to obtain current phone network status information, distinguishing whether the current network is Wi-Fi or mobile when playing online TV
- Access to camera is needed for AI Agent
- Access to mic(audio recorder) is needed for AI Agent
here is a short video to demostrate AI subtitle by running the great & excellent & amazing whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
https://github.com/zhouwg/kantv/assets/6889919/2fabcb24-c00b-4289-a06e-05b98ecd22b8
here is a screenshot to demostrate LLM inference by running the magic llama.cpp on a Xiaomi 14 device - fully offline, on-device.
here is a screenshot to demostrate ASR inference by running the excellent whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
here are some screenshots to demostrate CV inference by running the excellent ncnn on a Xiaomi 14 device - fully offline, on-device.
-
improve the quality of Qualcomm QNN backend for GGML
-
improve the performance of edge-AI inference on Android phone
-
bugfix in UI layer(Java)
-
bugfix in native layer(C/C++)
Be sure to review the opening issues before contribute to project KanTV, We use GitHub issues for tracking requests and bugs, please see how to submit issue in this project .
Report issue in various Android-based phone or even submit PR to this project is greatly welcomed.
- How to verify Qualcomm QNN backend for GGML on Qualcomm mobile SoC based android device
- How to setup customized KanTV server in local dev env
- How to create customized playlist for kantv apk
- How to integrate proprietary/open source codes to project KanTV for personal/proprietary/commercial R&D activity
- How to use whisper.cpp and ffmpeg to add subtitle to video
- How to reduce the size of apk
- How to sign apk
- How to validate AI algorithm/model on Android using this project
- Why focus on ggml & ncnn edge-AI inference framework
- What is the most difficult problem for this project
- Acknowledgement
- ChangeLog
- F.A.Q
- AI inference framework
- GGML by Georgi Gerganov
- NCNN by Tencent
- AI application engine
- ASR engine whisper.cpp by Georgi Gerganov
- LLM engine llama.cpp by Georgi Gerganov
Copyright (c) 2021 - 2023 Project KanTV
Copyright (c) 2024 - Authors of Project KanTV
Licensed under Apachev2.0 or later
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for kantv
Similar Open Source Tools
data:image/s3,"s3://crabby-images/6a96d/6a96de84e9a292c928f3e6ba60b31d78218b53cb" alt="kantv Screenshot"
kantv
KanTV is an open-source project that focuses on studying and practicing state-of-the-art AI technology in real applications and scenarios, such as online TV playback, transcription, translation, and video/audio recording. It is derived from the original ijkplayer project and includes many enhancements and new features, including: * Watching online TV and local media using a customized FFmpeg 6.1. * Recording online TV to automatically generate videos. * Studying ASR (Automatic Speech Recognition) using whisper.cpp. * Studying LLM (Large Language Model) using llama.cpp. * Studying SD (Text to Image by Stable Diffusion) using stablediffusion.cpp. * Generating real-time English subtitles for English online TV using whisper.cpp. * Running/experiencing LLM on Xiaomi 14 using llama.cpp. * Setting up a customized playlist and using the software to watch the content for R&D activity. * Refactoring the UI to be closer to a real commercial Android application (currently only supports English). Some goals of this project are: * To provide a well-maintained "workbench" for ASR researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To provide a well-maintained "workbench" for LLM researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To create an Android "turn-key project" for AI experts/researchers (who may not be familiar with regular Android software development) to focus on device-side AI R&D activity, where part of the AI R&D activity (algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark, etc.) can be done very easily using Android Studio IDE and a powerful Android phone.
data:image/s3,"s3://crabby-images/ed089/ed089e5eb1575e7001cb3a9d8b07fe07d7da4f45" alt="serverless-rag-demo Screenshot"
serverless-rag-demo
The serverless-rag-demo repository showcases a solution for building a Retrieval Augmented Generation (RAG) system using Amazon Opensearch Serverless Vector DB, Amazon Bedrock, Llama2 LLM, and Falcon LLM. The solution leverages generative AI powered by large language models to generate domain-specific text outputs by incorporating external data sources. Users can augment prompts with relevant context from documents within a knowledge library, enabling the creation of AI applications without managing vector database infrastructure. The repository provides detailed instructions on deploying the RAG-based solution, including prerequisites, architecture, and step-by-step deployment process using AWS Cloudshell.
data:image/s3,"s3://crabby-images/004ad/004ad7d839130da47e9be1bd7049e9870194a0d1" alt="aircrack-ng Screenshot"
aircrack-ng
Aircrack-ng is a comprehensive suite of tools designed to evaluate the security of WiFi networks. It covers various aspects of WiFi security, including monitoring, attacking (replay attacks, deauthentication, fake access points), testing WiFi cards and driver capabilities, and cracking WEP and WPA PSK. The tools are command line-based, allowing for extensive scripting and have been utilized by many GUIs. Aircrack-ng primarily works on Linux but also supports Windows, macOS, FreeBSD, OpenBSD, NetBSD, Solaris, and eComStation 2.
data:image/s3,"s3://crabby-images/8ff4e/8ff4e6f537d75789f3c34c0026b61c8cefd05579" alt="ktransformers Screenshot"
ktransformers
KTransformers is a flexible Python-centric framework designed to enhance the user's experience with advanced kernel optimizations and placement/parallelism strategies for Transformers. It provides a Transformers-compatible interface, RESTful APIs compliant with OpenAI and Ollama, and a simplified ChatGPT-like web UI. The framework aims to serve as a platform for experimenting with innovative LLM inference optimizations, focusing on local deployments constrained by limited resources and supporting heterogeneous computing opportunities like GPU/CPU offloading of quantized models.
data:image/s3,"s3://crabby-images/29db1/29db17858c9ea90bd08b7188f3dae17ed6729520" alt="PowerInfer Screenshot"
PowerInfer
PowerInfer is a high-speed Large Language Model (LLM) inference engine designed for local deployment on consumer-grade hardware, leveraging activation locality to optimize efficiency. It features a locality-centric design, hybrid CPU/GPU utilization, easy integration with popular ReLU-sparse models, and support for various platforms. PowerInfer achieves high speed with lower resource demands and is flexible for easy deployment and compatibility with existing models like Falcon-40B, Llama2 family, ProSparse Llama2 family, and Bamboo-7B.
data:image/s3,"s3://crabby-images/d6c44/d6c44ff8e949e5ad1cd872857d77287d5cf03488" alt="TestSpark Screenshot"
TestSpark
TestSpark is a plugin for generating unit tests that integrates AI-based test generation tools. It supports LLM-based test generation using OpenAI, HuggingFace, and JetBrains internal AI Assistant platform, as well as local search-based test generation using EvoSuite. Users can configure test generation settings, interact with test cases, view coverage statistics, and integrate tests into projects. The plugin is designed for experimental use to augment existing test suites, not replace manual test writing.
data:image/s3,"s3://crabby-images/cf610/cf61037a18e965f5df094d1c2d75758df200efda" alt="quark-engine Screenshot"
quark-engine
Quark Engine is an AI-powered tool designed for analyzing Android APK files. It focuses on enhancing the detection process for auto-suggestion, enabling users to create detection workflows without coding. The tool offers an intuitive drag-and-drop interface for workflow adjustments and updates. Quark Agent, the core component, generates Quark Script code based on natural language input and feedback. The project is committed to providing a user-friendly experience for designing detection workflows through textual and visual methods. Various features are still under development and will be rolled out gradually.
data:image/s3,"s3://crabby-images/bfc4d/bfc4d44c09bf91f63e0364f551f4b2243100c800" alt="Open-LLM-VTuber Screenshot"
Open-LLM-VTuber
Open-LLM-VTuber is a voice-interactive AI companion supporting real-time voice conversations and featuring a Live2D avatar. It can run offline on Windows, macOS, and Linux, offering web and desktop client modes. Users can customize appearance and persona, with rich LLM inference, text-to-speech, and speech recognition support. The project is highly customizable, extensible, and actively developed with exciting features planned. It provides privacy with offline mode, persistent chat logs, and various interaction features like voice interruption, touch feedback, Live2D expressions, pet mode, and more.
data:image/s3,"s3://crabby-images/6abdd/6abdd081cdfaeeafb02ef6f0a8e7795ef356d217" alt="Sunshine-AIO Screenshot"
Sunshine-AIO
Sunshine-AIO is an all-in-one step-by-step guide to set up Sunshine with all necessary tools for Windows users. It provides a dedicated display for game streaming, virtual monitor switching, automatic resolution adjustment, resource-saving features, game launcher integration, and stream management. The project aims to evolve into an AIO tool as it progresses, welcoming contributions from users.
data:image/s3,"s3://crabby-images/337f7/337f7d043796ac3fb540609b49e8bc4848b31c68" alt="obs-localvocal Screenshot"
obs-localvocal
LocalVocal is a Speech AI assistant OBS Plugin that enables users to transcribe speech into text and translate it into any language locally on their machine. The plugin runs OpenAI's Whisper for real-time speech processing and prediction. It supports features like transcribing audio in real-time, displaying captions on screen, sending captions to files, syncing captions with recordings, and translating captions to major languages. Users can bring their own Whisper model, filter or replace captions, and experience partial transcriptions for streaming. The plugin is privacy-focused, requiring no GPU, cloud costs, network, or downtime.
data:image/s3,"s3://crabby-images/acbdb/acbdbfb7b52f009ccf429c07610d71655959eca4" alt="tock Screenshot"
tock
Tock is an open conversational AI platform for building bots. It offers a natural language processing open source stack compatible with various tools, a user interface for building stories and analytics, a conversational DSL for different programming languages, built-in connectors for text/voice channels, toolkits for custom web/mobile integration, and the ability to deploy anywhere in the cloud or on-premise with Docker.
data:image/s3,"s3://crabby-images/ab494/ab494467acbf522c21a0a6c061db310a90a137c4" alt="svelte-commerce Screenshot"
svelte-commerce
Svelte Commerce is an open-source frontend for eCommerce, utilizing a PWA and headless approach with a modern JS stack. It supports integration with various eCommerce backends like MedusaJS, Woocommerce, Bigcommerce, and Shopify. The API flexibility allows seamless connection with third-party tools such as payment gateways, POS systems, and AI services. Svelte Commerce offers essential eCommerce features, is both SSR and SPA, superfast, and free to download and modify. Users can easily deploy it on Netlify or Vercel with zero configuration. The tool provides features like headless commerce, authentication, cart & checkout, TailwindCSS styling, server-side rendering, proxy + API integration, animations, lazy loading, search functionality, faceted filters, and more.
data:image/s3,"s3://crabby-images/6dde7/6dde74befbe315cc2c902cbe7ce0cda933a4c74c" alt="amazon-bedrock-client-for-mac Screenshot"
amazon-bedrock-client-for-mac
A sleek and powerful macOS client for Amazon Bedrock, bringing AI models to your desktop. It provides seamless interaction with multiple Amazon Bedrock models, real-time chat interface, easy model switching, support for various AI tasks, and native Dark Mode support. Built with SwiftUI for optimal performance and modern UI.
data:image/s3,"s3://crabby-images/037e5/037e5f2f9568bb6437a36fa4b92772b056781042" alt="RWKV-Runner Screenshot"
RWKV-Runner
RWKV Runner is a project designed to simplify the usage of large language models by automating various processes. It provides a lightweight executable program and is compatible with the OpenAI API. Users can deploy the backend on a server and use the program as a client. The project offers features like model management, VRAM configurations, user-friendly chat interface, WebUI option, parameter configuration, model conversion tool, download management, LoRA Finetune, and multilingual localization. It can be used for various tasks such as chat, completion, composition, and model inspection.
data:image/s3,"s3://crabby-images/fe918/fe91818e261bba5b9207b88b4de61dc0d6d770fa" alt="llama.vscode Screenshot"
llama.vscode
llama.vscode is a local LLM-assisted text completion extension for Visual Studio Code. It provides auto-suggestions on input, allows accepting suggestions with shortcuts, and offers various features to enhance text completion. The extension is designed to be lightweight and efficient, enabling high-quality completions even on low-end hardware. Users can configure the scope of context around the cursor and control text generation time. It supports very large contexts and displays performance statistics for better user experience.
data:image/s3,"s3://crabby-images/2ab42/2ab42facb3a4e255c56eea6860fc2dadff01e3e3" alt="denser-retriever Screenshot"
denser-retriever
Denser Retriever is an enterprise-grade AI retriever designed to streamline AI integration into applications, combining keyword-based searches, vector databases, and machine learning rerankers using xgboost. It provides state-of-the-art accuracy on MTEB Retrieval benchmarking and supports various heterogeneous retrievers for end-to-end applications like chatbots and semantic search.
For similar tasks
data:image/s3,"s3://crabby-images/6a96d/6a96de84e9a292c928f3e6ba60b31d78218b53cb" alt="kantv Screenshot"
kantv
KanTV is an open-source project that focuses on studying and practicing state-of-the-art AI technology in real applications and scenarios, such as online TV playback, transcription, translation, and video/audio recording. It is derived from the original ijkplayer project and includes many enhancements and new features, including: * Watching online TV and local media using a customized FFmpeg 6.1. * Recording online TV to automatically generate videos. * Studying ASR (Automatic Speech Recognition) using whisper.cpp. * Studying LLM (Large Language Model) using llama.cpp. * Studying SD (Text to Image by Stable Diffusion) using stablediffusion.cpp. * Generating real-time English subtitles for English online TV using whisper.cpp. * Running/experiencing LLM on Xiaomi 14 using llama.cpp. * Setting up a customized playlist and using the software to watch the content for R&D activity. * Refactoring the UI to be closer to a real commercial Android application (currently only supports English). Some goals of this project are: * To provide a well-maintained "workbench" for ASR researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To provide a well-maintained "workbench" for LLM researchers interested in practicing state-of-the-art AI technology in real scenarios on mobile devices (currently focusing on Android). * To create an Android "turn-key project" for AI experts/researchers (who may not be familiar with regular Android software development) to focus on device-side AI R&D activity, where part of the AI R&D activity (algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark, etc.) can be done very easily using Android Studio IDE and a powerful Android phone.
data:image/s3,"s3://crabby-images/ec9e7/ec9e77a3ac8345ecdd0a6948a1f93472a008f748" alt="ai-demos Screenshot"
ai-demos
The 'ai-demos' repository is a collection of example code from presentations focusing on building with AI and LLMs. It serves as a resource for developers looking to explore practical applications of artificial intelligence in their projects. The code snippets showcase various techniques and approaches to leverage AI technologies effectively. The repository aims to inspire and educate developers on integrating AI solutions into their applications.
data:image/s3,"s3://crabby-images/b0733/b073381702a2078321145283ce9266388b0b90e3" alt="amazon-sagemaker-generativeai Screenshot"
amazon-sagemaker-generativeai
Repository for training and deploying Generative AI models, including text-text, text-to-image generation, prompt engineering playground and chain of thought examples using SageMaker Studio. The tool provides a platform for users to experiment with generative AI techniques, enabling them to create text and image outputs based on input data. It offers a range of functionalities for training and deploying models, as well as exploring different generative AI applications.
data:image/s3,"s3://crabby-images/b3c02/b3c026f4b5c83027f3ff9b64b7e7c6f87fac0d2e" alt="AgentGPT Screenshot"
AgentGPT
AgentGPT is a platform that allows users to configure and deploy autonomous AI agents. Users can name their own custom AI and set it on any goal. The AI will think of tasks, execute them, and learn from the results to reach the goal. The platform provides a demo experience, automatic setup CLI, and a tech stack including Next.js, FastAPI, Prisma, TailwindCSS, Zod, and more. AgentGPT is designed to help users easily create and deploy AI agents for various tasks.
data:image/s3,"s3://crabby-images/57343/573433bba9787a1c8a508ef0abc45bb8f0c2c1ad" alt="openvino_build_deploy Screenshot"
openvino_build_deploy
The OpenVINO Build and Deploy repository provides pre-built components and code samples to accelerate the development and deployment of production-grade AI applications across various industries. With the OpenVINO Toolkit from Intel, users can enhance the capabilities of both Intel and non-Intel hardware to meet specific needs. The repository includes AI reference kits, interactive demos, workshops, and step-by-step instructions for building AI applications. Additional resources such as Jupyter notebooks and a Medium blog are also available. The repository is maintained by the AI Evangelist team at Intel, who provide guidance on real-world use cases for the OpenVINO toolkit.
data:image/s3,"s3://crabby-images/138cf/138cfc61bff33a68a0d34efebca08c886b96fcbc" alt="sagentic-af Screenshot"
sagentic-af
Sagentic.ai Agent Framework is a tool for creating AI agents with hot reloading dev server. It allows users to spawn agents locally by calling specific endpoint. The framework comes with detailed documentation and supports contributions, issues, and feature requests. It is MIT licensed and maintained by Ahyve Inc.
For similar jobs
data:image/s3,"s3://crabby-images/7689b/7689ba1fce50eb89a5e34075170d6aaee3c49f87" alt="weave Screenshot"
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
data:image/s3,"s3://crabby-images/10ae7/10ae70fb544e4cb1ced622d6de4a6da32e2f9150" alt="LLMStack Screenshot"
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
data:image/s3,"s3://crabby-images/83afc/83afcd39fd69a41723dd590c7594d452ad40edd5" alt="VisionCraft Screenshot"
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
data:image/s3,"s3://crabby-images/065d0/065d091551616e8781269d4b98673eee8b08234f" alt="kaito Screenshot"
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
data:image/s3,"s3://crabby-images/48887/488870f896a867b538f8a551521f4987e02b7077" alt="PyRIT Screenshot"
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
data:image/s3,"s3://crabby-images/c92ac/c92accb591e608b2d38283e73dd764fb033bff25" alt="tabby Screenshot"
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
data:image/s3,"s3://crabby-images/7740a/7740ad4457091afbcd6c9b0f3b808492d0dccb01" alt="spear Screenshot"
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
data:image/s3,"s3://crabby-images/33099/330995f291fdf6166ad2fee1a67c879cd5496194" alt="Magick Screenshot"
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.