Semi-Auto-NovelAI-to-Pixiv
带有 WebUI 的 NovelAI 量产工具, 实现了批量文生图; 批量图生图; 视频转绘; 分块重绘; 批量 Vibe; 批量局部重绘; 批量超分降噪; 批量自动打码; 批量添加水印; 批量上传 Pixiv; 图片筛选; 批量抹除, 还原或导出生成信息; 法术解析; 多模型反推提示词; ChatGPT; 动态加载插件; 自动 roll 画风串; 批量 Enhance; tag选择器; 涂鸦重绘; 图片压缩整理; 批量AI工具; wildcard
Stars: 192
Semi-Auto-NovelAI-to-Pixiv is a powerful tool that enables batch image generation with NovelAI, along with various other useful features in a super user-friendly interface. It allows users to create images, generate random images, upload images to Pixiv, apply filters, enhance images, add watermarks, and more. The tool also supports video-to-image conversion and various image manipulation tasks. It offers a seamless experience for users looking to automate image processing tasks.
README:
English document: README_EN.md
-
这是一个神奇项目, 实现了 NovelAI 本身无法实现的批量生图!
-
它不仅仅只能生图, 是集各种实用功能于一体的超级用户界面!
-
使用中遇到问题请加 QQ 群咨询:559063963
[!TIP] 那天大雨滂沱,雷电交加, 风儿甚是喧嚣,仿佛整个世界都在为某种未知的力量所动摇。
✨ 芝士目前已实现的功能:
功能 | 介绍 | 示例 | 说明 |
---|---|---|---|
教程说明 | 本项目的介绍及使用教程 | 请仔细阅读 | |
文生图 | 使用 Gradio 为 NovelAI 写的一个用户界面, 除了界面不同, 其它完全等同于使用 NovelAI 网站 | 生成的图片将保存到 ./output/t2i 文件夹 |
|
随机蓝图 | 通过随机组合 ./files/favorite 中的 tag 生成一张蓝图或无限生成蓝图 |
关于固定项目, 后三个文本框需要填写 favorite 中对应的键值, 当填写固定动作或固定角色时, 需要对应的动作类型和出处, 留空时随机, 关于随机蓝图的配置, 请查看 WebUI 配置设置页面的其它部分 | |
随机图片 | 通过读取 ./file/prompt 中的 *.txt 文件并追加输入的提示词作为提示词无限生成图片, 负面提示词将随机选择 favorite negative belief 中的负面提示词, 其它参数将使用 env 配置, 当文件夹下的所有 *.txt 文件均生成过一次后或点击停止生成后, 则将停止运行 |
关于随机图片的配置, 请查看 WebUI 配置设置页面的其它部分 | |
Vibe | 等同于使用 NovelAI 网站, 我为它添加了批量功能 | 需要准备一些图片到同一文件夹, 图片名称需要重命名为 (任意(不含下划线)_(信息提取强度, 浮点型(0, 1))_(参考强度, 浮点型(0, 1)).png) 的格式, 例如 hoshino-hinata_1.0_0.6
|
|
图生图 | 等同于使用 NovelAI 网站, 支持任何图片, 另外, 我为它添加了批量图生图 | 生成的图片将保存到 ./output/i2i 文件夹, 但会在 ./output 文件夹内生成一张名为 temp.png 的临时图片, 可以删除, 批量处理时, 请将图片放到同一个文件夹, 例如: ./output/choose_to_i2i
|
|
导演工具 | 使用不用的 AI 工具来编辑你的图片 | 完全等同于官网, 我为它添加了批量功能 | |
视频转绘 | 将视频用几个步骤重绘, 用于将三次元转绘为二次元 | 实验性功能, 欢迎提出建议 | |
分块重绘 | 将一张大图拆分成 640x640 的小块, 然后将这些小块用图生图的方式放大为 1024x1024, 不需要担心两张图片衔接过硬, 我使用鸣谢列表中的开源项目修复接缝 rife-ncnn-vulkan | 由于耗时较长, 目前仅开放单张放大, 使用时需要提供图片或图片路径(任选其一) | |
局部重绘 | 仅支持 NovelAI 生成的图片, 并且需要上传蒙版, 支持批量操作 | 上传的蒙版应为: 重绘区域为白色, 其余透明而不是黑色, 分辨率等于重绘图像, 批量操作时, 请将图片和蒙版放置于两个文件夹, 并且保证图片和蒙版文件名相同, 例如: ./output/inpaint/img , ./output/inpaint/mask , 生成的图片将保存到 ./output/inpaint
|
|
超分降噪 | 使用鸣谢名单中的开源项目对图片进行超分降噪, 支持任何图片单张或批量处理 | 生成的图片将保存到 ./output/upscale 文件夹, 不建议使用 srmd-cuda, 因为它不稳定. 当使用 waifu2x-caffe 或 waifu2x-converter 时, 将会在 ./output 文件夹内生成一个名为 temp.bat 的临时批处理文件, 可以删除, 批量处理时, 请将图片放到同一个文件夹, 例如: ./output/choose_to_upscale
|
|
自动打码 | 自动检测图片中的关键部位, 并对其打码 | 不能确保 100% 检测出来, 生成的图片将保存到 ./output/mosaic 文件夹, 批量处理时, 请将图片放到同一个文件夹, 例如: ./output/choose_to_mosaic
|
|
添加水印 | 在图片左上, 右上, 左下, 右下随机某个位置范围添加指定数量的随机透明度的随机水印 | 使用前, 请先准备一些自己的水印到 ./files/water 文件夹. 使用时, 请输入需要处理的图片目录并按确定, 处理后的图片将保存到 ./output/water
|
|
上传Pixiv | 批量将图片上传到 Pixiv | 关于上传Pixiv的配置, , 请查看 WebUI 配置设置页面的其它部分 | |
图片筛选 | 人工对图片进行筛选的工具 | 使用时, 请先输入图片目录并按下确定, 然后输入输出目录. 会在 ./output 文件夹下生成一个名为 array_data.npy 的文件, 它会保存上次筛选的进度, 即你可以不选择图片目录继续筛选, 筛选完毕后会自动删除. |
|
抹除数据 | 批量抹除, 还原或导出图片生成信息 | 还原信息时, 需要准备至少带有 prompt 的 *.png 图片或内容为 prompt 的 *.txt 文件, 并放到某一目录(图片信息文件目录), 选取的待还原图片目录中的文件名(不含扩展名)需要和刚刚的图片信息文件目录中的文件文件名一致 | |
法术解析 | 使用鸣谢名单中的开源项目进行读取 png info | 使用 iframe 嵌套入本项目 | |
Tagger | 使用 SmilingWolf 在 huggingface 上部署的反推模型, 我为它添加了批量操作 | 批量处理时, 生成的 prompt 文本会保存到图片的同一目录 | |
GPT Free | 免费, 多模型的 GPT, 使用鸣谢名单中的开源项目 | 使用 iframe 嵌套入本项目 | |
插件商店 | 展示所有在插件列表(./files/plugins.json)中的插件 | 安装时, 将想要安装的插件名称复制粘贴到左上角名称内, 点击安装即可, 重启后生效 | |
配置设置 | 在 WebUI 更改配置项 | 修改记得保存, 重启后立即生效 |
正在学习 Gradio, 尝试为本项目写一个 WebUI
-
实现动态加载插件, 提高本项目可扩展性!
-
已提交到商店的插件: 插件列表
[!TIP] 我独自一人走在湿滑泥泞的街头, 身旁只有寥寥几盏路灯在暗夜中孤寂地闪烁。
- 极低的配置需求, 极致的用户体验!
项目 | 说明 |
---|---|
NovelAI 会员 | 为了无限生成图片, 建议 25$/month 会员 |
魔法网络 | 为了成功发送请求, 确保你可以正常访问相关网站 |
1GB 显存 | 为了使用超分降噪所有引擎, 需要至少 1GB 显存 |
2GB 内存 | 为了流畅使用本项目, 需要至少 2GB 内存 |
Windows 10/11(x64) | 为了使用全部功能, 需要使用 64 位 Windows10/11 |
Microsoft Visual C++ 2015 | 为了使用超分降噪所有引擎, 需要安装运行库 |
[!WARNING] 远处传来几声猫的嘶叫,仿佛是夜晚的唯一音符,黑暗荒芜, 寒风刺骨, 伶仃孤苦。
- 如果你喜欢这个项目,请不妨点个 Star🌟,这是对开发者最大的动力
- 推荐安装 Python 3.10.11, 安装时请勾选 Add Python to PATH, 其余保持默认
- 推荐安装最新版本, 安装时一路 Next 即可
- 打开 cmd 或 powershell, 执行
git clone -b main --depth=1 https://github.com/zhulinyv/Semi-Auto-NovelAI-to-Pixiv.git
- 现在你可以直接运行项目根目录下的
run.bat
来启动 WebUI, 首次启动会自动创建虚拟环境并安装依赖, 耗时较长, 可以去冲杯咖啡或继续看下方的文档
如果上述操作你觉得难以上手或出现问题, 请加群咨询或下载整合包 Semi-Auto-NovelAI-to-Pixiv.7z
解压即用, 整合包用户请运行 整合包启动(Modpack launcher).bat
[!TIP] 月光透过稀疏的云层,洒在地面上,勾勒出一幅幽冥的画卷。
-
⚠️ 1.如果你已经启动了 WebUI, 但没有进行必要配置, 那么请转到设置页面进行必要配置 -
⚠️ 2.请不要跳过这一步, 它非常重要, 确保你已经将所有配置浏览过一遍 -
⚠️ 3.你同样可以直接编辑.env
文件进行配置
[!WARNING] 那几声猫的嘶叫,时而远去,时而又近了, 不知脚下的路究竟是通向何方。
- 1.打开 https://www.pixiv.net/illustration/create 并手动上传图片
- 2.选择标签, 年龄限制, AI生成作品, 公开范围, 作品评论功能, 原创作品
- 3.F12 打开控制台并切换到网络视图
- 4.点击投稿
- 5.找到并单击 illustraion, 右侧切换到标头选项
- 6.在请求头部中可以找到 Cookie 和 X-Csrf-Token
-
运行
run.bat
, 会自动打开默认浏览器并跳转到 127.0.0.1:11451 -
对于旧版用户: 不再建议运行单独脚本, 请使用 WebUI
-
如果真的需要(例如: 浏览器已添加休眠白名单但在非活动页面无法继续生成的情况), 请在 WebUI 中配置好目录等参数并单击生成独立脚本(你也可以自己阅读源代码编写独立的脚本), 然后运行根目录下的 run_stand_alone_scripts.bat
-
插件开发请移步: Wiki
[!TIP] 抬头是无尽的黑暗, 低头是无尽的黑暗, 仿佛陷入一个无边的漩涡中。
[!TIP] 黑暗如同一双无形的手,将我紧紧拥抱,深深地吞噬着我的思绪。
展开查看待办列表
- [x] 批量文生图
- [x] 批量图生图
- [x] 批量上传 Pixiv
- [x] 计算剩余点数
- [x] 批量 waifu2x
- [x] 批量局部重绘
- [x]
批量 vibe - [x] 批量打码
- [x] 用 Gradio 写一个 WebUI
- [ ]
将项目放到容器持久化运行 - [x] 修改界面样式
- [x]
添加 ChatGPT - [x]
写一个图片筛选器 - [ ]
通过账号密码获取 token - [x] 添加更多超分引擎
- [x] 添加文生图方式
- [x] 批量水印
- [x] 批量图片信息处理
- [x] 配置项界面
- [x] 打开相关文件夹功能
- [x] 合并随机蓝图等界面
- [x] 热键快速筛图
- [x] 教程和说明页面
- [x] 自定义插件
- [x] 自动生成独立脚本
- [x] 文生图指定数量
- [x] 文生图种子点击切换随机
- [x] 配置项添加是否还原图片信息
- [x] 补全独立脚本生成
- [x] 图片保存分类
- [x] 支持非文生图插件
- [x] 视频转绘
- [x] 提示词反推
- [x] 分块重绘
- [ ]
添加更多插帧引擎 - [x] 翻译剩余页面
- [x] 自动更新
- [x] 插件商店
- [x] 自定义清除元数据
- [x] 自动安装插件
- [x] 代理配置
- [x] 批量 Enhance
- [ ]
自定义保存目录 - [ ] 学习 C# 使用 wpfui 写一个启动器
- [x] YOLO 检测 NSFW
- [x] 启动 LOGO(甚至还加了个提示音)
- [x] 重新命名函数和变量
- [ ] 文生图中断
- [x] 插件列表读取远程仓库
- [x] 插件更新与卸载
- [x] 图片筛选添加复制操作
- [x] 整合包
- [x] 新增打码方式
- [x] 局部重绘优化蒙版上传
- [x] 涂鸦重绘
- [ ]
局部放大重绘 - [x] 图片压缩与分类整理
- [ ] vibe 保存风格
- [x] 回退 vibe 随机图
- [x] 简化 favorite 编辑
- [ ] 学习 js 写一个自动补全
- [ ] 简化 vibe 图片上传
- [x] 自定义分辨率
- [ ] 对接 SD
- [ ] ...
本项目使用 waifu2x-ncnn-vulkan | Anime4KCPP | realcugan-ncnn-vulkan | realesrgan-ncnn-vulkan | realsr-ncnn-vulkan | srmd-cuda | srmd-ncnn-vulkan | waifu2x-caffe | waifu2x-converter 降噪和放大图片
本项目使用 Genshin-Sync 上传图片至 Pixiv
本项目使用 GPT4FREE 提供 GPT 服务
本项目使用 novelai-image-metadata 抹除元数据
本项目使用 SmilingWolf/wd-tagger 反推提示词
本项目使用 rife-ncnn-vulkan 处理分块重绘图片接缝
本项目使用 300画风法典 提供的部分画风串
本项目使用 涩涩法典梦神版 提供的各种动作提示词
[!NOTE] 坠落, 坠落。
免责声明: 本软件仅提供技术服务,开发者不对用户使用本软件可能引发的任何法律责任或损失承担责任, 用户应对其使用本软件及其结果负全部责任
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Semi-Auto-NovelAI-to-Pixiv
Similar Open Source Tools
Semi-Auto-NovelAI-to-Pixiv
Semi-Auto-NovelAI-to-Pixiv is a powerful tool that enables batch image generation with NovelAI, along with various other useful features in a super user-friendly interface. It allows users to create images, generate random images, upload images to Pixiv, apply filters, enhance images, add watermarks, and more. The tool also supports video-to-image conversion and various image manipulation tasks. It offers a seamless experience for users looking to automate image processing tasks.
Steel-LLM
Steel-LLM is a project to pre-train a large Chinese language model from scratch using over 1T of data to achieve a parameter size of around 1B, similar to TinyLlama. The project aims to share the entire process including data collection, data processing, pre-training framework selection, model design, and open-source all the code. The goal is to enable reproducibility of the work even with limited resources. The name 'Steel' is inspired by a band '万能青年旅店' and signifies the desire to create a strong model despite limited conditions. The project involves continuous data collection of various cultural elements, trivia, lyrics, niche literature, and personal secrets to train the LLM. The ultimate aim is to fill the model with diverse data and leave room for individual input, fostering collaboration among users.
chatluna
Chatluna is a machine learning model plugin that provides chat services with large language models. It is highly extensible, supports multiple output formats, and offers features like custom conversation presets, rate limiting, and context awareness. Users can deploy Chatluna under Koishi without additional configuration. The plugin supports various models/platforms like OpenAI, Azure OpenAI, Google Gemini, and more. It also provides preset customization using YAML files and allows for easy forking and development within Koishi projects. However, the project lacks web UI, HTTP server, and project documentation, inviting contributions from the community.
agentica
Agentica is a human-centric framework for building large language model agents. It provides functionalities for planning, memory management, tool usage, and supports features like reflection, planning and execution, RAG, multi-agent, multi-role, and workflow. The tool allows users to quickly code and orchestrate agents, customize prompts, and make API calls to various services. It supports API calls to OpenAI, Azure, Deepseek, Moonshot, Claude, Ollama, and Together. Agentica aims to simplify the process of building AI agents by providing a user-friendly interface and a range of functionalities for agent development.
99AI
99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.
HivisionIDPhotos
HivisionIDPhoto is a practical algorithm for intelligent ID photo creation. It utilizes a comprehensive model workflow to recognize, cut out, and generate ID photos for various user photo scenarios. The tool offers lightweight cutting, standard ID photo generation based on different size specifications, six-inch layout photo generation, beauty enhancement (waiting), and intelligent outfit swapping (waiting). It aims to solve emergency ID photo creation issues.
ipex-llm
IPEX-LLM is a PyTorch library for running Large Language Models (LLMs) on Intel CPUs and GPUs with very low latency. It provides seamless integration with various LLM frameworks and tools, including llama.cpp, ollama, Text-Generation-WebUI, HuggingFace transformers, and more. IPEX-LLM has been optimized and verified on over 50 LLM models, including LLaMA, Mistral, Mixtral, Gemma, LLaVA, Whisper, ChatGLM, Baichuan, Qwen, and RWKV. It supports a range of low-bit inference formats, including INT4, FP8, FP4, INT8, INT2, FP16, and BF16, as well as finetuning capabilities for LoRA, QLoRA, DPO, QA-LoRA, and ReLoRA. IPEX-LLM is actively maintained and updated with new features and optimizations, making it a valuable tool for researchers, developers, and anyone interested in exploring and utilizing LLMs.
EmoLLM
EmoLLM is a series of large-scale psychological health counseling models that can support **understanding-supporting-helping users** in the psychological health counseling chain, which is fine-tuned from `LLM` instructions. Welcome everyone to star~⭐⭐. The currently open source `LLM` fine-tuning configurations are as follows:
FastGPT
FastGPT is a knowledge base Q&A system based on the LLM large language model, providing out-of-the-box data processing, model calling and other capabilities. At the same time, you can use Flow to visually arrange workflows to achieve complex Q&A scenarios!
Hands-On-Large-Language-Models-CN
Hands-On Large Language Models CN(ZH) is a Chinese version of the book 'Hands-On Large Language Models' by Jay Alammar and Maarten Grootendorst. It provides detailed code annotations and additional insights, offers Notebook versions suitable for Chinese network environments, utilizes openbayes for free GPU access, allows convenient environment setup with vscode, and includes accompanying Chinese language videos on platforms like Bilibili and YouTube. The book covers various chapters on topics like Tokens and Embeddings, Transformer LLMs, Text Classification, Text Clustering, Prompt Engineering, Text Generation, Semantic Search, Multimodal LLMs, Text Embedding Models, Fine-tuning Models, and more.
LLaMA-Factory
LLaMA Factory is a unified framework for fine-tuning 100+ large language models (LLMs) with various methods, including pre-training, supervised fine-tuning, reward modeling, PPO, DPO and ORPO. It features integrated algorithms like GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, LoRA+, LoftQ and Agent tuning, as well as practical tricks like FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA. LLaMA Factory provides experiment monitors like LlamaBoard, TensorBoard, Wandb, MLflow, etc., and supports faster inference with OpenAI-style API, Gradio UI and CLI with vLLM worker. Compared to ChatGLM's P-Tuning, LLaMA Factory's LoRA tuning offers up to 3.7 times faster training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.
GoMaxAI-ChatGPT-Midjourney-Pro
GoMaxAI Pro is an AI-powered application for personal, team, and enterprise private operations. It supports various models like ChatGPT, Claude, Gemini, Kimi, Wenxin Yiyuan, Xunfei Xinghuo, Tsinghua Zhipu, Suno-v3.5, and Luma-video. The Pro version offers a new UI interface, member points system, management backend, homepage features, support for various content formats, AI video capabilities, SAAS multi-opening function, bug fixes, and more. It is built using web frontend with Vue3, mobile frontend with Uniapp, management frontend with Vue3, backend with Nodejs, and uses MySQL5.7(+) + Redis for data support. It can be deployed on Linux, Windows, or MacOS, with data storage options including local storage, Aliyun OSS, Tencent Cloud COS, and Chevereto image bed.
build_MiniLLM_from_scratch
This repository aims to build a low-parameter LLM model through pretraining, fine-tuning, model rewarding, and reinforcement learning stages to create a chat model capable of simple conversation tasks. It features using the bert4torch training framework, seamless integration with transformers package for inference, optimized file reading during training to reduce memory usage, providing complete training logs for reproducibility, and the ability to customize robot attributes. The chat model supports multi-turn conversations. The trained model currently only supports basic chat functionality due to limitations in corpus size, model scale, SFT corpus size, and quality.
Llama-Chinese
Llama中文社区是一个专注于Llama模型在中文方面的优化和上层建设的高级技术社区。 **已经基于大规模中文数据,从预训练开始对Llama2模型进行中文能力的持续迭代升级【Done】**。**正在对Llama3模型进行中文能力的持续迭代升级【Doing】** 我们热忱欢迎对大模型LLM充满热情的开发者和研究者加入我们的行列。
ChatGPT-On-CS
ChatGPT-On-CS is an intelligent chatbot tool based on large models, supporting various platforms like WeChat, Taobao, Bilibili, Douyin, Weibo, and more. It can handle text, voice, and image inputs, access external resources through plugins, and customize enterprise AI applications based on proprietary knowledge bases. Users can set custom replies, utilize ChatGPT interface for intelligent responses, send images and binary files, and create personalized chatbots using knowledge base files. The tool also features platform-specific plugin systems for accessing external resources and supports enterprise AI applications customization.
For similar tasks
Semi-Auto-NovelAI-to-Pixiv
Semi-Auto-NovelAI-to-Pixiv is a powerful tool that enables batch image generation with NovelAI, along with various other useful features in a super user-friendly interface. It allows users to create images, generate random images, upload images to Pixiv, apply filters, enhance images, add watermarks, and more. The tool also supports video-to-image conversion and various image manipulation tasks. It offers a seamless experience for users looking to automate image processing tasks.
gpupixel
GPUPixel is a real-time, high-performance image and video filter library written in C++11 and based on OpenGL/ES. It incorporates a built-in beauty face filter that achieves commercial-grade beauty effects. The library is extremely easy to compile and integrate with a small size, supporting platforms including iOS, Android, Mac, Windows, and Linux. GPUPixel provides various filters like skin smoothing, whitening, face slimming, big eyes, lipstick, and blush. It supports input formats like YUV420P, RGBA, JPEG, PNG, and output formats like RGBA and YUV420P. The library's performance on devices like iPhone and Android is optimized, with low CPU usage and fast processing times. GPUPixel's lib size is compact, making it suitable for mobile and desktop applications.
painting-droid
Painting Droid is an AI-powered cross-platform painting app inspired by MS Paint, expandable with plugins and open. It utilizes various AI models, from paid providers to self-hosted open-source models, as well as some lightweight ones built into the app. Features include regular painting app features, AI-generated content filling and augmentation, filters and effects, image manipulation, plugin support, and cross-platform compatibility.
Topaz-Video-AI
Topaz-Video-AI is a software tool designed to enhance video quality and provide various editing features. Users can utilize this tool to improve the visual appeal of their videos by applying filters, adjusting colors, and enhancing details. The software offers a user-friendly interface and a range of customization options to cater to different editing needs. Despite potential triggers from antivirus programs, Topaz-Video-AI is safe to use and has been tested by numerous users. By following the provided instructions, users can easily download, install, and run the software to enhance their video content.
StableSwarmUI
StableSwarmUI is a modular Stable Diffusion web user interface that emphasizes making power tools easily accessible, high performance, and extensible. It is designed to be a one-stop-shop for all things Stable Diffusion, providing a wide range of features and capabilities to enhance the user experience.
upscayl
Upscayl is a free and open-source AI image upscaler that uses advanced AI algorithms to enlarge and enhance low-resolution images without losing quality. It is a cross-platform application built with the Linux-first philosophy, available on all major desktop operating systems. Upscayl utilizes Real-ESRGAN and Vulkan architecture for image enhancement, and its backend is fully open-source under the AGPLv3 license. It is important to note that a Vulkan compatible GPU is required for Upscayl to function effectively.
ailia-models
The collection of pre-trained, state-of-the-art AI models. ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing. # Supported models 323 models as of April 8th, 2024
models
This repository contains self-trained single image super resolution (SISR) models. The models are trained on various datasets and use different network architectures. They can be used to upscale images by 2x, 4x, or 8x, and can handle various types of degradation, such as JPEG compression, noise, and blur. The models are provided as safetensors files, which can be loaded into a variety of deep learning frameworks, such as PyTorch and TensorFlow. The repository also includes a number of resources, such as examples, results, and a website where you can compare the outputs of different models.
For similar jobs
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.
exif-photo-blog
EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
Twitter-Insight-LLM
This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.
obs-localvocal
LocalVocal is a live-streaming AI assistant plugin for OBS that allows you to transcribe audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). It's privacy-first, with all data staying on your machine, and requires no GPU, cloud costs, network, or downtime.