
wiseflow
Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.
Stars: 6813

Wiseflow is an agile information mining tool that utilizes the thinking and analysis capabilities of large models to accurately extract specific information from various given sources, without the need for manual intervention. The tool focuses on filtering noise from a vast amount of information to reveal valuable insights. It is recommended to use normal language models for information extraction tasks to optimize speed and cost, rather than complex reasoning models. The tool is designed for continuous information gathering based on specified focus points from various sources.
README:
🚀 AI首席情报官(Wiseflow)是一个敏捷的信息挖掘工具,可以从各种给定信源中依靠大模型的思考与分析能力精准抓取特定信息,全程无需人工参与。
我们缺的不是信息,而是从海量信息中过滤噪音,从而让有价值的信息显露出来
🌱看看AI情报官是如何帮您节省时间,过滤无关信息,并整理关注要点的吧!🌱
https://github.com/user-attachments/assets/fc328977-2366-4271-9909-a89d9e34a07b
🌟 提醒:虽然近期你可能听很多人提起 deepseek R1 为代表的reasoning模型屌爆了(对此我并不否认),但 wiseflow 这种信息提取和总结任务并不需要复杂的逻辑推理,使用reasoning模型反而会极大的增加耗时和成本!如果您对此结论有进一步了解的兴趣,可以参考如下测试报告:wiseflow V0.38 with deepseek series report
💢 智谱平台将在2025年3月14日零时起,正式对 web_search_pro 接口进行收费,如需使用搜索功能,请注意账户余额 💢 智谱平台公告
有关本次升级更多内容请见 CHANGELOG.md
V0.3.7以及之前版本的老用户升级后请先在 pb 文件夹下执行一次 ./pocketbase migrate
感谢如下社区成员在 V0.3.5~V0.3.9 版本中的 PR:
- @ourines 贡献了 install_pocketbase.sh自动化安装脚本
- @ibaoger 贡献了 windows下的pocketbase自动化安装脚本
- @tusik 贡献了异步 llm wrapper 同时发现了AsyncWebCrawler生命周期的问题
- @c469591 贡献了 windows版本启动脚本
- @braumye 贡献了 docker 运行方案
- @YikaJ 提供了对 install_pocketbase.sh 的优化
v0.3.9 是 0.3.x 系列的长期稳定版本。wiseflow 的下一个开源版本预计需要等待至少2个月,我们将开启全新的0.4.x 架构
有关0.4.x我其实一直在思考具体的产品路线图,目前看我需要更多的真实用户反馈,希望大家能够在 issue 板块中多提出使用需求。
在最新的提取策略下,我们发现7b 这种规模的模型也能很好的执行链接分析与提取任务,测试结果请参考 report
不过信息总结任务目前还是推荐大家使用不低于 32b 规模的模型,具体推荐请参考最新的 env_sample
继续欢迎大家提交更多测试结果,共同探索 wiseflow 在各种信源下的最佳使用方案。
现阶段,提交测试结果等同于提交项目代码,同样会被接纳为contributor,甚至受邀参加商业化项目!具体请参考 test/README.md
简单说,问答任务使用“deep search”类应用更合适,信息搜集任务你可以尝试下,就知道 wiseflow 在这方面的优势了……
wiseflow自2024年6月底发布 V0.3.0版本来受到了开源社区的广泛关注,甚至吸引了不少自媒体的主动报道,在此首先表示感谢!
但我们也注意到部分关注者对 wiseflow 的功能定位存在一些理解偏差,如下表格通过与传统爬虫工具、AI搜索、知识库(RAG)类项目的对比,代表了我们目前对于 wiseflow 产品最新定位思考。
与 首席情报官(Wiseflow) 的比较说明 | |
---|---|
爬虫类工具 | 首先 wiseflow 是基于爬虫工具的项目,但传统的爬虫工具在信息提取方面需要人工的提供明确的 Xpath 等信息……这不仅阻挡了普通用户,同时也毫无通用性可言,对于不同网站(包括已有网站升级后)都需要人工重做分析,更新程序。wiseflow致力于使用 LLM 自动化网页的分析和提取工作,用户只要告诉程序他的关注点即可。 如果以 Crawl4ai 为例对比说明,Crawl4ai 是会使用 llm 进行信息提取的爬虫,而wiseflow 则是会使用爬虫工具的llm信息提取器。 |
AI搜索(包括各类‘deep search’) | AI搜索主要的应用场景是具体问题的即时问答,举例:”XX公司的创始人是谁“、“xx品牌下的xx产品哪里有售” ,用户要的是一个答案;wiseflow主要的应用场景是某一方面信息的持续采集,比如XX公司的关联信息追踪,XX品牌市场行为的持续追踪……在这些场景下,用户能提供关注点(某公司、某品牌)、甚至能提供信源(站点 url 等),但无法提出具体搜索问题,用户要的是一系列相关信息 |
知识库(RAG)类项目 | 知识库(RAG)类项目一般是基于已有信息的下游任务,并且一般面向的是私有知识(比如企业内的操作手册、产品手册、政府部门的文件等);wiseflow 目前并未整合下游任务,同时面向的是互联网上的公开信息,如果从“智能体”的角度来看,二者属于为不同目的而构建的智能体,RAG 类项目是“(内部)知识助理智能体”,而 wiseflow 则是“(外部)信息采集智能体” |
🌹 点赞、fork是好习惯 🌹
windows 用户请提前下载 git bash 工具,并在 bash 中执行如下命令 bash下载链接
git clone https://github.com/TeamWiseFlow/wiseflow.git
linux/macos 用户请执行
chmod +x install_pocketbase
./install_pocketbase
windows 用户请执行 install_pocketbase.ps1 脚本
wiseflow 0.3.x版本使用 pocketbase 作为数据库,你当然也可以手动下载 pocketbase 客户端 (记得下载0.23.4版本,并放入 pb 目录下) 以及手动完成superuser的创建(记得存入.env文件)
具体可以参考 pb/README.md
🌟 这里与之前版本不同,V0.3.5开始需要把 .env 放置在 core 文件夹中。
wiseflow 是 LLM 原生应用,请务必保证为程序提供稳定的 LLM 服务。
🌟 wiseflow 并不限定模型服务提供来源,只要服务兼容 openAI SDK 即可,包括本地部署的 ollama、Xinference 等服务
siliconflow(硅基流动)提供大部分主流开源模型的在线 MaaS 服务,凭借着自身的加速推理技术积累,其服务速度和价格方面都有很大优势。使用 siliconflow 的服务时,.env的配置可以参考如下:
LLM_API_KEY=Your_API_KEY
LLM_API_BASE="https://api.siliconflow.cn/v1"
PRIMARY_MODEL="Qwen/Qwen2.5-32B-Instruct"
SECONDARY_MODEL="Qwen/Qwen2.5-14B-Instruct"
VL_MODEL="deepseek-ai/deepseek-vl2"
😄 如果您愿意,可以使用我的siliconflow邀请链接,这样我也可以获得更多token奖励 🌹
如果您的信源多为非中文页面,且也不要求提取出的 info 为中文,那么更推荐您使用 openai、claude、gemini 等海外闭源商业模型。您可以尝试第三方代理 AiHubMix,支持国内网络环境直连、支付宝便捷支付,免去封号风险。 使用 AiHubMix 的模型时,.env的配置可以参考如下:
LLM_API_KEY=Your_API_KEY
LLM_API_BASE="https://aihubmix.com/v1" # 具体参考 https://doc.aihubmix.com/
PRIMARY_MODEL="gpt-4o"
SECONDARY_MODEL="gpt-4o-mini"
VL_MODEL="gpt-4o"
😄 欢迎使用 AiHubMix邀请链接 注册 🌹
以 Xinference 为例,.env 配置可以参考如下:
# LLM_API_KEY='' 本地服务无需这一项,请注释掉或删除
LLM_API_BASE='http://127.0.0.1:9997'
PRIMARY_MODEL=启动的模型 ID
VL_MODEL=启动的模型 ID
PB_API_AUTH="[email protected]|1234567890"
这里pocketbase 数据库的 superuser 用户名和密码,记得用 | 分隔 (如果 install_pocketbase.sh 脚本执行成功,这一项应该已经存在了)
ZHIPU_API_KEY=Your_API_KEY
(申请地址:https://bigmodel.cn/ 目前免费 0.03 元/次,请保证账户余额)
下面的都是可选配置:
-
#VERBOSE="true"
是否开启观测模式,开启的话会把 debug 信息记录在 logger 文件上(默认仅输出在 console 上);
-
#PROJECT_DIR="work_dir"
项目运行数据目录,不配置的话,默认在
core/work_dir
,注意:目前整个 core 目录是挂载到 container 下的,所以意味着你可以直接访问这里。 -
#PB_API_BASE=""
只有当你的 pocketbase 不运行在默认ip 或端口下才需要配置,默认情况下忽略就行。
-
#LLM_CONCURRENT_NUMBER=8
用于控制 llm 的并发请求数量,不设定默认是1(开启前请确保 llm provider 支持设定的并发,本地大模型慎用,除非你对自己的硬件基础有信心)
推荐使用 conda 构建虚拟环境(当然你也可以忽略这一步,或者使用其他 python 虚拟环境方案)
conda create -n wiseflow python=3.10
conda activate wiseflow
之后运行
cd wiseflow
cd core
pip install -r requirements.txt
python -m playwright install --with-deps chromium
之后 MacOS&Linux 用户执行
chmod +x run.sh
./run.sh
Windows 用户执行
python windows_run.py
以上脚本会自动判断 pocketbase 是否已经在运行,如果未运行,会自动拉起。但是请注意,当你 ctrl+c 或者 ctrl+z 终止进程时,pocketbase 进程不会被终止,直到你关闭terminal。
run.sh 会先对所有已经激活(activated 设定为 true)的信源执行一次爬取任务,之后以小时为单位按设定的频率周期执行。
启动程序后,打开pocketbase Admin dashboard UI (http://127.0.0.1:8090/_/)
通过这个表单可以配置信源,注意:信源需要在下一步的 focus_point 表单中被选择。
sites 字段说明:
- url, 信源的url,信源无需给定具体文章页面,给文章列表页面即可。
- type, 类型,web 或者 rss。
通过这个表单可以指定你的关注点,LLM会按此提炼、过滤并分类信息。
字段说明:
- focuspoint, 关注点描述(必填),如”上海小升初信息“、”招标通知“
- explanation,关注点的详细解释或具体约定,如 “仅限上海市官方发布的初中升学信息”、“发布日期在2025年1月1日之后且金额100万以上的“等
- activated, 是否激活。如果关闭则会忽略该关注点,关闭后可再次开启
- per_hour, 爬取频率,单位为小时,类型为整数(1~24范围,我们建议扫描频次不要超过一天一次,即设定为24)
- search_engine, 每次爬取是否开启搜索引擎
- sites,选择对应的信源
注意:V0.3.8版本后,配置的调整无需重启程序,会在下一次执行时自动生效。
如果您希望使用 Docker 部署 Wiseflow,我们也提供了完整的容器化支持。
确保您的系统已经安装了 Docker。
将env_docker
文件复制为根目录下的.env
文件:
cp env_docker .env
3. 参考《安装与使用》修改.env
文件
以下几个环境变量是必须按需修改的:
LLM_API_KEY=""
LLM_API_BASE="https://api.siliconflow.cn/v1"
PB_SUPERUSER_EMAIL="[email protected]"
PB_SUPERUSER_PASSWORD="1234567890" #no '&' in the password and at least 10 characters
在项目根目录执行:
docker compose up -d
服务启动后:
- PocketBase 管理界面:http://localhost:8090/_/
- Wiseflow 服务将自动运行并连接到 PocketBase
docker compose down
-
./pb/pb_data
目录用于存储 PocketBase 相关文件 -
./docker/pip_cache
目录用于存储 Python 依赖包缓存, 避免重复下载安装依赖 -
./core/work_dir
目录用于存储 wiseflow 运行时的日志, 可在.env
文件修改PROJECT_DIR
1、参考 dashbord 部分源码二次开发。
注意 wiseflow 的 core 部分并不需要 dashboard,目前产品也未集成 dashboard,如果您有dashboard需求,请下载 V0.2.1版本
2、直接从 Pocketbase 中获取数据
wiseflow 所有抓取数据都会即时存入 pocketbase,因此您可以直接操作 pocketbase 数据库来获取数据。
PocketBase作为流行的轻量级数据库,目前已有 Go/Javascript/Python 等语言的SDK。
- Go : https://pocketbase.io/docs/go-overview/
- Javascript : https://pocketbase.io/docs/js-overview/
- python : https://github.com/vaphes/pocketbase
本项目基于 Apache2.0 开源。
商用合作,请联系 Email:[email protected]
- 商用客户请联系我们报备登记,产品承诺永远免费。
有任何问题或建议,欢迎通过 issue 留言。
- crawl4ai(Open-source LLM Friendly Web Crawler & Scraper) https://github.com/unclecode/crawl4ai
- pocketbase (Open Source realtime backend in 1 file) https://github.com/pocketbase/pocketbase
- python-pocketbase (pocketBase client SDK for python) https://github.com/vaphes/pocketbase
- feedparser (Parse feeds in Python) https://github.com/kurtmckee/feedparser
本项目开发受 GNE、AutoCrawler 、SeeAct 启发。
如果您在相关工作中参考或引用了本项目的部分或全部,请注明如下信息:
Author:Wiseflow Team
https://github.com/TeamWiseFlow/wiseflow
Licensed under Apache2.0
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for wiseflow
Similar Open Source Tools

wiseflow
Wiseflow is an agile information mining tool that utilizes the thinking and analysis capabilities of large models to accurately extract specific information from various given sources, without the need for manual intervention. The tool focuses on filtering noise from a vast amount of information to reveal valuable insights. It is recommended to use normal language models for information extraction tasks to optimize speed and cost, rather than complex reasoning models. The tool is designed for continuous information gathering based on specified focus points from various sources.

Code-Interpreter-Api
Code Interpreter API is a project that combines a scheduling center with a sandbox environment, dedicated to creating the world's best code interpreter. It aims to provide a secure, reliable API interface for remotely running code and obtaining execution results, accelerating the development of various AI agents, and being a boon to many AI enthusiasts. The project innovatively combines Docker container technology to achieve secure isolation and execution of Python code. Additionally, the project supports storing generated image data in a PostgreSQL database and accessing it through API endpoints, providing rich data processing and storage capabilities.

prose-polish
prose-polish is a tool for AI interaction through drag-and-drop cards, focusing on editing copy and manuscripts. It can recognize Markdown-formatted documents, automatically breaking them into paragraph cards. Users can create prefabricated prompt cards and quickly connect them to the manuscript for editing. The modified manuscript is still presented in card form, allowing users to drag it out as a new paragraph. To use it smoothly, users just need to remember one rule: 'Plug the plug into the socket!'

MarkMap-OpenAi-ChatGpt
MarkMap-OpenAi-ChatGpt is a Vue.js-based mind map generation tool that allows users to generate mind maps by entering titles or content. The application integrates the markmap-lib and markmap-view libraries, supports visualizing mind maps, and provides functions for zooming and adapting the map to the screen. Users can also export the generated mind map in PNG, SVG, JPEG, and other formats. This project is suitable for quickly organizing ideas, study notes, project planning, etc. By simply entering content, users can get an intuitive mind map that can be continuously expanded, downloaded, and shared.

MoneyPrinterTurbo
MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.

ZcChat
ZcChat is an AI desktop pet suitable for Galgame characters, featuring long-term memory, expressive actions, control over the computer, and voice functions. It utilizes Letta for AI long-term memory, Galgame-style character illustrations for more actions and expressions, and voice interaction with support for various voice synthesis tools like Vits. Users can configure characters, install Letta, set up voice synthesis and input, and control the pet to interact with the computer. The tool enhances visual and auditory experiences for users interested in AI desktop pets.

rime_wanxiang_pro
Rime Wanxiang Pro is an enhanced version of Wanxiang, supporting the 9, 14, and 18-key layouts. It features a pinyin library with optimized word and language models, supporting accurate sentence output with tones. The tool also allows for mixed Chinese and English input, offering various usage scenarios. Users can customize their input method by selecting different decoding and auxiliary code rules, enabling flexible combinations of pinyin and auxiliary codes. The tool simplifies the complex configuration of Rime and provides a unified word library for multiple input methods, enhancing input efficiency and user experience.

Long-Novel-GPT
Long-Novel-GPT is a long novel generator based on large language models like GPT. It utilizes a hierarchical outline/chapter/text structure to maintain the coherence of long novels. It optimizes API calls cost through context management and continuously improves based on self or user feedback until reaching the set goal. The tool aims to continuously refine and build novel content based on user-provided initial ideas, ultimately generating long novels at the level of human writers.

AI-Codereview-Gitlab
AI-Codereview-Gitlab is an automated code review tool based on large models, designed to help development teams conduct intelligent code reviews quickly during code merging or submission. It supports multiple large models including DeepSeek, ZhipuAI, OpenAI, and Ollama. The tool can automatically push review results to DingTalk, WeChat Work, and Feishu, generate daily reports based on GitLab commit records, and provide a visual dashboard to display code review records. The tool works by triggering webhook events on GitLab when users submit code, calling third-party large models to review the code, and recording the review results in corresponding Merge Requests or Commit Notes.

AivisSpeech
AivisSpeech is a Japanese text-to-speech software based on the VOICEVOX editor UI. It incorporates the AivisSpeech Engine for generating emotionally rich voices easily. It supports AIVMX format voice synthesis model files and specific model architectures like Style-Bert-VITS2. Users can download AivisSpeech and AivisSpeech Engine for Windows and macOS PCs, with minimum memory requirements specified. The development follows the latest version of VOICEVOX, focusing on minimal modifications, rebranding only where necessary, and avoiding refactoring. The project does not update documentation, maintain test code, or refactor unused features to prevent conflicts with VOICEVOX.

AI-Drug-Discovery-Design
AI-Drug-Discovery-Design is a repository focused on Artificial Intelligence-assisted Drug Discovery and Design. It explores the use of AI technology to accelerate and optimize the drug development process. The advantages of AI in drug design include speeding up research cycles, improving accuracy through data-driven models, reducing costs by minimizing experimental redundancies, and enabling personalized drug design for specific patients or disease characteristics.

paper-ai
Paper-ai is a tool that helps you write papers using artificial intelligence. It provides features such as AI writing assistance, reference searching, and editing and formatting tools. With Paper-ai, you can quickly and easily create high-quality papers.

MINI_LLM
This project is a personal implementation and reproduction of a small-parameter Chinese LLM. It mainly refers to these two open source projects: https://github.com/charent/Phi2-mini-Chinese and https://github.com/DLLXW/baby-llama2-chinese. It includes the complete process of pre-training, SFT instruction fine-tuning, DPO, and PPO (to be done). I hope to share it with everyone and hope that everyone can work together to improve it!

LabelQuick
LabelQuick_V2.0 is a fast image annotation tool designed and developed by the AI Horizon team. This version has been optimized and improved based on the previous version. It provides an intuitive interface and powerful annotation and segmentation functions to efficiently complete dataset annotation work. The tool supports video object tracking annotation, quick annotation by clicking, and various video operations. It introduces the SAM2 model for accurate and efficient object detection in video frames, reducing manual intervention and improving annotation quality. The tool is designed for Windows systems and requires a minimum of 6GB of memory.

CodeAsk
CodeAsk is a code analysis tool designed to tackle complex issues such as code that seems to self-replicate, cryptic comments left by predecessors, messy and unclear code, and long-lasting temporary solutions. It offers intelligent code organization and analysis, security vulnerability detection, code quality assessment, and other interesting prompts to help users understand and work with legacy code more efficiently. The tool aims to translate 'legacy code mountains' into understandable language, creating an illusion of comprehension and facilitating knowledge transfer to new team members.
For similar tasks

Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.

Time-LLM
Time-LLM is a reprogramming framework that repurposes large language models (LLMs) for time series forecasting. It allows users to treat time series analysis as a 'language task' and effectively leverage pre-trained LLMs for forecasting. The framework involves reprogramming time series data into text representations and providing declarative prompts to guide the LLM reasoning process. Time-LLM supports various backbone models such as Llama-7B, GPT-2, and BERT, offering flexibility in model selection. The tool provides a general framework for repurposing language models for time series forecasting tasks.

crewAI
CrewAI is a cutting-edge framework designed to orchestrate role-playing autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It enables AI agents to assume roles, share goals, and operate in a cohesive unit, much like a well-oiled crew. Whether you're building a smart assistant platform, an automated customer service ensemble, or a multi-agent research team, CrewAI provides the backbone for sophisticated multi-agent interactions. With features like role-based agent design, autonomous inter-agent delegation, flexible task management, and support for various LLMs, CrewAI offers a dynamic and adaptable solution for both development and production workflows.

Transformers_And_LLM_Are_What_You_Dont_Need
Transformers_And_LLM_Are_What_You_Dont_Need is a repository that explores the limitations of transformers in time series forecasting. It contains a collection of papers, articles, and theses discussing the effectiveness of transformers and LLMs in this domain. The repository aims to provide insights into why transformers may not be the best choice for time series forecasting tasks.

pytorch-forecasting
PyTorch Forecasting is a PyTorch-based package for time series forecasting with state-of-the-art network architectures. It offers a high-level API for training networks on pandas data frames and utilizes PyTorch Lightning for scalable training on GPUs and CPUs. The package aims to simplify time series forecasting with neural networks by providing a flexible API for professionals and default settings for beginners. It includes a timeseries dataset class, base model class, multiple neural network architectures, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. PyTorch Forecasting is built on pytorch-lightning for easy training on various hardware configurations.

spider
Spider is a high-performance web crawler and indexer designed to handle data curation workloads efficiently. It offers features such as concurrency, streaming, decentralization, headless Chrome rendering, HTTP proxies, cron jobs, subscriptions, smart mode, blacklisting, whitelisting, budgeting depth, dynamic AI prompt scripting, CSS scraping, and more. Users can easily get started with the Spider Cloud hosted service or set up local installations with spider-cli. The tool supports integration with Node.js and Python for additional flexibility. With a focus on speed and scalability, Spider is ideal for extracting and organizing data from the web.

AI_for_Science_paper_collection
AI for Science paper collection is an initiative by AI for Science Community to collect and categorize papers in AI for Science areas by subjects, years, venues, and keywords. The repository contains `.csv` files with paper lists labeled by keys such as `Title`, `Conference`, `Type`, `Application`, `MLTech`, `OpenReviewLink`. It covers top conferences like ICML, NeurIPS, and ICLR. Volunteers can contribute by updating existing `.csv` files or adding new ones for uncovered conferences/years. The initiative aims to track the increasing trend of AI for Science papers and analyze trends in different applications.

pytorch-forecasting
PyTorch Forecasting is a PyTorch-based package designed for state-of-the-art timeseries forecasting using deep learning architectures. It offers a high-level API and leverages PyTorch Lightning for efficient training on GPU or CPU with automatic logging. The package aims to simplify timeseries forecasting tasks by providing a flexible API for professionals and user-friendly defaults for beginners. It includes features such as a timeseries dataset class for handling data transformations, missing values, and subsampling, various neural network architectures optimized for real-world deployment, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. Built on pytorch-lightning, it supports training on CPUs, single GPUs, and multiple GPUs out-of-the-box.
For similar jobs

wiseflow
Wiseflow is an agile information mining tool that utilizes the thinking and analysis capabilities of large models to accurately extract specific information from various given sources, without the need for manual intervention. The tool focuses on filtering noise from a vast amount of information to reveal valuable insights. It is recommended to use normal language models for information extraction tasks to optimize speed and cost, rather than complex reasoning models. The tool is designed for continuous information gathering based on specified focus points from various sources.

lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.

AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.