CGraph
【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
Stars: 1858
CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.
README:
中文 | English Readme
CGraph is a cross-platform Directed Acyclic Graph framework based on pure C++ without any 3rd-party dependencies. You, with it, can build your own operators simply, and describe any running schedules as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please get in touch with us for free if you need more about this repository.
CGraph
中文名为【色丶图】,是一套无任何第三方依赖的跨平台图流程执行框架。通过GPipeline
(流水线)底层调度,提供了包含依赖元素依次执行、非依赖元素并发执行,支持暂停、恢复、超时设定的 eDAG
调度功能。
使用者只需继承GNode
(节点)类,实现子类的run()
方法,并根据需要设定依赖关系,即可实现任务的图化执行或流水线执行。还可以通过设定各种包含多节点信息的GGroup
(组),自行控制图的条件判断、循环和并发执行逻辑。
项目提供了丰富的Param
(参数)类型,用于不同应用场景下的数据互通。此外,还可以通过添加GAspect
(切面)的方式,实现以上各种元素功能的横向扩展;通过引入GAdapter
(适配器)对单个节点功能进行加强;或者通过添加GEvent
(信号),丰富和优化执行逻辑。
本工程使用纯C++11标准库编写,无任何第三方依赖。兼容MacOS
、Linux
、Windows
和Android
系统,支持通过 CLion
、VSCode
、Xcode
、Visual Studio
、Code::Blocks
、Qt Creator
等多款IDE进行本地编译和二次开发,具体编译方式请参考 CGraph 编译说明
详细功能介绍和用法,请参考 一面之猿网 中的文章内容。相关视频在B站持续更新中,欢迎观看和交流:
-
【B站视频】CGraph 入门篇
-
【B站视频】CGraph 功能篇
- 全面介绍CGraph项目中,所有的名词术语和功能模块
- 结合实际coding过程,详细介绍了每个功能的具体的使用场景、用法、以及解决的问题
- 适合想要全面了解功能和快速上手使用CGraph的童鞋
- 适合对多线程编程感兴趣的童鞋
-
【B站视频】CGraph 应用篇
-
【B站视频】CGraph 分享篇
#include "CGraph.h"
class MyNode1 : public CGraph::GNode {
public:
CStatus run() override {
printf("[%s], sleep for 1 second ...\n", this->getName().c_str());
CGRAPH_SLEEP_SECOND(1)
return CStatus();
}
};
class MyNode2 : public CGraph::GNode {
public:
CStatus run() override {
printf("[%s], sleep for 2 second ...\n", this->getName().c_str());
CGRAPH_SLEEP_SECOND(2)
return CStatus();
}
};
#include "MyNode.h"
using namespace CGraph;
int main() {
/* 创建一个流水线,用于设定和执行流图信息 */
GPipelinePtr pipeline = GPipelineFactory::create();
GElementPtr a, b, c, d = nullptr;
/* 注册节点之间的依赖关系 */
pipeline->registerGElement<MyNode1>(&a, {}, "nodeA");
pipeline->registerGElement<MyNode2>(&b, {a}, "nodeB");
pipeline->registerGElement<MyNode1>(&c, {a}, "nodeC");
pipeline->registerGElement<MyNode2>(&d, {b, c}, "nodeD");
/* 执行流图框架 */
pipeline->process();
/* 清空流水线中所有的资源 */
GPipelineFactory::remove(pipeline);
return 0;
}
如上图所示,图结构执行的时候,首先执行a
节点。a
节点执行完毕后,并行执行b
和c
节点。b
和c
节点全部执行完毕后,再执行d
节点。
- 纯序员给你介绍图化框架的简单实现——执行逻辑
- 纯序员给你介绍图化框架的简单实现——循环逻辑
- 纯序员给你介绍图化框架的简单实现——参数传递
- 纯序员给你介绍图化框架的简单实现——条件判断
- 纯序员给你介绍图化框架的简单实现——面向切面
- 纯序员给你介绍图化框架的简单实现——函数注入
- 纯序员给你介绍图化框架的简单实现——消息机制
- 纯序员给你介绍图化框架的简单实现——事件触发
- 纯序员给你介绍图化框架的简单实现——超时机制
- 纯序员给你介绍图化框架的简单实现——线程池优化(一)
- 纯序员给你介绍图化框架的简单实现——线程池优化(二)
- 纯序员给你介绍图化框架的简单实现——线程池优化(三)
- 纯序员给你介绍图化框架的简单实现——线程池优化(四)
- 纯序员给你介绍图化框架的简单实现——线程池优化(五)
- 纯序员给你介绍图化框架的简单实现——线程池优化(六)
- 纯序员给你介绍图化框架的简单实现——性能优化(一)
- 纯序员给你介绍图化框架的简单实现——性能优化(二)
-
纯序员给你介绍图化框架的简单实现——距离计算
- CGraph 主打歌——《听码农的话》
- 聊聊我写CGraph的这一年
- 从零开始主导一款收录于awesome-cpp的项目,是一种怎样的体验?
- 炸裂!CGraph性能全面超越taskflow之后,作者却说他更想...
- 以图优图:CGraph中计算dag最大并发度思路总结
- 一文带你了解练习时长两年半的CGraph
- CGraph作者想知道,您是否需要一款eDAG调度框架
- 降边增效:CGraph中冗余边剪裁思路总结
- GraphANNS : Graph-based Approximate Nearest Neighbor Search Working off CGraph
- CThreadPool : 一个简单好用、功能强大、性能优异、跨平台的C++线程池
- CGraph-lite : head-only, simplest CGraph, with DAG executor and param translate function
- awesome-cpp : A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.
- awesome-workflow-engines : A curated list of awesome open source workflow engines
- taskflow : A General-purpose Parallel and Heterogeneous Task Programming System
- torchpipe : Serving Inside Pytorch
- nndeploy : nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为内核,致力为用户提供跨平台、简单易用、高性能的模型部署体验。
- KuiperInfer : 带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
- OGraph : A simple way to build a pipeline with Go.
附录-1. 版本信息
[2021.05.04 - v1.0.0 - Chunel]
- 提供图化执行功能,支持非依赖节点并行计算
[2021.05.09 - v1.1.0 - Chunel]
- 优化图执行过程中的并发度
[2021.05.18 - v1.1.1 - Chunel]
- 添加节点
name
和session
信息
[2021.05.23 - v1.2.0 - Chunel]
- 提供单节点循环执行功能
[2021.05.29 - v1.3.0 - Chunel]
- 提供
cluster
(簇)和region
(区域)划分和循环执行功能 - 提供
tutorial
内容,包含多种使用样例
[2021.06.14 - v1.4.0 - Chunel]
- 提供
param
(参数)传递机制 - 提供
group
(组)功能,多节点模块统一继承自group
模块 - 添加对Linux系统的的支持
[2021.06.20 - v1.4.1 - Chunel]
- 提供
condition
(条件)功能 - 添加对Windows系统的支持
[2021.06.24 - v1.5.0 - Chunel]
- 提供
pipeline
工厂创建方法 - 更新
tutorial
内容
[2021.07.07 - v1.5.1 - Chunel]
- 优化线程池功能。实现任务盗取机制
[2021.07.11 - v1.5.2 - Chunel]
- 优化线程池功能。实现线程数量自动调节机制
[2021.07.31 - v1.5.3 - Chunel]
- 优化线程池功能。实现任务批量获取功能,优化任务盗取机制
[2021.08.29 - v1.6.0 - Chunel]
- 提供多
pipeline
功能,优化底层逻辑 - 更新
tutorial
内容
[2021.09.19 - v1.6.1 - Chunel]
- 提供
Lru
算子、Trie
算子和模板节点功能,优化底层逻辑 - 更新
tutorial
内容
[2021.09.29 - v1.7.0 - Chunel]
- 提供
aspect
(切面)功能,用于横向扩展node
或group
功能 - 更新
tutorial
内容
[2021.10.07 - v1.7.1 - Chunel]
- 优化
aspect
(切面)实现逻辑,提供切面参数功能,提供批量添加切面功能 - 更新
tutorial
内容
[2021.11.01 - v1.8.0 - Chunel]
- 提供
adapter
(适配器)功能,提供singleton
适配器功能 - 优化
pipeline
执行逻辑 - 更新
tutorial
内容
[2021.12.18 - v1.8.1 - Chunel]
- 优化了返回值
CStatus
信息
[2022.01.02 - v1.8.2 - Chunel]
- 提供节点执行超时自动退出功能,提供
task group
(任务组)功能 - 提供线程池配置参数设置方法
[2022.01.23 - v1.8.3 - Chunel]
- 提供
function
适配器,实现函数式编程功能 - 提供线程优先级调度功能,提供线程绑定cpu执行功能
- 更新
tutorial
内容
[2022.01.31 - v1.8.4 - Chunel]
- 提供
node
(节点)异步执行的功能
[2022.02.03 - v1.8.5 - Chunel]
- 提供
daemon
(守护)功能,用于定时执行非流图中任务 - 更新
tutorial
内容
[2022.04.03 - v1.8.6 - Chunel]
- 提供
DistanceCalculator
算子,用于实现任意数据类型、任意距离类型的计算 - 更新
tutorial
内容
[2022.04.05 - v2.0.0 - Chunel]
- 提供
domain
(领域)功能,提供Ann
领域抽象模型,开始支持个别专业方向 - 提供hold执行机制
- 更新
tutorial
内容
[2022.05.01 - v2.0.1 - Chunel]
- 优化
pipeline
注册机制,支持init方法自定义顺序执行 - 提供一键编译脚本
[2022.05.29 - v2.1.0 - Chunel]
- 提供
element
参数写入方法 - 提供针对C++14版本的支持
- 更新
tutorial
内容
[2022.10.03 - v2.1.1 - Chunel]
- 提供线程池中的任务优先级机制
- 优化
group
执行逻辑
[2022.11.03 - v2.2.0 - Chunel]
- 提供
message
(消息)功能,主要用于完成不同pipeline
之间的数据传递 - 更新
tutorial
内容
[2022.12.24 - v2.2.1 - Chunel]
- 提供
TemplateNode
(模板节点)功能,用于优化参数传参方式 - 更新
tutorial
内容
[2022.12.25 - v2.2.2 - yeshenyong]
- 优化图执行逻辑
[2022.12.30 - v2.2.3 - Chunel]
- 提供
message
发布订阅功能 - 提供执行引擎切换功能
[2023.01.21 - v2.3.0 - Chunel]
- 提供
event
(事件)功能 - 提供
CGraph Intro.xmind
文件,通过脑图的方式,介绍了CGraph的整体逻辑
[2023.01.25 - v2.3.1 - Chunel]
- 提供针对C++11版本的支持。感谢 MirrorYuChen 提供相关解决方案
[2023.02.10 - v2.3.2 - Chunel]
- 优化调度策略,提供调度参数配置接口
- 提供英文版本readme.md
[2023.02.12 - v2.3.3 - yeshenyong, Chunel]
- 提供graphviz可视化图展示功能
- 提供参数链路追踪功能
[2023.02.22 - v2.3.4 - Chunel]
- 优化Windows系统下调度机制
- 优化
param
机制和event
(事件)机制
[2023.03.25 - v2.4.0 - woodx, Chunel]
- 提供可运行的docker环境,和构建docker环境的dockerfile文件
- 提供
pipeline
调度资源管控机制 - 优化调度性能
[2023.05.05 - v2.4.1 - Chunel]
- 提供线程绑定执行功能
- 提供
pipeline
最大并发度获取方法。感谢 Hanano-Yuuki 提供相关解决方案 - 提供
pipeline
异步执行功能和执行时退出功能
[2023.06.17 - v2.4.2 - Chunel]
- 提供
MultiCondition
(多条件)功能 - 提供
pipeline
暂停执行和恢复执行功能
[2023.07.12 - v2.4.3 - Chunel]
- 优化
CStatus
功能,添加了异常定位信息
[2023.09.05 - v2.5.0 - Chunel]
- 提供perf功能,用于做
pipeline
的性能分析 - 提供
element
的超时机制 - 提供
some
(部分)功能,优化pipeline
的异步执行方式
[2023.09.15 - v2.5.1 - Chunel]
- 提供
fence
(栅栏)功能 - 提供
coordinator
(协调)功能
[2023.11.06 - v2.5.2 - Chunel]
- 优化
message
(消息)功能,可以设定写入阻塞时的处理方式,减少内存copy次数 - 添加
example
相关内容,针对不同行业,提供一些简单实现 - 优化调度性能
[2023.11.15 - v2.5.3 - Chunel]
- 提供
proto
定义文件 - 添加
mutable
(异变)功能,提供依赖关系注册语法糖
[2024.01.05 - v2.5.4 - Chunel]
- 提供
test
内容,包含性能和功能方面的测试用例 - 优化
event
(事件)机制,支持异步等待功能
[2024.07.18 - v2.6.0 - PaPaPig-Melody, Chunel]
- 提供
pipeline
的拓扑执行的方式 - 提供判定
element
之间是否有依赖关系的方法 - 提供bazel编译方式
- 优化perf功能
[2024.09.17 - v2.6.1 - Chunel]
- 提供
pipeline
的静态执行的方式,提供基于静态执行的微任务机制 - 提供
pipeline
剪裁功能,用于删除element
之间重复的依赖 - 提供
element
删除依赖的方法 - 优化
event
(事件)机制,异步事件可以等待结束 - 发布 CGraph-lite 项目,提供简单DAG构图和参数传递功能。接口完全兼容,可无缝切换至本项目
[2024.11.16 - v2.6.2 - Chunel]
- 优化参数互斥机制和获取性能
- 修复辅助线程异常等待问题
- 更新
tutorial
内容
[2024.12.24 - v2.7.0 - Chunel]
- 提供
stage
(阶段)功能,用于element
之间同步运行 - 更新
tutorial
内容
附录-2. 感谢
-
感谢 Doocs 微信公众号 刊登相关介绍文档,欢迎加入 Doocs 开源社区
-
感谢《HelloGithub》期刊介绍和推荐:HelloGithub 第70期
- 感谢《Github中文排行榜》介绍和推荐:Github中文排行榜 总榜-C++分类
- Thanks to the recommendation from awesome-cpp, we all know, it is the most authoritative recommendation list for cpp project in the world
- Thanks to the recommendation from
Taskflow Group
: awesome-parallel-computing, and we always treat taskflow as a role model - Thanks to the recommendation from awesome-workflow-engines
- 感谢各位开发者 CONTRIBUTORS 为项目做出的贡献
- 感谢所有为
CGraph
项目提出的意见和建议的朋友,在此不一一提及。随时欢迎大家加入,一起共建
附录-3. 联系方式
- 微信: ChunelFeng (欢迎扫描上方二维码,添加作者为好友。请简单备注个人信息^_^)
- 邮箱: [email protected]
- 源码: https://github.com/ChunelFeng/CGraph
- 论坛: www.chunel.cn
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for CGraph
Similar Open Source Tools
CGraph
CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.
Jarvis
Jarvis is a powerful virtual AI assistant designed to simplify daily tasks through voice command integration. It features automation, device management, and personalized interactions, transforming technology engagement. Built using Python and AI models, it serves personal and administrative needs efficiently, making processes seamless and productive.
AI-Vtuber
AI-VTuber is a highly customizable AI VTuber project that integrates with Bilibili live streaming, uses Zhifu API as the language base model, and includes intent recognition, short-term and long-term memory, cognitive library building, song library creation, and integration with various voice conversion, voice synthesis, image generation, and digital human projects. It provides a user-friendly client for operations. The project supports virtual VTuber template construction, multi-person device template management, real-time switching of virtual VTuber templates, and offers various practical tools such as video/audio crawlers, voice recognition, voice separation, voice synthesis, voice conversion, AI drawing, and image background removal.
agenta
Agenta is an open-source LLM developer platform for prompt engineering, evaluation, human feedback, and deployment of complex LLM applications. It provides tools for prompt engineering and management, evaluation, human annotation, and deployment, all without imposing any restrictions on your choice of framework, library, or model. Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time.
GoMaxAI-ChatGPT-Midjourney-Pro
GoMaxAI Pro is an AI-powered application for personal, team, and enterprise private operations. It supports various models like ChatGPT, Claude, Gemini, Kimi, Wenxin Yiyuan, Xunfei Xinghuo, Tsinghua Zhipu, Suno-v3.5, and Luma-video. The Pro version offers a new UI interface, member points system, management backend, homepage features, support for various content formats, AI video capabilities, SAAS multi-opening function, bug fixes, and more. It is built using web frontend with Vue3, mobile frontend with Uniapp, management frontend with Vue3, backend with Nodejs, and uses MySQL5.7(+) + Redis for data support. It can be deployed on Linux, Windows, or MacOS, with data storage options including local storage, Aliyun OSS, Tencent Cloud COS, and Chevereto image bed.
bitcart
Bitcart is a platform designed for merchants, users, and developers, providing easy setup and usage. It includes various linked repositories for core daemons, admin panel, ready store, Docker packaging, Python library for coins connection, BitCCL scripting language, documentation, and official site. The platform aims to simplify the process for merchants and developers to interact and transact with cryptocurrencies, offering a comprehensive ecosystem for managing transactions and payments.
prompt-in-context-learning
An Open-Source Engineering Guide for Prompt-in-context-learning from EgoAlpha Lab. 📝 Papers | ⚡️ Playground | 🛠 Prompt Engineering | 🌍 ChatGPT Prompt | ⛳ LLMs Usage Guide > **⭐️ Shining ⭐️:** This is fresh, daily-updated resources for in-context learning and prompt engineering. As Artificial General Intelligence (AGI) is approaching, let’s take action and become a super learner so as to position ourselves at the forefront of this exciting era and strive for personal and professional greatness. The resources include: _🎉Papers🎉_: The latest papers about _In-Context Learning_ , _Prompt Engineering_ , _Agent_ , and _Foundation Models_. _🎉Playground🎉_: Large language models(LLMs)that enable prompt experimentation. _🎉Prompt Engineering🎉_: Prompt techniques for leveraging large language models. _🎉ChatGPT Prompt🎉_: Prompt examples that can be applied in our work and daily lives. _🎉LLMs Usage Guide🎉_: The method for quickly getting started with large language models by using LangChain. In the future, there will likely be two types of people on Earth (perhaps even on Mars, but that's a question for Musk): - Those who enhance their abilities through the use of AIGC; - Those whose jobs are replaced by AI automation. 💎EgoAlpha: Hello! human👤, are you ready?
higress
Higress is an open-source cloud-native API gateway built on the core of Istio and Envoy, based on Alibaba's internal practice of Envoy Gateway. It is designed for AI-native API gateway, serving AI businesses such as Tongyi Qianwen APP, Bailian Big Model API, and Machine Learning PAI platform. Higress provides capabilities to interface with LLM model vendors, AI observability, multi-model load balancing/fallback, AI token flow control, and AI caching. It offers features for AI gateway, Kubernetes Ingress gateway, microservices gateway, and security protection gateway, with advantages in production-level scalability, stream processing, extensibility, and ease of use.
codemod
Codemod platform is a tool that helps developers create, distribute, and run codemods in codebases of any size. The AI-powered, community-led codemods enable automation of framework upgrades, large refactoring, and boilerplate programming with speed and developer experience. It aims to make dream migrations a reality for developers by providing a platform for seamless codemod operations.
ClashRoyaleBuildABot
Clash Royale Build-A-Bot is a project that allows users to build their own bot to play Clash Royale. It provides an advanced state generator that accurately returns detailed information using cutting-edge technologies. The project includes tutorials for setting up the environment, building a basic bot, and understanding state generation. It also offers updates such as replacing YOLOv5 with YOLOv8 unit model and enhancing performance features like placement and elixir management. The future roadmap includes plans to label more images of diverse cards, add a tracking layer for unit predictions, publish tutorials on Q-learning and imitation learning, release the YOLOv5 training notebook, implement chest opening and card upgrading features, and create a leaderboard for the best bots developed with this repository.
AI-on-the-edge-device
AI-on-the-edge-device is a project that enables users to digitize analog water, gas, power, and other meters using an ESP32 board with a supported camera. It integrates Tensorflow Lite for AI processing, offers a small and affordable device with integrated camera and illumination, provides a web interface for administration and control, supports Homeassistant, Influx DB, MQTT, and REST API. The device captures meter images, extracts Regions of Interest (ROIs), runs them through AI for digitization, and allows users to send data to MQTT, InfluxDb, or access it via REST API. The project also includes 3D-printable housing options and tools for logfile management.
Ai-Hoshino
Ai Hoshino - MD is a WhatsApp bot tool with features like voice and text interaction, group configuration, anti-delete, anti-link, personalized welcome messages, chatbot functionality, sticker creation, sub-bot integration, RPG game, YouTube music and video downloads, and more. The tool is actively maintained by Starlights Team and offers a range of functionalities for WhatsApp users.
99AI
99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.
SwanLab
SwanLab is an open-source, lightweight AI experiment tracking tool that provides a platform for tracking, comparing, and collaborating on experiments, aiming to accelerate the research and development efficiency of AI teams by 100 times. It offers a friendly API and a beautiful interface, combining hyperparameter tracking, metric recording, online collaboration, experiment link sharing, real-time message notifications, and more. With SwanLab, researchers can document their training experiences, seamlessly communicate and collaborate with collaborators, and machine learning engineers can develop models for production faster.
WeChatMsg
WeChatMsg is a tool designed to help users manage and analyze their WeChat data. It aims to provide users with the ability to preserve their precious memories and create a personalized AI companion. The tool allows users to extract and export various types of data from WeChat, such as text, images, contacts, and more. Additionally, it offers features like analyzing chat data and generating visual annual reports. WeChatMsg is built on the idea of empowering users to take control of their data and foster emotional connections through technology.
midjourney-proxy
Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.
For similar tasks
CGraph
CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.
For similar jobs
db2rest
DB2Rest is a modern low-code REST DATA API platform that simplifies the development of intelligent applications. It seamlessly integrates existing and new databases with language models (LMs/LLMs) and vector stores, enabling the rapid delivery of context-aware, reasoning applications without vendor lock-in.
mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.
airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.
labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
airflow
Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative. Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.
airbyte-platform
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's low-code Connector Development Kit (CDK). Airbyte is used by data engineers and analysts at companies of all sizes to move data for a variety of purposes, including data warehousing, data analysis, and machine learning.
chronon
Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.