
CGraph
【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
Stars: 1921

CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.
README:
中文 | English Readme
CGraph is a cross-platform Directed Acyclic Graph framework based on pure C++ without any 3rd-party dependencies. You, with it, can build your own operators simply, and describe any running schedules as you need, such as dependence, parallelling, aggregation, conditional and so on. Python APIs are also supported to build your pipeline. Tutorials and contact information are show as follows. Please get in touch with us for free if you need more about this repository.
CGraph
中文名为【色丶图】,是一套无任何第三方依赖的跨平台图流程执行框架。通过GPipeline
(流水线)底层调度,提供了包含依赖元素依次执行、非依赖元素并发执行,支持暂停、恢复、超时设定的 eDAG
调度功能。
使用者只需继承GNode
(节点)类,实现子类的run()
方法,并根据需要设定依赖关系,即可实现任务的图化执行或流水线执行。还可以通过设定各种包含多节点信息的GGroup
(组),自行控制图的条件判断、循环和并发执行逻辑。
项目提供了丰富的Param
(参数)类型,用于不同应用场景下的数据互通。此外,还可以通过添加GAspect
(切面)的方式,实现以上各种元素功能的横向扩展;通过引入GAdapter
(适配器)对单个节点功能进行加强;或者通过添加GEvent
(信号),丰富和优化执行逻辑。
本工程使用纯C++11标准库编写,无任何第三方依赖,并且提供Python
版本。兼容MacOS
、Linux
、Windows
和Android
系统,支持通过 CLion
、VSCode
、Xcode
、Visual Studio
、Code::Blocks
、Qt Creator
等多款IDE进行本地编译和二次开发,具体编译方式请参考 CGraph 编译说明
详细功能介绍和用法,请参考 一面之猿网 中的文章内容。相关视频在B站持续更新中,欢迎观看和交流:
-
【B站视频】CGraph 入门篇
-
【B站视频】CGraph 功能篇
- 全面介绍CGraph项目中,所有的名词术语和功能模块
- 结合实际coding过程,详细介绍了每个功能的具体的使用场景、用法、以及解决的问题
- 适合想要全面了解功能和快速上手使用CGraph的童鞋
- 适合对多线程编程感兴趣的童鞋
-
【B站视频】CGraph 应用篇
-
【B站视频】CGraph 分享篇
- C++ 版本
#include "CGraph.h"
using namespace CGraph;
class MyNode1 : public GNode {
public:
CStatus run() override {
printf("[%s], sleep for 1 second ...\n", this->getName().c_str());
CGRAPH_SLEEP_SECOND(1)
return CStatus();
}
};
class MyNode2 : public GNode {
public:
CStatus run() override {
printf("[%s], sleep for 2 second ...\n", this->getName().c_str());
CGRAPH_SLEEP_SECOND(2)
return CStatus();
}
};
int main() {
/* 创建一个流水线,用于设定和执行流图信息 */
GPipelinePtr pipeline = GPipelineFactory::create();
GElementPtr a, b, c, d = nullptr;
/* 注册节点之间的依赖关系 */
pipeline->registerGElement<MyNode1>(&a, {}, "nodeA");
pipeline->registerGElement<MyNode2>(&b, {a}, "nodeB");
pipeline->registerGElement<MyNode1>(&c, {a}, "nodeC");
pipeline->registerGElement<MyNode2>(&d, {b, c}, "nodeD");
/* 执行流图框架 */
pipeline->process();
/* 清空流水线中所有的资源 */
GPipelineFactory::remove(pipeline);
return 0;
}
- Python 版本
import time
from datetime import datetime
from PyCGraph import GNode, GPipeline, CStatus
class MyNode1(GNode):
def run(self):
print("[{0}] {1}, enter MyNode1 run function. Sleep for 1 second ... ".format(datetime.now(), self.getName()))
time.sleep(1)
return CStatus()
class MyNode2(GNode):
def run(self):
print("[{0}] {1}, enter MyNode2 run function. Sleep for 2 second ... ".format(datetime.now(), self.getName()))
time.sleep(2)
return CStatus()
if __name__ == '__main__':
pipeline = GPipeline()
a, b, c, d = MyNode1(), MyNode2(), MyNode1(), MyNode2()
pipeline.registerGElement(a, set(), "nodeA")
pipeline.registerGElement(b, {a}, "nodeB")
pipeline.registerGElement(c, {a}, "nodeC")
pipeline.registerGElement(d, {b, c}, "nodeD")
pipeline.process()
- 纯序员给你介绍图化框架的简单实现——执行逻辑
- 纯序员给你介绍图化框架的简单实现——循环逻辑
- 纯序员给你介绍图化框架的简单实现——参数传递
- 纯序员给你介绍图化框架的简单实现——条件判断
- 纯序员给你介绍图化框架的简单实现——面向切面
- 纯序员给你介绍图化框架的简单实现——函数注入
- 纯序员给你介绍图化框架的简单实现——消息机制
- 纯序员给你介绍图化框架的简单实现——事件触发
- 纯序员给你介绍图化框架的简单实现——超时机制
- 纯序员给你介绍图化框架的简单实现——线程池优化(一)
- 纯序员给你介绍图化框架的简单实现——线程池优化(二)
- 纯序员给你介绍图化框架的简单实现——线程池优化(三)
- 纯序员给你介绍图化框架的简单实现——线程池优化(四)
- 纯序员给你介绍图化框架的简单实现——线程池优化(五)
- 纯序员给你介绍图化框架的简单实现——线程池优化(六)
- 纯序员给你介绍图化框架的简单实现——性能优化(一)
- 纯序员给你介绍图化框架的简单实现——性能优化(二)
-
纯序员给你介绍图化框架的简单实现——距离计算
- CGraph 主打歌——《听码农的话》
- 聊聊我写CGraph的这一年
- 从零开始主导一款收录于awesome-cpp的项目,是一种怎样的体验?
- 炸裂!CGraph性能全面超越taskflow之后,作者却说他更想...
- 以图优图:CGraph中计算dag最大并发度思路总结
- 一文带你了解练习时长两年半的CGraph
- CGraph作者想知道,您是否需要一款eDAG调度框架
- 降边增效:CGraph中冗余边剪裁思路总结
-
最新码坛爽文:重生之我在国外写CGraph(python版本)
- GraphANNS : Graph-based Approximate Nearest Neighbor Search Working off CGraph
- CThreadPool : 一个简单好用、功能强大、性能优异、跨平台的C++线程池
- CGraph-lite : head-only, simplest CGraph, with DAG executor and param translate function
- awesome-cpp : A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.
- awesome-workflow-engines : A curated list of awesome open source workflow engines
- taskflow : A General-purpose Parallel and Heterogeneous Task Programming System
- torchpipe : Serving Inside Pytorch
- nndeploy : nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为内核,致力为用户提供跨平台、简单易用、高性能的模型部署体验。
- KuiperInfer : 带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
- OGraph : A simple way to build a pipeline with Go.
-
pybind11 : Seamless operability between C++11 and Python
- 本项目Python接口绑定功能,使用pybind11实现
附录-1. 版本信息
[2021.05.04 - v1.0.0 - Chunel]
- 提供图化执行功能,支持非依赖节点并行计算
[2021.05.09 - v1.1.0 - Chunel]
- 优化图执行过程中的并发度
[2021.05.18 - v1.1.1 - Chunel]
- 添加节点
name
和session
信息
[2021.05.23 - v1.2.0 - Chunel]
- 提供单节点循环执行功能
[2021.05.29 - v1.3.0 - Chunel]
- 提供
cluster
(簇)和region
(区域)划分和循环执行功能 - 提供
tutorial
内容,包含多种使用样例
[2021.06.14 - v1.4.0 - Chunel]
- 提供
param
(参数)传递机制 - 提供
group
(组)功能,多节点模块统一继承自group
模块 - 添加对Linux系统的的支持
[2021.06.20 - v1.4.1 - Chunel]
- 提供
condition
(条件)功能 - 添加对Windows系统的支持
[2021.06.24 - v1.5.0 - Chunel]
- 提供
pipeline
工厂创建方法 - 更新
tutorial
内容
[2021.07.07 - v1.5.1 - Chunel]
- 优化线程池功能。实现任务盗取机制
[2021.07.11 - v1.5.2 - Chunel]
- 优化线程池功能。实现线程数量自动调节机制
[2021.07.31 - v1.5.3 - Chunel]
- 优化线程池功能。实现任务批量获取功能,优化任务盗取机制
[2021.08.29 - v1.6.0 - Chunel]
- 提供多
pipeline
功能,优化底层逻辑 - 更新
tutorial
内容
[2021.09.19 - v1.6.1 - Chunel]
- 提供
Lru
算子、Trie
算子和模板节点功能,优化底层逻辑 - 更新
tutorial
内容
[2021.09.29 - v1.7.0 - Chunel]
- 提供
aspect
(切面)功能,用于横向扩展node
或group
功能 - 更新
tutorial
内容
[2021.10.07 - v1.7.1 - Chunel]
- 优化
aspect
(切面)实现逻辑,提供切面参数功能,提供批量添加切面功能 - 更新
tutorial
内容
[2021.11.01 - v1.8.0 - Chunel]
- 提供
adapter
(适配器)功能,提供singleton
适配器功能 - 优化
pipeline
执行逻辑 - 更新
tutorial
内容
[2021.12.18 - v1.8.1 - Chunel]
- 优化了返回值
CStatus
信息
[2022.01.02 - v1.8.2 - Chunel]
- 提供节点执行超时自动退出功能,提供
task group
(任务组)功能 - 提供线程池配置参数设置方法
[2022.01.23 - v1.8.3 - Chunel]
- 提供
function
适配器,实现函数式编程功能 - 提供线程优先级调度功能,提供线程绑定cpu执行功能
- 更新
tutorial
内容
[2022.01.31 - v1.8.4 - Chunel]
- 提供
node
(节点)异步执行的功能
[2022.02.03 - v1.8.5 - Chunel]
- 提供
daemon
(守护)功能,用于定时执行非流图中任务 - 更新
tutorial
内容
[2022.04.03 - v1.8.6 - Chunel]
- 提供
DistanceCalculator
算子,用于实现任意数据类型、任意距离类型的计算 - 更新
tutorial
内容
[2022.04.05 - v2.0.0 - Chunel]
- 提供
domain
(领域)功能,提供Ann
领域抽象模型,开始支持个别专业方向 - 提供hold执行机制
- 更新
tutorial
内容
[2022.05.01 - v2.0.1 - Chunel]
- 优化
pipeline
注册机制,支持init方法自定义顺序执行 - 提供一键编译脚本
[2022.05.29 - v2.1.0 - Chunel]
- 提供
element
参数写入方法 - 提供针对C++14版本的支持
- 更新
tutorial
内容
[2022.10.03 - v2.1.1 - Chunel]
- 提供线程池中的任务优先级机制
- 优化
group
执行逻辑
[2022.11.03 - v2.2.0 - Chunel]
- 提供
message
(消息)功能,主要用于完成不同pipeline
之间的数据传递 - 更新
tutorial
内容
[2022.12.24 - v2.2.1 - Chunel]
- 提供
TemplateNode
(模板节点)功能,用于优化参数传参方式 - 更新
tutorial
内容
[2022.12.25 - v2.2.2 - yeshenyong]
- 优化图执行逻辑
[2022.12.30 - v2.2.3 - Chunel]
- 提供
message
发布订阅功能 - 提供执行引擎切换功能
[2023.01.21 - v2.3.0 - Chunel]
- 提供
event
(事件)功能 - 提供
CGraph Intro.xmind
文件,通过脑图的方式,介绍了CGraph的整体逻辑
[2023.01.25 - v2.3.1 - Chunel]
- 提供针对C++11版本的支持。感谢 MirrorYuChen 提供相关解决方案
[2023.02.10 - v2.3.2 - Chunel]
- 优化调度策略,提供调度参数配置接口
- 提供英文版本readme.md
[2023.02.12 - v2.3.3 - yeshenyong, Chunel]
- 提供graphviz可视化图展示功能
- 提供参数链路追踪功能
[2023.02.22 - v2.3.4 - Chunel]
- 优化Windows系统下调度机制
- 优化
param
机制和event
(事件)机制
[2023.03.25 - v2.4.0 - woodx, Chunel]
- 提供可运行的docker环境,和构建docker环境的dockerfile文件
- 提供
pipeline
调度资源管控机制 - 优化调度性能
[2023.05.05 - v2.4.1 - Chunel]
- 提供线程绑定执行功能
- 提供
pipeline
最大并发度获取方法。感谢 Hanano-Yuuki 提供相关解决方案 - 提供
pipeline
异步执行功能和执行时退出功能
[2023.06.17 - v2.4.2 - Chunel]
- 提供
MultiCondition
(多条件)功能 - 提供
pipeline
暂停执行和恢复执行功能
[2023.07.12 - v2.4.3 - Chunel]
- 优化
CStatus
功能,添加了异常定位信息
[2023.09.05 - v2.5.0 - Chunel]
- 提供perf功能,用于做
pipeline
的性能分析 - 提供
element
的超时机制 - 提供
some
(部分)功能,优化pipeline
的异步执行方式
[2023.09.15 - v2.5.1 - Chunel]
- 提供
fence
(栅栏)功能 - 提供
coordinator
(协调)功能
[2023.11.06 - v2.5.2 - Chunel]
- 优化
message
(消息)功能,可以设定写入阻塞时的处理方式,减少内存copy次数 - 添加
example
相关内容,针对不同行业,提供一些简单实现 - 优化调度性能
[2023.11.15 - v2.5.3 - Chunel]
- 提供
proto
定义文件 - 添加
mutable
(异变)功能,提供依赖关系注册语法糖
[2024.01.05 - v2.5.4 - Chunel]
- 提供
test
内容,包含性能和功能方面的测试用例 - 优化
event
(事件)机制,支持异步等待功能
[2024.07.18 - v2.6.0 - PaPaPig-Melody, Chunel]
- 提供
pipeline
的拓扑执行的方式 - 提供判定
element
之间是否有依赖关系的方法 - 提供bazel编译方式
- 优化perf功能
[2024.09.17 - v2.6.1 - Chunel]
- 提供
pipeline
的静态执行的方式,提供基于静态执行的微任务机制 - 提供
pipeline
剪裁功能,用于删除element
之间重复的依赖 - 提供
element
删除依赖的方法 - 优化
event
(事件)机制,异步事件可以等待结束 - 发布 CGraph-lite 项目,提供简单DAG构图和参数传递功能。接口完全兼容,可无缝切换至本项目
[2024.11.16 - v2.6.2 - Chunel]
- 优化参数互斥机制和获取性能
- 修复辅助线程异常等待问题
- 更新
tutorial
内容
[2025.02.08 - v3.0.0 - Chunel]
- 提供
stage
(阶段)功能,用于element
之间同步运行 - 提供 Python 封装
- 更新
tutorial
内容
附录-2. 感谢
-
感谢 Doocs 微信公众号 刊登相关介绍文档,欢迎加入 Doocs 开源社区
-
感谢《HelloGithub》期刊介绍和推荐:HelloGithub 第70期
- Thanks to the recommendation from awesome-cpp, we all know, it is the most authoritative recommendation list for cpp project in the world
- Thanks to the recommendation from
Taskflow Group
: awesome-parallel-computing, and we always treat taskflow as a role model - Thanks to the recommendation from awesome-workflow-engines
- 感谢各位开发者 CONTRIBUTORS 为项目做出的贡献
- 感谢所有为
CGraph
项目提出的意见和建议的朋友,在此不一一提及。随时欢迎大家加入,一起共建
附录-3. 联系方式
- 微信: ChunelFeng (欢迎扫描上方二维码,添加作者为好友。请简单备注个人信息^_^)
- 邮箱: [email protected]
- 源码: https://github.com/ChunelFeng/CGraph
- 论坛: www.chunel.cn
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for CGraph
Similar Open Source Tools

CGraph
CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.

Jarvis
Jarvis is a powerful virtual AI assistant designed to simplify daily tasks through voice command integration. It features automation, device management, and personalized interactions, transforming technology engagement. Built using Python and AI models, it serves personal and administrative needs efficiently, making processes seamless and productive.

Awesome-Embodied-AI-Job
Awesome Embodied AI Job is a curated list of resources related to jobs in the field of Embodied Artificial Intelligence. It includes job boards, companies hiring, and resources for job seekers interested in roles such as robotics engineer, computer vision specialist, AI researcher, machine learning engineer, and data scientist.

prompt-in-context-learning
An Open-Source Engineering Guide for Prompt-in-context-learning from EgoAlpha Lab. 📝 Papers | ⚡️ Playground | 🛠 Prompt Engineering | 🌍 ChatGPT Prompt | ⛳ LLMs Usage Guide > **⭐️ Shining ⭐️:** This is fresh, daily-updated resources for in-context learning and prompt engineering. As Artificial General Intelligence (AGI) is approaching, let’s take action and become a super learner so as to position ourselves at the forefront of this exciting era and strive for personal and professional greatness. The resources include: _🎉Papers🎉_: The latest papers about _In-Context Learning_ , _Prompt Engineering_ , _Agent_ , and _Foundation Models_. _🎉Playground🎉_: Large language models(LLMs)that enable prompt experimentation. _🎉Prompt Engineering🎉_: Prompt techniques for leveraging large language models. _🎉ChatGPT Prompt🎉_: Prompt examples that can be applied in our work and daily lives. _🎉LLMs Usage Guide🎉_: The method for quickly getting started with large language models by using LangChain. In the future, there will likely be two types of people on Earth (perhaps even on Mars, but that's a question for Musk): - Those who enhance their abilities through the use of AIGC; - Those whose jobs are replaced by AI automation. 💎EgoAlpha: Hello! human👤, are you ready?

agenta
Agenta is an open-source LLM developer platform for prompt engineering, evaluation, human feedback, and deployment of complex LLM applications. It provides tools for prompt engineering and management, evaluation, human annotation, and deployment, all without imposing any restrictions on your choice of framework, library, or model. Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time.

AI-Vtuber
AI-VTuber is a highly customizable AI VTuber project that integrates with Bilibili live streaming, uses Zhifu API as the language base model, and includes intent recognition, short-term and long-term memory, cognitive library building, song library creation, and integration with various voice conversion, voice synthesis, image generation, and digital human projects. It provides a user-friendly client for operations. The project supports virtual VTuber template construction, multi-person device template management, real-time switching of virtual VTuber templates, and offers various practical tools such as video/audio crawlers, voice recognition, voice separation, voice synthesis, voice conversion, AI drawing, and image background removal.

chatgpt-infinity
ChatGPT Infinity is a free and powerful add-on that makes ChatGPT generate infinite answers on any topic. It offers customizable topic selection, multilingual support, adjustable response interval, and auto-scroll feature for a seamless chat experience.

instill-core
Instill Core is an open-source orchestrator comprising a collection of source-available projects designed to streamline every aspect of building versatile AI features with unstructured data. It includes Instill VDP (Versatile Data Pipeline) for unstructured data, AI, and pipeline orchestration, Instill Model for scalable MLOps and LLMOps for open-source or custom AI models, and Instill Artifact for unified unstructured data management. Instill Core can be used for tasks such as building, testing, and sharing pipelines, importing, serving, fine-tuning, and monitoring ML models, and transforming documents, images, audio, and video into a unified AI-ready format.

GoMaxAI-ChatGPT-Midjourney-Pro
GoMaxAI Pro is an AI-powered application for personal, team, and enterprise private operations. It supports various models like ChatGPT, Claude, Gemini, Kimi, Wenxin Yiyuan, Xunfei Xinghuo, Tsinghua Zhipu, Suno-v3.5, and Luma-video. The Pro version offers a new UI interface, member points system, management backend, homepage features, support for various content formats, AI video capabilities, SAAS multi-opening function, bug fixes, and more. It is built using web frontend with Vue3, mobile frontend with Uniapp, management frontend with Vue3, backend with Nodejs, and uses MySQL5.7(+) + Redis for data support. It can be deployed on Linux, Windows, or MacOS, with data storage options including local storage, Aliyun OSS, Tencent Cloud COS, and Chevereto image bed.

Ai-Hoshino
Ai Hoshino - MD is a WhatsApp bot tool with features like voice and text interaction, group configuration, anti-delete, anti-link, personalized welcome messages, chatbot functionality, sticker creation, sub-bot integration, RPG game, YouTube music and video downloads, and more. The tool is actively maintained by Starlights Team and offers a range of functionalities for WhatsApp users.

chatgpt-auto-refresh
ChatGPT Auto Refresh is a userscript that keeps ChatGPT sessions fresh by eliminating network errors and Cloudflare checks. It removes the 10-minute time limit from conversations when Chat History is disabled, ensuring a seamless experience. The tool is safe, lightweight, and a time-saver, allowing users to keep their sessions alive without constant copy/paste/refresh actions. It works even in background tabs, providing convenience and efficiency for users interacting with ChatGPT. The tool relies on the chatgpt.js library and is compatible with various browsers using Tampermonkey, making it accessible to a wide range of users.

SwanLab
SwanLab is an open-source, lightweight AI experiment tracking tool that provides a platform for tracking, comparing, and collaborating on experiments, aiming to accelerate the research and development efficiency of AI teams by 100 times. It offers a friendly API and a beautiful interface, combining hyperparameter tracking, metric recording, online collaboration, experiment link sharing, real-time message notifications, and more. With SwanLab, researchers can document their training experiences, seamlessly communicate and collaborate with collaborators, and machine learning engineers can develop models for production faster.

simpletransformers
Simple Transformers is a library based on the Transformers library by HuggingFace, allowing users to quickly train and evaluate Transformer models with only 3 lines of code. It supports various tasks such as Information Retrieval, Language Models, Encoder Model Training, Sequence Classification, Token Classification, Question Answering, Language Generation, T5 Model, Seq2Seq Tasks, Multi-Modal Classification, and Conversational AI.

easyAi
EasyAi is a lightweight, beginner-friendly Java artificial intelligence algorithm framework. It can be seamlessly integrated into Java projects with Maven, requiring no additional environment configuration or dependencies. The framework provides pre-packaged modules for image object detection and AI customer service, as well as various low-level algorithm tools for deep learning, machine learning, reinforcement learning, heuristic learning, and matrix operations. Developers can easily develop custom micro-models tailored to their business needs.
For similar tasks

CGraph
CGraph is a cross-platform **D** irected **A** cyclic **G** raph framework based on pure C++ without any 3rd-party dependencies. You, with it, can **build your own operators simply, and describe any running schedules** as you need, such as dependence, parallelling, aggregation and so on. Some useful tools and plugins are also provide to improve your project. Tutorials and contact information are show as follows. Please **get in touch with us for free** if you need more about this repository.
For similar jobs

db2rest
DB2Rest is a modern low-code REST DATA API platform that simplifies the development of intelligent applications. It seamlessly integrates existing and new databases with language models (LMs/LLMs) and vector stores, enabling the rapid delivery of context-aware, reasoning applications without vendor lock-in.

mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.

airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.

labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

airflow
Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative. Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

airbyte-platform
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's low-code Connector Development Kit (CDK). Airbyte is used by data engineers and analysts at companies of all sizes to move data for a variety of purposes, including data warehousing, data analysis, and machine learning.

chronon
Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.