Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Elyra extends JupyterLab with an AI centric approach.
A series of DAGs/Workflows to help maintain the operation of Airflow
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
【A simple C++ DAG framework】 一个简单好用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork
Yet another cron alternative with a Web UI, but with much more capabilities. It aims to solve greater problems.
Example end to end data engineering project.
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Get updates on the fastest growing repos and cool stats about GitHub right in your inbox
Once per month. No spam.