Apache Airflow
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. Airflow uses directed acyclic graphs (DAGs) to manage workflow orchestration. The Airflow REST API provides programmatic access to DAGs, DAG runs, tasks, connections, variables, pools, and monitoring for both Airflow OSS and cloud-managed deployments.
APIs
Apache Airflow API
The Apache Airflow REST API (v2) provides stable, backward-compatible endpoints for managing workflows (DAGs), DAG runs, task instances, connections, variables, XComs, pools, an...
Capabilities
Apache Airflow Workflow Orchestration
Unified workflow capability for managing Apache Airflow pipelines — DAGs, DAG runs, task monitoring, variables, and connections. Used by data engineers and platform teams for or...
Run with NaftikoFeatures
Define workflows as Python code using Directed Acyclic Graphs (DAGs).
Programmatically generate DAGs and tasks based on configuration or data.
Pre-built operators for databases, cloud services, APIs, and data tools.
Stable REST API for programmatic management of DAGs, runs, tasks, and infrastructure.
Built-in web interface for monitoring, triggering, and debugging workflows.
Robust scheduler with support for CRON and timed triggers.
Plugin system and provider packages for extending functionality.
Provider packages for AWS, GCP, Azure, and other cloud platforms.
Available as managed service from AWS (MWAA), GCP (Cloud Composer), and Astronomer.
Use Cases
Schedule and monitor extract, transform, load data pipelines.
Orchestrate machine learning training, evaluation, and deployment workflows.
Schedule data validation and quality check jobs.
Automate periodic report generation and distribution.
Coordinate calls to multiple APIs in complex workflows.
Schedule database maintenance, migrations, and backup jobs.
Integrations
Run Spark jobs from Airflow DAGs.
Orchestrate dbt model runs via the dbt operator.
Run tasks in Kubernetes pods with the KubernetesPodOperator.
Provider package for S3, Redshift, EMR, Lambda, and other AWS services.
Provider package for BigQuery, Dataflow, GCS, and other GCP services.
Provider package for Azure Data Factory, Blob Storage, and other Azure services.
SnowflakeOperator for running SQL in Snowflake data warehouse.
Trigger Airbyte syncs from Airflow DAGs.