Apache Airflow logo

Apache Airflow

Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. Airflow uses directed acyclic graphs (DAGs) to manage workflow orchestration. The Airflow REST API provides programmatic access to DAGs, DAG runs, tasks, connections, variables, pools, and monitoring for both Airflow OSS and cloud-managed deployments.

1 APIs 1 Capabilities 9 Features
Workflow OrchestrationData PipelineOpen SourceApacheDAGSchedulingETLData Engineering

APIs

Apache Airflow API

The Apache Airflow REST API (v2) provides stable, backward-compatible endpoints for managing workflows (DAGs), DAG runs, task instances, connections, variables, XComs, pools, an...

Capabilities

Apache Airflow Workflow Orchestration

Unified workflow capability for managing Apache Airflow pipelines — DAGs, DAG runs, task monitoring, variables, and connections. Used by data engineers and platform teams for or...

Run with Naftiko

Features

DAG Authoring

Define workflows as Python code using Directed Acyclic Graphs (DAGs).

Dynamic DAG Generation

Programmatically generate DAGs and tasks based on configuration or data.

Rich Operator Library

Pre-built operators for databases, cloud services, APIs, and data tools.

REST API v2

Stable REST API for programmatic management of DAGs, runs, tasks, and infrastructure.

Web UI

Built-in web interface for monitoring, triggering, and debugging workflows.

Scheduler

Robust scheduler with support for CRON and timed triggers.

Extensible

Plugin system and provider packages for extending functionality.

Multi-Cloud Support

Provider packages for AWS, GCP, Azure, and other cloud platforms.

Managed Services

Available as managed service from AWS (MWAA), GCP (Cloud Composer), and Astronomer.

Use Cases

ETL Pipeline Orchestration

Schedule and monitor extract, transform, load data pipelines.

ML Pipeline Management

Orchestrate machine learning training, evaluation, and deployment workflows.

Data Quality Checks

Schedule data validation and quality check jobs.

Report Generation

Automate periodic report generation and distribution.

API Orchestration

Coordinate calls to multiple APIs in complex workflows.

Database Operations

Schedule database maintenance, migrations, and backup jobs.

Integrations

Apache Spark

Run Spark jobs from Airflow DAGs.

dbt

Orchestrate dbt model runs via the dbt operator.

Kubernetes

Run tasks in Kubernetes pods with the KubernetesPodOperator.

AWS

Provider package for S3, Redshift, EMR, Lambda, and other AWS services.

Google Cloud

Provider package for BigQuery, Dataflow, GCS, and other GCP services.

Azure

Provider package for Azure Data Factory, Blob Storage, and other Azure services.

Snowflake

SnowflakeOperator for running SQL in Snowflake data warehouse.

Airbyte

Trigger Airbyte syncs from Airflow DAGs.

Semantic Vocabularies

Airflow Context

128 classes · 304 properties

JSON-LD

API Governance Rules

Apache Airflow API Rules

24 rules · 8 errors 7 warnings 9 info

SPECTRAL

Resources

🌐
Portal
Portal
🚀
GettingStarted
GettingStarted
👥
GitHubOrganization
GitHubOrganization
👥
GitHubRepository
GitHubRepository
📰
Blog
Blog
👥
StackOverflow
StackOverflow
📄
ChangeLog
ChangeLog
🔗
IssueTracker
IssueTracker
📦
Docker Image
SDK
📦
Helm Chart
SDK
🔗
Airflow Spectral Rules
SpectralRules
🔗
Workflow Orchestration
NaftikoCapability
🔗
Airflow Vocabulary
Vocabulary