Amazon Data Pipeline

AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it is stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR. It supports data-driven workflows with retry, failure handling, and scheduling capabilities.

1 APIs 7 Features

Data ProcessingETLWorkflowsData PipelineAutomation

APIs

AWS Data Pipeline API

The AWS Data Pipeline API provides a web service for processing and moving data between different AWS compute and storage services as well as on-premises data sources at specifi...

Collections

AWS Data Pipeline API

POSTMAN

AWS Data Pipeline API

OPEN

Arazzo Workflows

Amazon Data Pipeline Clone Pipeline

Copy an existing pipeline's definition into a brand-new pipeline and activate it.

ARAZZO

Amazon Data Pipeline Deactivate and Delete

Stop a running pipeline and then permanently remove it and its run history.

ARAZZO

Amazon Data Pipeline Export Definition

Confirm a pipeline exists and then export its active definition objects.

ARAZZO

Amazon Data Pipeline Inspect Running Tasks

Find running task instances in a pipeline and pull their full object definitions.

ARAZZO

Amazon Data Pipeline List and Describe

List all accessible pipelines and pull full metadata for the first page of them.

ARAZZO

Amazon Data Pipeline Provision and Activate

Create an empty pipeline, populate its definition, activate it, and confirm its state.

ARAZZO

Amazon Data Pipeline Redeploy Definition

Deactivate a pipeline, write a new definition, then reactivate it with the new objects.

ARAZZO

Amazon Data Pipeline Tag and Confirm

Add governance tags to a pipeline and confirm they are attached.

ARAZZO

Amazon Data Pipeline Validate Then Put Definition

Validate a candidate pipeline definition and only commit it when it is error free.

ARAZZO

Pricing Plans

Amazon Data Pipeline Plans Pricing

3 plans

PLANS

Rate Limits

Amazon Data Pipeline Rate Limits

5 limits

RATE LIMITS

FinOps

Amazon Data Pipeline Finops

FINOPS

Features

Data-Driven Workflows

Define complex data processing workflows with activities, data nodes, schedules, and preconditions using a declarative pipeline definition.

Multi-Service Integration

Move and transform data between Amazon S3, Amazon RDS, Amazon DynamoDB, Amazon Redshift, and Amazon EMR in a single pipeline.

Flexible Scheduling

Schedule pipeline runs at fixed intervals (hourly, daily, weekly) or trigger them based on data availability preconditions.

Automated Retry and Failure Handling

Configure automatic retries for failed activities with configurable retry intervals, timeout settings, and failure notifications.

On-Premises Data Support

Process data from on-premises databases and file systems using the Data Pipeline Task Runner agent installed locally.

EMR Integration

Launch and manage Amazon EMR clusters as pipeline resources to run Hive, Pig, and MapReduce jobs as part of data workflows.

Pipeline Versioning

Manage active and latest pipeline definition versions, enabling updates to running pipelines without disrupting current execution.

Use Cases

Daily ETL Workflows

Schedule daily extraction, transformation, and loading of data from relational databases into S3 or Redshift for analytics processing.

Log Processing Pipelines

Process application and server log files from S3 using EMR activities to generate aggregated reports and analytics datasets.

Database Migration

Migrate data between on-premises databases and AWS managed database services using scheduled pipeline activities.

Data Lake Ingestion

Automate the ingestion and transformation of raw data into structured formats in S3 data lakes for downstream analytics.

Cross-Region Data Replication

Replicate DynamoDB tables or S3 data across AWS regions using scheduled pipeline copy activities for disaster recovery.

Semantic Vocabularies

Amazon Data Pipeline Context

0 classes · 30 properties

JSON-LD

API Governance Rules

Amazon Data Pipeline API Rules

22 rules · 13 errors 5 warnings 4 info

SPECTRAL

JSON Structure

Activate Pipeline Request Structure

0 properties

JSON STRUCTURE

Create Pipeline Output Structure

0 properties

JSON STRUCTURE

Create Pipeline Request Structure

0 properties

JSON STRUCTURE

Describe Pipelines Output Structure

0 properties

JSON STRUCTURE

Describe Pipelines Request Structure

0 properties

JSON STRUCTURE

Error Structure

0 properties

JSON STRUCTURE

Field Structure

0 properties

JSON STRUCTURE

Get Pipeline Definition Output Structure

0 properties

JSON STRUCTURE

List Pipelines Output Structure

0 properties

JSON STRUCTURE

Pipeline Description Structure

0 properties

JSON STRUCTURE

Pipeline Id Name Structure

0 properties

JSON STRUCTURE

Pipeline Object Structure

0 properties

JSON STRUCTURE

Put Pipeline Definition Output Structure

0 properties

JSON STRUCTURE

Query Objects Output Structure

0 properties

JSON STRUCTURE

Tag Structure

0 properties

JSON STRUCTURE

Validation Error Structure

0 properties

JSON STRUCTURE

Example Payloads

Activate Pipeline Request Example

2 fields

EXAMPLE

Create Pipeline Output Example

1 fields

EXAMPLE

Create Pipeline Request Example

3 fields

EXAMPLE

Describe Pipelines Output Example

1 fields

EXAMPLE

Describe Pipelines Request Example

1 fields

EXAMPLE

Error Example

2 fields

EXAMPLE

Field Example

2 fields

EXAMPLE

Get Pipeline Definition Output Example

3 fields

EXAMPLE

List Pipelines Output Example

2 fields

EXAMPLE

Pipeline Description Example

4 fields

EXAMPLE

Pipeline Id Name Example

2 fields

EXAMPLE

Pipeline Object Example

3 fields

EXAMPLE

Put Pipeline Definition Output Example

3 fields

EXAMPLE

Query Objects Output Example

2 fields

EXAMPLE

Tag Example

2 fields

EXAMPLE

Validation Error Example

2 fields

EXAMPLE

Resources

PostmanWorkspace

PostmanWorkspace

DeveloperPortal

DeveloperPortal

Documentation

TermsOfService

PrivacyPolicy

GitHubOrganization

GitHubOrganization

SpectralRules

Sources

opencollection: 1.0.0
info:
  name: AWS Data Pipeline API
  version: '2012-10-29'
request:
  auth:
    type: apikey
    key: Authorization
    value: '{{Authorization}}'
    placement: header
items:
- info:
    name: Pipelines
    type: folder
  items:
  - info:
      name: Create Pipeline
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=CreatePipeline
      body:
        type: json
        data: '{}'
    docs: Creates a new, empty pipeline. Use PutPipelineDefinition to populate the pipeline.
  - info:
      name: Delete Pipeline
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=DeletePipeline
      body:
        type: json
        data: '{}'
    docs: Deletes a pipeline, its pipeline definition, and its run history. You cannot query or restore a deleted pipeline.
  - info:
      name: Describe Pipelines
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=DescribePipelines
      body:
        type: json
        data: '{}'
    docs: Retrieves metadata about one or more pipelines. The information retrieved includes the name, description, inactivity
      timeout, and the list of the pipeline's schedules.
  - info:
      name: List Pipelines
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=ListPipelines
      body:
        type: json
        data: '{}'
    docs: Lists the pipeline identifiers for all active pipelines that you have permission to access.
- info:
    name: Pipeline Objects
    type: folder
  items:
  - info:
      name: Put Pipeline Definition
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=PutPipelineDefinition
      body:
        type: json
        data: '{}'
    docs: Adds tasks, schedules, and preconditions to the specified pipeline. This operation is idempotent.
  - info:
      name: Get Pipeline Definition
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=GetPipelineDefinition
      body:
        type: json
        data: '{}'
    docs: Gets the definition of the specified pipeline. You can call GetPipelineDefinition to retrieve the pipeline definition
      that you provided using PutPipelineDefinition.
  - info:
      name: Validate Pipeline Definition
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=ValidatePipelineDefinition
      body:
        type: json
        data: '{}'
    docs: Validates the specified pipeline definition to ensure that it is well formed and can be run without error.
- info:
    name: Pipeline Runs
    type: folder
  items:
  - info:
      name: Activate Pipeline
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=ActivatePipeline
      body:
        type: json
        data: '{}'
    docs: Validates the specified pipeline and starts processing pipeline tasks. If the pipeline does not pass validation,
      activation fails.
  - info:
      name: Deactivate Pipeline
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=DeactivatePipeline
      body:
        type: json
        data: '{}'
    docs: Deactivates the specified running pipeline. The pipeline is set to the DEACTIVATING state until the deactivation
      process completes.
  - info:
      name: Query Objects
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=QueryObjects
      body:
        type: json
        data: '{}'
    docs: Queries the specified pipeline for the names of objects that match the specified set of conditions.
  - info:
      name: Describe Objects
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=DescribeObjects
      body:
        type: json
        data: '{}'
    docs: Gets the object definitions for a set of objects associated with the pipeline.
- info:
    name: Tags
    type: folder
  items:
  - info:
      name: Add Tags
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=AddTags
      body:
        type: json
        data: '{}'
    docs: Adds or modifies tags for the specified pipeline.
  - info:
      name: Remove Tags
      type: http
    http:
      method: POST
      url: https://datapipeline.amazonaws.com/?Action=RemoveTags
      body:
        type: json
        data: '{}'
    docs: Removes existing tags from the specified pipeline.
bundled: true