Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning applications using open-source analytics frameworks such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto.

Amazon EMR publishes 1 API on the APIs.io network: Clusters API. Tagged areas include Amazon Web Services, Analytics, Apache Spark, Big Data, and Data Processing.

The Amazon EMR catalog on APIs.io includes 1 JSON-LD context and 2 Spectral governance rulesets.

Amazon EMR’s developer surface includes developer portal, documentation, engineering blog, developer console, signup flow, support, FAQ, and 25 more developer resources.

🌐 Visit website 📡 Source on GitHub

65.2/100 strong ▬ flat Agent 28/100 agent aware Full breakdown ↓
scored 2026-07-28 · rubric v0.6

AccessFreemium

1 APIs 5 Features 4 Use Cases

Amazon Web ServicesAnalyticsApache SparkBig DataData ProcessingHadoop

Kin Score

Kin Score How this is scored →
scored 2026-07-28 · rubric v0.6

Composite quality — 65.2/100 · strong

Contract Quality 16.3 / 25

Developer Ergonomics 7.0 / 20

Commercial Clarity 17.9 / 20

Operational Transparency 8.2 / 13

Governance 8.3 / 12

Discoverability 7.6 / 10

Agent readiness — 28/100 · agent aware

Machine-Readable Contract 18 / 18

Agentic Access Contract 10 / 10

MCP Server 0 / 12

Machine-Readable Auth 0 / 10

Idempotency 0 / 9

Stable Error Semantics 0 / 8

Request/Response Examples 7 / 7

Rate-Limit Signaling 7 / 7

Typed Event Surface 0 / 6

Agent Skills 0 / 5

Well-Known Catalog 0 / 4

Consent & Bot Identity 0 / 3

A2A Agent Card 0 / 8

Dry-Run / Simulate Mode 0 / 4

Improve this rating by publishing the missing artifacts — every area above can be raised, and the full rubric is at apis.io/rating/. This rating is computed from github.com/api-evangelist/amazon-emr: open an issue to ask a question, or submit a pull request to add artifacts. Want it done for you? Prioritized profiling — $2,500 →

APIs 1

Individual APIs this provider publishes, each with its own machine-readable definition.

Rate Limits 1

Documented rate limits and quota policies.

Amazon Emr Rate Limits

5 limits

RATE LIMITS

FinOps 1

Cost, billing, and metering signals for API financial operations.

Amazon Emr Finops

FINOPS

Features 5

Notable capabilities this provider offers.

Apache Spark Support

Run Apache Spark jobs for large-scale data processing and machine learning

Auto Scaling

Automatically adjust cluster size based on workload demand

Spot Instance Integration

Use EC2 Spot instances to reduce costs up to 90%

EMR Serverless

Run analytics without provisioning or managing clusters

Studio Notebooks

Develop and debug jobs using EMR Studio Jupyter notebooks

JSON Schema 1

Standalone JSON Schema definitions for this provider's data models.

Examples 1

Example request and response payloads for these APIs.

AGENTIC

Use Cases 4

What developers build with this provider.

ETL Data Processing

Extract, transform, and load large datasets across data lakes and warehouses

Machine Learning

Train machine learning models on large datasets using Spark MLlib

Log Analytics

Process and analyze application logs at petabyte scale

Financial Risk Analysis

Run Monte Carlo simulations and risk models on large datasets

Integrations 4

Pre-built integrations with other platforms and tools.

Amazon S3

Use S3 as data lake storage for EMR clusters

AWS Glue

Integrate with Glue Data Catalog for metadata management

Amazon Athena

Query data processed by EMR using Athena SQL

Amazon SageMaker

Hand off processed data to SageMaker for model training

Resources

Get Started 5

Portal, sign-up, and the first successful call

Portal

DeveloperPortal

Console

Signup

Login

Documentation 1

Reference material describing how the API behaves

Documentation

Documentation

Agent Surfaces 1

MCP servers, agent skills, and machine-readable catalogs

AgenticAccess

AgenticAccess

Design & Contract 8

Pagination, idempotency, versioning, errors, and events

Arazzo

Arazzo

Arazzo

Arazzo

Arazzo

Arazzo

SpectralRules

Vocabulary

Scroll for all 8

Build 2

SDKs, sample code, and the tooling you integrate with

PostmanWorkspace

PostmanWorkspace

GitHubOrganization

GitHubOrganization

Access & Security 5

Authentication, authorization, and security posture

TrustCenter

TrustCenter

VulnerabilityDisclosure

VulnerabilityDisclosure

DomainSecurity

DomainSecurity

Compliance

Compliance

Security

Security

Learn 1

Tutorials, courses, talks, and written guidance

YouTube

YouTube

Operate 5

Status, limits, changes, and where to get help

StatusPage

Support

FAQ

StackOverflow

Contact

Commercial 2

Pricing, plans, and the legal terms of use

TermsOfService

TermsOfService

PrivacyPolicy

PrivacyPolicy

Company 1

The organization behind the API

Blog

Blog

Other 1

Properties that don't map to a standard resource type

KnowledgeCenter

KnowledgeCenter

Source (apis.yml)

name: Amazon EMR
description: Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive
  SQL queries, and machine learning applications using open-source analytics frameworks such as Apache Spark, Apache Hive,
  Apache HBase, Apache Flink, Apache Hudi, and Presto.
accessModel:
  pricing: freemium
  onboarding: unknown
  trial: false
  try_now: false
  public: false
  label: Freemium
  confidence: medium
  source:
  - plans
  generated: '2026-07-22'
  method: derived
image: https://a0.awsstatic.com/libra-css/images/logos/aws_logo_smile_1200x630.png
url: https://aws.amazon.com/emr/
created: '2024-01-15'
modified: '2026-05-19'
specificationVersion: '0.19'
tags:
- Amazon Web Services
- Analytics
- Apache Spark
- AWS
- Big Data
- Data Processing
- Hadoop
apis:
- aid: amazon-emr:amazon-emr-clusters-api
  name: Amazon EMR Clusters API
  description: Operations for creating, managing, and terminating EMR clusters
  humanURL: https://aws.amazon.com/emr/
  baseURL: https://elasticmapreduce.amazonaws.com
  tags:
  - Clusters
  properties:
  - type: OpenAPI
    url: openapi/amazon-emr-clusters-api-openapi.yml
  - type: Documentation
    url: https://docs.aws.amazon.com/emr/latest/ManagementGuide/
  - type: APIReference
    url: https://docs.aws.amazon.com/emr/latest/APIReference/
  - type: GettingStarted
    url: https://aws.amazon.com/emr/getting-started/
  - type: Pricing
    url: https://aws.amazon.com/emr/pricing/
  - type: FAQ
    url: https://aws.amazon.com/emr/faqs/
  - type: JSONSchema
    url: json-schema/amazon-emr-schema.json
  - type: JSONLD
    url: json-ld/amazon-emr-context.jsonld
common:
- type: AgenticAccess
  url: agentic-access/amazon-emr-agentic-access.yml
- type: TrustCenter
  url: security/amazon-emr-trust-center.yml
- type: VulnerabilityDisclosure
  url: security/amazon-emr-vulnerability-disclosure.yml
- type: DomainSecurity
  url: security/amazon-emr-domain-security.yml
- type: PostmanWorkspace
  url: https://www.postman.com/kinlaneapi/amazon-emr/overview
- type: Arazzo
  url: arazzo/amazon-emr-run-cluster-with-steps-workflow.yml
  name: Amazon EMR Launch a Cluster With Processing Steps
- type: Arazzo
  url: arazzo/amazon-emr-run-hadoop-hive-cluster-workflow.yml
  name: Amazon EMR Launch a Hadoop and Hive Cluster
- type: Arazzo
  url: arazzo/amazon-emr-run-hbase-cluster-workflow.yml
  name: Amazon EMR Launch an HBase Cluster
- type: Arazzo
  url: arazzo/amazon-emr-run-presto-query-cluster-workflow.yml
  name: Amazon EMR Launch a Presto Query Cluster
- type: Arazzo
  url: arazzo/amazon-emr-run-spark-cluster-workflow.yml
  name: Amazon EMR Launch a Spark Cluster
- type: Arazzo
  url: arazzo/amazon-emr-run-spark-etl-job-workflow.yml
  name: Amazon EMR Run a Spark ETL Job
- type: Portal
  url: https://aws.amazon.com/
- type: DeveloperPortal
  url: https://aws.amazon.com/emr/
- type: Documentation
  url: https://docs.aws.amazon.com/emr/
- type: Blog
  url: https://aws.amazon.com/blogs/
- type: GitHubOrganization
  url: https://github.com/aws
- type: Console
  url: https://console.aws.amazon.com/emr/
- type: Signup
  url: https://portal.aws.amazon.com/billing/signup
- type: Login
  url: https://signin.aws.amazon.com/
- type: StatusPage
  url: https://health.aws.amazon.com/health/status
- type: Support
  url: https://aws.amazon.com/support/
- type: FAQ
  url: https://aws.amazon.com/emr/faqs/
- type: TermsOfService
  url: https://aws.amazon.com/service-terms/
- type: PrivacyPolicy
  url: https://aws.amazon.com/privacy/
- type: Compliance
  url: https://aws.amazon.com/compliance/
- type: Security
  url: https://aws.amazon.com/security/
- type: YouTube
  url: https://www.youtube.com/user/AmazonWebServices
- type: StackOverflow
  url: https://stackoverflow.com/questions/tagged/emr
- type: KnowledgeCenter
  url: https://repost.aws/knowledge-center
- type: Contact
  url: https://aws.amazon.com/contact-us/
- type: SpectralRules
  url: rules/amazon-emr-spectral-rules.yml
- type: Vocabulary
  url: vocabulary/amazon-emr-vocabulary.yaml
- type: Features
  data:
  - name: Apache Spark Support
    description: Run Apache Spark jobs for large-scale data processing and machine learning
  - name: Auto Scaling
    description: Automatically adjust cluster size based on workload demand
  - name: Spot Instance Integration
    description: Use EC2 Spot instances to reduce costs up to 90%
  - name: EMR Serverless
    description: Run analytics without provisioning or managing clusters
  - name: Studio Notebooks
    description: Develop and debug jobs using EMR Studio Jupyter notebooks
- type: UseCases
  data:
  - name: ETL Data Processing
    description: Extract, transform, and load large datasets across data lakes and warehouses
  - name: Machine Learning
    description: Train machine learning models on large datasets using Spark MLlib
  - name: Log Analytics
    description: Process and analyze application logs at petabyte scale
  - name: Financial Risk Analysis
    description: Run Monte Carlo simulations and risk models on large datasets
- type: Integrations
  data:
  - name: Amazon S3
    description: Use S3 as data lake storage for EMR clusters
  - name: AWS Glue
    description: Integrate with Glue Data Catalog for metadata management
  - name: Amazon Athena
    description: Query data processed by EMR using Athena SQL
  - name: Amazon SageMaker
    description: Hand off processed data to SageMaker for model training
maintainers:
- FN: Kin Lane
  email: kin@apievangelist.com

Amazon EMR

APIs 1

Postman Collections 1

Open Collections 1

Arazzo Workflows 6

Pricing Plans 1

Rate Limits 1

FinOps 1

Features 5

Apache Spark Support

Auto Scaling

Spot Instance Integration

EMR Serverless

Studio Notebooks

Semantic Vocabularies 1

Spectral Rules 2

JSON Schema 1

JSON Structure 1

Examples 1

Security Posture 3

Agentic Access 1

Use Cases 4

ETL Data Processing

Machine Learning

Log Analytics

Financial Risk Analysis

Integrations 4

Amazon S3

AWS Glue

Amazon Athena

Amazon SageMaker

Get Started 5

Documentation 1

Agent Surfaces 1

Design & Contract 8

Build 2

Access & Security 5

Learn 1

Operate 5

Commercial 2

Company 1

Other 1

Source (apis.yml)