Amazon DataZone
Amazon DataZone is a data management service that helps you catalog, discover, govern, share, and analyze your data across your organization and beyond. It enables data producers and consumers to collaborate, with built-in governance, data catalog capabilities, and a business data catalog to organize and share data across your AWS environment. DataZone provides domain-based governance, project workspaces, subscription-based access control, and integration with AWS analytics services.
APIs
Amazon DataZone API
The Amazon DataZone API provides programmatic access to create and manage data domains, data assets, data catalogs, projects, subscriptions, and governance policies for enterpri...
Capabilities
Features
Central catalog where data producers publish assets and data consumers can discover, understand, and request access to data products.
Organize data assets, users, and governance policies within domains that reflect your organizational structure and data ownership.
Built-in request/approval workflow for data consumers to request access to data assets with business justification and audit trail.
Isolated project containers within domains where teams organize their data assets, environments, and members.
Automatically provision data access environments with Athena, Glue, Redshift, or other tools when subscriptions are approved.
Automatically discover and import tables from AWS Glue Data Catalog into DataZone for cataloging and governance.
Track data lineage across assets to understand data origins, transformations, and dependencies for trust and compliance.
Use Cases
Build an internal data marketplace where business units publish their data products for discovery and consumption by other teams.
Implement governed data access with approval workflows ensuring data consumers have proper authorization and business justification.
Share data assets across AWS accounts within an organization using DataZone's subscription and access management capabilities.
Enable analysts to discover and access data independently through the DataZone catalog with automatic environment provisioning.
Maintain audit trails of data access, govern sensitive data assets, and enforce data residency policies through domain governance.
Integrations
DataZone integrates with Glue Data Catalog to automatically discover, import, and catalog Glue tables for sharing and governance.
Catalog Redshift tables and views and govern cross-cluster data sharing through DataZone subscription workflows.
DataZone environments provision Athena as a query engine for subscribers accessing S3-based data assets.
Catalog S3-based datasets in DataZone and control access through subscription-based Lake Formation permissions.
DataZone uses Lake Formation for fine-grained column and row-level access control when subscriptions are approved.
IAM roles provide domain execution context and identity-based access control for DataZone resources and catalog operations.