DataHub logo

DataHub

DataHub is LinkedIn's generalized metadata search and discovery platform, providing a unified data catalog, lineage graph, governance tooling, and event-driven Actions Framework. It exposes GraphQL, OpenAPI, and Rest.li APIs along with Python and Java SDKs and a CLI for metadata ingestion.

7 APIs 0 Features
Data CatalogData DiscoveryData GovernanceData LineageMetadata

APIs

DataHub GraphQL API

Primary API for querying and mutating metadata in DataHub. The GraphQL API serves as the main public API for the platform and can be used to fetch and update metadata programmat...

DataHub OpenAPI

RESTful API endpoints documented using the OpenAPI standard for interacting with DataHub metadata. Provides endpoints for entities, relationships, timeline, and platform events....

DataHub REST API

The Rest.li API represents the underlying persistence layer and exposes the raw PDL models used in storage. It powers the GraphQL API under the hood and is used for system-speci...

DataHub Python SDK

Python client for interacting with DataHub. The acryl-datahub package provides a CLI and SDK for DataHub, including REST and Kafka emitter APIs for pushing metadata programmatic...

DataHub Java SDK

Java client for interacting with DataHub. The io.acryl datahub-client package offers REST emitter APIs that can be used to emit metadata from JVM-based systems. It supports all ...

DataHub CLI

Command line tool for interacting with DataHub. The datahub CLI allows you to perform common operations including metadata ingestion, entity management, and system administratio...

DataHub Actions Framework

Event-driven framework for responding to real-time changes in the DataHub metadata graph. The Actions Framework allows you to configure event sources, transformations, and actio...

Event Specifications

DataHub Actions Framework Events

Event-driven interface for responding to real-time changes in the DataHub metadata graph. The Actions Framework consumes Metadata Change Log events and Platform Events from Kafk...

ASYNCAPI

Semantic Vocabularies

Datahub Context

0 classes · 9 properties

JSON-LD

API Governance Rules

DataHub API Rules

5 rules · 1 errors 4 warnings

SPECTRAL

Resources

🔗
PostmanWorkspace
PostmanWorkspace
🔗
ArazzoWorkflows
ArazzoWorkflows
🔗
Website
Website
🌐
Portal
Portal
🔗
Documentation
Documentation
🚀
GettingStarted
GettingStarted
🔑
Authentication
Authentication
👥
GitHubRepository
GitHubRepository
🔗
Slack
Slack
📰
Blog
Blog
🔗
Demo
Demo
📄
ChangeLog
ChangeLog
🟢
StatusPage
StatusPage
🔗
Community
Community
👥
YouTube
YouTube
🔗
LinkedIn
LinkedIn
📜
PrivacyPolicy
PrivacyPolicy
🔗
Security
Security
🔗
JSONLD
JSONLD
🔗
Vocabulary
Vocabulary
🔗
Capabilities
Capabilities
🔗
Rules
Rules

Sources

Raw ↑
aid: datahub
name: DataHub
description: >-
  DataHub is LinkedIn's generalized metadata search and discovery platform, providing a unified data catalog, lineage
  graph, governance tooling, and event-driven Actions Framework. It exposes GraphQL, OpenAPI, and Rest.li APIs along
  with Python and Java SDKs and a CLI for metadata ingestion.
image: https://datahubproject.io/img/datahub-logo.svg
type: Index
tags:
- Data Catalog
- Data Discovery
- Data Governance
- Data Lineage
- Metadata
created: '2024-01-15'
modified: '2026-05-19'
url: https://raw.githubusercontent.com/api-evangelist/datahub/refs/heads/main/apis.yml
specificationVersion: '0.19'
kind: opensource
position: Consumer
access: 3rd-Party
apis:
- aid: datahub:datahub-graphql-api
  name: DataHub GraphQL API
  description: >-
    Primary API for querying and mutating metadata in DataHub. The GraphQL API serves as the main public API for the
    platform and can be used to fetch and update metadata programmatically in the language of your choice. It mirrors
    the capabilities available in the DataHub UI.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/api/graphql/overview
  baseURL: http://localhost:8080/api/graphql
  tags:
  - GraphQL
  - Metadata
  - Queries
  - Search
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/api/graphql/overview
  - type: GettingStarted
    url: https://docs.datahub.com/docs/api/graphql/getting-started
  - type: Reference
    url: https://docs.datahub.com/docs/graphql/queries
  - type: Playground
    url: http://localhost:8080/api/graphiql
  - url: graphql/datahub-graphql.md
    type: GraphQL
- aid: datahub:datahub-openapi
  name: DataHub OpenAPI
  description: >-
    RESTful API endpoints documented using the OpenAPI standard for interacting with DataHub metadata. Provides
    endpoints for entities, relationships, timeline, and platform events. The OpenAPI spec is auto-generated and
    available via Swagger UI for interactive exploration. Recommended for advanced users who need lower-level access
    to the metadata graph.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/api/openapi/openapi-usage-guide
  baseURL: http://localhost:8080/openapi/
  tags:
  - Entities
  - Metadata
  - OpenAPI
  - REST
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/api/openapi/openapi-usage-guide
  - type: OpenAPI
    url: openapi/datahub-openapi-openapi.yml
  - type: JSONSchema
    url: json-schema/datahub-metadata-change-log-event-schema.json
- aid: datahub:datahub-rest-api
  name: DataHub REST API
  description: >-
    The Rest.li API represents the underlying persistence layer and exposes the raw PDL models used in storage. It
    powers the GraphQL API under the hood and is used for system-specific ingestion of metadata by the Metadata
    Ingestion Framework. This API is considered system-internal and is not recommended for direct external use.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/api/datahub-apis
  baseURL: http://localhost:8080/
  tags:
  - Entities
  - Internal
  - Metadata
  - REST
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/api/datahub-apis
- aid: datahub:datahub-python-sdk
  name: DataHub Python SDK
  description: >-
    Python client for interacting with DataHub. The acryl-datahub package provides a CLI and SDK for DataHub,
    including REST and Kafka emitter APIs for pushing metadata programmatically. It is one of the most recommended
    tools for extending and customizing DataHub behavior, especially for ingestion and bulk metadata operations.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/metadata-ingestion/as-a-library
  baseURL: https://pypi.org/project/acryl-datahub/
  tags:
  - Emitter
  - Ingestion
  - Python
  - SDK
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/metadata-ingestion/as-a-library
  - type: GitHubRepository
    url: https://github.com/datahub-project/datahub
  - type: SDKs
    url: https://pypi.org/project/acryl-datahub/
- aid: datahub:datahub-java-sdk
  name: DataHub Java SDK
  description: >-
    Java client for interacting with DataHub. The io.acryl datahub-client package offers REST emitter APIs that can be
    used to emit metadata from JVM-based systems. It supports all major DataHub entity types including Dataset, Chart,
    Dashboard, Container, DataFlow, DataJob, MLModel, and MLModelGroup.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/metadata-integration/java/as-a-library
  baseURL: https://github.com/datahub-project/datahub
  tags:
  - Emitter
  - Java
  - Metadata
  - SDK
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/metadata-integration/java/as-a-library
  - type: GitHubRepository
    url: https://github.com/datahub-project/datahub
- aid: datahub:datahub-cli
  name: DataHub CLI
  description: >-
    Command line tool for interacting with DataHub. The datahub CLI allows you to perform common operations including
    metadata ingestion, entity management, and system administration from the command line. It is installed as part of
    the acryl-datahub Python package and supports a plugin architecture for different data source connectors.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/cli
  baseURL: https://pypi.org/project/acryl-datahub/
  tags:
  - CLI
  - Command Line
  - Ingestion
  - Metadata
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/cli
  - type: GettingStarted
    url: https://docs.datahub.com/docs/metadata-ingestion/cli-ingestion
  - type: SDKs
    url: https://pypi.org/project/acryl-datahub/
- aid: datahub:datahub-actions-framework
  name: DataHub Actions Framework
  description: >-
    Event-driven framework for responding to real-time changes in the DataHub metadata graph. The Actions Framework
    allows you to configure event sources, transformations, and actions using YAML configuration files. It enables
    seamless integration of DataHub into a broader event-based architecture by consuming Metadata Change Logs and
    Platform Events.
  image: https://datahubproject.io/img/datahub-logo.svg
  humanURL: https://docs.datahub.com/docs/actions
  baseURL: https://pypi.org/project/acryl-datahub-actions/
  tags:
  - Actions
  - Automation
  - Events
  - Real-Time
  properties:
  - type: Documentation
    url: https://docs.datahub.com/docs/actions
  - type: GettingStarted
    url: https://docs.datahub.com/docs/actions/quickstart
  - type: SDKs
    url: https://pypi.org/project/acryl-datahub-actions/
  - type: AsyncAPI
    url: asyncapi/datahub-actions-asyncapi.yml
common:
- type: PostmanWorkspace
  url: https://www.postman.com/kinlaneapi/datahub/overview
- type: ArazzoWorkflows
  url: arazzo/
  workflows:
  - url: arazzo/datahub-add-glossary-terms-workflow.yml
    name: DataHub Add Glossary Terms to a Dataset
    summary: >-
      Confirm a dataset, attach glossary terms via its glossaryTerms aspect, then review the change in the entity
      timeline.
  - url: arazzo/datahub-assign-ownership-workflow.yml
    name: DataHub Assign Dataset Ownership
    summary: Write an ownership aspect onto a dataset, then read it back to verify the owners were recorded.
  - url: arazzo/datahub-decommission-dataset-workflow.yml
    name: DataHub Decommission a Dataset
    summary: Confirm a dataset, check it has no downstream dependents, then soft delete it from the metadata graph.
  - url: arazzo/datahub-emit-and-audit-workflow.yml
    name: DataHub Emit Platform Event and Audit
    summary: >-
      Emit a metadata change proposal through the platform ingestion path, then read the entity back and review its
      timeline.
  - url: arazzo/datahub-tag-dataset-workflow.yml
    name: DataHub Tag a Dataset
    summary: Confirm a dataset exists, then write its globalTags aspect to apply governance tags.
  - url: arazzo/datahub-trace-lineage-workflow.yml
    name: DataHub Trace Dataset Lineage
    summary: Confirm a dataset, query its downstream relationships, then batch fetch the related datasets' aspects.
  - url: arazzo/datahub-upsert-dataset-workflow.yml
    name: DataHub Upsert Dataset and Verify
    summary: >-
      Write a dataset's properties aspect into the metadata graph, then read the entity back to confirm the write
      landed.
- type: Website
  url: https://datahub.com
- type: Portal
  url: https://docs.datahub.com
- type: Documentation
  url: https://docs.datahub.com/docs/
- type: GettingStarted
  url: https://docs.datahub.com/docs/quickstart
- type: Authentication
  url: https://docs.datahub.com/docs/authentication
- type: GitHubRepository
  url: https://github.com/datahub-project/datahub
- type: Slack
  url: https://slack.datahubproject.io
- type: Blog
  url: https://datahub.com/blog/
- type: Demo
  url: https://demo.datahubproject.io/
- type: ChangeLog
  url: https://github.com/datahub-project/datahub/releases
- type: StatusPage
  url: https://status.datahub.com
- type: Community
  url: https://forum.datahubproject.io/
- type: YouTube
  url: https://youtube.com/@datahubproject
- type: LinkedIn
  url: https://www.linkedin.com/company/datahub-cloud
- type: PrivacyPolicy
  url: https://datahub.com/privacy-policy/
- type: Security
  url: https://docs.datahub.com/docs/security_stance
- type: JSONLD
  url: json-ld/datahub-context.jsonld
- type: Vocabulary
  url: vocabulary/datahub-vocabulary.yml
- type: Capabilities
  url: capabilities/datahub-capabilities.yml
- type: Rules
  url: rules/datahub-rules.yml
maintainers:
- FN: Kin Lane
  email: kin@apievangelist.com