Agent Skill · MotherDuck

motherduck-build-data-pipeline

Design an end-to-end MotherDuck data pipeline. Use for ETL/ELT workflows -- choosing raw, staging, and analytics boundaries, bulk ingestion paths, transformation sequencing, dlt/dbt integration, publication targets, or whether DuckLake is actually required.

Provider: MotherDuck Path in repo: skills/motherduck-build-data-pipeline/SKILL.md

Skill body

Build a Data Pipeline with MotherDuck

Use this skill when the user needs an ingestion-to-serving workflow, not just a single load step.

This is a use-case skill. It orchestrates motherduck-connect, motherduck-load-data, motherduck-model-data, motherduck-query, motherduck-share-data, and motherduck-ducklake.

Start Here: Is a MotherDuck Server Active?

Always determine this first.

Use that discovery to decide whether the pipeline is:

If no server is active, ask for source shape and target shape before drafting the pipeline.

Use This Skill When

Pipeline Defaults

Workflow

  1. Confirm whether live MotherDuck discovery is available.
  2. Inspect the current workspace and target data model.
  3. Define raw, staging, and analytics boundaries.
  4. Ingest raw data.
  5. Deduplicate, type, and promote into staging.
  6. Materialize analytics-ready outputs.
  7. Validate counts, freshness, uniqueness, and business metrics before publishing downstream assets.

When this skill produces a native DuckDB (md:) connection, watermark it with custom_user_agent=agent-skills/2.3.0(harness-<harness>;llm-<llm>). If metadata is missing, fall back to harness-unknown and llm-unknown.

Output

The output of this skill should be:

If the caller explicitly asks for structured JSON, return raw JSON only with no Markdown fences or prose before/after it. This is mainly for automated tests, regression checks, or downstream tooling that needs a stable machine-readable shape. Normal human-facing use of the skill can stay in prose unless JSON is explicitly requested.

Use this exact top-level shape when JSON is requested:

{
  "summary": {},
  "assumptions": [],
  "implementation_plan": [],
  "validation_plan": [],
  "risks": []
}

References

Runnable Artifact

Run it with:

uv run --with duckdb python skills/motherduck-build-data-pipeline/artifacts/pipeline_stage_example.py

Run the same stage pattern against temporary MotherDuck databases:

MOTHERDUCK_ARTIFACT_USE_MOTHERDUCK=1 \
uv run --with duckdb python skills/motherduck-build-data-pipeline/artifacts/pipeline_stage_example.py

Validate the TypeScript companion artifact:

uv run scripts/test_typescript_artifacts.py

For the full MotherDuck project:

cd skills/motherduck-build-data-pipeline/references/dlt-dbt-motherduck-project
export MOTHERDUCK_TOKEN=...
export MOTHERDUCK_PIPELINE_DB=md_skills_pipeline_demo
uv sync --python 3.12
uv run python pipeline/run_all.py
uv run python pipeline/cleanup.py

Verified Notes

Skill frontmatter

license: MIT