Agent Skill · NVIDIA NIM

tao-convert-dataset-format

Run `tao-daft convert` to convert NVIDIA TAO DAFT datasets between supported formats. Do not use for non-DAFT data. Use when the user asks to convert a DAFT dataset, change DAFT format, change a TAO dataset format, or run `tao-daft convert`.

Provider: NVIDIA NIM Path in repo: skills/tao-convert-dataset-format/SKILL.md

Skill body

Convert a TAO DAFT Dataset

Quick start

tao-daft convert <source-format> <target-format> --path <input> --output <output>

Source and target are positional subcommands; --path and --output are flags. Discover the supported formats and per-pair flags from the leaf --help (see “CLI conventions” below).

Preflight

python -c "import nvidia_tao_daft" 2>/dev/null || {
  echo "MISSING: tao-daft not installed. Run:"
  echo "  pip install nvidia-tao-daft"
  exit 1
}

Quick Start

Discover the installed CLI surface before choosing format slugs, then run the leaf conversion command with explicit --path and --output flags:

tao-daft --version
tao-daft convert --help
tao-daft convert <source-format> --help
tao-daft convert <source-format> <target-format> --path /path/to/daft --output /path/to/converted

Purpose

Drives tao-daft convert to transform a DAFT dataset (or a tree of them) between supported formats. The CLI does the real work; the skill picks the right source/target pair and flags, then explains the result.

Trigger on: converting a DAFT dataset, packaging DAFT QA / summarization / temporal tasks for VLM training, producing a meta.json-style training set, or the command tao-daft convert. Do not trigger for non-DAFT → DAFT conversion (COCO, YOLO, Data Factory JSONL) — redirect to the upstream nvidia-tao-daft repo’s converter skills.

If the user opens ambiguously, run a few --help calls first.

Prerequisites

Instructions

CLI conventions

tao-daft is nested argparse subcommands. The conventions below are stable across versions even when format names or flags change, so always discover the current surface from --help rather than relying on names this doc happens to mention.

  1. Source and target are both positional subcommands, not --from/--to: tao-daft convert <source> <target> [flags]. Format slugs are versioned, lowercase, dot-separated (metropolis-v3.0, cosmos-reason-v1.0, …).
  2. Path and output are flags--path PATH (source), --output OUTPUT (destination). Both required at the leaf; passing positionally fails.
  3. --path accepts both granularities — a single scene/dataset or a parent directory; the converter walks the tree.
  4. Per-pair flags live at the leaf — flag sets differ between targets (e.g. media-handling). Always check the leaf --help.

Operating procedure:

  1. tao-daft --version — confirm install, pin version in any report.
  2. tao-daft convert --help — list supported source formats.
  3. tao-daft convert <source> --help — list valid targets for that source.
  4. Infer source from layout (same directory markers as the tao-validate-dataset-format skill’s “Format inference”). If you cannot infer or the target is unspecified, ask.
  5. tao-daft convert <source> <target> --help — pick flags for the user’s intent (task subset, media copy vs reference, metadata).
  6. Execute, then interpret (see below).

Reading output

Per-scene progress prints to stdout; non-zero exit on failure. The converted dataset is written under --output — spot-check it with the tao-validate-dataset-format skill before training. For large trees, capture the full output and partial-read if huge.

Limitations

Troubleshooting

Skill frontmatter

license: Apache-2.0 compatibility: Requires Python 3.10+ and the nvidia-tao-sdk package (pip install nvidia-tao-daft). metadata: {"author" => "NVIDIA Corporation", "version" => "0.1.0"} allowed-tools: Read Bash tags: tao-daftdatasetconversionvlmcosmos-reason