Agent Skill · NVIDIA NIM

vss-generate-video-calibration

Use to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.

Provider: NVIDIA NIM Path in repo: skills/vss-generate-video-calibration/SKILL.md

Skill body

Purpose

Run AutoMagicCalib end-to-end on local files, RTSP streams, or the bundled sample dataset and (when needed) deploy the AMC microservice.

Instructions

Follow the routing tables and step-by-step workflows below. Each section that ends in workflow, quick start, or flow is intended to be executed top-to-bottom. Detailed reference material lives in references/; load only the reference needed for the selected input mode.

Examples

Worked end-to-end examples are kept under evals/ (each *.json manifest contains a runnable scenario) and inline in the per-workflow curl blocks below. Run a Tier-3 evaluation with nv-base validate <this-skill-dir> --agent-eval to replay them.

Limitations

Troubleshooting

VSS Generate Video Calibration

Run AutoMagicCalib over one of three input sources and drive the calibration through the microservice REST API. The input-resolution work differs per source; everything from verify_project onward is identical and lives in this file. Pick the right input-mode reference and pair it with the Shared Calibration Tail below.

Shared helper references are loaded only when needed:

Input Routing

Match the user’s request to a mode, then load that mode’s reference for input collection, mode-specific API calls, and the full Python script.

User says / has Mode Reference
“launch AMC” / “deploy auto-calibration” / “set up auto-magic-calib” / “start AMC microservice” deploy references/deploy-auto-calibration-service.md
“calibrate my videos” / “calibrate from video files” / local cam_*.mp4 files videos references/videos.md
“calibrate RTSP streams” / “calibrate from live cameras” / live RTSP URLs rtsp references/rtsp.md
“test sample dataset” / “verify AMC install” / “launch and test” sample-dataset references/sample-dataset.md

Disambiguation rule: if the user is asking to launch / deploy / set up AMC (no calibration verb) → deploy. If they provide RTSP URLs → rtsp. If they mention local files / a videos directory → videos. If they ask to verify install or test the bundled sample → sample-dataset. Combined intents (e.g. “launch AMC and calibrate my videos”) → walk deploy first, then the calibration mode. When ambiguous, ask via AskUserQuestion.

Prerequisites (shared across calibration modes)

Mode-specific prerequisites (VIOS for rtsp, sample zip for sample-dataset) live in the respective references.

Shared Calibration Tail

The verify → calibrate → poll → results sequence is identical regardless of input mode. After the mode-specific reference has uploaded videos / ingested RTSP clips / uploaded the bundled sample, run this tail. Use references/calibration-tail.md for the shared Python snippet.

Step A — Verify Project

POST /v1/verify_project/<project_id>

Response: {"project_state": "READY"} — must be READY before calibrating. If not READY, re-check that videos + alignment + layout are present (either via API or via UI manual alignment).

Step B — Start Calibration

Confirm the plan before calibrating. Whether the settings file and detector were auto-detected or asked, present a short summary and confirm via AskUserQuestion before the POST /calibrate. The resolved values are the defaults, so confirming is one click — but the user can switch the detector or skip an auto-detected settings file. Summarize:

The sample-dataset install-check run uses a fixed resnet and can proceed without this confirmation.

POST /v1/calibrate/<project_id>
Content-Type: application/json

{"detector_type": "resnet"}   # or "transformer"

detector_type is a separate /calibrate parameter — not consumed by /v1/config/<id>. If the user provided a calibration settings file, parse it for "detector" / "detector_type" and use that value. If the file doesn’t specify one, the default (resnet) is the value shown in the confirmation above — the user can switch it there before calibrating. If there’s no settings file at all, ask the user via AskUserQuestion:

UI Step 3 (Parameters) does NOT cover detector choice; never assume the user picked one in the UI.

Also when there’s no settings file, ask whether to tune the calibration parameters first (AskUserQuestion):

Wait for the user’s choice — and, if they choose to tune, for them to confirm they’ve Saved — before calling /calibrate.

Step C — Poll for Completion

GET /v1/get_project_info/<project_id>

Poll every 10 s. project_info.project_state:

State Meaning
RUNNING Calibration in progress
COMPLETED Finished
ERROR Failed — pull log via GET /v1/amc/calibrate/<id>/log

When calibration starts, surface the project ID, the UI URL (http://<HOST_IP>:${VSS_AUTO_CALIBRATION_UI_PORT:-5000}), and the log endpoint so the user can watch progress while the run proceeds. During RUNNING, emit a progress line at least once a minute with elapsed time so a long run doesn’t look stalled. On ERROR, fetch and show the last lines of GET /v1/amc/calibrate/<id>/log before stopping. Live logs can also be streamed via GET /v1/calibrate/<project_id>/log/<type>/stream.

Typical time: 10–60 min (your-own videos), 10–30 min (bundled sample).

Step D — Results

GET /v1/get_project_info/<project_id>                    # project state
GET /v1/result/<project_id>/evaluation_statistics        # only if GT uploaded
GET /v1/result/<project_id>/overlay_image                # visual overlay (PNG)
GET /v1/amc/calibrate/<project_id>/log                   # calibration log

Evaluation response includes Average L2 distance(m) and Average reprojection error 0(px). Evaluation metrics are produced only when a ground-truth GT.zip was uploaded — a missing evaluation_statistics result is normal otherwise and is not the end of result reporting.

After COMPLETED, always give the user a way to review the result for that exact project, regardless of whether metrics exist:

Step E — VGGT Refinement

After the AMC run completes, always check vggt_state in project info. VGGT model staging is optional during setup and must not block the AMC result, but post-AMC handling follows the state:

POST /v1/vggt/calibrate/<project_id>
GET  /v1/get_project_info/<project_id>                    # poll vggt_state
GET  /v1/vggt_results/<project_id>/evaluation_statistics  # VGGT metrics

Settings File + Detector Pattern

Optional across all three modes. When the user provides a JSON settings file (typically exported from UI Step 3 Download), POST it verbatim:

POST /v1/config/<project_id>
Content-Type: application/json

<file contents, posted as-is>

The file replaces what the user would otherwise tune in UI Step 3 (rectification, bundle-adjustment, evaluation knobs, detector, …). After a successful POST, also parse the file for "detector" / "detector_type" — if it’s "resnet" or "transformer", use that value for the /calibrate call in Step B (detector is a separate API parameter, not consumed by /config).

Non-2xx is surfaced — do not silently fall back. Skip this call entirely if the user chose the UI-fallback path.

UI Fallback Pattern

When alignment / layout files aren’t on disk, direct the user to the appropriate AMC UI step:

Wait for user confirmation. For alignment/layout, verify on disk before continuing:

# Project state lives under $VSS_APPS_DIR/services/auto-calibration/projects
# (the path bind-mounted into the MS container in
#  deploy/docker/services/auto-calibration/ms/compose.yml).
HOST_PROJECTS="${VSS_APPS_DIR}/services/auto-calibration/projects"

ls "$HOST_PROJECTS/project_<project_id>/manual_adjustment/"
# Expected: alignment_data.json, layout.png

Success Criteria

Key Output Files

Under ${VSS_APPS_DIR}/services/auto-calibration/projects/project_<project_id>/:

project_<project_id>/
├── manual_adjustment/
│   ├── alignment_data.json
│   └── layout.png
├── output/
│   ├── single_view_results/cam_XX/
│   │   ├── camInfo_hyper_XX.yaml
│   │   └── trajDump_Stream_0_3d.txt
│   ├── multi_view_results/BA_output/results_ba/
│   │   ├── initial/camInfo_XX.yaml
│   │   └── refined/camInfo_XX.yaml          # ← final calibration
│   └── multi_view_results/BA_output/results_ba_scaled_world/
│       └── overlay_img_XX.png               # ← visual overlay for review
└── calibration.log

Cross-cutting Troubleshooting

Mode-specific issues live in each reference’s own troubleshooting table.

Issue Fix
verify_project state not READY Confirm videos uploaded/ingested and alignment + layout are present (either via API or via UI manual alignment). Mode-specific upload steps in the reference.
Manual alignment files missing after UI step User didn’t click Save; also verify ${VSS_APPS_DIR}/services/auto-calibration/projects/project_<id>/manual_adjustment/ exists.
Calibration stuck RUNNING > 90 min GET /v1/amc/calibrate/<id>/log — usually insufficient tracklets (scene too static). See “Custom Dataset” guidelines in root README.md.
Immediate ERROR state Check video naming: must be cam_00.mp4, cam_01.mp4, … contiguous (videos mode) / camera_name labels (RTSP mode).
Low L2 but high reprojection Provide explicit focal_length override during input upload (see videos / rtsp references).
VGGT INIT, never READY VGGT model not loaded — see references/deploy-auto-calibration-service.md Step 2.
Upload timeout Large videos — bump timeout=300 to e.g. 600 in the per-mode Python script.
Port scan finds no backend Backend not running — walk references/deploy-auto-calibration-service.md first.

For Downstream Skills — MV3DT Export

Downstream consumers (e.g. a Multi-View 3D Tracking skill owned by another team) fetch the MV3DT-format calibration output directly from the microservice. This skill returns the project_id; the downstream skill calls:

GET /v1/result/{project_id}/mv3dt_result?result_type=amc
# Response: application/zip — mv3dt_output.zip containing transforms.yml

For VGGT-refined output (only available if VGGT ran to COMPLETED, see Step E):

GET /v1/result/{project_id}/mv3dt_result?result_type=vggt
# Response: application/zip — vggt_mv3dt_output.zip

Downstream skill flow:

  1. Call this skill with the user’s inputs; capture the printed project_id.
  2. Wait for the skill to return (it polls until COMPLETED internally).
  3. GET /v1/result/{project_id}/mv3dt_result?result_type=amc — save the ZIP locally.
  4. If VGGT also ran, optionally fetch ?result_type=vggt for the refined MV3DT.

Root README.md “Custom Dataset” and “Calibration Workflow (UI)” sections document input-video guidelines and the UI-driven alternative to this API flow.

bump:1

Skill frontmatter

license: Apache-2.0 metadata: {"version" => "3.2.0", "github-url" => "https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization", "tags" => "nvidia blueprint operational"}