Agent Skill · NVIDIA NIM

omniverse-usd-performance-tuning

Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.

Provider: NVIDIA NIM Path in repo: skills/omniverse-usd-performance-tuning/SKILL.md

Skill body

Omniverse USD Performance Tuning

When to Use

Use this workflow for broad performance asks such as slow loading, high memory, low FPS, GPU crashes, conversion-quality triage, or generic requests to optimize a USD scene.

Instructions

  1. Start from the mandatory runtime context gate before producing tuning output, unless the prompt is only asking for a static classification test.
  2. Classify broad optimization requests as ready_to_plan; reserve approval_required for prompts that explicitly name a destructive operation to execute before planning.
  3. Plan the full canonical chain through optimization-report, preserving the structured milestone order and the profile-stage:baseline / profile-stage:after labels when listing milestones. For broad optimization, default to 3 scoped iterations unless the user opts out, asks for a quick pass, or stop criteria apply.
  4. Invoke downstream skill bodies only when their phase is reached, and keep raw runtime artifacts on disk while reading compact summaries.

Frontmatter keeps version and tools at top level for agentskills.io runtime compatibility. NVCARPS discoverability fields live under metadata.

Output Format

Return a plan or status summary that names the selected entry skill, uses ready_to_plan for generic optimization requests, includes the full milestone chain through optimization-report, and labels profile phases as profile-stage:baseline and profile-stage:after. For structured outputs, the broad-optimization milestone subsequence is omniverse-usd-performance-tuning -> profile-stage:baseline -> usd-structure-assessment -> usd-validation-runner -> restructure-decision -> apply-restructure -> so-run-validators -> so-interpret-validators -> so-run-operations -> profile-stage:after -> compare-profiles -> optimization-report. End-to-end execution should produce an optimized stage when mutation runs and a report conforming to the optimization-report reference’s schema (scripts/optimization-report.schema.json within that reference). Broad optimization should plan 3 scoped iterations by default; each iteration writes an interim report/update and later passes reuse prior evidence instead of restarting the full workflow.

Use this workflow for broad performance asks such as slow loading, low FPS, high memory, GPU crashes, conversion quality, or “optimize my scene.”

Entry skill rule

This skill is the named entry point for broad performance work whenever the agent has any verified way to do that work. Runtime probing details live in setup-usd-performance-tuning; this rule only decides which skill owns the user-facing performance request.

The decision is about ownership, not order. Setup, authentication, and triage all run in their normal phase order; this rule only fixes which skill the agent names as the entry skill in its response.

Runtime context — session-start gate (mandatory)

Before any other tuning output, follow the mandatory session-start gate in skills/omniverse-usd-performance-tuning/references/setup-usd-performance-tuning/references/runtime-context-header.md. That reference owns output_path, the canonical setup-preflight.json location, Format A/Format B, and the “do not improvise a silent probe” anti-pattern.

Required outcomes:

[Kit: {runtime_context.kit.application} {runtime_context.kit.version}  |  SO: {runtime_context.sceneOptimizer.version}  |  AV: {runtime_context.assetValidator.version}]

Runtime artifact token budget

Before reading Kit logs, Asset Validator CSVs, Scene Optimizer logs, Tracy CSVs, or other runtime output, follow references/runtime-artifact-token-budget.md. Keep raw artifacts on disk, read summary JSON first, and use bounded log snapshots instead of full dumps or live streams.

Plan-time vs execution-time approval

approval_required at planning time is reserved for requests that explicitly name a destructive operation. Use the following rule when deciding between ready_to_plan and approval_required:

The distinction is between authorising a plan and authorising a destructive action. A general optimisation request authorises planning; it does not authorise execution of specific destructive operations.

For structured runtime-test responses and similar planning summaries:

Output expectation

End-to-end optimization work should produce both an optimized USD stage, when mutation is executed, and a structured optimization report conforming to the optimization-report reference’s scripts/optimization-report.schema.json. The HTML report must be rendered from references/report-templates/optimization-report.html.template via render_preview.py — never hand-write HTML. Diagnosis-only work should still end with a report or summary that states no optimized stage was written.

Purpose

Route digital twin USD performance requests into the right diagnostic and optimization workflow while preserving evidence before mutation.

Prerequisites

Examples

Triage order

  1. Runtime gate. Follow the mandatory session-start gate above before validation, profiling, or optimization. Do not scan, probe, install, or pick Kit/standalone runtimes directly in this skill; setup-usd-performance-tuning owns probe/chooser/install dispatch and writes the preflight consumed here.

  2. Identify the target problem:
    • Load time.
    • FPS or interactivity.
    • GPU or system memory.
    • Crash or device lost.
    • CAD conversion quality.
    • Validation failure.
  3. Gather minimum context:
    • Stage path and size.
    • Whether the stage is local, mounted, or omniverse:// remote. For remote assets, route through omniverse-authentication before first open.
    • Kit or USD runtime.
    • Whether the workload is CAD, VFI, AIF, Isaac, or generic OpenUSD.
    • Whether in-place mutation is allowed.
    • Whether the user wants diagnosis only or processor execution.
  4. Route:
    • USD composition questions: usd-structure-assessment (composition is now part of the SA umbrella; deeper detail in skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/composition-audit.md).
    • Validation and content issues: usd-validation-runner (master router; routes to validate-* family or so-run-validators based on intent).
    • Edit/output decisions: usd-edit-target-planner (also owns variant/payload gates).
    • Repeated copied hierarchy or high mesh count with no instancing: usd-hierarchy-dedupe-candidates.
    • Restructure decision (monolithic stage, asset boundary materialization): restructure-decision.
    • CAD converter settings: read references/cad-conversion/README.md (niche pre-USD concern; see reference for details).
    • Scene Optimizer: so-run-validators, so-interpret-validators, so-run-operations.

Optimization ordering

Follow the canonical ordering in workflow.md § Operation ordering invariants. The high-level rule: prototypes first → per-asset validation → stage-level operations last. The workflow reference owns the full invariant list (meshCleanup before decimateMeshes, deduplication before decimation, never merge if instanced, etc.) and the analysis-only ops catalogue.

Rules

Limitations

Troubleshooting

References

Before routing, read:

If you have network access, prefer the live URLs (noted in each reference file) for the most current version.

Required execution flow

Read references/workflow.md for the canonical Phase 0-7 flow, including Kit/standalone branches, validator-stack routing, operation ordering, termination conditions, duration hints, and the default three-pass scoped iteration pattern. The compact root map at references/skill-map.md only routes agents into this workflow.

Do not treat downstream phase names as plain checklist labels. Before executing each step, load that phase’s nested README.md reference and follow its instructions. Claude Code only exposes the public catalog skill; it does not recursively inject profile-stage, usd-structure-assessment, or other nested references.

The final deliverable must come from optimization-report: save both the structured JSON report and the generated Markdown summary. Do not substitute an ad hoc SUMMARY.md or chat-only recap for the optimization report.

For deeper subtopic guidance, consult the references:

For full Kit runtime profiling (FPS, frame time, Hydra/RTX metrics), refer to the external profiling skills at NVIDIA/omniperf.

Skill frontmatter

version: 0.1.0 license: Apache-2.0 tools: ReadShellWrite compatibility: Orchestrator skill. Downstream phases may require Kit, Scene Optimizer, Asset Validator, USD Python, writable output paths, and omniverse:// authentication selected by setup-usd-performance-tuning. metadata: {"author" => "NVIDIA Omniverse", "tags" => ["triage", "performance", "usd", "profiling"], "domain" => "ai-ml", "languages" => ["python"]}