omniverse-usd-performance-tuning
Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.
Skill body
Omniverse USD Performance Tuning
When to Use
Use this workflow for broad performance asks such as slow loading, high memory, low FPS, GPU crashes, conversion-quality triage, or generic requests to optimize a USD scene.
Instructions
- Start from the mandatory runtime context gate before producing tuning output, unless the prompt is only asking for a static classification test.
- Classify broad optimization requests as
ready_to_plan; reserveapproval_requiredfor prompts that explicitly name a destructive operation to execute before planning. - Plan the full canonical chain through
optimization-report, preserving the structured milestone order and theprofile-stage:baseline/profile-stage:afterlabels when listing milestones. For broad optimization, default to 3 scoped iterations unless the user opts out, asks for a quick pass, or stop criteria apply. - Invoke downstream skill bodies only when their phase is reached, and keep raw runtime artifacts on disk while reading compact summaries.
Frontmatter keeps version and tools at top level for agentskills.io runtime
compatibility. NVCARPS discoverability fields live under metadata.
Output Format
Return a plan or status summary that names the selected entry skill, uses ready_to_plan for generic optimization requests, includes the full milestone chain through optimization-report, and labels profile phases as profile-stage:baseline and profile-stage:after. For structured outputs, the broad-optimization milestone subsequence is omniverse-usd-performance-tuning -> profile-stage:baseline -> usd-structure-assessment -> usd-validation-runner -> restructure-decision -> apply-restructure -> so-run-validators -> so-interpret-validators -> so-run-operations -> profile-stage:after -> compare-profiles -> optimization-report. End-to-end execution should produce an optimized stage when mutation runs and a report conforming to the optimization-report reference’s schema (scripts/optimization-report.schema.json within that reference). Broad optimization should plan 3 scoped iterations by default; each iteration writes an interim report/update and later passes reuse prior evidence instead of restarting the full workflow.
Use this workflow for broad performance asks such as slow loading, low FPS, high memory, GPU crashes, conversion quality, or “optimize my scene.”
Entry skill rule
This skill is the named entry point for broad performance work whenever the
agent has any verified way to do that work. Runtime probing details live in
setup-usd-performance-tuning; this rule only decides which skill owns the
user-facing performance request.
- If the setup probe shows any verified runtime path - Kit, standalone, or
even a partial stack such as Asset Validator only - enter here. If the
user’s requested tool is missing, return the specific
blocked_code(blocked_missing_scene_optimizer,blocked_missing_so_operation, etc.) instead of substituting another workflow. - Enter at
setup-usd-performance-tuningonly when no runtime path is verified and runtime choice/setup is the first unresolved problem. - For
omniverse://assets, enter atomniverse-authenticationfirst. Authentication precedes setup and triage for remote assets.
The decision is about ownership, not order. Setup, authentication, and triage all run in their normal phase order; this rule only fixes which skill the agent names as the entry skill in its response.
Runtime context — session-start gate (mandatory)
Before any other tuning output, follow the mandatory session-start gate in
skills/omniverse-usd-performance-tuning/references/setup-usd-performance-tuning/references/runtime-context-header.md.
That reference owns output_path, the canonical setup-preflight.json
location, Format A/Format B, and the “do not improvise a silent probe”
anti-pattern.
Required outcomes:
- Missing or unreadable preflight: invoke
setup-usd-performance-tuning. - Present preflight: print Format A and wait for the user to choose Continue, Change Kit, Switch to standalone, or Re-run probe.
- Confirmed runtime in the same session: use compact Format B for follow-up status.
[Kit: {runtime_context.kit.application} {runtime_context.kit.version} | SO: {runtime_context.sceneOptimizer.version} | AV: {runtime_context.assetValidator.version}]
Runtime artifact token budget
Before reading Kit logs, Asset Validator CSVs, Scene Optimizer logs, Tracy CSVs,
or other runtime output, follow
references/runtime-artifact-token-budget.md. Keep raw artifacts on disk, read
summary JSON first, and use bounded log snapshots instead of full dumps or live
streams.
Plan-time vs execution-time approval
approval_required at planning time is reserved for requests that explicitly name a destructive operation. Use the following rule when deciding between ready_to_plan and approval_required:
approval_requiredat planning time — the user’s request itself names a destructive operation: “flatten this stage”, “decimate the meshes”, “merge prototypes”, “delete unused prims”, or any specific named mutation that cannot be undone within the same workflow. In this case the agent’s first response must be an approval prompt that names the operation, before the agent commits to a plan that executes it.ready_to_planat planning time — the user’s request is general: “optimize this scene”, “make it load faster”, “reduce GPU memory”, “improve interactivity”. The agent lays out the full plan, including any destructive operations the plan would invoke (for exampleso-run-operationswithmergeMaterials), without withholding the plan itself. Approval for each destructive operation is requested alongside plan approval.
The distinction is between authorising a plan and authorising a destructive action. A general optimisation request authorises planning; it does not authorise execution of specific destructive operations.
For structured runtime-test responses and similar planning summaries:
- A future
restructure-decisionprompt is a planned user-decision gate, not a reason to set the top-level responsedecisiontoapproval_requiredfor a generic optimization request. - For a generic optimization request, set
decision: "ready_to_plan"and include the full intended chain in bothcommitted_milestonesandplanned_phases, throughoptimization-report. - It is valid for
gates_observedto includeasks_user_for_restructure_decisionwhile the top-leveldecisionremainsready_to_plan. - Whenever a chain names profile phases, use the exact labels
profile-stage:baselineandprofile-stage:after; do not emit the ambiguous bareprofile-stagetoken. - Start structured milestone lists with
omniverse-usd-performance-tuningas the owning entry skill. Includesetup-usd-performance-tuningonly as additional Phase 0 context, not as a replacement for the entry skill milestone. - For broad optimization requests, preserve the milestone subsequence from Output Format above exactly, with optional extra analysis steps inserted only where they do not reorder it.
- Do not list
so-run-validatorsorso-interpret-validatorsbeforerestructure-decisionin broad optimization milestone summaries. Phase-aware validator routing still happens throughusd-validation-runner; the SO validator executor/interpreter milestones appear after the restructure decision path in the structured plan contract.
Output expectation
End-to-end optimization work should produce both an optimized USD stage, when
mutation is executed, and a structured optimization report conforming to
the optimization-report reference’s scripts/optimization-report.schema.json. The HTML report must be rendered
from references/report-templates/optimization-report.html.template via
render_preview.py — never hand-write HTML. Diagnosis-only work should still
end with a report or summary that states no optimized stage was written.
Purpose
Route digital twin USD performance requests into the right diagnostic and optimization workflow while preserving evidence before mutation.
Prerequisites
- Stage path or enough context to identify the target asset.
- User goal: diagnosis only, validation, profiling, or processor execution.
- Runtime availability status from
setup-usd-performance-tuningwhen not already known. - Permission status for in-place mutation vs writing a separate optimized output.
Examples
- “This USD loads slowly; triage what to check first.”
- “Route a low-FPS CAD scene through the performance workflow.”
Triage order
-
Runtime gate. Follow the mandatory session-start gate above before validation, profiling, or optimization. Do not scan, probe, install, or pick Kit/standalone runtimes directly in this skill;
setup-usd-performance-tuningowns probe/chooser/install dispatch and writes the preflight consumed here. - Identify the target problem:
- Load time.
- FPS or interactivity.
- GPU or system memory.
- Crash or device lost.
- CAD conversion quality.
- Validation failure.
- Gather minimum context:
- Stage path and size.
- Whether the stage is local, mounted, or
omniverse://remote. For remote assets, route throughomniverse-authenticationbefore first open. - Kit or USD runtime.
- Whether the workload is CAD, VFI, AIF, Isaac, or generic OpenUSD.
- Whether in-place mutation is allowed.
- Whether the user wants diagnosis only or processor execution.
- Route:
- USD composition questions:
usd-structure-assessment(composition is now part of the SA umbrella; deeper detail inskills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/composition-audit.md). - Validation and content issues:
usd-validation-runner(master router; routes tovalidate-*family orso-run-validatorsbased on intent). - Edit/output decisions:
usd-edit-target-planner(also owns variant/payload gates). - Repeated copied hierarchy or high mesh count with no instancing:
usd-hierarchy-dedupe-candidates. - Restructure decision (monolithic stage, asset boundary materialization):
restructure-decision. - CAD converter settings: read
references/cad-conversion/README.md(niche pre-USD concern; see reference for details). - Scene Optimizer:
so-run-validators,so-interpret-validators,so-run-operations.
- USD composition questions:
Optimization ordering
Follow the canonical ordering in workflow.md § Operation ordering invariants. The high-level rule: prototypes first → per-asset validation → stage-level operations last. The workflow reference owns the full invariant list (meshCleanup before decimateMeshes, deduplication before decimation, never merge if instanced, etc.) and the analysis-only ops catalogue.
Rules
- Always run composition audit before mutation.
- Always validate before and after processor execution.
- Optimize prototypes before per-asset validation.
- Do not run whole-stage mesh deduplication on very large CAD scenes before checking for hierarchy-level reuse.
- Do not recommend a fixed optimization stack without bottleneck evidence.
- Do not invent numeric thresholds or expected percentage wins.
- Prefer canonical SO ops over specialty / documentary ones. The op
curation in
references/operations/_curation.jsonclassifies every op ascanonical,specialty,analysis,documentary, ordeprecated. When more than one op could resolve the same finding, recommend the canonical one first and only reach for a specialty op when the user explicitly asks or the rationale warrants it. Specifically:- For vertex welding, prefer canonical
meshCleanupwith explicit flags over the standalonemergeVerticesop. The standalone op is a legacy/specialty surface; use upstreamusd-optimizefor the operation mechanics and local approval policy before mutating. - For hierarchy dedupe, recommend
usd-hierarchy-dedupe-candidates+apply-restructure(the USD-authored rewrite path). - For per-mesh dedupe, recommend
deduplicateGeometry(canonical) overfindCoincidingGeometry(analysis — produces a report, not a change). - Do not recommend
documentary-status ops (e.g.,boxClip,deletePrims,removeAttributes,removeUntypedPrims,mergeoutside its narrow non-instanced case) without an explicit user request. Documentary ops survive in the per-opreferences/operations/<key>.mdrouting stubs for completeness but are excluded from agent-initiated recommendations. - Specialty ≠ documentary. Ops classified as
specialtyin_curation.jsoneither (a) have validator-finding evidence that wires them into theso-interpret-validatorschain (e.g.sparseMeshes,optimizePrimvars), or (b) are load-bearing escape hatches needed for specific downstream contexts (e.g.primitivesToMesheswhen output must beUsdGeomMesh,utilityFunctionfor instancing toggles and material rebinding,pythonScriptforso-create-proxyrecipes). Recommend specialty ops when their validator fires OR when their downstream context applies — the suppression above only targetsdocumentaryops.
- For vertex welding, prefer canonical
Limitations
- Does not replace downstream reference instructions; load each required reference before executing it.
- Does not install runtimes directly; follow setup or install references when requirements are missing.
- Does not authorize mutation when the user has not allowed writes.
Troubleshooting
- If runtime status is unclear, run
setup-usd-performance-tuningbefore profiling or validation. - If the reported problem is vague, gather stage path, workload type, and whether diagnosis or execution is requested.
- If the workflow suggests mutation before evidence, return to baseline profiling and composition audit first.
References
Before routing, read:
skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/optimization-tradeoffs.md— identify which pipeline phase the scene is in (extraction, structuring, or optimization). The right action depends on the phase.skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/factory-level-structuring.md— understand the three pillars (assets, aggregation, animation) and the seven-step structuring pattern.
If you have network access, prefer the live URLs (noted in each reference file) for the most current version.
Required execution flow
Read references/workflow.md for the canonical Phase 0-7 flow, including
Kit/standalone branches, validator-stack routing, operation ordering,
termination conditions, duration hints, and the default three-pass scoped
iteration pattern.
The compact root map at references/skill-map.md only routes agents
into this workflow.
Do not treat downstream phase names as plain checklist labels. Before executing
each step, load that phase’s nested README.md reference and follow its
instructions. Claude Code only exposes the public catalog skill; it does not
recursively inject profile-stage, usd-structure-assessment, or other nested
references.
The final deliverable must come from optimization-report: save both the structured JSON report and the generated Markdown summary. Do not substitute an ad hoc SUMMARY.md or chat-only recap for the optimization report.
For deeper subtopic guidance, consult the references:
skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/composition-audit.md,skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/layer-health.md- subtopic detail for SA’s Phase 1 checklist.skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/instancing-readiness/references/instancing-tradeoffs.md- merge safety, decision tree for instancing choices.skills/omniverse-usd-performance-tuning/references/usd-structure-assessment/references/usd-edit-target-planner/references/variants-payloads.md- deeper variant/payload trade-offs (gates are inline in usd-edit-target-planner).references/cad-conversion/README.md- CAD converter settings.references/upstreams/usd-optimize.md- upstream SO mechanics and prebuilt package resolution.skills/omniverse-usd-performance-tuning/references/usd-validation-runner/references/so-run-validators/references/infrastructure.md- local handoff for SO validator infrastructure.skills/omniverse-usd-performance-tuning/references/usd-validation-runner/README.md- tier 1/2/3 selected-probe plan, large-stage guardrails, full-sweep approval, and scene-aware adjustment.skills/omniverse-usd-performance-tuning/references/optimization-report/references/optimization-report-template.md- the data contract every phase populates.
For full Kit runtime profiling (FPS, frame time, Hydra/RTX metrics), refer to the external profiling skills at NVIDIA/omniperf.