Agent Skill · NVIDIA NIM

nv-generate-vae-finetune

Used for finetuning the NV-Generate-CTMR MAISI VAE from CT/MRI NIfTI datalists. Not for clinical or production data approval.

Provider: NVIDIA NIM Path in repo: skills/nv-generate-vae-finetune/SKILL.md

Skill body

NV-Generate-VAE-Finetune

Purpose

Instructions

Examples

Validate and stage a preflight finetune check from an input bundle (the recommended first step — no GPU, no training). This is the single canonical command; replace INPUT_BUNDLE and OUT_DIR with your paths:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-vae-finetune/scripts/run_vae_finetune.py \
  INPUT_BUNDLE/preflight_datalist.json \
  --data-base-dir INPUT_BUNDLE/preflight_dataset \
  --output-dir OUT_DIR \
  --modality mri \
  --preflight

For real GPU finetuning and other variations, see Usage below.

Available Scripts

| Script | Purpose | Arguments | |—|—|—| | scripts/run_vae_finetune.py | Primary entrypoint declared by skill_manifest.yaml. | DATALIST.json --data-base-dir DATA_DIR --output-dir OUT_DIR [--epochs N] [--modality mri] [--patch-size 64,64,64] [--preflight] |

Prerequisites

1. Config and environment JSON (adapt to your data)

The wrapper copies the upstream VAE config/env JSON from $NV_GENERATE_ROOT/configs, rewrites the fields below, and writes the staged copies under OUT_DIR/workflow/configs/. You normally only set your datalist and data root; the listed CLI flags override individual fields when you need to.

Environment JSON (environment_maisi_vae_train.json):

Field Set from Notes
model_dir --output-dir Where autoencoder.pt/discriminator.pt and best checkpoints are saved.
tfevent_path --output-dir TensorBoard event directory.
finetune --train-from-scratch true (default) loads trained_autoencoder_path; the flag sets it false.
trained_autoencoder_path upstream weights / --trained-autoencoder-path Starting VAE checkpoint when finetuning.

Training fields (config_maisi_vae_train.json):

Field Flag Type Default Notes
autoencoder_train.n_epochs --epochs int 1  
autoencoder_train.batch_size --batch-size int 1 Per-GPU (single-GPU runner).
autoencoder_train.patch_size --patch-size int,int,int 64,64,64 Training crop.
autoencoder_train.val_batch_size --val-batch-size int 1  
autoencoder_train.val_sliding_window_patch_size --val-sliding-window-patch-size int,int,int 96,96,64 Sliding-window validation ROI.
autoencoder_train.lr --lr float 1e-4  
autoencoder_train.perceptual_weight --perceptual-weight float 0.3 LPIPS term.
autoencoder_train.kl_weight --kl-weight float 1e-7 KL term.
autoencoder_train.adv_weight --adv-weight float 0.1 Adversarial term.
autoencoder_train.recon_loss --recon-loss l1|l2 l1  
autoencoder_train.val_interval --val-interval int 1 Epochs between validation passes.
autoencoder_train.cache --cache-rate float 0.0 MONAI CacheDataset fraction.
autoencoder_train.amp --no-amp flag on Mixed precision; flag disables it.
data_option.random_aug --no-random-aug flag on Random augmentation; flag disables it.
data_option.spacing_type --spacing-type original|fixed|rand_zoom original  
data_option.spacing --spacing float,float,float unset Required when spacing_type is fixed/rand_zoom.
data_option.select_channel --select-channel int 0 Channel for multi-channel inputs.

--modality (ct or mri, default mri) fills the per-entry class for datalist items missing one. Validation/testing entries are required because the training loop runs a validation pass.

For an end-to-end reference including example data download, see the upstream tutorial train_vae_tutorial.ipynb.

2. Usage (one-line training)

Preflight only:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-vae-finetune/scripts/run_vae_finetune.py \
  PATH_TO_DATALIST.json \
  --data-base-dir PATH_TO_DATA_ROOT \
  --output-dir runs/nv_generate_vae_finetune_preflight \
  --preflight

Preflight bundle input:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-vae-finetune/scripts/run_vae_finetune.py \
  PATH_TO_INPUT_BUNDLE/preflight_datalist.json \
  --data-base-dir PATH_TO_INPUT_BUNDLE/preflight_dataset \
  --output-dir runs/nv_generate_vae_finetune_preflight \
  --preflight

GPU finetuning:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python -m pip install -r "$NV_GENERATE_ROOT/requirements.txt" && \
python -m pip install lpips tensorboard && \
python skills/nv-generate-vae-finetune/scripts/run_vae_finetune.py \
  PATH_TO_DATALIST.json \
  --data-base-dir PATH_TO_DATA_ROOT \
  --output-dir runs/nv_generate_vae_finetune \
  --epochs 1 \
  --modality mri \
  --patch-size 64,64,64 \
  --download-model-data

Replace PATH_TO_DATALIST.json and PATH_TO_DATA_ROOT with the user’s actual paths. Do not use the fixture datalist for real training; it is a preflight-only placeholder.

3. Monitor training (TensorBoard)

The runner writes TensorBoard scalars (per-iteration and per-epoch recons_loss, kl_loss, p_loss, adversarial/real/fake losses, and a validation scale_factor) under OUT_DIR/artifacts/tfevent/autoencoder. Launch TensorBoard against the output directory:

python -m pip install tensorboard && \
tensorboard --logdir runs/nv_generate_vae_finetune/artifacts/tfevent

The same per-epoch loss history is also captured in OUT_DIR/artifacts/workflow_summary.json and echoed in the JSON the wrapper prints to stdout (loss_history, best-checkpoint paths, exit_code, stderr_tail).

4. Hyperparameter tuning and common pitfalls

5. Evaluate the finetuned VAE

Validation reconstruction loss (lowest-val_weighted_loss epoch) is tracked automatically and the best autoencoder is saved as autoencoder_epochN.pt under OUT_DIR/artifacts/models. To evaluate downstream:

This skill gates file accounting and reconstruction bookkeeping only — image quality and downstream utility must be judged by a domain expert.

Limitations

Troubleshooting

| Error | Cause | Fix | |—|—|—| | VAE configs/helpers were not found | NV_GENERATE_ROOT does not point at a current NV-Generate-CTMR checkout. | Clone or update https://github.com/NVIDIA-Medtech/NV-Generate-CTMR and set NV_GENERATE_ROOT. | | datalist must include non-empty validation[] or testing[] | VAE training requires validation data for the configured validation loop. | Add validation[] or testing[] entries with relative image paths. | | CUDA, MONAI, or LPIPS import failure | Runtime environment lacks upstream dependencies. | Install "$NV_GENERATE_ROOT/requirements.txt" plus lpips tensorboard in the selected environment. |

Skill frontmatter

license: Apache-2.0 allowed-tools: Bash metadata: {"author" => "NVIDIA MedTech Team", "tags" => ["MedTech", "CT", "MRI", "VAE", "finetune"]}