Agent Skill · NVIDIA NIM

nv-generate-mr-brain-finetune

Used for finetuning NV-Generate-CTMR MR-brain diffusion UNet from a NIfTI datalist. Not for clinical or production data approval.

Provider: NVIDIA NIM Path in repo: skills/nv-generate-mr-brain-finetune/SKILL.md

Skill body

NV-Generate-MR-Brain-Finetune

Purpose

Instructions

Examples

Validate and stage a preflight finetune check from an input bundle (the recommended first step — no GPU, no training). This is the single canonical command; replace INPUT_BUNDLE and OUT_DIR with your paths:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
  INPUT_BUNDLE/preflight_datalist.json \
  --data-base-dir INPUT_BUNDLE/preflight_dataset \
  --output-dir OUT_DIR \
  --modality mri_t1 \
  --preflight

For real GPU finetuning and other variations, see Usage below.

Available Scripts

| Script | Purpose | Arguments | |—|—|—| | scripts/run_mr_brain_finetune.py | Primary entrypoint declared by skill_manifest.yaml. | DATALIST.json --data-base-dir DATA_DIR --output-dir OUT_DIR [--epochs N] [--modality mri_t1] [--num-gpus N] [--no-amp] [--model-config FILE] [--run-inference] [--preflight] |

Prerequisites

1. Config and environment JSON (adapt to your data)

This is a thin wrapper around the upstream train_diff_unet_tutorial.ipynb flow. Each run performs four steps, delegating the heavy lifting to the model author’s scripts:

  1. Stage configs — copy the three config JSONs and rewrite only the run-specific paths and n_epochs (notebook cell 15).
  2. python -m scripts.diff_model_create_training_data → latent *_emb.nii.gz embeddings (cell 17).
  3. Write embedding sidecars — a <emb>.nii.gz.json per embedding with spacing/modality (and body-region indices when the model uses them). This is the one piece of glue that lives in the notebook (cell 19), not in upstream scripts/, and diff_model_train requires it; the skill owns it.
  4. python -m scripts.diff_model_train (cell 21), optionally python -m scripts.diff_model_infer.

Tune by editing the config JSON, not by adding flags. All training/inference hyperparameters (lr, batch_size, cache_rate, inference dim/spacing/num_inference_steps/cfg_guidance_scale, …) live in config_maisi_diff_model_rflow-mr-brain.json. Edit the upstream copy, or pass your own with --model-config FILE (and --env-config / --model-def for the other two). The wrapper only ever rewrites the fields below.

Environment JSON (environment_maisi_diff_model_rflow-mr-brain.json) — fields the wrapper rewrites per run:

Field Set from Notes
data_base_dir --data-base-dir Root for relative training[].image paths.
json_data_list your datalist Staged copy with per-entry modality filled in.
embedding_base_dir, model_dir, output_dir --output-dir Latent embeddings, checkpoints, inference images.
modality_mapping_path upstream Maps modality name → integer code.
model_filename --model-filename Output checkpoint name (default diff_unet_3d_rflow-mr-brain_v0.pt).
existing_ckpt_filepath upstream weights / --existing-ckpt-filepath Starting checkpoint; cleared by --train-from-scratch.
trained_autoencoder_path upstream weights / --trained-autoencoder-path VAE used to encode/decode latents.

Model config (config_maisi_diff_model_rflow-mr-brain.json) — the only fields the wrapper touches:

Field Set from Default Notes
diffusion_unet_train.n_epochs --epochs 2 (upstream config ships 1000) Convenience override (cell 15 does the same); wrapper default is small for verification.
diffusion_unet_inference.modality --modality from modality_mapping.json Kept consistent with the training modality for optional --run-inference.

Everything else in that file (lr, batch_size, cache_rate, the rest of diffusion_unet_inference) is left exactly as written — edit the JSON to change it.

Runtime flags (not config fields): --num-gpus N (>1 launches torch.distributed.run), --no-amp (disable mixed precision, passed through to diff_model_train).

--modality selects the integer code from configs/modality_mapping.json. Supported brain values: mri (8), mri_t1 (9, default), mri_t2 (10), mri_flair (11), mri_swi (20), and their *_skull_stripped variants (29/30/31/32). Per-case training[].modality overrides --modality. The modality also feeds the step-3 embedding sidecars.

For an end-to-end reference including example data download and checkpoint loading, see the upstream tutorial train_diff_unet_tutorial.ipynb.

2. Usage (one-line training)

Preflight only:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
  PATH_TO_DATALIST.json \
  --data-base-dir PATH_TO_DATA_ROOT \
  --output-dir runs/nv_generate_mr_brain_finetune_preflight \
  --preflight

Preflight bundle input:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
  PATH_TO_INPUT_BUNDLE/preflight_datalist.json \
  --data-base-dir PATH_TO_INPUT_BUNDLE/preflight_dataset \
  --output-dir runs/nv_generate_mr_brain_finetune_preflight \
  --preflight

GPU finetuning:

export NV_GENERATE_ROOT="${NV_GENERATE_ROOT:-.workbench_data/upstreams/NV-Generate-CTMR}" && \
python -m pip install -r "$NV_GENERATE_ROOT/requirements.txt" && \
python skills/nv-generate-mr-brain-finetune/scripts/run_mr_brain_finetune.py \
  PATH_TO_DATALIST.json \
  --data-base-dir PATH_TO_DATA_ROOT \
  --output-dir runs/nv_generate_mr_brain_finetune \
  --epochs 2 \
  --modality mri_t1 \
  --run-inference

Replace PATH_TO_DATALIST.json and PATH_TO_DATA_ROOT with the user’s actual paths. Do not use the fixture datalist for real training; it is a preflight-only placeholder.

3. Monitor training (TensorBoard)

scripts.diff_model_train writes TensorBoard event files under the staged model_dir (OUT_DIR/artifacts/models). Launch TensorBoard against the output directory and watch the loss curve:

python -m pip install tensorboard && \
tensorboard --logdir runs/nv_generate_mr_brain_finetune/artifacts

The run summary is written to OUT_DIR/artifacts/workflow_summary.json (checkpoint path, embedding sidecars, inference outputs); the JSON the wrapper prints to stdout mirrors the same paths plus exit_code and a stderr_tail for quick triage.

4. Hyperparameter tuning and common pitfalls

5. Evaluate the finetuned model

Use the staged checkpoint (OUT_DIR/artifacts/models/<model_filename>) as the diffusion UNet for generation, then inspect the synthesized volumes:

This skill gates file accounting and command provenance only — anatomical realism and downstream utility must be judged by a domain expert on the generated images.

Limitations

Troubleshooting

| Error | Cause | Fix | |—|—|—| | diffusion training scripts were not found | NV_GENERATE_ROOT does not point at a current NV-Generate-CTMR checkout. | Clone or update https://github.com/NVIDIA-Medtech/NV-Generate-CTMR and set NV_GENERATE_ROOT. | | missing datalist image | training[].image paths are not relative to --data-base-dir or files are absent. | Fix the datalist or pass the correct data root. | | CUDA or MONAI import failure | Runtime environment lacks upstream dependencies. | Install "$NV_GENERATE_ROOT/requirements.txt" in the selected environment. |

Skill frontmatter

license: Apache-2.0 allowed-tools: Bash metadata: {"author" => "NVIDIA MedTech Team", "tags" => ["MedTech", "MRI", "brain", "finetune"]}