Wan2.1/test2_vae_cpu.sh at fdbc5f0588179d8c8eca4a58c84f096af79de4d2 - Wan2.1 - Zengtudor's git

Zengtudor/Wan2.1

mirror of https://github.com/Wan-Video/Wan2.1.git synced 2025-11-04 06:15:17 +00:00

Stan Campbell fdbc5f0588 feat: add --vae_cpu flag for improved VRAM optimization

Add --vae_cpu argument to enable VAE offloading for consumer GPUs with
limited VRAM. When enabled, VAE initializes on CPU and moves to GPU only
when needed for encoding/decoding operations.

Key changes:
- Add --vae_cpu argument to generate.py (mirrors --t5_cpu pattern)
- Update all 4 pipelines (T2V, I2V, FLF2V, VACE) with conditional VAE offloading
- Fix DiT offloading to free VRAM before T5 loading when offload_model=True
- Handle VAE scale tensors (mean/std) during device transfers

Benefits:
- Saves ~100-200MB VRAM without performance degradation
- Enables T2V-1.3B on more consumer GPUs (tested on 11.49GB GPU)
- Backward compatible (default=False)
- Consistent with existing --t5_cpu flag

Test results on 11.49 GiB VRAM GPU:
- Baseline: OOM (needed 80MB, only 85MB free)
- With --vae_cpu: Success
- With --t5_cpu: Success
- With both flags: Success (maximum VRAM savings)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-17 03:14:28 -07:00

6 lines

362 B

Bash

Executable File

Raw Blame History

 #!/usr/bin/bash
 # Test 2: VAE CPU offloading only
 echo "=== TEST 2: VAE offloading enabled (--vae_cpu) ==="
 echo "Expected: Success - should save 100-200MB VRAM"
 python ../generate.py --task t2v-1.3B --size 480*832 --ckpt_dir ./t2v-1.3b --offload_model True --vae_cpu --prompt "happy the dwarf and sneezy the dwarf wrestle to the death at madison square garden"