Commit Graph

11 Commits

Author SHA1 Message Date
Stan Campbell
fdbc5f0588 feat: add --vae_cpu flag for improved VRAM optimization
Add --vae_cpu argument to enable VAE offloading for consumer GPUs with
limited VRAM. When enabled, VAE initializes on CPU and moves to GPU only
when needed for encoding/decoding operations.

Key changes:
- Add --vae_cpu argument to generate.py (mirrors --t5_cpu pattern)
- Update all 4 pipelines (T2V, I2V, FLF2V, VACE) with conditional VAE offloading
- Fix DiT offloading to free VRAM before T5 loading when offload_model=True
- Handle VAE scale tensors (mean/std) during device transfers

Benefits:
- Saves ~100-200MB VRAM without performance degradation
- Enables T2V-1.3B on more consumer GPUs (tested on 11.49GB GPU)
- Backward compatible (default=False)
- Consistent with existing --t5_cpu flag

Test results on 11.49 GiB VRAM GPU:
- Baseline: OOM (needed 80MB, only 85MB free)
- With --vae_cpu: Success
- With --t5_cpu: Success
- With both flags: Success (maximum VRAM savings)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-17 03:14:28 -07:00
Ang Wang
76e9427657
Format the code (#402)
* isort the code

* format the code

* Add yapf config file

* Remove torch cuda memory profiler
2025-05-16 12:35:38 +08:00
Zhen Han
c709fcf0e7
fix vace size (#397) 2025-05-14 22:01:45 +08:00
Ang Wang
18d53feb7a
[feature] Add VACE (#389)
* Add VACE

* Support training with multiple gpus

* Update default args for vace task

* vace block update

* Add vace exmaple jpg

* Fix dist vace fwd hook error

* Update vace exmample

* Update vace args

* Update pipeline name for vace

* vace gradio and Readme

* Update vace snake png

---------

Co-authored-by: hanzhn <han.feng.jason@gmail.com>
2025-05-14 20:44:25 +08:00
yupeng1111
df44622e72
[feature] Wan2.1-FLF2V-14B (#338)
Co-authored-by: 澎鹏 <shiyupeng.syp@taobao.com>
2025-04-17 21:56:46 +08:00
jiangzeyinzi
d6434cf8ef
To support system prompt as a input parameter (#280) 2025-03-28 11:51:31 +08:00
Wenting Shen
bc3249d61c
add free stroage api for FSDP (#277)
Signed-off-by: wenting.swt <wenting.swt@alibaba-inc.com>
2025-03-26 15:49:37 +08:00
Yingda Chen
d0a086800a
Update prompt_extend.py 2025-03-06 13:11:22 +08:00
Alan May
82c18d91a1
fix: correct Chinese ISO language code typo (#10)
Fix incorrect Chinese language ISO code abbreviation
from CH to ZH
2025-03-03 17:04:13 +08:00
Adrian Corduneanu
0e3c42a830
Update text2video.py to reduce GPU memory by emptying cache (#44)
* Update text2video.py to reduce GPU memory by emptying cache

If offload_model is set, empty_cache() must be called after the model is moved to CPU to actually free the GPU. I verified on a RTX 4090 that without calling empty_cache the model remains in memory and the subsequent vae decoding never finishes.

* Update text2video.py only one empty_cache needed before vae decode
2025-02-26 18:56:57 +08:00
WanX-Video-1
65386b2e03 init upload 2025-02-25 22:07:47 +08:00