Update text2video.py to reduce GPU memory by emptying cache

If offload_model is set, empty_cache() must be called after the model is moved to CPU to actually free the GPU. I verified on a RTX 4090 that without calling empty_cache the model remains in memory and the subsequent vae decoding never finishes.
2026-01-11 16:53:34 +00:00 · 2025-02-25 23:31:46 -08:00 · 2025-02-25 23:31:46 -08:00 · 65819c1d08
commit 65819c1d08
parent 73648654c5
1 changed files with 2 additions and 0 deletions
--- a/wan/text2video.py
+++ b/wan/text2video.py
@ -252,6 +252,7 @@ class WanT2V:
            x0 = latents
            if offload_model:
                self.model.cpu()
+                torch.cuda.empty_cache()
            if self.rank == 0:
                videos = self.vae.decode(x0)

@ -260,6 +261,7 @@ class WanT2V:
        if offload_model:
            gc.collect()
            torch.cuda.synchronize()
+            torch.cuda.empty_cache()
        if dist.is_initialized():
            dist.barrier()