mirror of
https://github.com/Wan-Video/Wan2.1.git
synced 2025-12-23 15:33:33 +00:00
Compare commits
3 Commits
1d6ce64db6
...
6ca4b174b2
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6ca4b174b2 | ||
|
|
ae487cc653 | ||
|
|
c5a6d87db7 |
@ -36,6 +36,7 @@ In this repository, we present **Wan2.1**, a comprehensive and open suite of vid
|
|||||||
|
|
||||||
## Community Works
|
## Community Works
|
||||||
If your work has improved **Wan2.1** and you would like more people to see it, please inform us.
|
If your work has improved **Wan2.1** and you would like more people to see it, please inform us.
|
||||||
|
- [Video-As-Prompt](https://github.com/bytedance/Video-As-Prompt), the first unified semantic-controlled video generation model based on **Wan2.1-14B-I2V** with a Mixture-of-Transformers architecture and in-context controls (e.g., concept, style, motion, camera). Refer to the [project page](https://bytedance.github.io/Video-As-Prompt/) for more examples.
|
||||||
- [LightX2V](https://github.com/ModelTC/LightX2V), a lightweight and efficient video generation framework that integrates **Wan2.1** and **Wan2.2**, supports multiple engineering acceleration techniques for fast inference, which can run on RTX 5090 and RTX 4060 (8GB VRAM).
|
- [LightX2V](https://github.com/ModelTC/LightX2V), a lightweight and efficient video generation framework that integrates **Wan2.1** and **Wan2.2**, supports multiple engineering acceleration techniques for fast inference, which can run on RTX 5090 and RTX 4060 (8GB VRAM).
|
||||||
- [DriVerse](https://github.com/shalfun/DriVerse), an autonomous driving world model based on **Wan2.1-14B-I2V**, generates future driving videos conditioned on any scene frame and given trajectory. Refer to the [project page](https://github.com/shalfun/DriVerse/tree/main) for more examples.
|
- [DriVerse](https://github.com/shalfun/DriVerse), an autonomous driving world model based on **Wan2.1-14B-I2V**, generates future driving videos conditioned on any scene frame and given trajectory. Refer to the [project page](https://github.com/shalfun/DriVerse/tree/main) for more examples.
|
||||||
- [Training-Free-WAN-Editing](https://github.com/KyujinHan/Awesome-Training-Free-WAN2.1-Editing), built on **Wan2.1-T2V-1.3B**, allows training-free video editing with image-based training-free methods, such as [FlowEdit](https://arxiv.org/abs/2412.08629) and [FlowAlign](https://arxiv.org/abs/2505.23145).
|
- [Training-Free-WAN-Editing](https://github.com/KyujinHan/Awesome-Training-Free-WAN2.1-Editing), built on **Wan2.1-T2V-1.3B**, allows training-free video editing with image-based training-free methods, such as [FlowEdit](https://arxiv.org/abs/2412.08629) and [FlowAlign](https://arxiv.org/abs/2505.23145).
|
||||||
|
|||||||
@ -13,6 +13,7 @@ import numpy as np
|
|||||||
import torch
|
import torch
|
||||||
import torch.cuda.amp as amp
|
import torch.cuda.amp as amp
|
||||||
import torch.distributed as dist
|
import torch.distributed as dist
|
||||||
|
import torchvision
|
||||||
import torchvision.transforms.functional as TF
|
import torchvision.transforms.functional as TF
|
||||||
from tqdm import tqdm
|
from tqdm import tqdm
|
||||||
|
|
||||||
@ -211,7 +212,12 @@ class WanFLF2V:
|
|||||||
round(last_frame_size[1] * last_frame_resize_ratio),
|
round(last_frame_size[1] * last_frame_resize_ratio),
|
||||||
]
|
]
|
||||||
# 2. center crop
|
# 2. center crop
|
||||||
last_frame = TF.center_crop(last_frame, last_frame_size)
|
transform = torchvision.transforms.Compose([
|
||||||
|
torchvision.transforms.Resize((last_frame_size[0], last_frame_size[1])),
|
||||||
|
torchvision.transforms.CenterCrop((first_frame_size[0], first_frame_size[1]))
|
||||||
|
])
|
||||||
|
|
||||||
|
last_frame = transform(last_frame)
|
||||||
|
|
||||||
max_seq_len = ((F - 1) // self.vae_stride[0] + 1) * lat_h * lat_w // (
|
max_seq_len = ((F - 1) // self.vae_stride[0] + 1) * lat_h * lat_w // (
|
||||||
self.patch_size[1] * self.patch_size[2])
|
self.patch_size[1] * self.patch_size[2])
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user