more spoil

This commit is contained in:
DeepBeepMeep 2025-09-25 23:21:51 +02:00
parent 14d68bbc91
commit ba764e8d60
9 changed files with 130 additions and 66 deletions

View File

@ -20,21 +20,27 @@ WanGP supports the Wan (and derived models), Hunyuan Video and LTV Video models
**Follow DeepBeepMeep on Twitter/X to get the Latest News**: https://x.com/deepbeepmeep **Follow DeepBeepMeep on Twitter/X to get the Latest News**: https://x.com/deepbeepmeep
## 🔥 Latest Updates : ## 🔥 Latest Updates :
### September 25 2025: WanGP v8.73 - Here Are ~~Two~~Three New Contenders in the Vace Arena ! ### September 25 2025: WanGP v8.73 - ~~Here Are Two Three New Contenders in the Vace Arena !~~ The Never Ending Release
So in today's release you will find two Wannabe Vace that covers each only a subset of Vace features but offers some interesting advantages: So in ~~today's~~ this release you will find two Wannabe Vace that covers each only a subset of Vace features but offers some interesting advantages:
- **Wan 2.2 Animate**: this model is specialized in *Body Motion* and *Facial Motion transfers*. It does that very well. You can use this model to either *Replace* a person in an in Video or *Animate* the person of your choice using an existing *Pose Video* (remember *Animate Anyone* ?). By default it will keep the original soundtrack. *Wan 2.2 Animate* seems to be under the hood a derived i2v model and should support the corresponding Loras Accelerators (for instance *FusioniX t2v*). Also as a WanGP exclusivity, you will find support for *Outpainting*. - **Wan 2.2 Animate**: this model is specialized in *Body Motion* and *Facial Motion transfers*. It does that very well. You can use this model to either *Replace* a person in an in Video or *Animate* the person of your choice using an existing *Pose Video* (remember *Animate Anyone* ?). By default it will keep the original soundtrack. *Wan 2.2 Animate* seems to be under the hood a derived i2v model and should support the corresponding Loras Accelerators (for instance *FusioniX t2v*). Also as a WanGP exclusivity, you will find support for *Outpainting*.
In order to use Wan 2.2 Animate you will need first to stop by the *Mat Anyone* embedded tool, to extract the *Video Mask* of the person from which you want to extract the motion. In order to use Wan 2.2 Animate you will need first to stop by the *Mat Anyone* embedded tool, to extract the *Video Mask* of the person from which you want to extract the motion.
With version WanGP 8.74, there is an extra option that allows you to apply *Relighting* when Replacing a person. Also, you can now Animate a person without providing a Video Mask to target the source of the motion (with the risk it will be less precise)
- **Lucy Edit**: this one claims to be a *Nano Banana* for Videos. Give it a video and asks it to change it (it is specialized in clothes changing) and voila ! The nice thing about it is that is it based on the *Wan 2.2 5B* model and therefore is very fast especially if you the *FastWan* finetune that is also part of the package. - **Lucy Edit**: this one claims to be a *Nano Banana* for Videos. Give it a video and asks it to change it (it is specialized in clothes changing) and voila ! The nice thing about it is that is it based on the *Wan 2.2 5B* model and therefore is very fast especially if you the *FastWan* finetune that is also part of the package.
Also because I wanted to spoil you: Also because I wanted to spoil you:
- **Qwen Edit Plus**: also known as the *Qwen Edit 25th September Update* which is specialized in combining multiple Objects / People. There is also a new support for *Pose transfer* & *Recolorisation*. All of this made easy to use in WanGP. You will find right now only the quantized version since HF crashes when uploading the unquantized version. - **Qwen Edit Plus**: also known as the *Qwen Edit 25th September Update* which is specialized in combining multiple Objects / People. There is also a new support for *Pose transfer* & *Recolorisation*. All of this made easy to use in WanGP. You will find right now only the quantized version since HF crashes when uploading the unquantized version.
- **T2V Video 2 Video Masking**: ever wanted to apply a Lora, a process (for instance Upsampling) or a Text Prompt on only a (moving) part of a Source Video. Look no further, I have added *Masked Video 2 Video* (which works also in image2image) in the *Text 2 Video* models. As usual you just need to use *Matanyone* to creatre the mask.
*Update 8.71*: fixed Fast Lucy Edit that didnt contain the lora *Update 8.71*: fixed Fast Lucy Edit that didnt contain the lora
*Update 8.72*: shadow drop of Qwen Edit Plus *Update 8.72*: shadow drop of Qwen Edit Plus
*Update 8.73*: Qwen Preview & InfiniteTalk Start image *Update 8.73*: Qwen Preview & InfiniteTalk Start image
*Update 8.74*: Animate Relighting / Nomask mode , t2v Masked Video to Video
### September 15 2025: WanGP v8.6 - Attack of the Clones ### September 15 2025: WanGP v8.6 - Attack of the Clones

View File

@ -2,12 +2,16 @@
"model": { "model": {
"name": "Wan2.2 Animate", "name": "Wan2.2 Animate",
"architecture": "animate", "architecture": "animate",
"description": "Wan-Animate takes a video and a character image as input, and generates a video in either 'animation' or 'replacement' mode.", "description": "Wan-Animate takes a video and a character image as input, and generates a video in either 'Animation' or 'Replacement' mode. Sliding Window of 81 frames at least are recommeded to obtain the best Style continuity.",
"URLs": [ "URLs": [
"https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_bf16.safetensors", "https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_bf16.safetensors",
"https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_quanto_fp16_int8.safetensors", "https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_quanto_fp16_int8.safetensors",
"https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_quanto_bf16_int8.safetensors" "https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_14B_quanto_bf16_int8.safetensors"
], ],
"preload_URLs" :
[
"https://huggingface.co/DeepBeepMeep/Wan2.2/resolve/main/wan2.2_animate_relighting_lora.safetensors"
],
"group": "wan2_2" "group": "wan2_2"
} }
} }

View File

@ -292,7 +292,7 @@ def denoise(
callback(-1, None, True) callback(-1, None, True)
original_image_latents = None if img_cond_seq is None else img_cond_seq.clone() original_image_latents = None if img_cond_seq is None else img_cond_seq.clone()
original_timesteps = timesteps
morph, first_step = False, 0 morph, first_step = False, 0
if img_msk_latents is not None: if img_msk_latents is not None:
randn = torch.randn_like(original_image_latents) randn = torch.randn_like(original_image_latents)
@ -309,7 +309,7 @@ def denoise(
updated_num_steps= len(timesteps) -1 updated_num_steps= len(timesteps) -1
if callback != None: if callback != None:
from shared.utils.loras_mutipliers import update_loras_slists from shared.utils.loras_mutipliers import update_loras_slists
update_loras_slists(model, loras_slists, updated_num_steps) update_loras_slists(model, loras_slists, len(original_timesteps))
callback(-1, None, True, override_num_inference_steps = updated_num_steps) callback(-1, None, True, override_num_inference_steps = updated_num_steps)
from mmgp import offload from mmgp import offload
# this is ignored for schnell # this is ignored for schnell

View File

@ -825,7 +825,7 @@ class QwenImagePipeline(): #DiffusionPipeline
) )
num_warmup_steps = max(len(timesteps) - num_inference_steps * self.scheduler.order, 0) num_warmup_steps = max(len(timesteps) - num_inference_steps * self.scheduler.order, 0)
self._num_timesteps = len(timesteps) self._num_timesteps = len(timesteps)
original_timesteps = timesteps
# handle guidance # handle guidance
if self.transformer.guidance_embeds: if self.transformer.guidance_embeds:
guidance = torch.full([1], guidance_scale, device=device, dtype=torch.float32) guidance = torch.full([1], guidance_scale, device=device, dtype=torch.float32)
@ -860,7 +860,7 @@ class QwenImagePipeline(): #DiffusionPipeline
updated_num_steps= len(timesteps) updated_num_steps= len(timesteps)
if callback != None: if callback != None:
from shared.utils.loras_mutipliers import update_loras_slists from shared.utils.loras_mutipliers import update_loras_slists
update_loras_slists(self.transformer, loras_slists, updated_num_steps) update_loras_slists(self.transformer, loras_slists, len(original_timesteps))
callback(-1, None, True, override_num_inference_steps = updated_num_steps) callback(-1, None, True, override_num_inference_steps = updated_num_steps)

View File

@ -34,6 +34,7 @@ from shared.utils.vace_preprocessor import VaceVideoProcessor
from shared.utils.basic_flowmatch import FlowMatchScheduler from shared.utils.basic_flowmatch import FlowMatchScheduler
from shared.utils.utils import get_outpainting_frame_location, resize_lanczos, calculate_new_dimensions, convert_image_to_tensor, fit_image_into_canvas from shared.utils.utils import get_outpainting_frame_location, resize_lanczos, calculate_new_dimensions, convert_image_to_tensor, fit_image_into_canvas
from .multitalk.multitalk_utils import MomentumBuffer, adaptive_projected_guidance, match_and_blend_colors, match_and_blend_colors_with_mask from .multitalk.multitalk_utils import MomentumBuffer, adaptive_projected_guidance, match_and_blend_colors, match_and_blend_colors_with_mask
from shared.utils.audio_video import save_video
from mmgp import safetensors2 from mmgp import safetensors2
from shared.utils.audio_video import save_video from shared.utils.audio_video import save_video
@ -434,7 +435,7 @@ class WanAny2V:
start_step_no = 0 start_step_no = 0
ref_images_count = 0 ref_images_count = 0
trim_frames = 0 trim_frames = 0
extended_overlapped_latents = clip_image_start = clip_image_end = None extended_overlapped_latents = clip_image_start = clip_image_end = image_mask_latents = None
no_noise_latents_injection = infinitetalk no_noise_latents_injection = infinitetalk
timestep_injection = False timestep_injection = False
lat_frames = int((frame_num - 1) // self.vae_stride[0]) + 1 lat_frames = int((frame_num - 1) // self.vae_stride[0]) + 1
@ -585,7 +586,7 @@ class WanAny2V:
kwargs['cam_emb'] = cam_emb kwargs['cam_emb'] = cam_emb
# Video 2 Video # Video 2 Video
if denoising_strength < 1. and input_frames != None: if "G" in video_prompt_type and input_frames != None:
height, width = input_frames.shape[-2:] height, width = input_frames.shape[-2:]
source_latents = self.vae.encode([input_frames])[0].unsqueeze(0) source_latents = self.vae.encode([input_frames])[0].unsqueeze(0)
injection_denoising_step = 0 injection_denoising_step = 0
@ -616,6 +617,14 @@ class WanAny2V:
if hasattr(sample_scheduler, "sigmas"): sample_scheduler.sigmas= sample_scheduler.sigmas[injection_denoising_step:] if hasattr(sample_scheduler, "sigmas"): sample_scheduler.sigmas= sample_scheduler.sigmas[injection_denoising_step:]
injection_denoising_step = 0 injection_denoising_step = 0
if input_masks is not None and not "U" in video_prompt_type:
image_mask_latents = torch.nn.functional.interpolate(input_masks, size= source_latents.shape[-2:], mode="nearest").unsqueeze(0)
if image_mask_latents.shape[2] !=1:
image_mask_latents = torch.cat([ image_mask_latents[:,:, :1], torch.nn.functional.interpolate(image_mask_latents, size= (source_latents.shape[-3]-1, *source_latents.shape[-2:]), mode="nearest") ], dim=2)
image_mask_latents = torch.where(image_mask_latents>=0.5, 1., 0. )[:1].to(self.device)
# save_video(image_mask_latents.squeeze(0), "mama.mp4", value_range=(0,1) )
# image_mask_rebuilt = image_mask_latents.repeat_interleave(8, dim=-1).repeat_interleave(8, dim=-2).unsqueeze(0)
# Phantom # Phantom
if phantom: if phantom:
lat_input_ref_images_neg = None lat_input_ref_images_neg = None
@ -737,7 +746,7 @@ class WanAny2V:
denoising_extra = "" denoising_extra = ""
from shared.utils.loras_mutipliers import update_loras_slists, get_model_switch_steps from shared.utils.loras_mutipliers import update_loras_slists, get_model_switch_steps
phase_switch_step, phase_switch_step2, phases_description = get_model_switch_steps(timesteps, updated_num_steps, guide_phases, 0 if self.model2 is None else model_switch_phase, switch_threshold, switch2_threshold ) phase_switch_step, phase_switch_step2, phases_description = get_model_switch_steps(original_timesteps,guide_phases, 0 if self.model2 is None else model_switch_phase, switch_threshold, switch2_threshold )
if len(phases_description) > 0: set_header_text(phases_description) if len(phases_description) > 0: set_header_text(phases_description)
guidance_switch_done = guidance_switch2_done = False guidance_switch_done = guidance_switch2_done = False
if guide_phases > 1: denoising_extra = f"Phase 1/{guide_phases} High Noise" if self.model2 is not None else f"Phase 1/{guide_phases}" if guide_phases > 1: denoising_extra = f"Phase 1/{guide_phases} High Noise" if self.model2 is not None else f"Phase 1/{guide_phases}"
@ -748,8 +757,8 @@ class WanAny2V:
denoising_extra = f"Phase {phase_no}/{guide_phases} {'Low Noise' if trans == self.model2 else 'High Noise'}" if self.model2 is not None else f"Phase {phase_no}/{guide_phases}" denoising_extra = f"Phase {phase_no}/{guide_phases} {'Low Noise' if trans == self.model2 else 'High Noise'}" if self.model2 is not None else f"Phase {phase_no}/{guide_phases}"
callback(step_no-1, denoising_extra = denoising_extra) callback(step_no-1, denoising_extra = denoising_extra)
return guide_scale, guidance_switch_done, trans, denoising_extra return guide_scale, guidance_switch_done, trans, denoising_extra
update_loras_slists(self.model, loras_slists, updated_num_steps, phase_switch_step= phase_switch_step, phase_switch_step2= phase_switch_step2) update_loras_slists(self.model, loras_slists, len(original_timesteps), phase_switch_step= phase_switch_step, phase_switch_step2= phase_switch_step2)
if self.model2 is not None: update_loras_slists(self.model2, loras_slists, updated_num_steps, phase_switch_step= phase_switch_step, phase_switch_step2= phase_switch_step2) if self.model2 is not None: update_loras_slists(self.model2, loras_slists, len(original_timesteps), phase_switch_step= phase_switch_step, phase_switch_step2= phase_switch_step2)
callback(-1, None, True, override_num_inference_steps = updated_num_steps, denoising_extra = denoising_extra) callback(-1, None, True, override_num_inference_steps = updated_num_steps, denoising_extra = denoising_extra)
def clear(): def clear():
@ -762,6 +771,7 @@ class WanAny2V:
scheduler_kwargs = {} if isinstance(sample_scheduler, FlowMatchScheduler) else {"generator": seed_g} scheduler_kwargs = {} if isinstance(sample_scheduler, FlowMatchScheduler) else {"generator": seed_g}
# b, c, lat_f, lat_h, lat_w # b, c, lat_f, lat_h, lat_w
latents = torch.randn(batch_size, *target_shape, dtype=torch.float32, device=self.device, generator=seed_g) latents = torch.randn(batch_size, *target_shape, dtype=torch.float32, device=self.device, generator=seed_g)
if "G" in video_prompt_type: randn = latents
if apg_switch != 0: if apg_switch != 0:
apg_momentum = -0.75 apg_momentum = -0.75
apg_norm_threshold = 55 apg_norm_threshold = 55
@ -784,23 +794,21 @@ class WanAny2V:
timestep = torch.full((target_shape[-3],), t, dtype=torch.int64, device=latents.device) timestep = torch.full((target_shape[-3],), t, dtype=torch.int64, device=latents.device)
timestep[:source_latents.shape[2]] = 0 timestep[:source_latents.shape[2]] = 0
kwargs.update({"t": timestep, "current_step": start_step_no + i}) kwargs.update({"t": timestep, "current_step_no": i, "real_step_no": start_step_no + i })
kwargs["slg_layers"] = slg_layers if int(slg_start * sampling_steps) <= i < int(slg_end * sampling_steps) else None kwargs["slg_layers"] = slg_layers if int(slg_start * sampling_steps) <= i < int(slg_end * sampling_steps) else None
if denoising_strength < 1 and i <= injection_denoising_step: if denoising_strength < 1 and i <= injection_denoising_step:
sigma = t / 1000 sigma = t / 1000
noise = torch.randn(batch_size, *target_shape, dtype=torch.float32, device=self.device, generator=seed_g)
if inject_from_start: if inject_from_start:
new_latents = latents.clone() noisy_image = latents.clone()
new_latents[:,:, :source_latents.shape[2] ] = noise[:, :, :source_latents.shape[2] ] * sigma + (1 - sigma) * source_latents noisy_image[:,:, :source_latents.shape[2] ] = randn[:, :, :source_latents.shape[2] ] * sigma + (1 - sigma) * source_latents
for latent_no, keep_latent in enumerate(latent_keep_frames): for latent_no, keep_latent in enumerate(latent_keep_frames):
if not keep_latent: if not keep_latent:
new_latents[:, :, latent_no:latent_no+1 ] = latents[:, :, latent_no:latent_no+1] noisy_image[:, :, latent_no:latent_no+1 ] = latents[:, :, latent_no:latent_no+1]
latents = new_latents latents = noisy_image
new_latents = None noisy_image = None
else: else:
latents = noise * sigma + (1 - sigma) * source_latents latents = randn * sigma + (1 - sigma) * source_latents
noise = None
if extended_overlapped_latents != None: if extended_overlapped_latents != None:
if no_noise_latents_injection: if no_noise_latents_injection:
@ -940,6 +948,13 @@ class WanAny2V:
latents, latents,
**scheduler_kwargs)[0] **scheduler_kwargs)[0]
if image_mask_latents is not None:
sigma = 0 if i == len(timesteps)-1 else timesteps[i+1]/1000
noisy_image = randn * sigma + (1 - sigma) * source_latents
latents = noisy_image * (1-image_mask_latents) + image_mask_latents * latents
if callback is not None: if callback is not None:
latents_preview = latents latents_preview = latents
if ref_images_before and ref_images_count > 0: latents_preview = latents_preview[:, :, ref_images_count: ] if ref_images_before and ref_images_count > 0: latents_preview = latents_preview[:, :, ref_images_count: ]
@ -999,3 +1014,11 @@ class WanAny2V:
setattr(target, "face_adapter_fuser_blocks", module ) setattr(target, "face_adapter_fuser_blocks", module )
delattr(model, "face_adapter") delattr(model, "face_adapter")
def get_loras_transformer(self, get_model_recursive_prop, base_model_type, model_type, video_prompt_type, model_mode, **kwargs):
if base_model_type == "animate":
if "1" in video_prompt_type:
preloadURLs = get_model_recursive_prop(model_type, "preload_URLs")
if len(preloadURLs) > 0:
return [os.path.join("ckpts", os.path.basename(preloadURLs[0]))] , [1]
return [], []

View File

@ -1220,7 +1220,8 @@ class WanModel(ModelMixin, ConfigMixin):
y=None, y=None,
freqs = None, freqs = None,
pipeline = None, pipeline = None,
current_step = 0, current_step_no = 0,
real_step_no = 0,
x_id= 0, x_id= 0,
max_steps = 0, max_steps = 0,
slg_layers=None, slg_layers=None,
@ -1310,9 +1311,9 @@ class WanModel(ModelMixin, ConfigMixin):
del causal_mask del causal_mask
offload.shared_state["embed_sizes"] = grid_sizes offload.shared_state["embed_sizes"] = grid_sizes
offload.shared_state["step_no"] = current_step offload.shared_state["step_no"] = real_step_no
offload.shared_state["max_steps"] = max_steps offload.shared_state["max_steps"] = max_steps
if current_step == 0 and x_id == 0: clear_caches() if current_step_no == 0 and x_id == 0: clear_caches()
# arguments # arguments
kwargs = dict( kwargs = dict(
@ -1336,7 +1337,7 @@ class WanModel(ModelMixin, ConfigMixin):
if standin_ref is not None: if standin_ref is not None:
standin_cache_enabled = False standin_cache_enabled = False
kwargs["standin_phase"] = 2 kwargs["standin_phase"] = 2
if current_step == 0 or not standin_cache_enabled : if current_step_no == 0 or not standin_cache_enabled :
standin_x = self.patch_embedding(standin_ref).to(modulation_dtype).flatten(2).transpose(1, 2) standin_x = self.patch_embedding(standin_ref).to(modulation_dtype).flatten(2).transpose(1, 2)
standin_e = self.time_embedding( sinusoidal_embedding_1d(self.freq_dim, torch.zeros_like(t)).to(modulation_dtype) ) standin_e = self.time_embedding( sinusoidal_embedding_1d(self.freq_dim, torch.zeros_like(t)).to(modulation_dtype) )
standin_e0 = self.time_projection(standin_e).unflatten(1, (6, self.dim)).to(e.dtype) standin_e0 = self.time_projection(standin_e).unflatten(1, (6, self.dim)).to(e.dtype)
@ -1401,7 +1402,7 @@ class WanModel(ModelMixin, ConfigMixin):
skips_steps_cache = self.cache skips_steps_cache = self.cache
if skips_steps_cache != None: if skips_steps_cache != None:
if skips_steps_cache.cache_type == "mag": if skips_steps_cache.cache_type == "mag":
if current_step <= skips_steps_cache.start_step: if real_step_no <= skips_steps_cache.start_step:
should_calc = True should_calc = True
elif skips_steps_cache.one_for_all and x_id != 0: # not joint pass, not main pas, one for all elif skips_steps_cache.one_for_all and x_id != 0: # not joint pass, not main pas, one for all
assert len(x_list) == 1 assert len(x_list) == 1
@ -1410,7 +1411,7 @@ class WanModel(ModelMixin, ConfigMixin):
x_should_calc = [] x_should_calc = []
for i in range(1 if skips_steps_cache.one_for_all else len(x_list)): for i in range(1 if skips_steps_cache.one_for_all else len(x_list)):
cur_x_id = i if joint_pass else x_id cur_x_id = i if joint_pass else x_id
cur_mag_ratio = skips_steps_cache.mag_ratios[current_step * 2 + cur_x_id] # conditional and unconditional in one list cur_mag_ratio = skips_steps_cache.mag_ratios[real_step_no * 2 + cur_x_id] # conditional and unconditional in one list
skips_steps_cache.accumulated_ratio[cur_x_id] *= cur_mag_ratio # magnitude ratio between current step and the cached step skips_steps_cache.accumulated_ratio[cur_x_id] *= cur_mag_ratio # magnitude ratio between current step and the cached step
skips_steps_cache.accumulated_steps[cur_x_id] += 1 # skip steps plus 1 skips_steps_cache.accumulated_steps[cur_x_id] += 1 # skip steps plus 1
cur_skip_err = np.abs(1-skips_steps_cache.accumulated_ratio[cur_x_id]) # skip error of current steps cur_skip_err = np.abs(1-skips_steps_cache.accumulated_ratio[cur_x_id]) # skip error of current steps
@ -1430,7 +1431,7 @@ class WanModel(ModelMixin, ConfigMixin):
if x_id != 0: if x_id != 0:
should_calc = skips_steps_cache.should_calc should_calc = skips_steps_cache.should_calc
else: else:
if current_step <= skips_steps_cache.start_step or current_step == skips_steps_cache.num_steps-1: if real_step_no <= skips_steps_cache.start_step or real_step_no == skips_steps_cache.num_steps-1:
should_calc = True should_calc = True
skips_steps_cache.accumulated_rel_l1_distance = 0 skips_steps_cache.accumulated_rel_l1_distance = 0
else: else:

View File

@ -124,12 +124,20 @@ class family_handler():
if base_model_type in ["t2v"]: if base_model_type in ["t2v"]:
extra_model_def["guide_custom_choices"] = { extra_model_def["guide_custom_choices"] = {
"choices":[("Use Text Prompt Only", ""),("Video to Video guided by Text Prompt", "GUV")], "choices":[("Use Text Prompt Only", ""),
("Video to Video guided by Text Prompt", "GUV"),
("Video to Video guided by Text Prompt and Restricted to the Area of the Video Mask", "GVA")],
"default": "", "default": "",
"letters_filter": "GUV", "letters_filter": "GUVA",
"label": "Video to Video" "label": "Video to Video"
} }
extra_model_def["mask_preprocessing"] = {
"selection":[ "", "A"],
"visible": False
}
if base_model_type in ["infinitetalk"]: if base_model_type in ["infinitetalk"]:
extra_model_def["no_background_removal"] = True extra_model_def["no_background_removal"] = True
extra_model_def["all_image_refs_are_background_ref"] = True extra_model_def["all_image_refs_are_background_ref"] = True
@ -160,14 +168,22 @@ class family_handler():
if base_model_type in ["animate"]: if base_model_type in ["animate"]:
extra_model_def["guide_custom_choices"] = { extra_model_def["guide_custom_choices"] = {
"choices":[ "choices":[
("Animate Person in Reference Image using Motion of Person in Control Video", "PVBXAKI"), ("Animate Person in Reference Image using Motion of Whole Control Video", "PVBKI"),
("Replace Person in Control Video Person in Reference Image", "PVBAI"), ("Animate Person in Reference Image using Motion of Targeted Person in Control Video", "PVBXAKI"),
("Replace Person in Control Video by Person in Reference Image", "PVBAI"),
("Replace Person in Control Video by Person in Reference Image and Apply Relighting Process", "PVBAI1"),
], ],
"default": "KI", "default": "PVBKI",
"letters_filter": "PVBXAKI", "letters_filter": "PVBXAKI1",
"label": "Type of Process", "label": "Type of Process",
"show_label" : False, "show_label" : False,
} }
extra_model_def["mask_preprocessing"] = {
"selection":[ "", "A", "XA"],
"visible": False
}
extra_model_def["video_guide_outpainting"] = [0,1] extra_model_def["video_guide_outpainting"] = [0,1]
extra_model_def["keep_frames_video_guide_not_supported"] = True extra_model_def["keep_frames_video_guide_not_supported"] = True
extra_model_def["extract_guide_from_window_start"] = True extra_model_def["extract_guide_from_window_start"] = True
@ -480,8 +496,8 @@ class family_handler():
"video_prompt_type": "UV", "video_prompt_type": "UV",
}) })
elif base_model_type in ["animate"]: elif base_model_type in ["animate"]:
ui_defaults.update({ ui_defaults.update({
"video_prompt_type": "PVBXAKI", "video_prompt_type": "PVBKI",
"mask_expand": 20, "mask_expand": 20,
"audio_prompt_type": "R", "audio_prompt_type": "R",
}) })

View File

@ -99,7 +99,7 @@ def parse_loras_multipliers(loras_multipliers, nb_loras, num_inference_steps, me
return loras_list_mult_choices_nums, slists_dict, "" return loras_list_mult_choices_nums, slists_dict, ""
def update_loras_slists(trans, slists_dict, num_inference_steps, phase_switch_step = None, phase_switch_step2 = None ): def update_loras_slists(trans, slists_dict, num_inference_steps, phase_switch_step = None, phase_switch_step2 = None):
from mmgp import offload from mmgp import offload
sz = len(slists_dict["phase1"]) sz = len(slists_dict["phase1"])
slists = [ expand_slist(slists_dict, i, num_inference_steps, phase_switch_step, phase_switch_step2 ) for i in range(sz) ] slists = [ expand_slist(slists_dict, i, num_inference_steps, phase_switch_step, phase_switch_step2 ) for i in range(sz) ]
@ -108,7 +108,8 @@ def update_loras_slists(trans, slists_dict, num_inference_steps, phase_switch_st
def get_model_switch_steps(timesteps, total_num_steps, guide_phases, model_switch_phase, switch_threshold, switch2_threshold ): def get_model_switch_steps(timesteps, guide_phases, model_switch_phase, switch_threshold, switch2_threshold ):
total_num_steps = len(timesteps)
model_switch_step = model_switch_step2 = None model_switch_step = model_switch_step2 = None
for i, t in enumerate(timesteps): for i, t in enumerate(timesteps):
if guide_phases >=2 and model_switch_step is None and t <= switch_threshold: model_switch_step = i if guide_phases >=2 and model_switch_step is None and t <= switch_threshold: model_switch_step = i

69
wgp.py
View File

@ -63,7 +63,7 @@ AUTOSAVE_FILENAME = "queue.zip"
PROMPT_VARS_MAX = 10 PROMPT_VARS_MAX = 10
target_mmgp_version = "3.6.0" target_mmgp_version = "3.6.0"
WanGP_version = "8.73" WanGP_version = "8.74"
settings_version = 2.36 settings_version = 2.36
max_source_video_frames = 3000 max_source_video_frames = 3000
prompt_enhancer_image_caption_model, prompt_enhancer_image_caption_processor, prompt_enhancer_llm_model, prompt_enhancer_llm_tokenizer = None, None, None, None prompt_enhancer_image_caption_model, prompt_enhancer_image_caption_processor, prompt_enhancer_llm_model, prompt_enhancer_llm_tokenizer = None, None, None, None
@ -3699,13 +3699,15 @@ def extract_faces_from_video_with_mask(input_video_path, input_mask_path, max_fr
any_mask = input_mask_path != None any_mask = input_mask_path != None
video = get_resampled_video(input_video_path, start_frame, max_frames, target_fps) video = get_resampled_video(input_video_path, start_frame, max_frames, target_fps)
if len(video) == 0: return None if len(video) == 0: return None
frame_height, frame_width, _ = video[0].shape
num_frames = len(video)
if any_mask: if any_mask:
mask_video = get_resampled_video(input_mask_path, start_frame, max_frames, target_fps) mask_video = get_resampled_video(input_mask_path, start_frame, max_frames, target_fps)
frame_height, frame_width, _ = video[0].shape num_frames = min(num_frames, len(mask_video))
num_frames = min(len(video), len(mask_video))
if num_frames == 0: return None if num_frames == 0: return None
video, mask_video = video[:num_frames], mask_video[:num_frames] video = video[:num_frames]
if any_mask:
mask_video = mask_video[:num_frames]
from preprocessing.face_preprocessor import FaceProcessor from preprocessing.face_preprocessor import FaceProcessor
face_processor = FaceProcessor() face_processor = FaceProcessor()
@ -6970,39 +6972,49 @@ def refresh_video_prompt_type_alignment(state, video_prompt_type, video_prompt_t
all_guide_processes ="PDESLCMUVB" all_guide_processes ="PDESLCMUVB"
def refresh_video_prompt_type_video_guide(state, video_prompt_type, video_prompt_type_video_guide, image_mode, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value ): def refresh_video_prompt_type_video_guide(state, filter_type, video_prompt_type, video_prompt_type_video_guide, image_mode, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value ):
old_video_prompt_type = video_prompt_type
video_prompt_type = del_in_sequence(video_prompt_type, all_guide_processes)
video_prompt_type = add_to_sequence(video_prompt_type, video_prompt_type_video_guide)
visible = "V" in video_prompt_type
model_type = state["model_type"] model_type = state["model_type"]
model_def = get_model_def(model_type) model_def = get_model_def(model_type)
old_video_prompt_type = video_prompt_type
if filter_type == "alt":
guide_custom_choices = model_def.get("guide_custom_choices",{})
letter_filter = guide_custom_choices.get("letters_filter","")
else:
letter_filter = all_guide_processes
video_prompt_type = del_in_sequence(video_prompt_type, letter_filter)
video_prompt_type = add_to_sequence(video_prompt_type, video_prompt_type_video_guide)
visible = "V" in video_prompt_type
any_outpainting= image_mode in model_def.get("video_guide_outpainting", []) any_outpainting= image_mode in model_def.get("video_guide_outpainting", [])
mask_visible = visible and "A" in video_prompt_type and not "U" in video_prompt_type mask_visible = visible and "A" in video_prompt_type and not "U" in video_prompt_type
image_outputs = image_mode > 0 image_outputs = image_mode > 0
keep_frames_video_guide_visible = not image_outputs and visible and not model_def.get("keep_frames_video_guide_not_supported", False) keep_frames_video_guide_visible = not image_outputs and visible and not model_def.get("keep_frames_video_guide_not_supported", False)
image_mask_guide, image_guide, image_mask = switch_image_guide_editor(image_mode, old_video_prompt_type , video_prompt_type, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value ) image_mask_guide, image_guide, image_mask = switch_image_guide_editor(image_mode, old_video_prompt_type , video_prompt_type, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value )
# mask_video_input_visible = image_mode == 0 and mask_visible
mask_preprocessing = model_def.get("mask_preprocessing", None) mask_preprocessing = model_def.get("mask_preprocessing", None)
if mask_preprocessing is not None: if mask_preprocessing is not None:
mask_selector_visible = mask_preprocessing.get("visible", True) mask_selector_visible = mask_preprocessing.get("visible", True)
else: else:
mask_selector_visible = True mask_selector_visible = True
return video_prompt_type, gr.update(visible = visible and not image_outputs), image_guide, gr.update(visible = keep_frames_video_guide_visible), gr.update(visible = visible and "G" in video_prompt_type), gr.update(visible= (visible or "F" in video_prompt_type or "K" in video_prompt_type) and any_outpainting), gr.update(visible= visible and mask_selector_visible and not "U" in video_prompt_type ) , gr.update(visible= mask_visible and not image_outputs), image_mask, image_mask_guide, gr.update(visible= mask_visible)
def refresh_video_prompt_type_video_guide_alt(state, video_prompt_type, video_prompt_type_video_guide_alt, image_mode):
model_def = get_model_def(state["model_type"])
guide_custom_choices = model_def.get("guide_custom_choices",{})
video_prompt_type = del_in_sequence(video_prompt_type, guide_custom_choices.get("letters_filter",""))
video_prompt_type = add_to_sequence(video_prompt_type, video_prompt_type_video_guide_alt)
control_video_visible = "V" in video_prompt_type
ref_images_visible = "I" in video_prompt_type ref_images_visible = "I" in video_prompt_type
denoising_strength_visible = "G" in video_prompt_type return video_prompt_type, gr.update(visible = visible and not image_outputs), image_guide, gr.update(visible = keep_frames_video_guide_visible), gr.update(visible = visible and "G" in video_prompt_type), gr.update(visible= (visible or "F" in video_prompt_type or "K" in video_prompt_type) and any_outpainting), gr.update(visible= visible and mask_selector_visible and not "U" in video_prompt_type ) , gr.update(visible= mask_visible and not image_outputs), image_mask, image_mask_guide, gr.update(visible= mask_visible), gr.update(visible = ref_images_visible )
return video_prompt_type, gr.update(visible = control_video_visible and image_mode ==0), gr.update(visible = control_video_visible and image_mode >=1), gr.update(visible = ref_images_visible ), gr.update(visible = denoising_strength_visible )
# def refresh_video_prompt_video_guide_trigger(state, video_prompt_type, video_prompt_type_video_guide): # def refresh_video_prompt_type_video_guide_alt(state, video_prompt_type, video_prompt_type_video_guide_alt, image_mode, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value ):
# video_prompt_type_video_guide = video_prompt_type_video_guide.split("#")[0] # old_video_prompt_type = video_prompt_type
# return refresh_video_prompt_type_video_guide(state, video_prompt_type, video_prompt_type_video_guide) # model_def = get_model_def(state["model_type"])
# guide_custom_choices = model_def.get("guide_custom_choices",{})
# video_prompt_type = del_in_sequence(video_prompt_type, guide_custom_choices.get("letters_filter",""))
# video_prompt_type = add_to_sequence(video_prompt_type, video_prompt_type_video_guide_alt)
# image_outputs = image_mode > 0
# control_video_visible = "V" in video_prompt_type
# ref_images_visible = "I" in video_prompt_type
# denoising_strength_visible = "G" in video_prompt_type
# mask_expand_visible = control_video_visible and "A" in video_prompt_type and not "U" in video_prompt_type
# mask_video_input_visible = image_mode == 0 and mask_expand_visible
# image_mask_guide, image_guide, image_mask = switch_image_guide_editor(image_mode, old_video_prompt_type , video_prompt_type, old_image_mask_guide_value, old_image_guide_value, old_image_mask_value )
# keep_frames_video_guide_visible = not image_outputs and visible and not model_def.get("keep_frames_video_guide_not_supported", False)
# return video_prompt_type, gr.update(visible = control_video_visible and image_mode ==0), gr.update(visible = control_video_visible and image_mode >=1), gr.update(visible = ref_images_visible ), gr.update(visible = denoising_strength_visible ), gr.update(visible = mask_video_input_visible ), gr.update(visible = mask_expand_visible), image_mask_guide, image_guide, image_mask, gr.update(visible = keep_frames_video_guide_visible)
def refresh_preview(state): def refresh_preview(state):
gen = get_gen_info(state) gen = get_gen_info(state)
@ -8294,9 +8306,10 @@ def generate_video_tab(update_form = False, state_dict = None, ui_defaults = Non
image_prompt_type_radio.change(fn=refresh_image_prompt_type_radio, inputs=[state, image_prompt_type, image_prompt_type_radio], outputs=[image_prompt_type, image_start_row, image_end_row, video_source, keep_frames_video_source, image_prompt_type_endcheckbox], show_progress="hidden" ) image_prompt_type_radio.change(fn=refresh_image_prompt_type_radio, inputs=[state, image_prompt_type, image_prompt_type_radio], outputs=[image_prompt_type, image_start_row, image_end_row, video_source, keep_frames_video_source, image_prompt_type_endcheckbox], show_progress="hidden" )
image_prompt_type_endcheckbox.change(fn=refresh_image_prompt_type_endcheckbox, inputs=[state, image_prompt_type, image_prompt_type_radio, image_prompt_type_endcheckbox], outputs=[image_prompt_type, image_end_row] ) image_prompt_type_endcheckbox.change(fn=refresh_image_prompt_type_endcheckbox, inputs=[state, image_prompt_type, image_prompt_type_radio, image_prompt_type_endcheckbox], outputs=[image_prompt_type, image_end_row] )
video_prompt_type_image_refs.input(fn=refresh_video_prompt_type_image_refs, inputs = [state, video_prompt_type, video_prompt_type_image_refs,image_mode], outputs = [video_prompt_type, image_refs_row, remove_background_images_ref, image_refs_relative_size, frames_positions,video_guide_outpainting_col], show_progress="hidden") video_prompt_type_image_refs.input(fn=refresh_video_prompt_type_image_refs, inputs = [state, video_prompt_type, video_prompt_type_image_refs,image_mode], outputs = [video_prompt_type, image_refs_row, remove_background_images_ref, image_refs_relative_size, frames_positions,video_guide_outpainting_col], show_progress="hidden")
video_prompt_type_video_guide.input(fn=refresh_video_prompt_type_video_guide, inputs = [state, video_prompt_type, video_prompt_type_video_guide, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_guide, image_guide, keep_frames_video_guide, denoising_strength, video_guide_outpainting_col, video_prompt_type_video_mask, video_mask, image_mask, image_mask_guide, mask_expand], show_progress="hidden") video_prompt_type_video_guide.input(fn=refresh_video_prompt_type_video_guide, inputs = [state, gr.State(""), video_prompt_type, video_prompt_type_video_guide, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_guide, image_guide, keep_frames_video_guide, denoising_strength, video_guide_outpainting_col, video_prompt_type_video_mask, video_mask, image_mask, image_mask_guide, mask_expand, image_refs_row], show_progress="hidden")
video_prompt_type_video_guide_alt.input(fn=refresh_video_prompt_type_video_guide_alt, inputs = [state, video_prompt_type, video_prompt_type_video_guide_alt, image_mode], outputs = [video_prompt_type, video_guide, image_guide, image_refs_row, denoising_strength ], show_progress="hidden") video_prompt_type_video_guide_alt.input(fn=refresh_video_prompt_type_video_guide, inputs = [state, gr.State("alt"),video_prompt_type, video_prompt_type_video_guide_alt, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_guide, image_guide, keep_frames_video_guide, denoising_strength, video_guide_outpainting_col, video_prompt_type_video_mask, video_mask, image_mask, image_mask_guide, mask_expand, image_refs_row], show_progress="hidden")
video_prompt_type_video_mask.input(fn=refresh_video_prompt_type_video_mask, inputs = [state, video_prompt_type, video_prompt_type_video_mask, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_mask, image_mask_guide, image_guide, image_mask, mask_expand], show_progress="hidden") # video_prompt_type_video_guide_alt.input(fn=refresh_video_prompt_type_video_guide_alt, inputs = [state, video_prompt_type, video_prompt_type_video_guide_alt, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_guide, image_guide, image_refs_row, denoising_strength, video_mask, mask_expand, image_mask_guide, image_guide, image_mask, keep_frames_video_guide ], show_progress="hidden")
video_prompt_type_video_mask.input(fn=refresh_video_prompt_type_video_mask, inputs = [state, video_prompt_type, video_prompt_type_video_mask, image_mode, image_mask_guide, image_guide, image_mask], outputs = [video_prompt_type, video_mask, image_mask_guide, image_guide, image_mask, mask_expand], show_progress="hidden")
video_prompt_type_alignment.input(fn=refresh_video_prompt_type_alignment, inputs = [state, video_prompt_type, video_prompt_type_alignment], outputs = [video_prompt_type]) video_prompt_type_alignment.input(fn=refresh_video_prompt_type_alignment, inputs = [state, video_prompt_type, video_prompt_type_alignment], outputs = [video_prompt_type])
multi_prompts_gen_type.select(fn=refresh_prompt_labels, inputs=[multi_prompts_gen_type, image_mode], outputs=[prompt, wizard_prompt, image_end], show_progress="hidden") multi_prompts_gen_type.select(fn=refresh_prompt_labels, inputs=[multi_prompts_gen_type, image_mode], outputs=[prompt, wizard_prompt, image_end], show_progress="hidden")
video_guide_outpainting_top.input(fn=update_video_guide_outpainting, inputs=[video_guide_outpainting, video_guide_outpainting_top, gr.State(0)], outputs = [video_guide_outpainting], trigger_mode="multiple" ) video_guide_outpainting_top.input(fn=update_video_guide_outpainting, inputs=[video_guide_outpainting, video_guide_outpainting_top, gr.State(0)], outputs = [video_guide_outpainting], trigger_mode="multiple" )