Wan 2.1 VACE Plus is a unified video editing model supporting 5 functions: multi-image reference, video repainting, local editing, video extension, and video outpainting. Processing time: 5-10 minutes.
API Key authentication. Format: Bearer YOUR_API_KEY.
Video editing function to use. Determines which parameters are required and applicable: (1) image_reference: Generate video from reference images; (2) video_repainting: Apply style transfer to existing video using control conditions; (3) video_edit: Edit specific regions of video using mask; (4) video_extension: Extend video at start/end using frame/clip references; (5) video_outpainting: Expand video canvas boundaries
image_reference, video_repainting, video_edit, video_extension, video_outpainting "image_reference"
Content description in Chinese or English (1-800 characters). Describes the desired video content or editing effect. Required for all functions
1 - 800"A girl walking through a forest"
Reference image URLs array. Usage by function: (1) image_reference: 1-3 images required, used as visual references for video generation; (2) video_repainting: max 1 image optional, provides style reference; (3) video_edit: max 1 image optional, provides style reference; (4) other functions: not used. Images must be publicly accessible HTTP/HTTPS URLs
1 - 3 elements[
"https://example.com/ref1.jpg",
"https://example.com/ref2.jpg"
]Classification for each reference image: 'obj' (object/subject) or 'bg' (background). Only for image_reference function. Array length must match ref_images_url length. Maximum 1 'bg' element allowed. Helps model distinguish between subject references and background style references
obj, bg ["obj", "bg"]Input video URL. Usage by function: (1) video_repainting: required, the video to be repainted; (2) video_edit: required, the video to be edited; (3) video_outpainting: required, the video to expand; (4) video_extension: optional, provides reference style when extending; (5) image_reference: not used. Must be publicly accessible HTTP/HTTPS/OSS URL
"https://example.com/video.mp4"
Control condition type for structure preservation. Usage by function: (1) video_repainting: required, determines how video structure is preserved during repainting; (2) video_extension: required when video_url is present, maintains consistency with reference video; (3) video_edit: optional, provides structural guidance for editing; (4) other functions: not used. Options: 'posebodyface' (pose+body+face detection), 'posebody' (pose+body only), 'depth' (depth map), 'scribble' (edge detection), '' (empty string for no extraction)
posebodyface, posebody, depth, scribble, "depth"
Mask image URL for video_edit function. Defines the region to edit (white=edit, black=keep). Must provide either mask_image_url OR mask_video_url, not both. When using mask_image_url, also specify mask_frame_id to indicate which frame the mask corresponds to. The mask will be propagated to other frames based on mask_type setting
"https://example.com/mask.png"
Mask video URL for video_edit function. Provides frame-by-frame mask for precise control (white=edit, black=keep). Must provide either mask_image_url OR mask_video_url, not both. Mask video must have the same frame count as the input video. Use this for complex editing that requires different masks for different frames
"https://example.com/mask.mp4"
Frame ID (1-based index) indicating which frame the mask_image_url corresponds to. Only applicable when using mask_image_url in video_edit function. The mask will be propagated from this frame to others based on mask_type. For example, mask_frame_id=1 means the mask corresponds to the first frame of the video. Default is 1 (first frame)
x >= 11
Mask propagation type for video_edit function. Options: (1) 'tracking': mask follows object movement across frames (recommended for moving objects); (2) 'fixed': mask stays in same position across all frames (recommended for static scenes or background edits). Default is 'tracking'
tracking, fixed "tracking"
Mask expansion ratio for video_edit function (0.0-1.0). Expands the mask boundary to include surrounding areas. 0.0 = no expansion (use exact mask), 1.0 = maximum expansion. Default is 0.05 (5% expansion). Useful for ensuring complete coverage of editing region and avoiding edge artifacts
0 <= x <= 10.1
Mask expansion mode for video_edit function. Determines how the mask is expanded when expand_ratio > 0. Options: (1) 'hull': convex hull expansion (smooth, rounded boundaries); (2) 'bbox': bounding box expansion (rectangular boundaries); (3) 'original': keep original mask shape while expanding. Default is 'hull'. Use 'hull' for natural objects, 'bbox' for rectangular regions
hull, bbox, original "hull"
First frame image URL for video_extension function. Specifies the starting frame when extending video forward. At least one of first_frame_url/last_frame_url/first_clip_url/last_clip_url must be provided. Use this to define the exact starting point of the extended video. The model will generate smooth transition from this frame
"https://example.com/first_frame.jpg"
Last frame image URL for video_extension function. Specifies the ending frame when extending video backward. At least one of first_frame_url/last_frame_url/first_clip_url/last_clip_url must be provided. Use this to define the exact ending point of the extended video. The model will generate smooth transition to this frame
"https://example.com/last_frame.jpg"
First video clip URL for video_extension function. Provides a video segment to use as the starting portion. At least one of first_frame_url/last_frame_url/first_clip_url/last_clip_url must be provided. Use this when you want more control than a single frame can provide. The model will extend naturally from the end of this clip
"https://example.com/first_clip.mp4"
Last video clip URL for video_extension function. Provides a video segment to use as the ending portion. At least one of first_frame_url/last_frame_url/first_clip_url/last_clip_url must be provided. Use this when you want more control than a single frame can provide. The model will extend naturally to the beginning of this clip
"https://example.com/last_clip.mp4"
Top boundary expansion scale for video_outpainting function (1.0-2.0). 1.0 = no expansion (original height), 2.0 = double the top area. Default is 1.0. Use values > 1.0 to expand the canvas upward. For example, 1.5 means add 50% more canvas space above the original video. The model will generate content to fill the expanded area naturally
1 <= x <= 21.5
Bottom boundary expansion scale for video_outpainting function (1.0-2.0). 1.0 = no expansion (original height), 2.0 = double the bottom area. Default is 1.0. Use values > 1.0 to expand the canvas downward. For example, 1.5 means add 50% more canvas space below the original video. The model will generate content to fill the expanded area naturally
1 <= x <= 21.5
Left boundary expansion scale for video_outpainting function (1.0-2.0). 1.0 = no expansion (original width), 2.0 = double the left area. Default is 1.0. Use values > 1.0 to expand the canvas leftward. For example, 1.5 means add 50% more canvas space to the left of the original video. The model will generate content to fill the expanded area naturally
1 <= x <= 21.5
Right boundary expansion scale for video_outpainting function (1.0-2.0). 1.0 = no expansion (original width), 2.0 = double the right area. Default is 1.0. Use values > 1.0 to expand the canvas rightward. For example, 1.5 means add 50% more canvas space to the right of the original video. The model will generate content to fill the expanded area naturally
1 <= x <= 21.5
Output video resolution in widthheight format. Available options: '1280720' (16:9 landscape, HD), '7201280' (9:16 portrait, mobile-friendly), '960960' (1:1 square, social media), '1088832' (4:3 landscape), '8321088' (3:4 portrait). Default is '1280*720'. Choose based on your target platform and use case. Applicable to all functions
1280*720, 720*1280, 960*960, 832*1088, 1088*832 "1280*720"
Video duration in seconds. Fixed at 5 seconds and cannot be modified. The model always generates 5-second videos regardless of this parameter value
5 5
Enable intelligent prompt rewriting and enhancement. When true (default), the model will automatically optimize and expand your prompt for better results. When false, uses your prompt exactly as provided. Recommended to keep true unless you need precise control over the exact wording. Applicable to all functions
true
Random seed for reproducible results (0-2147483647). Using the same seed with identical parameters will produce similar (though not pixel-perfect identical) results. Useful for A/B testing different prompts while keeping other randomness constant. If not specified, a random seed is used each time. Applicable to all functions
0 <= x <= 214748364742