The first Kling image-to-video model to natively generate synchronized audio and video in a single pass, supporting dialogue, sound effects, and lip-synced speech alongside motion brush.
API Key authentication. Format: Bearer YOUR_API_KEY.
Starting frame image URL or Base64
"https://example.com/image.jpg"
Ending frame image URL or Base64 (mutually exclusive with motion brush)
"https://example.com/end.jpg"
Video generation prompt
1 - 2500"Live concert with crowd cheering"
Negative prompt
1 - 2500"silent, empty"
Audio generation switch: on (enable) or off (disable). Only supported by V2.6.
on, off "on"
Audio configuration list, controls detailed audio generation parameters. Only supported by V2.6.
Generation mode: std or pro
std, pro "pro"
Static mask image for motion brush (mutually exclusive with image_tail)
"https://example.com/mask.jpg"
Dynamic mask array for motion brush (mutually exclusive with image_tail)
Video duration in seconds
5, 10 10