The first Kling text-to-video model to natively generate synchronized audio and video in one pass, including dialogue, ambient sounds, and lip-synced speech.
API Key authentication. Format: Bearer YOUR_API_KEY.
Video description text, supports Chinese and English
1 - 2500"Forest stream with flowing water and birds chirping"
Negative prompt describing undesired elements
1 - 2500"silent, empty, deserted"
Audio generation switch: on (enable) or off (disable). Only supported by V2.6 and later versions.
on, off "on"
Generation mode: std (standard) or pro (professional)
std, pro "pro"
Video aspect ratio
16:9, 9:16, 1:1 "16:9"
Video duration in seconds
5, 10 10