MiniMax’s multimodal vision model that blends text-to-image generation with visual reasoning for seamless cross-modal tasks.
API Key authentication. Format: Bearer YOUR_API_KEY.
Image description text, supports Chinese and English
1 - 1500"A girl in futuristic style"
Reference image list for image-to-image generation. Each element contains type and image_file fields
1 - 10 elements[
{
"type": "character",
"image_file": "https://example.com/reference.jpg"
}
]Image aspect ratio. If both aspect_ratio and width/height are provided, aspect_ratio takes priority
1:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16, 21:9 "1:1"
Image width in pixels (must be used with height). Must be divisible by 8. If aspect_ratio is provided, width/height will be ignored
512 <= x <= 20481024
Image height in pixels (must be used with width). Must be divisible by 8. If aspect_ratio is provided, width/height will be ignored
512 <= x <= 20481024
Random seed for reproducible results. Use the same seed with same parameters to generate identical images
x >= 042
Number of images to generate per request
1 <= x <= 91
Enable automatic prompt optimization to improve generation quality
false