Wanxiang Wan 2.7 Image-to-Video
Video
Wanxiang Wan 2.7 Image-to-Video
POST
Wanxiang Wan 2.7 Image-to-Video
Wanxiang Wan 2.7 image-to-video model supports multimodal inputs (text/image/audio/video) and can perform three major tasks: first-frame video generation, first-and-last-frame video generation, and video continuation. It supports 720P and 1080P resolutions, durations from 2 to 15 seconds, and is billed by the second. Output includes audio by default.
Request Headers
Enum value:
application/jsonBearer authentication format: Bearer {{API Key}}.
Request Body
Random seed used to improve the reproducibility of generated results. Value range: [0, 2147483647].Value range: [0, 2147483647]
Text prompt used to describe the elements and visual characteristics expected in the generated video. Supports Chinese and English, up to 5000 characters.Length limit: 0 - 5000
Generated video duration, in seconds, billed by the second. Must be an integer in the range [2, 15].Value range: [2, 15]
First-frame image URL. Supported formats: JPEG, JPG, PNG (transparent channels are not supported), BMP, WEBP. Resolution width and height range: [240, 8000] pixels; aspect ratio: 1:8 to 8:1; file size must not exceed 20 MB. Choose either this or first_clip_url; at least one must be provided.
Whether to add a watermark identifier. The watermark is located in the lower-right corner of the video.
Output video resolution tier, which affects cost. The video’s aspect ratio remains consistent with the input media.Options:
720P, 1080PWhether to enable intelligent prompt rewriting. When enabled, a large model is used to intelligently rewrite the input prompt, which can significantly improve generation quality for shorter prompts, but will increase processing time.
URL of the first video clip, used for video continuation. The model will continue generating based on the video content. Supported formats: mp4, mov; duration: 2 to 10 seconds; resolution width and height range: [240, 4096] pixels; aspect ratio: 1:8 to 8:1; file size must not exceed 100 MB. Choose either this or image_url.
Last-frame image URL. Used together with the first frame to generate a first-and-last-frame video. Format restrictions are the same as for the first frame.
Negative prompt used to describe content that you do not want to see in the video. Supports Chinese and English, up to 500 characters.Length limit: 0 - 500
Driving audio URL. When provided, the model will use this audio as the driving source to generate the video (such as lip sync, motion beats, etc.). If not provided, the model will automatically generate matching background music or sound effects. Supported formats: wav, mp3; duration: 2 to 30 seconds; file size must not exceed 15 MB.
Response
Use task_id to request the Get Task Result API to retrieve the generated output.