Skip to main content
POST
/
v3
/
async
/
wan2.7-r2v
Wanx Wan 2.7 Reference-to-Video
curl --request POST \
  --url https://api.highwayapi.ai/v3/async/wan2.7-r2v \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "size": "<string>",
  "audio": true,
  "media": [
    {
      "url": "<string>",
      "type": "<string>",
      "reference_voice": "<string>"
    }
  ],
  "prompt": "<string>",
  "duration": 123,
  "shot_type": "<string>",
  "watermark": true,
  "negative_prompt": "<string>"
}
'
{
  "task_id": "<string>"
}
Wanx Wan 2.7 reference-to-video model supports multimodal input (text/images/videos). It can use a person or object as the protagonist to generate single-character performance videos or multi-character interaction videos. It supports intelligent storyboarding to generate multi-shot videos. It supports 720P and 1080P resolutions, durations from 2 to 10 seconds, and is billed by the second. The output includes audio by default.
This is an asynchronous API and only returns the asynchronous task’s task_id. You should use this task_id to request the Get Task Result API to retrieve the generated result.

Request Headers

Content-Type
string
required
Enumerated value: application/json
Authorization
string
required
Bearer authentication format: Bearer {{API key}}.

Request Body

seed
integer
Random seed, used to improve the reproducibility of generated results. Value range: [0, 2147483647].Value range: [0, 2147483647]
size
string
default:"1920*1080"
Output video resolution (widthheight), which affects cost. 720P tier: 1280720 (16:9), 7201280 (9:16), 960960 (1:1), 1088832 (4:3), 8321088 (3:4). 1080P tier: 19201080 (16:9), 10801920 (9:16), 14401440 (1:1), 16321248 (4:3), 1248*1632 (3:4).Available values: 1280*720, 720*1280, 960*960, 1088*832, 832*1088, 1920*1080, 1080*1920, 1440*1440, 1632*1248, 1248*1632
audio
boolean
default:true
Whether to generate a video with sound, which affects cost. Default is true (video with sound).
media
array
required
Reference media array, used to extract character appearance, motion, and voice timbre. Corresponds to character1, character2, etc. in the prompt in array order. Number of images: 0–5; number of videos: 0–3; total number does not exceed 5. Image formats: JPEG, JPG, PNG, BMP, WEBP; resolution [240,8000] pixels; no more than 10 MB. Video formats: MP4, MOV; duration 1–30 seconds; no more than 100 MB. Audio formats: MP3, WAV, FLAC; duration 3–30 seconds.Array length: 1 - 5
prompt
string
required
Text prompt, used to describe the elements and visual characteristics expected in the generated video. Use character1, character2, etc. to reference the reference characters. Each reference (video or image) contains only a single character. Chinese and English are supported, up to 1500 characters.Length limit: 0 - 1500
duration
integer
default:5
Duration of the generated video, in seconds, billed by the second. Integer value range: [2, 10].Value range: [2, 10]
shot_type
string
default:"single"
Shot type. single indicates a single shot (default), and multi indicates multiple shots. This parameter has higher priority than the prompt.Available values: single, multi
watermark
boolean
default:false
Whether to add a watermark identifier. The watermark is located in the lower-right corner of the video.
negative_prompt
string
Negative prompt, used to describe content you do not want to appear in the video. Chinese and English are supported, up to 500 characters.Length limit: 0 - 500

Response

task_id
string
Use task_id to request the Get Task Result API to retrieve the generated output.