Skip to main content

Alibaba WAN 2.7 integration

WAN 2.7 Image-to-Video supports three generation modes: animate from a first frame, control start-to-end animation, or extend existing videos with optional audio guidance.
WAN 2.7 Image-to-Video is an AI video generation API that creates MP4 videos from images or extends existing videos. It supports three distinct generation modes: first-frame animation, first+last frame controlled animation, and video continuation. Output is available at 720P (1280x720) or 1080P (1920x1080) resolution with durations from 2 to 15 seconds. The model also supports optional audio input and automatic prompt expansion.

Key capabilities

  • Three generation modes: First frame only, first+last frame, and video continuation
  • Resolution options: 720P (1280x720) and 1080P (1920x1080) output
  • Flexible durations: 2 to 15 seconds of video output
  • Audio-guided generation: Provide a WAV or MP3 audio file (2-30 seconds, max 15MB) to guide video creation
  • Prompt expansion: AI optimizer expands short prompts into detailed scripts for richer output
  • Video extension: Continue an existing MP4/MOV video (2-10 seconds, max 100MB) with new content
  • Image constraints: Supports JPEG, PNG, BMP, WEBP images (240-8000px per side, max 20MB)
  • Async processing: Webhook notifications or polling for task completion

Generation modes

ModeRequired inputsBest for
First framestart_image_urlAnimating a still image with AI-generated motion
First + last framestart_image_url + end_image_urlControlled transition between two keyframes
Video continuationvideo_url (optionally + end_image_url)Extending an existing video clip with new content

Use cases

  • Product animation: Bring product images to life with smooth motion and camera movements
  • Marketing videos: Animate brand imagery into short-form video content
  • Social media content: Create video posts from static images for TikTok, Instagram, and YouTube
  • Video extension: Extend short clips into longer narratives using video continuation
  • Storyboarding: Animate concept art or wireframes to preview motion sequences
  • Creative exploration: Experiment with first+last frame mode for controlled visual transitions

API operations

Generate videos by submitting an image or video to the API. The service returns a task ID for async polling or webhook notification.

POST /v1/ai/image-to-video/wan-2-7

Create a new image-to-video generation task

GET /v1/ai/image-to-video/wan-2-7

List all WAN 2.7 I2V tasks with status

GET /v1/ai/image-to-video/wan-2-7/{task-id}

Get task status and results by ID

Parameters

ParameterTypeRequiredDefaultDescription
promptstringNo-Text description to guide video motion and style. Max 5000 characters
negative_promptstringNo-Elements to avoid (e.g., “blurry, watermark”). Max 500 characters
start_image_urlstringConditional-URL of first-frame image (JPEG/PNG/BMP/WEBP, 240-8000px, max 20MB)
end_image_urlstringNo-URL of last-frame image. Use with start_image_url or video_url
video_urlstringConditional-URL of existing video to extend (MP4/MOV, 2-10s, max 100MB)
audio_urlstringNo-URL of audio file (WAV/MP3, 2-30s, max 15MB) to guide generation
resolutionstringNo"1080P"Output resolution: "720P" or "1080P"
durationintegerNo5Video length in seconds: 2 to 15. For video continuation, this is total output length
seedintegerNoRandomSeed for reproducibility (0 to 2147483647)
additional_settings.prompt_extendbooleanNotrueEnable AI prompt expansion for richer output
webhook_urlstringNo-URL for async status notifications

Frequently Asked Questions

WAN 2.7 Image-to-Video is an AI video generation API developed by Alibaba. You submit an image or video URL, receive a task ID immediately, then poll for results or receive a webhook notification when processing completes. The model generates MP4 video at 720P or 1080P resolution in durations from 2 to 15 seconds.
First frame: Provide start_image_url alone to animate from a starting image. First + last frame: Provide both start_image_url and end_image_url for controlled start-to-end animation. Video continuation: Provide video_url to extend an existing video, optionally with end_image_url as the target ending frame.
WAN 2.7 accepts JPEG, PNG, BMP, and WEBP images via publicly accessible URLs. Images must be 240-8000 pixels per side with an aspect ratio between 1:8 and 8:1, and a maximum file size of 20MB.
Video continuation accepts MP4 and MOV files via publicly accessible URLs. Input videos must be 2-10 seconds long, 240-4096 pixels per side, and under 100MB.
Provide a WAV or MP3 audio file URL via the audio_url parameter. The audio must be 2-30 seconds long and under 15MB. WAN 2.7 uses the audio to influence the visual content and motion of the generated video.
Rate limits depend on your subscription tier. See the Rate Limits page for current limits by plan.
See the Pricing page for current rates and subscription options.

Best practices

  • Image quality: Use high-resolution images with clear subjects and balanced lighting. Avoid heavily compressed or noisy inputs.
  • First + last frame: Ensure both images share a similar visual style and subject for smooth transitions.
  • Video continuation: Input video duration (2-10s) counts toward total output duration. Plan accordingly.
  • Prompt writing: Even though prompts are optional for I2V, adding motion and camera directions improves results.
  • Negative prompts: Always include: “blurry, low quality, watermark, text, distortion, extra limbs”
  • Production integration: Use webhooks for scalable applications instead of polling.
  • Error handling: Implement retry with exponential backoff for 503 errors during high-demand periods.