Alibaba WAN 2.7 integration
WAN 2.7 Image-to-Video supports three generation modes: animate from a first frame, control start-to-end animation, or extend existing videos with optional audio guidance.
Key capabilities
- Three generation modes: First frame only, first+last frame, and video continuation
- Resolution options: 720P (1280x720) and 1080P (1920x1080) output
- Flexible durations: 2 to 15 seconds of video output
- Audio-guided generation: Provide a WAV or MP3 audio file (2-30 seconds, max 15MB) to guide video creation
- Prompt expansion: AI optimizer expands short prompts into detailed scripts for richer output
- Video extension: Continue an existing MP4/MOV video (2-10 seconds, max 100MB) with new content
- Image constraints: Supports JPEG, PNG, BMP, WEBP images (240-8000px per side, max 20MB)
- Async processing: Webhook notifications or polling for task completion
Generation modes
| Mode | Required inputs | Best for |
|---|---|---|
| First frame | start_image_url | Animating a still image with AI-generated motion |
| First + last frame | start_image_url + end_image_url | Controlled transition between two keyframes |
| Video continuation | video_url (optionally + end_image_url) | Extending an existing video clip with new content |
Use cases
- Product animation: Bring product images to life with smooth motion and camera movements
- Marketing videos: Animate brand imagery into short-form video content
- Social media content: Create video posts from static images for TikTok, Instagram, and YouTube
- Video extension: Extend short clips into longer narratives using video continuation
- Storyboarding: Animate concept art or wireframes to preview motion sequences
- Creative exploration: Experiment with first+last frame mode for controlled visual transitions
API operations
Generate videos by submitting an image or video to the API. The service returns a task ID for async polling or webhook notification.POST /v1/ai/image-to-video/wan-2-7
Create a new image-to-video generation task
GET /v1/ai/image-to-video/wan-2-7
List all WAN 2.7 I2V tasks with status
GET /v1/ai/image-to-video/wan-2-7/{task-id}
Get task status and results by ID
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | No | - | Text description to guide video motion and style. Max 5000 characters |
negative_prompt | string | No | - | Elements to avoid (e.g., “blurry, watermark”). Max 500 characters |
start_image_url | string | Conditional | - | URL of first-frame image (JPEG/PNG/BMP/WEBP, 240-8000px, max 20MB) |
end_image_url | string | No | - | URL of last-frame image. Use with start_image_url or video_url |
video_url | string | Conditional | - | URL of existing video to extend (MP4/MOV, 2-10s, max 100MB) |
audio_url | string | No | - | URL of audio file (WAV/MP3, 2-30s, max 15MB) to guide generation |
resolution | string | No | "1080P" | Output resolution: "720P" or "1080P" |
duration | integer | No | 5 | Video length in seconds: 2 to 15. For video continuation, this is total output length |
seed | integer | No | Random | Seed for reproducibility (0 to 2147483647) |
additional_settings.prompt_extend | boolean | No | true | Enable AI prompt expansion for richer output |
webhook_url | string | No | - | URL for async status notifications |
Frequently Asked Questions
What is WAN 2.7 Image-to-Video and how does it work?
What is WAN 2.7 Image-to-Video and how does it work?
WAN 2.7 Image-to-Video is an AI video generation API developed by Alibaba. You submit an image or video URL, receive a task ID immediately, then poll for results or receive a webhook notification when processing completes. The model generates MP4 video at 720P or 1080P resolution in durations from 2 to 15 seconds.
What are the three generation modes?
What are the three generation modes?
First frame: Provide
start_image_url alone to animate from a starting image. First + last frame: Provide both start_image_url and end_image_url for controlled start-to-end animation. Video continuation: Provide video_url to extend an existing video, optionally with end_image_url as the target ending frame.What image formats are supported?
What image formats are supported?
WAN 2.7 accepts JPEG, PNG, BMP, and WEBP images via publicly accessible URLs. Images must be 240-8000 pixels per side with an aspect ratio between 1:8 and 8:1, and a maximum file size of 20MB.
What video formats are supported for video continuation?
What video formats are supported for video continuation?
Video continuation accepts MP4 and MOV files via publicly accessible URLs. Input videos must be 2-10 seconds long, 240-4096 pixels per side, and under 100MB.
How does audio-guided generation work?
How does audio-guided generation work?
Provide a WAV or MP3 audio file URL via the
audio_url parameter. The audio must be 2-30 seconds long and under 15MB. WAN 2.7 uses the audio to influence the visual content and motion of the generated video.What are the rate limits for WAN 2.7?
What are the rate limits for WAN 2.7?
Rate limits depend on your subscription tier. See the Rate Limits page for current limits by plan.
How much does WAN 2.7 cost?
How much does WAN 2.7 cost?
See the Pricing page for current rates and subscription options.
Best practices
- Image quality: Use high-resolution images with clear subjects and balanced lighting. Avoid heavily compressed or noisy inputs.
- First + last frame: Ensure both images share a similar visual style and subject for smooth transitions.
- Video continuation: Input video duration (2-10s) counts toward total output duration. Plan accordingly.
- Prompt writing: Even though prompts are optional for I2V, adding motion and camera directions improves results.
- Negative prompts: Always include: “blurry, low quality, watermark, text, distortion, extra limbs”
- Production integration: Use webhooks for scalable applications instead of polling.
- Error handling: Implement retry with exponential backoff for 503 errors during high-demand periods.
Related APIs
- WAN 2.7 Text-to-Video: Generate videos purely from text prompts with WAN 2.7
- WAN 2.7 Reference-to-Video: Generate videos featuring characters from reference images or videos
- WAN 2.6 Image-to-Video: Previous WAN generation with multi-shot sequences
- Kling 2.5 Turbo Pro: Alternative I2V model with cinematic quality