Frequently Asked Questions

Everything you need to know about Wan 2.5 and Wan-Animate

What is the maximum video length for Wan 2.5?

Wan 2.5 generates videos up to 10 seconds in length per request. For longer videos, you'll need to generate multiple clips and stitch them together in post-production using video editing software.

Which resolution should I use?

Start with 720p for testing and refining prompts, as it's faster and cheaper. Once you're satisfied with the result, generate the final version in 1080p. 480p is best for quick previews only.

How does audio-sync work?

When you provide an audio file, Wan 2.5 analyzes the audio and generates lip movements that match the speech or vocals. This works best with clear dialogue or singing. Background music or ambient sounds won't produce synchronized lip movements.

What's the difference between T2V and I2V?

Text-to-Video (T2V) generates videos from text prompts alone. Image-to-Video (I2V) animates a still image you provide, adding movement and (optionally) audio-synced lip movements. I2V is better for character consistency.

Does Wan 2.5 add watermarks?

No, Wan 2.5 output does not include watermarks when accessed through platforms like Fal.ai, WaveSpeed, or AIMLAPI. You receive clean, watermark-free video files.

What frame rate does Wan 2.5 use?

Wan 2.5 generates videos at 24 frames per second (24 FPS), which is the standard cinematic frame rate. This frame rate is fixed and cannot be changed.

Can I use Wan 2.5 for commercial projects?

Usage rights depend on the platform you're using. Check the terms of service for Fal.ai, WaveSpeed, or AIMLAPI. Generally, output can be used commercially, but verify with your specific platform.

How much does it cost?

Pricing varies by platform and resolution. Expect $0.05-0.15 per second depending on quality (480p/720p/1080p). A 10-second 1080p clip typically costs $1.50. See our platform comparison for detailed pricing.

What is Wan-Animate?

Wan-Animate is a complementary tool for character animation and pose transfer. It allows you to replace characters in videos or apply specific poses and movements. It's separate from Wan 2.5 but works with similar technology.

Can I generate videos without audio?

Yes, audio input is optional. You can generate videos using only text prompts (T2V) or images (I2V) without audio. However, you'll miss out on the lip-sync feature, which is one of Wan 2.5's key strengths.

What languages does Wan 2.5 support?

Wan 2.5 prompts are typically in English, but the audio-sync feature works with multiple languages including English, Chinese, and others. The quality of lip-sync may vary by language.

How long does generation take?

Generation time depends on resolution and platform load. Typically: 480p takes 1-2 minutes, 720p takes 2-4 minutes, and 1080p takes 3-5 minutes. Times may vary during peak usage.