HomeTechAlibaba Cloud Releases WAN 2.6 Video Model With Multi-Shot and Prompt Recipes

Alibaba Cloud Releases WAN 2.6 Video Model With Multi-Shot and Prompt Recipes

Published on

Perplexity Search API: Real-Time Web Retrieval That Outperforms Closed Search Systems

Search APIs have not fundamentally changed how they surface content for AI systems until now. Perplexity has opened access to the same retrieval infrastructure that powers its public answer engine, and the architecture is built differently from the ground up.

Alibaba Cloud launched WAN 2.6 in December 2025, a video generation AI that creates up to 15 seconds of synchronized audiovisual content from text, images, or reference videos. The model introduces multi-shot storytelling, character roleplay with up to two subjects per video, and native audio generation addressing key limitations in earlier AI video tools. Model Studio now offers structured prompt recipes to help creators control motion, camera angles, and narrative flow across five production formulas.

Alibaba Cloud launched WAN 2.6 in December 2025, a video generation AI that creates up to 15 seconds of synchronized audiovisual content from text, images, or reference videos. The model introduces multi-shot storytelling, character roleplay with up to two subjects per video, and native audio generation addressing key limitations in earlier AI video tools. Model Studio now offers structured prompt recipes to help creators control motion, camera angles, and narrative flow across five production formulas.

What’s New in WAN 2.6

WAN 2.6 ships with reference-to-video capabilities that preserve character appearance, motion, and voice across multiple shots. Alibaba confirmed the model supports 15-second HD output at 1080p and 24 fps, a 50% increase over WAN 2.5’s 10-second limit. The release includes Smart Multi-Shot mode, which automatically structures narrative videos from simple prompts without manual shot-by-shot instructions.

Model Studio documentation published January 2026 details five prompt formulas: Basic, Advanced, Image-to-Video, Sound, and Multi-Shot. The Sound formula leverages WAN 2.5’s native audio engine with additional controls for voiceovers, sound effects, and background music. Processing speed improved by 30% compared to WAN 2.5, according to third-party benchmarks.

Why It Matters for Creators

WAN 2.6 marks the first open-source model to generate video and audio in a single pass without external stitching tools. This cuts production time for marketing clips, educational content, and social media videos—sectors where speed determines deployment cost. The reference-to-video feature enables consistent character branding across campaigns, a challenge for earlier text-to-video models that struggled with identity preservation.

Multi-lens narrative support allows creators to produce coherent story arcs with controlled shot transitions, closing the gap between AI-generated clips and professional editing workflows. Alibaba positions this for commercial use in advertising and drama production, where 15-second formats dominate mobile platforms.

Prompt Recipe Breakdown

Model Studio provides five formulas optimized for different skill levels and output types:

  • Basic Formula: Short, open-ended prompts for creative exploration by first-time users
  • Advanced Formula: Detailed descriptions with motion, lighting, and storytelling elements for experienced creators
  • Image-to-Video: Focuses on motion and camera movement when source images define subject and style
  • Sound Formula: Adds voice, effects, and music descriptors for WAN 2.5 audio features
  • Reference Video (WAN 2.6 only): Uses up to two character references to maintain appearance and voice consistency
  • Multi-Shot (WAN 2.6 only): Defines shot structure, camera positions, and timing with cross-shot continuity

The Image-to-Video formula works with both WAN 2.5 and 2.6, while Reference Video and Multi-Shot require the newer model.

Availability and Access

WAN 2.6 is live in Alibaba Cloud Model Studio for global users with API access through third-party platforms including Higgsfield, WaveSpeed AI, and Floyo. Alibaba has not disclosed commercial pricing tiers but confirmed the model is available under open-source licensing, unlike proprietary competitors.

The official announcement notes improved instruction-following accuracy and visual quality but does not provide quantitative benchmarks for either metric. WAN 2.5 remains available for 10-second single-shot use cases where longer narratives are unnecessary.

Featured Snippet Boxes

What is WAN 2.6 used for?

WAN 2.6 generates AI videos up to 15 seconds long from text, images, or reference videos with synchronized audio. It’s designed for marketing clips, educational content, social media videos, and commercial productions requiring multi-shot storytelling with consistent characters.

How is WAN 2.6 different from WAN 2.5?

WAN 2.6 adds multi-shot narrative control, reference-to-video with up to two characters, 15-second output (vs. 10 seconds), and Smart Multi-Shot automation. It processes 30% faster and delivers better visual quality and instruction-following than WAN 2.5.

Can WAN 2.6 generate audio automatically?

Yes, WAN 2.6 uses native audio-visual synchronization to generate voice, sound effects, and background music in a single pass without external tools. The Sound prompt formula in Model Studio controls audio elements through text descriptions.

Where can I access WAN 2.6?

WAN 2.6 is available through Alibaba Cloud Model Studio with API access on Higgsfield, WaveSpeed AI, and Floyo. Alibaba released it under open-source licensing in December 2025, though commercial pricing details are not public.

Mohammad Kashif
Mohammad Kashif
Senior Technology Analyst and Writer at AdwaitX, specializing in the convergence of Mobile Silicon, Generative AI, and Consumer Hardware. Moving beyond spec sheets, his reviews rigorously test "real-world" metrics analyzing sustained battery efficiency, camera sensor behavior, and long-term software support lifecycles. Kashif’s data-driven approach helps enthusiasts and professionals distinguish between genuine innovation and marketing hype, ensuring they invest in devices that offer lasting value.

Latest articles

Perplexity Search API: Real-Time Web Retrieval That Outperforms Closed Search Systems

Search APIs have not fundamentally changed how they surface content for AI systems until now. Perplexity has opened access to the same retrieval infrastructure that powers its public answer engine, and the architecture is built differently from the ground up.

Xbox Project Helix: Microsoft’s Next Console Targets a New Generation of Performance

Announced at GDC 2026 by Jason Ronald, Vice President of Next Generation at Xbox, this is not a hardware revision or mid-cycle refresh. It is a generational platform change

Perplexity Agent API: The Managed Runtime Developers Have Been Waiting For

The Perplexity Agent API removes those layers entirely. It is a multi-provider, interoperable runtime that handles model routing, tool execution, and reasoning

my.WordPress.net: The WordPress That Lives in Your Browser, Not on a Server

WordPress just eliminated the single biggest reason people avoid it. my.WordPress.net launches a full WordPress environment directly in your browser, with no hosting plan, no domain purchase, and no account creation

More like this

Perplexity Search API: Real-Time Web Retrieval That Outperforms Closed Search Systems

Search APIs have not fundamentally changed how they surface content for AI systems until now. Perplexity has opened access to the same retrieval infrastructure that powers its public answer engine, and the architecture is built differently from the ground up.

Xbox Project Helix: Microsoft’s Next Console Targets a New Generation of Performance

Announced at GDC 2026 by Jason Ronald, Vice President of Next Generation at Xbox, this is not a hardware revision or mid-cycle refresh. It is a generational platform change

Perplexity Agent API: The Managed Runtime Developers Have Been Waiting For

The Perplexity Agent API removes those layers entirely. It is a multi-provider, interoperable runtime that handles model routing, tool execution, and reasoning