Happy Horse

All video models

Video modelAlibaba

Happy Horse Image to Video

Image → Video — generates video.

Specifications

Input mode: Image → Video
Accepts: reference image (up to 9), start frame
Aspect ratios: 21:9, 16:10, 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16, 10:16, 9:21
Resolutions: 720p, 1080p
Durations: 3s, 4s, 5s, 6s, 7s, 8s, 9s, 10s, 11s, 12s, 13s, 14s, 15s
Max duration: 15s
Native audio: No
Pricing: 30 credits / second — longer clips and higher resolutions cost more
Typical generation time: ~4 min
Free tier: Yes

Image → Video examples

Wood Duck on a Rustic Dock — Happy Horse

Mallard Duck Standing on a Ledge — Happy Horse

A medium close-up of a young man with short brown hair and blue eyes speaking in a podcast studio. He wears a dark blue t-shirt. A professional microphone is positioned to his left, with a podcast studio light box in the background. This 1284x718 resolution video was generated using the Happy Horse model.

Man Speaking in Podcast Studio — Happy Horse

A close-up video frame of a bewildered man with a mustache and styled hair wearing a beige suit and tie. In the blurred background, a crowd of men in traditional Middle Eastern attire stands in a sunlit outdoor market with palm trees. Generated by Happy Horse at 1284x718 resolution.

Confused Man in Marketplace — Happy Horse

A video still showing a smiling couple standing in a lush vineyard during a warm golden hour sunset. The woman wears a floral dress, and the man holds a glass of red wine. This video was generated using the Happy Horse model at a resolution of 1284x718.

Couple in a Sunset Vineyard — Happy Horse

A close-up video frame of a rectangular glass perfume bottle with a brass cap, containing a Polaroid photo of a young schoolgirl. The bottle sits on a rustic wooden table under warm sunlight. This 1284x718 video was generated using the Happy Horse model on Hedra.

Vintage Perfume Bottle with Polaroid Photo — Happy Horse

A video still showing a male podcast host with a beard and grey sweater speaking into a black microphone. The background is a softly blurred recording studio with warm lighting and a neon light fixture. Generated using Happy Horse at 1284x718 resolution.

Podcast Host Speaking in Studio — Happy Horse

Happy Horse Text to Video

Text → Video — generates video.

Specifications

Input mode: Text → Video
Aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4
Resolutions: 720p, 1080p
Durations: 3s, 4s, 5s, 6s, 7s, 8s, 9s, 10s, 11s, 12s, 13s, 14s, 15s
Max duration: 15s
Native audio: No
Pricing: 30 credits / second — longer clips and higher resolutions cost more
Typical generation time: ~3 min
Free tier: Yes

Text → Video examples

A close-up shot of a dark glass perfume bottle labeled Efflux, surrounded by floating lavender stems and citrus slices against a dark, moody background. This 1280x720 text-to-video generation by Happy Horse features a shallow depth of field with soft, cinematic lighting.

Efflux Perfume Bottle with Floating Botanicals — Happy Horse

A wide-angle landscape video frame capturing a calm ocean during sunset, generated by the Happy Horse model at 1280x720 resolution. The sun glows brightly near the horizon, casting orange and yellow light onto the water's surface and scattered clouds in the sky.

Ocean Sunset at Golden Hour — Happy Horse

A wide-angle view of a large ocean wave cresting under a dark, stormy sky. In the background, a warm orange sunset glows along the horizon beneath low-hanging clouds. This text-to-video generation was created using the Happy Horse model at a 1920x1080 resolution.

Stormy Ocean Sunset — Happy Horse

A wide-angle, low-angle shot of a clear stream flowing over moss-covered stones in a dense forest. Lush green ferns line the banks, and tall trees rise into the canopy. Generated at 1920x1080 resolution using the Happy Horse model on Hedra, this 5-second video captures serene natural movement.

Forest Stream Flowing — Happy Horse

What is Happy Horse best for?

Happy Horse is Alibaba's native multimodal video model, built to generate video and matching audio together rather than dubbing sound on afterward. That makes it strong for dialogue-driven and multilingual work: it does synchronized lip-sync in seven languages (Mandarin, English, Cantonese, Japanese, Korean, German, and French). It produces cinematic 1080p multi-shot clips with consistent characters, and on its debut it topped the Artificial Analysis Video Arena — ranking ahead of ByteDance's Seedance 2.0 and Kuaishou's Kling on the no-audio leaderboards.

Who created Happy Horse, and when was it released?

Happy Horse (HappyHorse-1.0) was built by Alibaba's Taotian Group, led by Zhang Di — a former Kuaishou vice president and technical lead on Kling AI who returned to Alibaba in November 2025. The model first appeared anonymously atop the Artificial Analysis Video Arena around April 7, 2026; Alibaba confirmed it was behind the model on April 9–10, and rolled out a public API later in April 2026 through Alibaba Cloud's Bailian platform.

How does Happy Horse fit into Alibaba's model lineup, and is it open?

It is the successor to Alibaba's earlier Wan (Tongyi Wanxiang) video series — which had been sitting mid-pack — and leapfrogged it to #1 on the Artificial Analysis Video Arena, ahead of rivals including ByteDance's Seedance 2.0, Kuaishou's Kling, and OpenAI's Sora 2. Alibaba open-sourced HappyHorse-1.0 under Apache-2.0, with model weights and a public repository released alongside the launch.

What makes Happy Horse technically unusual, and how should you prompt it?

Under the hood it's a 15-billion-parameter single-stream Transformer: text, image, video, and audio tokens are packed into one sequence (40 layers — modality-specific projections only at the first and last four, shared parameters across the middle 32), which is what lets it generate picture and sound natively in sync. DMD-2 distillation trims generation to roughly 8 steps — about 38 seconds for a 1080p clip on a single H100. For prompting, treat it like a director giving instructions: keep prompts concise and specific rather than piling on buzzwords, and spell out the dialogue or sound you want so the synchronized audio and lip-sync have something to lock onto.

Prompt tips

Keep it to 20 words: Stick to a strict formula: [Subject] [does action] in [setting], [time of day], [one atmosphere or camera cue].
Don't over-describe: Skip exhaustive wardrobe details or lighting recipes; extra detail eats into the model's generation budget and degrades biomechanics.
Use character tokens for consistency: In Reference-to-Video mode, map your uploaded images to character1, character2, etc., in the prompt to maintain multi-character stability.
Describe motion explicitly: Use clear camera language (e.g., "slow dolly in," "orbit left," "locked off") rather than vague action words to get the best cinematic movement.

Happy Horse

All video models

Video modelAlibaba

Happy Horse Image to Video

Image → Video — generates video.

Specifications

Input mode: Image → Video
Accepts: reference image (up to 9), start frame
Aspect ratios: 21:9, 16:10, 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16, 10:16, 9:21
Resolutions: 720p, 1080p
Durations: 3s, 4s, 5s, 6s, 7s, 8s, 9s, 10s, 11s, 12s, 13s, 14s, 15s
Max duration: 15s
Native audio: No
Pricing: 30 credits / second — longer clips and higher resolutions cost more
Typical generation time: ~4 min
Free tier: Yes

Image → Video examples

Wood Duck on a Rustic Dock — Happy Horse

Mallard Duck Standing on a Ledge — Happy Horse

Man Speaking in Podcast Studio — Happy Horse

Confused Man in Marketplace — Happy Horse

Couple in a Sunset Vineyard — Happy Horse

Vintage Perfume Bottle with Polaroid Photo — Happy Horse

Podcast Host Speaking in Studio — Happy Horse

Happy Horse Text to Video

Text → Video — generates video.

Specifications

Input mode: Text → Video
Aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4
Resolutions: 720p, 1080p
Durations: 3s, 4s, 5s, 6s, 7s, 8s, 9s, 10s, 11s, 12s, 13s, 14s, 15s
Max duration: 15s
Native audio: No
Pricing: 30 credits / second — longer clips and higher resolutions cost more
Typical generation time: ~3 min
Free tier: Yes

Text → Video examples

Efflux Perfume Bottle with Floating Botanicals — Happy Horse

Ocean Sunset at Golden Hour — Happy Horse

Stormy Ocean Sunset — Happy Horse

Forest Stream Flowing — Happy Horse

What is Happy Horse best for?

Who created Happy Horse, and when was it released?

How does Happy Horse fit into Alibaba's model lineup, and is it open?

What makes Happy Horse technically unusual, and how should you prompt it?

Prompt tips

Keep it to 20 words: Stick to a strict formula: [Subject] [does action] in [setting], [time of day], [one atmosphere or camera cue].
Don't over-describe: Skip exhaustive wardrobe details or lighting recipes; extra detail eats into the model's generation budget and degrades biomechanics.
Use character tokens for consistency: In Reference-to-Video mode, map your uploaded images to character1, character2, etc., in the prompt to maintain multi-character stability.
Describe motion explicitly: Use clear camera language (e.g., "slow dolly in," "orbit left," "locked off") rather than vague action words to get the best cinematic movement.

Happy Horse

Overview

Happy Horse Image to Video

Specifications

Image → Video examples

Happy Horse Text to Video

Specifications

Text → Video examples

What is Happy Horse best for?

Who created Happy Horse, and when was it released?

How does Happy Horse fit into Alibaba's model lineup, and is it open?

What makes Happy Horse technically unusual, and how should you prompt it?

Similar models

Prompt tips

What Will You Create?

Happy Horse

Overview

Happy Horse Image to Video

Specifications

Image → Video examples

Happy Horse Text to Video

Specifications

Text → Video examples

What is Happy Horse best for?

Who created Happy Horse, and when was it released?

How does Happy Horse fit into Alibaba's model lineup, and is it open?

What makes Happy Horse technically unusual, and how should you prompt it?

Similar models

Prompt tips

What Will You Create?

Company

Overview

Happy Horse Image to Video

Specifications

Image → Video examples

Happy Horse Text to Video

Specifications

Text → Video examples

What is Happy Horse best for?

Who created Happy Horse, and when was it released?

How does Happy Horse fit into Alibaba's model lineup, and is it open?

What makes Happy Horse technically unusual, and how should you prompt it?

Similar models

Prompt tips

What Will You Create?

Overview

Happy Horse Image to Video

Specifications

Image → Video examples

Happy Horse Text to Video

Specifications

Text → Video examples

What is Happy Horse best for?

Who created Happy Horse, and when was it released?

How does Happy Horse fit into Alibaba's model lineup, and is it open?

What makes Happy Horse technically unusual, and how should you prompt it?

Similar models

Prompt tips

What Will You Create?