Cogvideox

App in the BluixApps catalog

What it is

CogVideoX is THUDM's open-source text-to-video model family (China's Tsinghua University + Zhipu AI) — generates 6-10 second videos at 720×480 with strong temporal consistency and physics. Multiple model sizes (2B, 5B params) for different VRAM budgets. Apache 2.0 license.

One of the highest-quality open video models, the answer to closed-source Runway/Pika competitors.

What it's for

  • Text-to-video — high-fidelity short videos from prompts
  • Image-to-video — extend a still image into motion
  • Video-to-video — re-style or extend existing video
  • Marketing content at scale — campaign visuals, social ads
  • Educational visuals — bring concepts to life
  • Game/film previs — high-quality storyboard videos

Who it's for

  • Studios producing high-quality video content
  • AI agencies offering video generation to clients
  • Researchers studying video generation
  • Marketers generating campaign visuals
  • Hosting providers offering premium video gen tier

Why teams pick CogVideoX over alternatives

  • Apache 2.0 — fully open, commercial-friendly
  • High-quality outputs — competitive with closed Pika/Runway for many use cases
  • Multiple sizes: 2B (fast, 12GB), 5B (best quality, 24GB)
  • THUDM backing — top Chinese research institution
  • Active community — Diffusers integration first-class
  • Multi-modal: text2vid, img2vid, vid2vid

Integrations

  • HuggingFace Diffusers — first-class pipeline
  • ComfyUI nodes — community wrappers (very active)
  • Gradio UI included for web access
  • API mode — Diffusers Python wrapper
  • Pair with stills: Flux/SDXL → image → CogVideoX img2vid
  • Pair with prompts: Ollama prompt-enhance → CogVideoX (better outputs)

Notable users & community

  • 12k+ GitHub stars
  • Powering production tools at multiple AI startups
  • Featured prominently in HuggingFace video model leaderboards
  • Active Chinese + English community
  • Multiple Comm-driven LoRA / fine-tunes available

Tips & operations

  • Model selection by VRAM:
    • 2B: ~12 GB VRAM, ~2 min per video
    • 5B: ~24 GB VRAM, ~6 min per video
  • Prompts: English works best; detailed descriptions improve coherence
  • Length: 6 seconds default, max 10 with current weights
  • Resolution: 720×480 default; community fine-tunes for 1280×720
  • First boot downloads ~20 GB for 5B model — pre-cache
  • VAE upsample option for higher quality output
  • GPU memory tricks: enable model CPU offload for 12 GB cards

What we ship in BluixApps

  • Cloned THUDM/CogVideo repo
  • pytorch/pytorch CUDA 12.4 base
  • gradio_composite_demo launcher (richest UI)
  • Persistent volumes: repo, models (~20 GB), outputs (MP4)
  • Port 7863 mapped, listen 0.0.0.0
  • Install report at /root/bluixapps/cogvideox.txt
  • Notes for switching between 2B and 5B variants
  • GPU pre-flight check via bluixapps_ensure_nvidia_runtime
  • Backup hook covers models + outputs
Read this app's deep dive on bluix.app ↗

Get this app — pick a BluixApps plan

Same catalog. Scaling tenant isolation, white-label and support tier.

TierTenantsCatalogSupportWhite-labelMonthly
Stacks119 curated stacksStandard$19/moDetailDeploy
Starter10Full catalogStandard+$15–25/mo$49/moDetailDeploy
Pro25Full catalogPriority bugfix+$15–25/mo$149/moDetailDeploy
Growth100Full catalogPriority bugfix+$15–25/mo$349/moDetailDeploy
Scale500Full catalog7-day window+$15–25/mo$799/moDetailDeploy
EnterpriseUnlimitedFull catalogPriority 7-dayBundled$1,499/moDetailDeploy

Powered by WHMCompleteSolution