2024-10-04 - Movie Gen A Cast of Media Foundation Models

October 4, 2024 updated October 6, 2024 1 min read

#video-generation

30b Transformer
- text to image and text to video
- trained on O(100M) videos and O(1B) images
- tuned with Supervised Fine Tuning
13B video to audio and text to audio model
- trained on O(1M) hours
Flow Matching
[ ]

ml papers video-generation