2025-02-10
Models
Papers
- [2502.04896] Goku: Flow Based Video Generative Foundation Models
- [2501.13918] Improving Video Generation with Human Feedback
- [2502.04507] Fast Video Generation with Sliding Tile Attention
- [2502.05173] VideoRoPE: What Makes for Good Video Rotary Position Embedding?
- [2502.05179] FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
- [2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
- [2502.04320] ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features
- [2502.05236] Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
- [2502.06788] EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
- [2502.06782] Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT
- [2502.06527] CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers
- [2502.06155] Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
- [2502.07701] Magic 1-For-1: Generating One Minute Video Clips within One Minute
- [2502.07617] Scaling Pre-training to One Hundred Billion Data for Vision Language Models
- [2502.07508] Enhance-A-Video: Better Generated Video for Free
- [2502.07864] TransMLA: Multi-head Latent Attention Is All You Need
- [2502.06145] Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Code
- [ ]
Articles
- [ ]
Videos
- [ ]
Other
- [ ]