Models
Papers
- [2501.05460v1] Efficiently serving large multimedia models using EPD Disaggregation
- [2501.06252] : Self-adaptive LLMs
- [2501.08313] MiniMax-01: Scaling Foundation Models with Lightning Attention
- [2411.13055] Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
- [2501.09755] Learnings from Scaling Visual Tokenizers for Reconstruction and Generation
- [2501.09732] Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
- [2501.09223] Foundations of Large Language Models
Code
- [ ]
Articles
Videos
- [ ]