2025-11-24
Models
- GitHub - Tencent-Hunyuan/HunyuanOCR
- black-forest-labs/FLUX.2-dev · Hugging Face
- GitHub - nari-labs/dia2: TTS model capable of streaming conversational audio in realtime.
- Zyphra
- inclusionAI/LLaDA2.0-flash · Hugging Face
- microsoft/Fara-7B · Hugging Face
Papers
- [2511.08704] Rethinking generative image pretraining: How far are we from scaling up next-pixel prediction?
- [2511.19399] DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
- [2511.18890] Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
- [2511.16397] AICC: Parse HTML Finer, Make Models Better — A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser
- [2511.08704] Rethinking generative image pretraining: How far are we from scaling up next-pixel prediction?
- [2511.16249] Controllable Layer Decomposition for Reversible Multi-Layer Image Generation
- [2511.19365] DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
- [2511.20635] iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation
- [2511.21631] Qwen3-VL Technical Report
- [2511.21691] Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Code
- GitHub - opendatalab/MinerU: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
- GitHub - inclusionAI/dFactory: Easy and Efficient dLLM Fine-Tuning
Articles
- BFL - Representation Comparison
- FP8 Reinforcement Learning | Unsloth Documentation
- Continuous batching from first principles
Videos
- [ ]
Other
- [ ]