2025-11-04
Models
- Kimi-Linear-A3B - a moonshotai Collection
- moonshotai/Kimi-Linear-48B-A3B-Instruct · Hugging Face
- GitHub - datalab-to/chandra: OCR model that handles complex tables, forms, handwriting with full layout. #ocr
- BAAI/Emu3.5 · Hugging Face
Papers
- [2510.27688] Continuous Autoregressive Language Models
- [2510.19949] Surfer 2: The Next Generation of Cross-Platform Computer Use Agents #cua
- [2510.15110v1] DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning
- [2511.03276] Diffusion Language Models are Super Data Learners
- [2510.26692] Kimi Linear: An Expressive, Efficient Attention Architecture
- [2510.24699] AgentFold: Long-Horizon Web Agents with Proactive Context Management
- [2408.08125] Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification
Code
Articles
- Introduction to parallelism in PyTorch | George Grigorev Blog
- Inside NVIDIA GPUs: Anatomy of high performance matmul kernels - Aleksa Gordić
- SORA From Scratch: Diffusion Transformers for Video Generation Models
- talks/fine_tuning_with_trl/Fine tuning with TRL (Oct 25).pdf at main · sergiopaniego/talks · GitHub
- Unlocking On-Policy Distillation for Any Model Family - a Hugging Face Space by HuggingFaceH4
- Beyond Standard LLMs - by Sebastian Raschka, PhD
Videos
- [ ]
Other
- [ ]