2024-09-02
Papersβ
- [2404.16710] LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding #transformers
- [2405.04434] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
- ![[Pasted image 20240904210108.png]]
- [2409.02060] OLMoE: Open Mixture-of-Experts Language Models
Modelsβ
OLMoE - 1B Mixture of Experts #moe
- [2409.02060] OLMoE: Open Mixture-of-Experts Language Models
- Niklas Muennighoff on X: "Releasing OLMoE - the first good Mixture-of-Experts LLM that's 100% open-source - 1B active, 7B total params for 5T tokens - Best small LLM & matches more costly ones like Gemma, Llama - Open Model/Data/Code/Logs + lots of analysis & experiments πhttps://t.co/Vpac2q90CS π§΅1/9 https://t.co/YOMV5t2Td1" / X
Meet Yi-Coder: A Small but Mighty LLM for Code - 01.AI Blog #code-model
Reflection 70B
- Matt Shumer on X: "The technique that drives Reflection 70B is simple, but very powerful. Current LLMs have a tendency to hallucinate, and canβt recognize when they do so. Reflection-Tuning enables LLMs to recognize their mistakes, and then correct them before committing to an answer. https://t.co/pW78iXSwwb" / X
- mattshumer/Reflection-70B Β· Hugging Face
Salesforce xLAM
Videosβ
- Lecture 28: Liger Kernel - Efficient Triton Kernels for LLM Training - YouTube #triton
- int64 addressing slower than int32, need to cast to int64 for large tensors
- Cohere For AI - Community Talks: Mostafa Elhoushi & Akshat Shrivastava - YouTube
Devβ
- Production-ready Python Docker Containers with uv #python #uv #docker
- CUDA-Free Inference for LLMs | PyTorch #pytorch
- SGLang v0.3 Release: 7x Faster DeepSeek MLA, 1.5x Faster torch.compile, Multi-Image/Video LLaVA-OneVision | LMSYS Org
- Advanced Python: Achieving High Performance with Code Generation | by Yonatan Zunger | Medium
Randomβ
- Ilya Sutskever's SSI Inc raises $1B | Hacker News
- Dylan Freedman on X: "The new Qwen2-VL-7B Instruct model gets 100% accuracy extracting text from this handwritten document. This is the first open weights model (Apache 2.0) that I've seen OCR this accurately. (Thank you @fdaudens for the tip!) https://t.co/AB9r3bKDF0 https://t.co/nAEY7cp1w8" / X
- Fetching Title#xsta