Skip to main content

2024-09-02

Papers

[2404.16710] LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding #transformers
[2405.04434] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
- ![[Pasted image 20240904210108.png]]
[2409.02060] OLMoE: Open Mixture-of-Experts Language Models

Models

Videos

Lecture 28: Liger Kernel - Efficient Triton Kernels for LLM Training - YouTube #triton
- int64 addressing slower than int32, need to cast to int64 for large tensors
Cohere For AI - Community Talks: Mostafa Elhoushi & Akshat Shrivastava - YouTube

Dev

Random

Papers
Models
Videos
Dev
Random