Optimizations
Quantized Optimizers
Fused Ops
- GitHub - unslothai/unsloth: Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
- GitHub - linkedin/Liger-Kernel: Efficient Triton Kernels for LLM Training
Compile
FlexAttention
Block causal mask to pack samples