michal.i/o

❯

❯

❯

Quantization

Jan 21, 20251 min read

GitHub - bitsandbytes-foundation/bitsandbytes: Accessible large language models via k-bit quantization for PyTorch.
GitHub - pytorch/ao: PyTorch native quantization and sparsity for training and inference
GitHub - intel/auto-round: Advanced Quantization Algorithm for LLMs. This is official implementation of “Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs”
GitHub - ChenMnZ/PrefixQuant: An algorithm for static activation quantization of LLMs

Training

Lecture 30: Quantized Training - YouTube

Stochastic Rounding

Training
Stochastic Rounding

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025