compression

December 9, 2023 updated October 5, 2024 1 min read

Model Compression

Quantization

A Visual Guide to Quantization - by Maarten Grootendorst

https://github.com/666DZY666/micronet

Lower Precision

GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch

Knowledge Distillation

For Knowledge Distillation see this note

Lectures

https://efficientml.ai/schedule/

ml