compression

Model Compression

Quantization

https://github.com/666DZY666/micronet

Lower Precision

GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch

Knowledge Distillation

For Knowledge Distillation see this note

Lectures

https://efficientml.ai/schedule/


ml