PyTorch

Performance Optimizations

Model Initialization

Mixed Precision

Quantization

Data Loading

torch.compile

JIT

TorchDynamo

https://github.com/pytorch/torchdynamo

ex: https://github.com/pytorch/torchdynamo/blob/main/benchmarks/training_loss.py

TorchInductor

https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747

AITemplate

https://github.com/facebookincubator/AITemplate

Distributed

Debugging and Profiling

Memory

Utility Libraries

FlexAttention