Pytorch Conference
Activation Checkpointing (SAC - Selective Activation Checkpointing
- torch.compile automatically does some recomputation
- By default, torch.compile will use the min-cut partitioner, which chooses to recompute certain ops with the objective of minimizing number of tensors saved for backward boundary
- It’s primary objective is to reduce runtime however, so it is relatively conservative wrt recomputation, e.g. we only recompute fusible, non-compute intensive ops, and have a heuristic to avoid long fusible chains.
in 2.5 a new checkpoint API will allow setting checkpointing policy 2.4 has a new compile only memory budget API to trade off memory for speed
Timeline of LLMS
PyTorch Conference 2024: Keynote: Navigating the Architectural Ti…
Data Quality - Filtering, Curriculum, Synthetic
grouped query attention
larger vocab
RMS Norm
RoPE Encoding (relative)
mixture of experts (ex mixtral)
Sliding Window Attention
litgpt/litgpt/model.py at main · Lightning-AI/litgpt · GitHub