pytorch

Pytorch Conference

pytorch

Activation Checkpointing (SAC - Selective Activation Checkpointing


  • torch.compile automatically does some recomputation
  • By default, torch.compile will use the min-cut partitioner, which chooses to recompute certain ops with the objective of minimizing number of tensors saved for backward boundary
  • It’s primary objective is to reduce runtime however, so it is relatively conservative wrt recomputation, e.g. we only recompute fusible, non-compute intensive ops, and have a heuristic to avoid long fusible chains.

in 2.5 a new checkpoint API will allow setting checkpointing policy 2.4 has a new compile only memory budget API to trade off memory for speed

torch._dynamo.config.activation_memory_budget = 0.5 
out = torch.compile(fn)(inp)

Timeline of LLMS

PyTorch Conference 2024: Keynote: Navigating the Architectural Ti…

Data Quality - Filtering, Curriculum, Synthetic

grouped query attention

larger vocab

RMS Norm

RoPE Encoding (relative)

mixture of experts (ex mixtral)

Sliding Window Attention

litgpt/litgpt/model.py at main · Lightning-AI/litgpt · GitHub

Better GPU Support in Apache Ray