Multi-Head Latent Attention (MLA) December 28, 2024 updated December 31, 2024 1 min read [2405.04434] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model multi head latent attention (MLA) ยท GitHub