NeurIPS Tutorial Opening the Language Model Pipeline: A Tutorial on Data Preparation, Model Training, and Adaptation




















NeurIPS Tutorial Flow Matching for Generative Modeling











NeurIPS Tutorial Beyond Decoding: Meta-Generation Algorithms for Large Language Models

































Lambda - Distributed Training




Shopify RecSys


MATH-AI: The 4th Workshop on Mathematical Reasoning and AI




Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning




The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models

Tri Dao
SSMs and Mamba










Navdeep Jaitly














Sparsified SSMs





Longhorn: State Space Models are Amortized Online Learners










GEAR - KV Cache Compression


OLMoE - AI2



An Evolved Universal Transformer Memory
Machine Learning for Systems
Jeff Dean
Fine-Tuning in Modern Machine Learning: Principles and Scalability
















How Transformers learn Causal Structure with Gradient Descent





