Papers
- [2408.10189] Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Modelsssmsdistillation
- [2408.10012] CLIPCleaner: Cleaning Noisy Labels with CLIPlabel-noise
- [2403.17695] PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
- [2301.07088] Vision Learners Meet Web Image-Text Pairsselfsup
- [2308.07545] Vision-Language Dataset Distillation
- [2408.12408] An Evaluation of Deep Learning Models for Stock Market Trend Prediction
- [2407.10240] xLSTMTime : Long-term Time Series Forecasting With xLSTM
Videos
- Aidan Gomez: What No One Understands About Foundation Models | E1191 - YouTube
- Cohere For AI - Community Talks: Charles Hernandez - YouTubetorchquantization
- Fast, lazy container loading in modal.com by Jonathon Belotti - YouTube