michal.i/o

❯

❯

2024-11-04

Jan 21, 20251 min read

Models

[ ]

Papers

[2410.23262] EMMA: End-to-End Multimodal Model for Autonomous Driving
[2411.03313] Classification Done Right for Vision-Language Pre-Training
[2406.06484] Parallelizing Linear Transformers with the Delta Rule over Sequence Length
[2411.02959] HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
[2411.04965] BitNet a4.8: 4-bit Activations for 1-bit LLMs
[2411.04996] Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
[2411.04905] OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
[2410.17897] Value Residual Learning For Alleviating Attention Concentration In Transformers
[2410.21228] LoRA vs Full Fine-tuning: An Illusion of Equivalence
[2407.10964] No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations
[2405.17604] LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters
[2411.02853] ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate

Code

GitHub - run-ai/runai-model-streamer

Articles

[ ]

Videos

YouTube
Stanford Graph Learning Workshop 2024 - YouTube
https://www.youtube.com/watch?v=0Yi3yUjB-3M&list=PPSV

Other

[ ]

Tweets

Models
Papers
Code
Articles
Videos
Other
Tweets

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025