Models
- GitHub - deepseek-ai/DeepSeek-Coder-V2: DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
- Reader-LM: Small Language Models for Cleaning and Converting HTML to Markdown
- [2409.06666] LLaMA-Omni: Seamless Speech Interaction with Large Language Models
- mistral-community/pixtral-12b-240910 · Hugging Face
- x.com
Papers
- [2404.03085] Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference
- [2409.03460] LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones
- [2409.04431] Theory, Analysis, and Best Practices for Sigmoid Self-Attention
- [2409.07431] Synthetic continued pretraining
- [2409.07146] Gated Slot Attention for Efficient Linear-Time Sequence Modeling
- [2409.08239] Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sourcessynthetic-data
Code
- GitHub - kvcache-ai/ktransformers: A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizationstriton
- GitHub - sustcsonglin/flash-linear-attention: Efficient implementations of state-of-the-art linear attention models in Pytorch and Tritonlinear-attentionssmtriton
Articles
- PDF Retrieval with Vision Language Models | Vespa Blog
- ColPali: Efficient Document Retrieval with Vision Language Models 👀
Videos
- [ ]
Other
- [ ]