michal.i/o

❯

❯

2024-09-09

Jan 21, 20251 min read

todo
speech
attention
synthetic-data
triton
linear-attention
ssm

Models

GitHub - deepseek-ai/DeepSeek-Coder-V2: DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Reader-LM: Small Language Models for Cleaning and Converting HTML to Markdown
- todo distill to SSM / Mamba model
- todo train OCR model on page image to markdown
- todo train html / image to json schema model
[2409.06666] LLaMA-Omni: Seamless Speech Interaction with Large Language Models
- ICTNLP/Llama-3.1-8B-Omni · Hugging Face speech
mistral-community/pixtral-12b-240910 · Hugging Face
x.com

Papers

[2404.03085] Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference
- Mycelium • Graph visualization library
[2409.03460] LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones
- GitHub - altair199797/LowFormer
[2409.04431] Theory, Analysis, and Best Practices for Sigmoid Self-Attention
- GitHub - apple/ml-sigmoid-attention attention
[2409.07431] Synthetic continued pretraining
[2409.07146] Gated Slot Attention for Efficient Linear-Time Sequence Modeling
[2409.08239] Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources synthetic-data

Code

GitHub - kvcache-ai/ktransformers: A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations triton
GitHub - sustcsonglin/flash-linear-attention: Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton linear-attention ssm triton

Articles

PDF Retrieval with Vision Language Models | Vespa Blog
ColPali: Efficient Document Retrieval with Vision Language Models 👀

Videos

[ ]

Other

[ ]

Models
Papers
Code
Articles
Videos
Other

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025