michal.i/o

❯

❯

2024-09-16

Jan 21, 20252 min read

vlm
speech
optimizers
text-embeddings
compression
multimodal
dataloader
pytorch
umap
tsne
vision-language
tuning
rl
distributed
tabular

Models

[2409.11402] NVLM: Open Frontier-Class Multimodal LLMsvlm
- GitHub - NVIDIA/Megatron-Energon: Megatron’s multi-modal data loader
Qwen2.5: A Party of Foundation Models! | Qwen
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.speech
GitHub - kyutai-labs/moshispeech
- Moshi.pdf
GitHub - microsoft/GRIN-MoE

Papers

EUREKA: Evaluating and Understanding Large Foundation Models - Microsoft Research
- GitHub - microsoft/eureka-ml-insights: A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
[2409.11321] SOAP: Improving and Stabilizing Shampoo using Adamoptimizers
- GitHub - nikhilvyas/SOAP
[2409.10173] jina-embeddings-v3: Multilingual Embeddings With Task LoRA text-embeddings
[2409.12191] Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution
[2409.11564] Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Code

GitHub - kyleliang919/Online-Subspace-Descent: This repo is based on https://github.com/jiaweizzhao/GaLore, paper coming soon optimizers compression
GitHub - NVIDIA/Megatron-Energon: Megatron’s multi-modal data loader multimodal dataloader pytorch
GitHub - TorchDR/TorchDR: TorchDR - PyTorch Dimensionality Reduction pytorch umap tsne
GitHub - modelscope/ms-swift: Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, …)vision-language vlm tuning
Release v0.3.0 Release Note · linkedin/Liger-Kernel · GitHub
GitHub - voideditor/void open source cursor alternative
GitHub - pytorch-labs/LeanRL: LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.pytorch rl

Articles

Improved Data Loading with Threads | NVIDIA Technical Blog
- uses noGIL mode to evaluate a threaded torch DataLoader
- CUDA context switches a larger issue with process based workers
- no difference in Pillow
rerankers: A Lightweight Python Library to Unify Ranking Methods – Answer.AI
[Distributed w/ TorchTitan] Introducing Async Tensor Parallelism in PyTorch - distributed / torchtitan - PyTorch Forums pytorch distributed
Polars — GPU acceleration with Polars and NVIDIA RAPIDS tabular
How to make LLMs go fast
Fine-tuning LLMs to 1.58bit: extreme quantization made easy
static.sched.com/hosted_files/pytorch2024/8f/Pytorch Conference - Making LLM training faster.pdf
Inference-Friendly Models with MixAttention | Databricks Blog
Optimizing AI Inference at Character.AI
https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

Videos

CS 194/294-196 (LLM Agents) - Lecture 1 - YouTube
- Large Language Model Agents
YouTube twiml Simon Williamson video about LLMs for code
YouTube Noam Brown from OpenAI on test time compute and planning
Tabular Learning: skrub and Foundation Models with Gaël Varoquaux, PhD - YouTube tabular

Other

Illuminate - paper to podcast tool from google
- Apple last week gave us “Sigmoid Self-Attention” - Paper Podcast - YouTube

Models
Papers
Code
Articles
Videos
Other

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025