michal.i/o

❯

❯

2024-11-18

Jan 21, 20252 min read

star

Models

nvidia/Hymba-1.5B-Base · Hugging Face
Cosmos Tokenizer: A suite of image and video neural tokenizersstar
- GitHub - NVIDIA/Cosmos-Tokenizer: A suite of image and video neural tokenizers
Nexusflow/Athene-V2-Agent · Hugging Face

Papers

[2411.07975] JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation
[2410.08020] Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
[2411.10440] LLaVA-o1: Let Vision Language Models Reason Step-by-Step
[2411.10433] M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
[2411.13676] Hymba: A Hybrid-head Architecture for Small Language Models
[2411.14402] Multimodal Autoregressive Pre-training of Large Vision Encoders star
[2411.14347] DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding star
[2411.12155] Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning
[2411.14429] Revisiting the Integration of Convolution and Attention for Vision Backbone

Code

GitHub - apple/ml-aim: This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
GitHub - Lightricks/LTX-Video: Official repository for LTX-Video
GitHub - rayleizhu/GLMix: [NeurIPS 2024] official code release for our paper “Revisiting the Integration of Convolution and Attention for Vision Backbone”.

Articles

You could have designed state of the art positional encoding
Extending the Context Length to 1M Tokens! | Qwen

Videos

Tim Dettmers on Open-source AI, LMs, SWE Bench, Agents, Quantization, & Optimization - YouTube
Speculations on Test-Time Scaling (o1) - YouTube
- Speculations on Test-Time Scaling | Richard M. Karp Distinguished Lecture - YouTube
Retrieval augmented generation; Extractive summarization - YouTube
Learning at test time in LLMs - YouTube
QA: Retrieval & Answer extraction - YouTube
Flash Attention derived and coded from first principles with Triton (Python) - YouTube
Guest Lecture 1: Or Patashnik - The Power of Attention Layers (KAIST CS492D, Fall 2024) - YouTube

Other

CS 886: Recent Advances on Foundation Models
Stanford Graph Learning Workshop 2024 | Stanford Engineering Data Science Applications

Tweets

:

Models
Papers
Code
Articles
Videos
Other
Tweets

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025