michal.i/o

❯

❯

2024-12-16

Jan 21, 20252 min read

Models

Welcome to the Falcon 3 Family of Open Models!
GitHub - foundation-model-stack/bamba: Train, tune, and infer Bamba model
GitHub - AnswerDotAI/ModernBERT: Bringing BERT into modernity via both architecture changes and scaling

Papers

[2412.07752] FlashRNN: Optimizing Traditional RNNs on Modern Hardware
[2412.10360] Apollo: An Exploration of Video Understanding in Large Multimodal Models
[2412.10117] CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models
[2410.18779] A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
[2412.10302] DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[2412.09607] Spectral Image Tokenizer
[2412.12095] Causal Diffusion Transformers for Generative Modeling
[2412.10437] SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
[2410.02899] FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
[2412.01951] Self-Improvement in Language Models: The Sharpening Mechanism
[2412.13061] VidTok: A Versatile and Open-Source Video Tokenizer
[[2412.14164] MetaMorph: Multimodal Understanding and Generation via Instruction Tuning](https://arxiv.org/abs/2412.14164
[2412.12432] Three Things to Know about Deep Metric Learning
GitHub - foundation-model-stack/bamba: Train, tune, and infer Bamba model
[2412.13303] FastVLM: Efficient Vision Encoding for Vision Language Models
[2412.15115] Qwen2.5 Technical Report
[2412.15213] Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
[2412.13663] Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
[2412.14475] MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Code

GitHub - hao-ai-lab/FastVideo: FastVideo is an open-source framework for accelerating large video diffusion model.
GitHub - Genesis-Embodied-AI/Genesis: A generative world for general-purpose robotics & embodied AI learning.
GitHub - huggingface/picotron: Minimalistic 4D-parallelism distributed training framework for education purpose

Articles

[ ]

Videos

[ ]

Other

[ ]

Tweets

Models
Papers
Code
Articles
Videos
Other
Tweets

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025