ml

Notes

2025 12 03 Matryoshka Transformers for Diffusion 0
2025 12 03 LORAs for diffusion steps 92
2025 11 23 Computer Vision 60
2025 11 21 Low Precision Formats and Mixed Precision 23
2025 11 06 Model Leaderboards 71
2025 11 06 3D Generation and World Models 818
2025 11 06 ML for CAD 5585
2025 10 27 Prompt Optimization 421
2025 10 27 Deep Research 1464
2025 10 17 Diffusion Language Models 3729
2025 10 09 Image Editing 85
2025 10 09 Computer Use Agents 58
2025 02 26 Diffusion for Perception 258
2025 02 03 Video Editing 23
2025 01 30 Multi-Vector Retrieval 460
2025 01 29 Omni Multimodal Models 250
2025 01 29 Pretraining - Large Scale Training Tricks 1116
2025 01 28 Vector Quantization and Compression 32
2025 01 28 PyTorch Performance Guide 104
2025 01 27 Music Generation 39
2025 01 27 Multi Label future token prediction head 54
2025 01 26 Super Fast Decoder Inference 0
2025 01 25 Take all branches in parallel 104
2025 01 25 Latent Generative visual reasoning 30
2025 01 25 Soft Verifiers 22
2025 01 25 GAN + Active Learning on top of Reasoning 106
2025 01 25 User Embedding Conditioned Generative Models 142
2025 01 25 Codebook KV Cache 476
2025 01 25 CLIP in GPT 597
2025 01 23 2025-01-23 RAG Pipelines 2215
2025 01 20 LLM Evaluation 636
2025 01 18 Production Machine Learning Systems 4288
2025 01 17 Information Retrieval - Retrieval, Ranking and Search 1693
2025 01 15 Commander - Super Fast Local Function Calling 413
2025 01 07 Prompt Engineering 652
2025 01 07 ML Systems 181
2024 12 18 Activation Functions 194
2024 12 17 Speedruns 544
2024 12 17 Hallucinations 131
2024 12 16 Document Processing 262
2024 12 13 Pretrain on synthetic conversation data 235
2024 12 13 Predict token from positional embedding 0
2024 12 13 Tokenization 1514
2024 12 11 "World Models" - Modeling the Real World 138
2024 12 11 Autonomous Driving - Self Driving 670
2024 12 10 Neural Architecture Search (NAS) 639
2024 12 10 Neural Architecture Search for SSM Hybrids 185
2024 12 09 Pruning 120
2024 12 08 Decoding and Sampling 3307
2024 12 08 2024 NeurIPS 7553
2024 12 08 Mechanistic Interpretability 596
2024 12 07 Function Calling (with LLMs) 1042
2024 12 07 ML Competitions 44
2024 12 07 AI Web Browser 2244
2024 12 07 Self-Supervised Image Models 938
2024 12 06 Teach VLM to Zoom and Pan 121
2024 12 05 Food Recognition 90
2024 12 05 Image Matching 286
2024 12 03 Token Dropping, Pruning, Merging and Compression 1367
2024 12 03 Generative Models 3259
2024 12 03 Variational Autoencoders (VAE) 122
2024 12 03 Agents 2117
2024 11 30 ML Courses & Books 2655
2024 11 29 Data Curation 338
2024 11 27 Mixture of Modules 300
2024 11 22 Structured Generation with LLMs 571
2024 11 17 2024-11-17 - Mixture-of-Transformers A Sparse and Scalable Architecture for Multi-Modal Foundation Models 176
2024 11 16 SLAM 185
2024 11 08 Sapiens for Robotics 0
2024 11 08 Bad apples for label noise early stopping 0
2024 11 08 Small Proxy model to predict loss for given sample 40
2024 11 03 2024-11-03 - ReMoE FULLY DIFFERENTIABLE MIXTURE-OF-EXPERTS WITH RELU ROUTING 0
2024 11 03 2024-11-03 - GATED DELTA NETWORKS IMPROVING MAMBA2 WITH DELTA RULE 158
2024 11 03 2024-11-03 - On the Efficiency of Convolutional Neural Networks 1206
2024 11 03 2024-11-03 - TokenFormer - RETHINKING TRANSFORMER SCAL-ING WITH TOKENIZED MODEL PARAMETERS 398
2024 10 28 Small Foundational Models 945
2024 10 26 White space separated conv text encoder 0
2024 10 26 Early Fusion Multimodal Encoder Models 338
2024 10 25 Learning Skip Layers 0
2024 10 24 Robotics 3336
2024 10 24 logsumexp 733
2024 10 22 Data Loading 383
2024 10 22 Learn to Initialize from OS Models 62
2024 10 19 Two Stream SSMs 0
2024 10 18 matryoshka embeddings 114
2024 10 17 Tensor Tricks 298
2024 10 16 SSMs 4 Rec 0
2024 10 15 Test Time Compute, LLM Reasoning, Inference Time Scaling 7821
2024 10 15 Normalization 4563
2024 10 14 Computer Graphics 181
2024 10 14 Numerics 318
2024 10 14 Mamba 1727
2024 10 11 Storage 135
2024 10 11 Networking 135
2024 10 11 Universal embedding space for popular foundational models (or adapters) 532
2024 10 10 2024-10-10 - Pixtral 12B 62
2024 10 10 Tiny LLMs with rag in the middle 328
2024 10 10 Flow Matching - Rectified Flows 1588
2024 10 09 Tiny Foundational model by distilling from a lot of SOTA models 0
2024 10 09 Remove all the things 609
2024 10 09 Multi Modal Learning to Rank as a replacement for CLIP 209
2024 10 09 Latent Transformers with small vocabularies 406
2024 10 09 Recurrent Computation with Transformers by repeating layers 269
2024 10 09 Task Routing for Multimodal LLMs 72
2024 10 09 VLMs for better Vision Backbones 578
2024 10 09 Transformer Properties 225
2024 10 09 Model Routing 263
2024 10 09 xformers 174
2024 10 09 FairScale 0
2024 10 09 ML for Math 273
2024 10 08 A glossary of all the ways ML models fail to train 401
2024 10 04 2024-10-04 - Movie Gen A Cast of Media Foundation Models 278
2024 10 04 ML Conferences 537
2024 10 04 Embedding Models 731
2024 10 04 Code LLMs 1760
2024 10 03 torch compile 233
2024 10 03 LLM Training and Tuning 1146
2024 10 03 PrefixLM 0
2024 10 03 Alignment and Post Training 462
2024 10 03 Video Generation 2058
2024 10 03 Parameter Efficient Fine Tuning (PEFT) 208
2024 10 03 Computer Vision Backbones 465
2024 10 03 Deepspeed 0
2024 10 03 GPUs 2934
2024 10 03 CLIP 1261
2024 10 03 RL for LMs 287
2024 10 02 MLX 407
2024 09 27 Retrieval Augmented Generation (RAG) 5334
2024 09 26 Quantization 1612
2024 09 25 jax 204
2024 09 25 Decoder Transformer Inference (LLM Serving) 5343
2024 09 25 Long Context Transformers 2652
2024 09 24 Cloud GPUs 2622
2024 09 23 Softmax 308
2024 09 21 autograd 299
2024 09 19 Model Distillation and Transfer Learning 3006
2024 09 17 triton 904
2024 09 17 Vision Language Models 10139
2024 09 17 xlstm 95
2024 09 17 ocr 2143
2024 09 15 3D Computer Vision 1096
2024 09 10 Mistral7B 116
2024 09 09 Tabular Machine Learning 2733
2024 09 09 State Space Models (SSMs) 1119
2024 09 09 Semantic Search and Ranking 651
2024 09 04 Distributed Training 8177
2024 09 03 text2sql 241
2024 08 28 Approximate Nearest Neighbor Search (ANN) 3085
2024 08 15 Instance Retrieval and Instance Recognition 2491
2024 08 04 Server Inference 1858
2024 04 21 Mixture of Experts 5081
2023 12 17 2023-12-17 - Stable and low-precision training for large-scale vision-language models 1790
2023 12 16 2023 NeurIPS 13272
2023 12 09 2023-12-09 - SILC Improving Vision Language Pretraining with Self-Distillation 656
2023 12 09 2023-12-09 - Text as Image Learning Transferable Adapter for Multi-Label Classification 249
2023 12 09 2023-12-04 - Rejuvenating image-GPT as Strong Visual Representation Learners 901
2023 12 09 2023-12-05 - Mamba Linear-Time Sequence Modeling with Selective State Spaces 328
2023 12 09 2023-04-14 - Combined Scaling for Zero-shot Transfer Learning 775
2023 12 09 Multi Label Classification 1319
2023 12 09 2023-12-04 - MobileCLIP - Fast Image-Text Models through Multi-Modal Reinforced Training 1422
2023 12 09 Feature Stores 155
2023 12 09 Deep Learning Tricks of the Trade 906
2023 12 09 Visual Search 1610
2023 12 09 video 805
2023 12 09 Contrastive Learning 690
2023 12 09 Imitation Learning 20
2023 12 09 Retrieval Augmented Models 1485
2023 12 09 Segmentation 707
2023 12 09 Semi Supervised Learning 208
2023 12 09 Synthetic Data 472
2023 12 09 maes 515
2023 12 09 resources 609
2023 12 09 Label Noise 951
2023 12 09 ML Infrastructure 199
2023 12 09 Multi Task Learning 44
2023 12 09 Multimodal Learning 30
2023 12 09 NeRF - Neural Radiance Fields 450
2023 12 09 paper-params 282
2023 12 09 medical 2893
2023 12 09 Active Learning 274
2023 12 09 Image Recognition 569
2023 12 09 ML Scaling 961
2023 12 09 Machine Learning Tricks and Best Practices 179
2023 12 09 Natural Language Processing 847
2023 12 09 Object Detection 1770
2023 12 09 Text Embeddings 260
2023 12 09 benchmarks 423
2023 12 09 Data Formats for ML 627
2023 12 09 Extreme Classification 150
2023 12 09 CNNs 771
2023 12 09 Diffusion Models 2521
2023 12 09 Evaluation Metrics 580
2023 12 09 Few Shot Learning 249
2023 12 09 Human Pose Estimation and Human Modeling 726
2023 12 09 Learning to Rank 348
2023 12 09 Long Tail Classification and Class Imbalance 705
2023 12 09 Mobile Inference 2621
2023 12 09 Reinforcement Learning (RL) 1313
2023 12 09 Recommendation Systems (RecSys) 4573
2023 12 09 Speech - Speech Recognition and TTS 5115
2023 12 09 Transformer Alternatives (mostly SSMs) 3934
2023 12 09 Transformers 12887
2023 12 09 Vision Transformers 2578
2023 12 09 compilers 1180
2023 12 09 compression 1082
2023 12 09 graphs 914
2023 12 09 fine-tuning 1260