KV Cache Compression [2412.00965v1] Token Cropr: Faster ViTs for Quite a Few Tasks Vision Language Models [2412.01818] [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster [2411.03312v1] Inference Optimal VLMs Need Only One Visual Token but Larger Models