Teach VLM to Zoom and Pan
Generate a dataset using existing VLM to train a model to select which regions to add to the input for further processing
Generate a dataset using existing VLM to train a model to select which regions to add to the input for further processing