Generate a dataset using existing VLM to train a model to select which regions to add to the input for further processing