Train small model that can query external sources

input query_token query_embeddings query_results_token result_embeddings …

Bootstrap from existing LLM, and on device nearest neighbor index (embedding tables only)

Curriculum learning by paging in different partitions of the data and distilling from the large LLM