
  • train adapters to align a bunch of popular foundational models
  • potentially have one universal embedding space and encoders and decoders to project into them


  • index in universal embedding space, swap models without having to reindex
  • swap out models based on task (model routing and speculative decoding)
  • task specific model routing for encoders in different modalities (ex: different sized inputs and models for OCR, document, image, segmentation inputs)

Matryoshka based embeddings