Goals

  • train adapters to align a bunch of popular foundational models
  • potentially have one universal embedding space and encoders and decoders to project into them

this would allow task specific model routing and simplifying multimodal model serving