- synthetic data
- tool definitions ⇒ LLM ⇒ generated text queries / prompts
- distill into embedding / encoder / crossencoder models
- setfit
- LORA
- transformer
- or tiny decoder LLM
- in context examples with RAG
- in context history of previous actions
- fast serving
- trie prefix cache
- structured generation
- small custom tokenizer
- [ ]