Use small vocab (or even character level) (ex 1024) with a 1D causal conv VAE to reduce embedding table size and sequence lengths
align from small token set to existing large tokenizers with CTC? (Sequence Modeling with CTC)
Nov 20, 20241 min read
Use small vocab (or even character level) (ex 1024) with a 1D causal conv VAE to reduce embedding table size and sequence lengths
align from small token set to existing large tokenizers with CTC? (Sequence Modeling with CTC)