Use small vocab (or even character level) (ex 1024) with a 1D causal conv VAE to reduce embedding table size and sequence lengths
align from small token set to existing large tokenizers with CTC? (Sequence Modeling with CTC)
Oct 11, 20241 min read
Use small vocab (or even character level) (ex 1024) with a 1D causal conv VAE to reduce embedding table size and sequence lengths
align from small token set to existing large tokenizers with CTC? (Sequence Modeling with CTC)