michal.i/o

❯

❯

❯

❯

2023-12-05 - Mamba Linear-Time Sequence Modeling with Selective State Spaces

2023-12-05 - Mamba Linear-Time Sequence Modeling with Selective State Spaces

Jan 21, 20251 min read

transformers quadratic with input length
- do well modeling interactions in a fixed window size, but can’t model beyond the supported window
mamba
- linear in sequence length
- 5x higher throughput than transformers
- beats large transformers
Structured State Space Sequence Models (SSMs) (S4)

TODO

continue

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025