EGGROLL
-
No backprop
-
Non differentiable architectures
-
No need for sync (rng state)
-
Can train on inference optimized wuantized versions of the model
-
Use on prod inference workoads
No backprop
Non differentiable architectures
No need for sync (rng state)
Can train on inference optimized wuantized versions of the model
Use on prod inference workoads