michal.i/o

❯

❯

❯

❯

2024-11-03 - On the Efficiency of Convolutional Neural Networks

2024-11-03 - On the Efficiency of Convolutional Neural Networks

Jan 21, 20251 min read

Review of Architectures

Network in Network
- 1x1 convs
Resnet
- Residual Block
- Bottleneck Block
SE Net
- channel wise attention (global)
MobileNetV2
- inverted residual
- depthwise convolutions
3x3 convs
- winograd conv
EfficientNet
- poor comp efficiency on GPUs
reducing activation size to reduce latency (due to memory movement and large intermediates)
wider models have more compute per activation
depth first execution of kernels to avoid large intermediate activations
- can’t be used for global ops like SE attention
ConvNeXT
ResNeXt
[ ]

Contributions

GPU kernels for fused FusedMBConv and MBConv blocks, exploiting temporal locality and reduce workspace size to avoid spilling to dram

instead use a fused kernel that computes in depth first fashion, getting rid of large intermediates

ConvFirst Block

Review of Architectures
Contributions
ConvFirst Block

Backlinks

No backlinks found

Graph View

Created with Quartz v4.4.0 © 2025