Mojo module
conv_smem
Shared memory layout for SM100 Conv2D kernel.
This module provides the Conv2dSmem struct which defines the shared memory organization for the Conv2D fprop kernel. The layout is similar to B200MatmulSmem but uses conv-specific naming.
SMEM Organization:
- Activation tiles (from im2col): Multi-stage pipelined
- Filter tiles: Multi-stage pipelined
- Output tiles: Double-buffered for epilogue
- Pipeline barriers: For producer-consumer synchronization
- CLC barriers: For work scheduling
- TMEM storage: For accumulator address sharing
Structs
-
Conv2dSmem: Shared memory layout for SM100 Conv2D fprop kernel.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!