Skip to main content
Log in

Mojo function

make_swizzle

make_swizzle[num_rows: Int, row_size: Int, access_size: Int]() -> Swizzle

2D swizzle to avoid bank conflict. Access access_size elements in num_rows x row_size in shared memory tile. num_rows should be for minimun access pattern. E.g. store 16x8 mma result to a 64 x 64 tile. The minimum access pattern is 8x8 sub-matrix, num_rows = 8, row_size = 64. We should swizzle the layout to avoid bank conflict for loading in the data in future. The load is most likely 16B, i.e. access_size = 4 for fp32 and 8 for bf16.