Mojo module
normalization
Functions
-
block_reduce: -
block_reduce_dual_sum: Combined block reduction for two sums using only 2 barriers. -
group_norm: -
group_norm_gpu: -
group_norm_gpu_block: -
group_norm_gpu_multi_block_norm: Multi-block normalize kernel: reduces partial stats and normalizes. -
group_norm_gpu_multi_block_stats: Multi-block stats kernel: computes partial Welford statistics per split. -
group_norm_gpu_warp_tiling: -
group_norm_reshape: Reshapes an input buffer for group normalization by flattening all dimensions except the group dimension. Returns a 2D buffer of shape (num_groups * N, group_size), where group_size is the product of channels_per_group and spatial. -
layer_norm: -
layer_norm_cpu: Computes layernorm(elementwise_fn(x)) across the last dimension of x, where layernorm is defined as . -
layer_norm_gpu: -
layer_norm_gpu_block: -
layer_norm_gpu_warp_tiling: -
layer_norm_reshape: -
layer_norm_shape: Compute the output shape of alayer_normoperation. -
rms_norm: -
rms_norm_cpu: -
rms_norm_fused_residual_add: -
rms_norm_fused_residual_add_cpu: -
rms_norm_fused_residual_add_gpu: -
rms_norm_fused_residual_add_gpu_block: -
rms_norm_fused_residual_add_gpu_block_no_shmem: RMS norm fused with residual add, without shared memory reductions. -
rms_norm_fused_residual_add_gpu_warp_tiling: -
rms_norm_gpu: -
rms_norm_gpu_block: -
rms_norm_gpu_warp_tiling: -
rms_norm_gpu_warp_tiling_128: -
welford_block_all_reduce: -
welford_combine: -
welford_update: -
welford_warp_reduce:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!