Mojo module
normalization
Functionsβ
- β
block_reduce: - β
block_reduce_dual_sum: Combined block reduction for two sums using only 2 barriers. - β
group_norm: - β
group_norm_gpu: - β
group_norm_gpu_block: - β
group_norm_gpu_multi_block_norm: Multi-block normalize kernel: reduces partial stats and normalizes. - β
group_norm_gpu_multi_block_stats: Multi-block stats kernel: computes partial Welford statistics per split. - β
group_norm_gpu_warp_tiling: - β
group_norm_reshape: Reshapes an input buffer for group normalization by flattening all dimensions except the group dimension. Returns a 2D buffer of shape (num_groups * N, group_size), where group_size is the product of channels_per_group and spatial. - β
layer_norm: - β
layer_norm_cpu: Computes layernorm(elementwise_fn(x)) across the last dimension of x, where layernorm is defined as . - β
layer_norm_gpu: - β
layer_norm_gpu_block: - β
layer_norm_gpu_warp_tiling: - β
layer_norm_reshape: - β
layer_norm_shape: Compute the output shape of alayer_normoperation. - β
rms_norm: - β
rms_norm_cpu: - β
rms_norm_fused_residual_add: - β
rms_norm_fused_residual_add_cpu: - β
rms_norm_fused_residual_add_gpu: - β
rms_norm_fused_residual_add_gpu_block: - β
rms_norm_fused_residual_add_gpu_block_no_shmem: RMS norm fused with residual add, without shared memory reductions. - β
rms_norm_fused_residual_add_gpu_warp_tiling: - β
rms_norm_gpu: - β
rms_norm_gpu_block: - β
rms_norm_gpu_warp_tiling: - β
rms_norm_gpu_warp_tiling_128: - β
rms_norm_rope_gpu: Fused RMS normalization followed by Rotary Position Embedding (RoPE) for GPU. - β
welford_block_all_reduce: - β
welford_combine: - β
welford_update: - β
welford_warp_reduce:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!