Mojo function
layer_norm_cpu
layer_norm_cpu[dtype: DType, //, input_fn: def[width: Int, alignment: Int](Int, Int) capturing -> SIMD[dtype, width], gamma_fn: def[width: Int, rank: Int, alignment: Int](IndexList[rank]) capturing -> SIMD[dtype, width], output_fn: def[width: Int, alignment: Int](row: Int, col: Int, val: SIMD[dtype, width]) capturing -> None](num_rows: Int, num_cols: Int, beta: TileTensor[dtype, address_space=beta.address_space, linear_idx_type=beta.linear_idx_type, element_size=beta.element_size], epsilon: Scalar[dtype])
Computes layernorm(elementwise_fn(x)) across the last dimension of x, where layernorm is defined as .
Currently performs 3 passes over the input data. This can be reduced to 2 by fusing the add, mean, and variance loops using Welford's algorithm.
Parameters:
- βdtype (
DType): The x and out buffers' elements dtype. - βinput_fn (
def[width: Int, alignment: Int](Int, Int) capturing -> SIMD[dtype, width]): Function called to generate an input value. - βgamma_fn (
def[width: Int, rank: Int, alignment: Int](IndexList[rank]) capturing -> SIMD[dtype, width]): Function called to generate a gamma value. - βoutput_fn (
def[width: Int, alignment: Int](row: Int, col: Int, val: SIMD[dtype, width]) capturing -> None): Function called to store the output value.
Args:
- βnum_rows (
Int): The number of rows in the input tensor. - βnum_cols (
Int): The number of columns in the input tensor. - βbeta (
TileTensor[dtype, address_space=beta.address_space, linear_idx_type=beta.linear_idx_type, element_size=beta.element_size]): The beta value to use in the layernorm calculation. - βepsilon (
Scalar[dtype]): The eps value to use in the layernorm calculation.
layer_norm_cpu[dtype: DType, rank: Int, //, input_fn: def[width: Int, rank: Int, alignment: Int](IndexList[rank]) capturing -> SIMD[dtype, width], gamma_fn: def[width: Int, rank: Int, alignment: Int](IndexList[rank]) capturing -> SIMD[dtype, width], output_fn: def[width: Int, rank: Int, alignment: Int](idx: IndexList[rank], val: SIMD[dtype, width]) capturing -> None](shape: IndexList[rank], beta: TileTensor[dtype, address_space=beta.address_space, linear_idx_type=beta.linear_idx_type, element_size=beta.element_size], epsilon: Scalar[dtype], ctx: Optional[DeviceContext] = None)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!