Mojo function
softmax_2_pass
softmax_2_pass[simd_width: Int, dtype: DType](output: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], input: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])
Performs an unbatched softmax on an input tensor using the two-pass online algorithm.
The unbatched two-pass online softmax is described in "Online normalizer calculation for softmax" (https://arxiv.org/abs/1805.02867) and "A full-stack search technique for domain optimized deep learning accelerators" (https://dl.acm.org/doi/abs/10.1145/3503222.3507767) and is defined as:
procedure SoftmaxUnbatched(InputInput)
runningMax = -∞
runningSum = 0
STAGE 1:
for i = 0 to N do
newMax = max(runningMax, Input[i])
runningSum = runningSum*exp(runningMax-newMax) + exp(Input[i]-newMax)
runningMax = newMax
end for
for i = 0 to N do
Output[i] = exp(Input[i] - runningMax) / runningSum
end for
Parameters:
- simd_width (
Int
): The simd_width to use in vectorization. - dtype (
DType
): The dtype of the input and output buffers.
Args:
- output (
LayoutTensor
): The output buffer in which to store the softmax values. - input (
LayoutTensor
): The input buffer used to compute the softmax.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!