Mojo function
fused_token_sampling_cpu
fused_token_sampling_cpu[type: DType, rank: Int, out_idx_type: DType](max_k: Int, input: NDBuffer[type, rank, origin], out_idxs: NDBuffer[out_idx_type, rank, origin], k: OptionalReg[NDBuffer[int64, 1, MutableAnyOrigin]] = OptionalReg[NDBuffer[int64, 1, MutableAnyOrigin]]({:i1 0, 1}), temperature: OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]] = OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]]({:i1 0, 1}), top_p: OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]] = OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]]({:i1 0, 1}), seed: OptionalReg[NDBuffer[uint64, 1, MutableAnyOrigin]] = OptionalReg[NDBuffer[uint64, 1, MutableAnyOrigin]]({:i1 0, 1}))
Generalized implementation of the Top K algorithm with sampling. Returns the sampled index from the innermost dimension of the input tensor for each row/subvolume.
Parameters:
- type (
DType
): Data type of the input buffer. - rank (
Int
): Rank of the input. - out_idx_type (
DType
): Data type of the output indices.
Args:
- max_k (
Int
): Largest number of top elements. - input (
NDBuffer[type, rank, origin]
): NDBuffer[type, rank] (Any shape)- The input tensor. - out_idxs (
NDBuffer[out_idx_type, rank, origin]
): NDBuffer[out_idx_type, rank] (shape of [input_shape[:-1]] + [1]) - The output indices. - k (
OptionalReg[NDBuffer[int64, 1, MutableAnyOrigin]]
): Optional device buffer of top elements to keep for each batch element. - temperature (
OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]]
): The temperature based scaling. - top_p (
OptionalReg[NDBuffer[float32, 1, MutableAnyOrigin]]
): Only use the tokens whose cumulative probability exceeds this threshold. - seed (
OptionalReg[NDBuffer[uint64, 1, MutableAnyOrigin]]
): The seed to use for the random number generator.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!