Mojo function
min_p_sampling
min_p_sampling[dtype: DType, out_idx_type: DType, //, _test_sort: Bool = False](min_ps: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], input_logits: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], out_token_ids: TileTensor[out_idx_type, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], temperature: Scalar[dtype] = 1)
Naive CPU implementation of Min-P sampling for token selection. This function applies temperature scaling, softmax, a merge sort, and then samples tokens based on the calculated probability threshold (Min-P).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!