Skip to main content

Mojo function

min_p_sampling

min_p_sampling[dtype: DType, out_idx_type: DType, //, _test_sort: Bool = False](min_ps: TileTensor[dtype, min_ps.LayoutType, min_ps.origin, address_space=min_ps.address_space, linear_idx_type=min_ps.linear_idx_type, element_shape_types=min_ps.element_shape_types], input_logits: TileTensor[dtype, input_logits.LayoutType, input_logits.origin, address_space=input_logits.address_space, linear_idx_type=input_logits.linear_idx_type, element_shape_types=input_logits.element_shape_types], out_token_ids: TileTensor[out_idx_type, out_token_ids.LayoutType, out_token_ids.origin, address_space=out_token_ids.address_space, linear_idx_type=out_token_ids.linear_idx_type, element_shape_types=out_token_ids.element_shape_types], temperature: Scalar[dtype] = 1)

Naive CPU implementation of Min-P sampling for token selection. This function applies temperature scaling, softmax, a merge sort, and then samples tokens based on the calculated probability threshold (Min-P).

Was this page helpful?