IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

topk_topp_sampling_from_prob

def topk_topp_sampling_from_prob[dtype: DType, out_idx_type: DType, block_size: Int = Int(1024), TopKArrLayoutType: TensorLayout = Layout[*?, *?], IndicesLayoutType: TensorLayout = Layout[*?, *?], TopPArrLayoutType: TensorLayout = Layout[*?, *?], SeedLayoutType: TensorLayout = Layout[*?, *?]](ctx: DeviceContext, probs: TileTensor[dtype, Storage=probs.Storage, address_space=probs.address_space, linear_idx_type=probs.linear_idx_type, element_size=probs.element_size], output: TileTensor[out_idx_type, Storage=output.Storage, address_space=output.address_space, linear_idx_type=output.linear_idx_type, element_size=output.element_size], top_k_val: Int, top_p_val: Float32 = 1, deterministic: Bool = False, rng_seed: Optional[TileTensor[DType.uint64, SeedLayoutType, ImmutAnyOrigin]] = None, rng_offset: UInt64 = UInt64(0), indices: Optional[TileTensor[out_idx_type, IndicesLayoutType, ImmutAnyOrigin]] = None, top_k_arr: Optional[TileTensor[out_idx_type, TopKArrLayoutType, ImmutAnyOrigin]] = None, top_p_arr: Optional[TileTensor[DType.float32, TopPArrLayoutType, ImmutAnyOrigin]] = None)

Joint top-k + top-p sampling from probability distribution.

Performs stochastic sampling considering only tokens that satisfy both the top-k count constraint AND the top-p nucleus constraint. When top_p_val is 1.0 (default) this behaves identically to topk_sampling_from_prob.

Args:

Raises:

Error: If tensor ranks or shapes are invalid.