IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python function

max_tokens_to_generate

max_tokens_to_generate()​

max.pipelines.modeling.dataprocessing.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=-1)

source

Returns the maximum number of new tokens to generate.

Respects both max_length (minus prompt_size) and, when non-negative, max_new_tokens; returns the minimum of the two when both apply.

Parameters:

  • prompt_size (int) – Current prompt (context) length in tokens.
  • max_length (int) – Maximum total sequence length.
  • max_new_tokens (int) – Cap on new tokens, or -1 to use only max_length.

Returns:

The effective cap on new tokens to generate.

Return type:

int