For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python function

max_tokens_to_generate

`max_tokens_to_generate()`

max.pipelines.modeling.dataprocessing.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=-1)

source

Returns the maximum number of new tokens to generate.

Respects both max_length (minus prompt_size) and, when non-negative, max_new_tokens; returns the minimum of the two when both apply.

Parameters:

prompt_size (int) – Current prompt (context) length in tokens.
max_length (int) – Maximum total sequence length.
max_new_tokens (int) – Cap on new tokens, or -1 to use only max_length.

Returns:

The effective cap on new tokens to generate.

Return type:

int

max_tokens_to_generate()​

`max_tokens_to_generate()`