Skip to main content

Python function

max_tokens_to_generate

max_tokens_to_generate()​

max.pipelines.modeling.dataprocessing.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=-1)

source

Returns the maximum number of new tokens to generate.

Respects both max_length (minus prompt_size) and, when non-negative, max_new_tokens; returns the minimum of the two when both apply.

Parameters:

  • prompt_size (int) – Current prompt (context) length in tokens.
  • max_length (int) – Maximum total sequence length.
  • max_new_tokens (int) – Cap on new tokens, or -1 to use only max_length.

Returns:

The effective cap on new tokens to generate.

Return type:

int