Skip to main content

Python module

tokenizer

Implementations of provided tokenizers.

IdentityPipelineTokenizer

class max.pipelines.lib.tokenizer.IdentityPipelineTokenizer(*args, **kwargs)

decode()

async decode(encoded, **kwargs)

Returns the encoded string unchanged (identity decoding).

Parameters:

encoded (str)

Return type:

str

encode()

async encode(prompt, add_special_tokens=False)

Returns the prompt unchanged (identity encoding).

Parameters:

  • prompt (str)
  • add_special_tokens (bool)

Return type:

str

eos

property eos: int

Returns the end-of-sequence token ID (0 for identity).

expects_content_wrapping

property expects_content_wrapping: bool

Returns whether this tokenizer expects content wrapping.

PreTrainedPipelineTokenizer

class max.pipelines.lib.tokenizer.PreTrainedPipelineTokenizer(delegate)

Parameters:

delegate (PreTrainedTokenizer | PreTrainedTokenizerFast)

apply_chat_template()

apply_chat_template(messages)

Applies the delegate’s chat template to the messages.

Parameters:

messages (list[TextGenerationRequestMessage])

Return type:

str

decode()

async decode(encoded, **kwargs)

Decodes token ids to text via the delegate.

Parameters:

encoded (ndarray[tuple[Any, ...], dtype[integer[Any]]])

Return type:

str

encode()

async encode(prompt, add_special_tokens=False)

Encodes the prompt to token ids via the delegate.

Parameters:

  • prompt (str)
  • add_special_tokens (bool)

Return type:

ndarray[tuple[Any, …], dtype[integer[Any]]]

eos

property eos: int

Returns the end-of-sequence token ID from the delegate.

expects_content_wrapping

property expects_content_wrapping: bool

Returns whether this tokenizer expects content wrapping.

TextAndVisionTokenizer

class max.pipelines.lib.tokenizer.TextAndVisionTokenizer(model_path, pipeline_config, *, revision=None, max_length=None, trust_remote_code=False, context_validators=None, **unused_kwargs)

Encapsulates creation of TextAndVisionContext and specific token encode/decode logic.

Parameters:

apply_chat_template()

apply_chat_template(messages)

Applies the processor’s chat template to the messages.

Parameters:

messages (list[TextGenerationRequestMessage])

Return type:

str

decode()

async decode(encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:

encoded (ndarray[tuple[Any, ...], dtype[integer[Any]]])

Return type:

str

encode()

async encode(prompt, add_special_tokens=True)

Transforms the provided prompt into a token array.

Parameters:

Return type:

ndarray[tuple[Any, …], dtype[integer[Any]]]

eos

property eos: int

Returns the end-of-sequence token ID from the delegate.

expects_content_wrapping

property expects_content_wrapping: bool

Returns whether this tokenizer expects content wrapping.

new_context()

async new_context(request)

Create a new TextAndVisionContext object, leveraging necessary information from TextGenerationRequest.

Parameters:

request (TextGenerationRequest)

Return type:

TextAndVisionContext

TextTokenizer

class max.pipelines.lib.tokenizer.TextTokenizer(model_path, pipeline_config, *, revision=None, max_length=None, trust_remote_code=False, enable_llama_whitespace_fix=False, chat_template=None, context_validators=None, **unused_kwargs)

Encapsulates creation of TextContext and specific token encode/decode logic.

Parameters:

  • model_path (str) – Path to the model/tokenizer
  • revision (str | None) – Git revision/branch to use
  • max_length (int | None) – Maximum sequence length
  • trust_remote_code (bool) – Whether to trust remote code from the model
  • enable_llama_whitespace_fix (bool) – Enable whitespace fix for Llama tokenizers
  • pipeline_config (PipelineConfig) – Optional pipeline configuration
  • chat_template (str | None) – Optional custom chat template string to override the one shipped with the Hugging Face model config. This allows customizing the prompt formatting for different use cases.
  • context_validators (list[Callable[[TextContext], None]] | None)

apply_chat_template()

apply_chat_template(messages, tools, chat_template_options=None)

Applies the delegate chat template to messages (and optional tools).

Parameters:

Return type:

str

decode()

async decode(encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:

encoded (ndarray[tuple[Any, ...], dtype[integer[Any]]])

Return type:

str

encode()

async encode(prompt, add_special_tokens=True)

Transforms the provided prompt into a token array.

Parameters:

Return type:

ndarray[tuple[Any, …], dtype[integer[Any]]]

eos

property eos: int

Returns the end-of-sequence token ID from the delegate.

expects_content_wrapping

property expects_content_wrapping: bool

Returns whether this tokenizer expects content wrapping.

new_context()

async new_context(request)

Create a new TextContext object, leveraging necessary information from TextGenerationRequest.

Parameters:

request (TextGenerationRequest)

Return type:

TextContext

max_tokens_to_generate()

max.pipelines.lib.tokenizer.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=None)

Returns the max number of new tokens to generate.

Parameters:

  • prompt_size (int)
  • max_length (int | None)
  • max_new_tokens (int | None)

Return type:

int | None

run_with_default_executor()

async max.pipelines.lib.tokenizer.run_with_default_executor(fn, *args, **kwargs)

Runs a callable in the default thread pool executor.

Parameters:

  • fn (Callable[[~_P], _R]) – Callable to run.
  • *args (~_P) – Positional arguments for fn.
  • **kwargs (~_P) – Keyword arguments for fn.

Returns:

The result of fn(*args, **kwargs).

Return type:

_R

Was this page helpful?