Python module

tokenizer

Implementations of provided tokenizers.

`IdentityPipelineTokenizer`

class max.pipelines.lib.tokenizer.IdentityPipelineTokenizer(*args, **kwargs)

`decode()`

async decode(encoded, **kwargs)

Decodes response tokens to text.

Parameters:: encoded (TokenizerEncoded) – Encoded response tokens.
Returns:: Un-encoded response text.
Return type:: str

`encode()`

async encode(prompt, add_special_tokens=False)

Encodes text prompts as tokens.

Parameters:

prompt (str) – Un-encoded prompt text.
add_special_tokens (bool)

Raises:

ValueError – If the prompt exceeds the configured maximum length.

Return type:

str

`eos`

property eos: int

The end of sequence token for this tokenizer.

`expects_content_wrapping`

property expects_content_wrapping: bool

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., RequestType.images.

`PreTrainedPipelineTokenizer`

class max.pipelines.lib.tokenizer.PreTrainedPipelineTokenizer(delegate)

Parameters:: delegate (PreTrainedTokenizer | PreTrainedTokenizerFast)

`apply_chat_template()`

apply_chat_template(messages)

Parameters:: messages (list[TextGenerationRequestMessage])
Return type:: str

`decode()`

async decode(encoded, **kwargs)

Decodes response tokens to text.

Parameters:: encoded (TokenizerEncoded) – Encoded response tokens.
Returns:: Un-encoded response text.
Return type:: str

`encode()`

async encode(prompt, add_special_tokens=False)

Encodes text prompts as tokens.

Parameters:

prompt (str) – Un-encoded prompt text.
add_special_tokens (bool)

Raises:

ValueError – If the prompt exceeds the configured maximum length.

Return type:

ndarray[tuple[int, …], dtype[integer[Any]]]

`eos`

property eos: int

The end of sequence token for this tokenizer.

`expects_content_wrapping`

property expects_content_wrapping: bool

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., RequestType.images.

`TextAndVisionTokenizer`

class max.pipelines.lib.tokenizer.TextAndVisionTokenizer(model_path, *, revision=None, max_length=None, trust_remote_code=False, pipeline_config=None, context_validators=None, **unused_kwargs)

Encapsulates creation of TextContext and specific token encode/decode logic.

Parameters:

model_path (str)
revision (str | None)
max_length (int | None)
trust_remote_code (bool)
pipeline_config (PipelineConfig | None)
context_validators (list[Callable[[TextAndVisionContext], None]] | None)

`apply_chat_template()`

apply_chat_template(messages)

Parameters:: messages (list[TextGenerationRequestMessage])
Return type:: str

`decode()`

async decode(encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:: encoded (ndarray[tuple[int, ...], dtype[integer[Any]]])
Return type:: str

`encode()`

async encode(prompt, add_special_tokens=True)

Transform the provided prompt into a token array.

Parameters:

prompt (str | Sequence[int])
add_special_tokens (bool)

Return type:

ndarray[tuple[int, …], dtype[integer[Any]]]

`eos`

property eos: int

The end of sequence token for this tokenizer.

`expects_content_wrapping`

property expects_content_wrapping: bool

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., RequestType.images.

`new_context()`

async new_context(request)

Create a new TextAndVisionContext object, leveraging necessary information from TextGenerationRequest.

Parameters:: request (TextGenerationRequest)
Return type:: TextAndVisionContext

`TextTokenizer`

class max.pipelines.lib.tokenizer.TextTokenizer(model_path, *, revision=None, max_length=None, trust_remote_code=False, enable_llama_whitespace_fix=False, pipeline_config=None, chat_template=None, context_validators=None, **unused_kwargs)

Encapsulates creation of TextContext and specific token encode/decode logic.

Parameters:

model_path (str) – Path to the model/tokenizer
revision (str | None) – Git revision/branch to use
max_length (int | None) – Maximum sequence length
trust_remote_code (bool) – Whether to trust remote code from the model
enable_llama_whitespace_fix (bool) – Enable whitespace fix for Llama tokenizers
pipeline_config (PipelineConfig | None) – Optional pipeline configuration
chat_template (str | None) – Optional custom chat template string to override the one shipped with the HuggingFace model config. This allows customizing the prompt formatting for different use cases.
context_validators (list[Callable[[TextContext], None]] | None)

`apply_chat_template()`

apply_chat_template(messages, tools, chat_template_options=None)

Parameters:

messages (list[TextGenerationRequestMessage])
tools (list[TextGenerationRequestTool] | None)
chat_template_options (dict[str, Any] | None)

Return type:

str

`decode()`

async decode(encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:: encoded (ndarray[tuple[int, ...], dtype[integer[Any]]])
Return type:: str

`encode()`

async encode(prompt, add_special_tokens=True)

Transform the provided prompt into a token array.

Parameters:

prompt (str | Sequence[int])
add_special_tokens (bool)

Return type:

ndarray[tuple[int, …], dtype[integer[Any]]]

`eos`

property eos: int

The end of sequence token for this tokenizer.

`expects_content_wrapping`

property expects_content_wrapping: bool

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., RequestType.images.

`new_context()`

async new_context(request)

Create a new TextContext object, leveraging necessary information from TextGenerationRequest.

Parameters:: request (TextGenerationRequest)
Return type:: TextContext

`max_tokens_to_generate()`

max.pipelines.lib.tokenizer.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=None)

Returns the max number of new tokens to generate.

Parameters:

prompt_size (int)
max_length (int | None)
max_new_tokens (int | None)

Return type:

int | None

`run_with_default_executor()`

async max.pipelines.lib.tokenizer.run_with_default_executor(fn, *args, **kwargs)

Parameters:

fn (Callable[[~_P], _R])
args (~_P)
kwargs (~_P)

Return type:

IdentityPipelineTokenizer​

decode()​

encode()​

eos​

expects_content_wrapping​

PreTrainedPipelineTokenizer​

apply_chat_template()​

decode()​

encode()​

eos​

expects_content_wrapping​

TextAndVisionTokenizer​

apply_chat_template()​

decode()​

encode()​

eos​

expects_content_wrapping​

new_context()​

TextTokenizer​

apply_chat_template()​

decode()​

encode()​

eos​

expects_content_wrapping​

new_context()​

max_tokens_to_generate()​

run_with_default_executor()​

`IdentityPipelineTokenizer`

`decode()`

`encode()`

`eos`

`expects_content_wrapping`

`PreTrainedPipelineTokenizer`

`apply_chat_template()`

`decode()`

`encode()`

`eos`

`expects_content_wrapping`

`TextAndVisionTokenizer`

`apply_chat_template()`

`decode()`

`encode()`

`eos`

`expects_content_wrapping`

`new_context()`

`TextTokenizer`

`apply_chat_template()`

`decode()`

`encode()`

`eos`

`expects_content_wrapping`

`new_context()`

`max_tokens_to_generate()`

`run_with_default_executor()`