Skip to main content
Log in

Python module

tokenizer

Implementations of provided tokenizers.

IdentityPipelineTokenizer

class max.pipelines.lib.tokenizer.IdentityPipelineTokenizer(*args, **kwargs)

decode()

async decode(context, encoded, **kwargs)

Decodes response tokens to text.

Parameters:

  • context (TokenGeneratorContext ) – Current generation context.
  • encoded (TokenizerEncoded ) – Encoded response tokens.

Returns:

Un-encoded response text.

Return type:

str

encode()

async encode(prompt, add_special_tokens=False)

Encodes text prompts as tokens.

Parameters:

  • prompt (str ) – Un-encoded prompt text.
  • add_special_tokens (bool )

Raises:

ValueError – If the prompt exceeds the configured maximum length.

Return type:

str

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }
{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }
{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }
{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., PipelineTokenizerRequest.images.

PreTrainedPipelineTokenizer

class max.pipelines.lib.tokenizer.PreTrainedPipelineTokenizer(delegate)

Parameters:

delegate (Union [ PreTrainedTokenizer , PreTrainedTokenizerFast ] )

apply_chat_template()

apply_chat_template(messages)

Parameters:

messages (list [ TokenGeneratorRequestMessage ] )

Return type:

str

decode()

async decode(context, encoded, **kwargs)

Decodes response tokens to text.

Parameters:

  • context (TokenGeneratorContext ) – Current generation context.
  • encoded (TokenizerEncoded ) – Encoded response tokens.

Returns:

Un-encoded response text.

Return type:

str

encode()

async encode(prompt, add_special_tokens=False)

Encodes text prompts as tokens.

Parameters:

  • prompt (str ) – Un-encoded prompt text.
  • add_special_tokens (bool )

Raises:

ValueError – If the prompt exceeds the configured maximum length.

Return type:

ndarray

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }
{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }
{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }
{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., PipelineTokenizerRequest.images.

TextAndVisionTokenizer

class max.pipelines.lib.tokenizer.TextAndVisionTokenizer(model_path, *, revision=None, max_length=None, max_new_tokens=None, trust_remote_code=False, **unused_kwargs)

Encapsulates creation of TextContext and specific token encode/decode logic.

Parameters:

  • model_path (str )
  • revision (str | None )
  • max_length (int | None )
  • max_new_tokens (int | None )
  • trust_remote_code (bool )

apply_chat_template()

apply_chat_template(messages)

Parameters:

messages (list [ TokenGeneratorRequestMessage ] )

Return type:

str

decode()

async decode(context, encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:

Return type:

str

encode()

async encode(prompt, add_special_tokens=True)

Transform the provided prompt into a token array.

Parameters:

Return type:

ndarray

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }
{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }
{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }
{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., PipelineTokenizerRequest.images.

new_context()

async new_context(request)

Create a new TextAndVisionContext object, leveraging necessary information like cache_seq_id and prompt from TokenGeneratorRequest.

Parameters:

request (TokenGeneratorRequest )

Return type:

TextAndVisionContext

TextTokenizer

class max.pipelines.lib.tokenizer.TextTokenizer(model_path, *, revision=None, max_length=None, max_new_tokens=None, trust_remote_code=False, enable_llama_whitespace_fix=False, **unused_kwargs)

Encapsulates creation of TextContext and specific token encode/decode logic.

Parameters:

  • model_path (str )
  • revision (str | None )
  • max_length (int | None )
  • max_new_tokens (int | None )
  • trust_remote_code (bool )
  • enable_llama_whitespace_fix (bool )

apply_chat_template()

apply_chat_template(messages, tools, chat_template_options=None)

Parameters:

Return type:

str

decode()

async decode(context, encoded, **kwargs)

Transformer a provided encoded token array, back into readable text.

Parameters:

Return type:

str

encode()

async encode(prompt, add_special_tokens=True)

Transform the provided prompt into a token array.

Parameters:

Return type:

ndarray

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a content property.

Text messages are formatted as:

{ "type": "text", "content": "text content" }
{ "type": "text", "content": "text content" }

instead of the OpenAI spec:

{ "type": "text", "text": "text content" }
{ "type": "text", "text": "text content" }

NOTE: Multimodal messages omit the content property. Both image_urls and image content parts are converted to:

{ "type": "image" }
{ "type": "image" }

Their content is provided as byte arrays through the top-level property on the request object, i.e., PipelineTokenizerRequest.images.

new_context()

async new_context(request)

Create a new TextContext object, leveraging necessary information like cache_seq_id and prompt from TokenGeneratorRequest.

Parameters:

request (TokenGeneratorRequest )

Return type:

TextContext

max_tokens_to_generate()

max.pipelines.lib.tokenizer.max_tokens_to_generate(prompt_size, max_length, max_new_tokens=None)

Returns the max number of new tokens to generate.

Parameters:

  • prompt_size (int )
  • max_length (int | None )
  • max_new_tokens (int | None )

Return type:

int | None

run_with_default_executor()

async max.pipelines.lib.tokenizer.run_with_default_executor(fn, *args)