IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

TextAndVisionTokenizer

TextAndVisionTokenizer

class max.pipelines.TextAndVisionTokenizer(model_path, pipeline_config, *, revision=None, max_length=None, trust_remote_code=False, **unused_kwargs)

source

Bases: PipelineTokenizer[TextAndVisionContext, ndarray[tuple[Any, …], dtype[integer[Any]]], TextGenerationRequest]

Encapsulates creation of TextAndVisionContext and specific token encode/decode logic.

Parameters:

apply_chat_template()

apply_chat_template(messages, tools=None, **chat_template_options)

source

Applies the processor’s chat template to the messages.

Parameters:

  • messages (list[TextGenerationRequestMessage]) – List of messages for the chat template.
  • tools (list[TextGenerationRequestTool] | None) – Optional tools available for the model to invoke.
  • **chat_template_options (Any) – Template options to forward to the Jinja template. Merged with add_generation_prompt=True default.

Returns:

The templated chat message as a string.

Return type:

str

create_eos_tracker()

async create_eos_tracker(request)

source

Builds an EOSTracker from the request sampling params and tokenizer default EOS token IDs.

Parameters:

request (TextGenerationRequest)

Return type:

EOSTracker

decode()

async decode(encoded, **kwargs)

source

Transforms a provided encoded token array back into readable text.

Parameters:

encoded (ndarray[tuple[Any, ...], dtype[integer[Any]]])

Return type:

str

encode()

async encode(prompt, add_special_tokens=True)

source

Transforms the provided prompt into a token array.

Parameters:

Return type:

ndarray[tuple[Any, …], dtype[integer[Any]]]

eos

property eos: int

source

Returns the end-of-sequence token ID from the delegate.

expects_content_wrapping

property expects_content_wrapping: bool

source

Returns whether this tokenizer expects content wrapping.

new_context()

async new_context(request)

source

Create a new TextAndVisionContext object, leveraging necessary information from TextGenerationRequest.

Parameters:

request (TextGenerationRequest)

Return type:

TextAndVisionContext

tokenizer_vocab_size

property tokenizer_vocab_size: int

source

Vocabulary size of the HuggingFace tokenizer delegate.