Ragged tensors
A method for batching multiple requests with differing sequence lengths without the need for padding tokens. Ragged tensors allow sequences of variable lengths to be processed together efficiently by storing them in a compact, non-uniform format.
Also sometimes referred to as "packed tensors."
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!