IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

gather_reduce

def gather_reduce[dtype: DType, gather_axis: Int, reduce_axis: Int, simd_width: Int, reduce_fn: def[dtype: DType, width: Int](SIMD[dtype, width], SIMD[dtype, width]) -> SIMD[dtype, width]](output: TileTensor[dtype, address_space=output.address_space, linear_idx_type=output.linear_idx_type, element_size=output.element_size], input: TileTensor[dtype, address_space=input.address_space, linear_idx_type=input.linear_idx_type, element_size=input.element_size], indices: TileTensor[DType.int32, address_space=indices.address_space, linear_idx_type=indices.linear_idx_type, element_size=indices.element_size], reduce_init: Scalar[dtype], ctx: Optional[DeviceContext] = None)

Computes output[i, j, k] = input[indices[i, j], k] and simultaneously reduces the output across axis 1 to produce output[i, k].

The motivating use-case for this is multi-hot embeddings in recommender models. This provides similar functionality to Torch's EmbeddingBag layer. In that context, i is the batch dimension, j is the multi-hot dimension, and k is the embedding dimension.