Mojo function

avg_pool_gpu

avg_pool_gpu[dtype: DType, int_type: DType, count_boundary: Bool = False](ctx: DeviceContext, input: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], filter: LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], strides: LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], dilations: LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], paddings: LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], output: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], ceil_mode: Bool = False)

Computes the average pool on GPU.

Params: count_boundary: Whether to count the boundary in the average computation.

Args:

ctx (DeviceContext): The DeviceContext to use for GPU execution.
input (LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On device) Batched image input to the pool2d operator.
filter (LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On host) Filter size on height and width dimensions with assumed tuple def (filter_h, filter_w).
strides (LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On host) Strides on height and width dimensions with assumed tuple def (stride_h, stride_w).
dilations (LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On host) Dilations on height and width dimensions with assumed tuple def (dilation_h, dilation_w).
paddings (LayoutTensor[int_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On host) Paddings on height and width dimensions with assumed tuple def (pad_h_before, pad_h_after, pad_w_before, pad_w_after)).
output (LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment]): (On device) Pre-allocated output tensor space.
ceil_mode (Bool): Ceiling mode defines the output shape and implicit padding.