Mojo function
copy_local_to_dram
copy_local_to_dram[dst_thread_layout: Layout, thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])
copy_local_to_dram[dst_thread_layout: Layout, thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], dst_base: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])
Used to copy data from registers to DRAM for AMD GPUs. It uses buffer_store intrinsic to store data and can check for bounds. In addition to dst and src, it takes dst_base as an argument to construct the buffer descriptor of the dst tensor. dst_base is the original global memory tensor from which dst is derived.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!