Mojo function
copy_dram_to_local
copy_dram_to_local[src_thread_layout: Layout, thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_base: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])
Used to copy data from DRAM to registers for AMD GPUs. It uses buffer_load intrinsic to load data and can check for bounds. In addition to dst and src, it takes src_base as an argument to construct the buffer descriptor of the src tensor. src_base is the original global memory tensor from which src is derived.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!