Skip to main content
Log in

Mojo function

copy_dram_to_sram

copy_dram_to_sram[src_thread_layout: Layout, dst_thread_layout: Layout = $0, swizzle: OptionalReg[Swizzle] = OptionalReg[Swizzle]({:i1 0, 1}), num_threads: Int = $0.size(), thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])

copy_dram_to_sram[src_thread_layout: Layout, dst_thread_layout: Layout = $0, swizzle: OptionalReg[Swizzle] = OptionalReg[Swizzle]({:i1 0, 1}), num_threads: Int = $0.size(), thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_base: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])

Used to copy data from DRAM to SRAM for AMD GPUs. It uses buffer_load intrinsic to load data and can check for bounds. In addition to dst and src, it takes src_base as an argument to construct the buffer descriptor of the src tensor. src_base is the original global memory tensor from which src is derived.

copy_dram_to_sram[thread_layout: Layout, swizzle: OptionalReg[Swizzle] = OptionalReg[Swizzle]({:i1 0, 1}), num_threads: Int = $0.size(), thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_base: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])

copy_dram_to_sram[thread_layout: Layout, swizzle: OptionalReg[Swizzle] = OptionalReg[Swizzle]({:i1 0, 1}), num_threads: Int = $0.size(), thread_scope: ThreadScope = ThreadScope(0)](dst: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src: LayoutTensor[dtype, layout, mut=mut, origin=origin, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])