Mojo function

copy_kv_pages_d2h

copy_kv_pages_d2h[dtype: DType](device_kv_blocks: LayoutTensor[dtype, Layout.row_major[6](), device_kv_blocks.origin], host_kv_blocks: LayoutTensor[dtype, Layout.row_major[6](), host_kv_blocks.origin], src_page_ids: LayoutTensor[DType.int64, Layout.row_major[1](), src_page_ids.origin], dst_page_ids: LayoutTensor[DType.int64, Layout.row_major[1](), dst_page_ids.origin], layer_idx: Int, ctx: DeviceContext)

Copy selected pages for a single layer from device to host KV cache.

This function performs true GPU→CPU async copy using enqueue_copy. It copies only the specified layer for each page, with separate source and destination page IDs to support independent page ID spaces.

The 6D tensor layout is: [num_pages, kv_dim, num_layers, page_size, num_heads, head_dim]

Args:

device_kv_blocks (LayoutTensor): Source GPU KV cache blocks .
host_kv_blocks (LayoutTensor): Destination CPU KV cache blocks.
src_page_ids (LayoutTensor): Pointer to GPU page IDs.
dst_page_ids (LayoutTensor): Pointer to CPU page IDs.
layer_idx (Int): Which layer to copy.
ctx (DeviceContext): Device context for GPU operations.