class schedule::merge_path::preprocess_t

Merge-path preprocess (host-side helper).

Computes per-block starting coordinates ahead of time so the main kernel can skip its diagonal search. Layout-generic: any layout view satisfying the contract in loops/container/layout.hxx works.

Defined in include/loops/schedule/merge_path_flat.hxx

Template Parameters

ParameterDescription
THREADS_PER_BLOCKThreads per block.
ITEMS_PER_THREADNumber of items per thread to process.
tiles_typeTile-id type (used for back-compat ctor).
atoms_typeAtom-id type (used for back-compat ctor).
tile_size_typeCounter type for tiles.
atom_size_typeCounter type for atoms.
layout_type

Members

total_work
num_merge_tiles
d_tile_coordinates
tile_coordinates

Methods

preprocess_t(tiles_iterator_t _tiles, tile_size_t _num_tiles, atom_size_t _num_atoms, cudaStream_t stream=0)
Construct from a CSR-shaped offsets pointer (back-compat shortcut; only valid when layout_type is layout::csr ).
preprocess_t(layout_t _layout, cudaStream_t stream=0)
Construct directly from a layout view (any layout type).
__device__ __host__ preprocess_t(preprocess_t const &rhs)
Special copy constructor; we never copy the device_vector as it is not supported within device code.
__device__ __host__ auto data( const)