class
schedule::merge_path::preprocess_t
Merge-path preprocess (host-side helper).
Computes per-block starting coordinates ahead of time so the main kernel can skip its diagonal search. Layout-generic: any layout view satisfying the contract in loops/container/layout.hxx works.
Defined in include/loops/schedule/merge_path_flat.hxxTemplate Parameters
| Parameter | Description |
|---|---|
THREADS_PER_BLOCK | Threads per block. |
ITEMS_PER_THREAD | Number of items per thread to process. |
tiles_type | Tile-id type (used for back-compat ctor). |
atoms_type | Atom-id type (used for back-compat ctor). |
tile_size_type | Counter type for tiles. |
atom_size_type | Counter type for atoms. |
layout_type |
Members
total_work
num_merge_tiles
d_tile_coordinates
tile_coordinates
Methods
preprocess_t(tiles_iterator_t _tiles, tile_size_t _num_tiles, atom_size_t _num_atoms, cudaStream_t stream=0)
Construct from a CSR-shaped offsets pointer (back-compat shortcut; only valid when layout_type is layout::csr ).
preprocess_t(layout_t _layout, cudaStream_t stream=0)
Construct directly from a layout view (any layout type).
__device__ __host__ preprocess_t(preprocess_t const &rhs)
Special copy constructor; we never copy the device_vector as it is not supported within device code.
__device__ __host__ auto data( const)