Reproducing Results
Pre-generated CSVs and the
performance_evaluation.ipynbnotebook used to recreate every labeled figure in the paper live inplots/.
End-to-end run
- Update the dataset path in
scripts/run.sh: - Set
DATASET_DIRto the directory containing theMMsubdirectory (which itself holds per-matrix subdirectories of.mtxfiles). See Datasets for how to obtain it. -
Optionally change
DATASET_FILES_NAME; it defaults todatasets/suitesparse.txt, the paper's matrix list. -
Kick off the run from the
scripts/directory:
bash
cd scripts && ./run.sh
A complete run can take up to ~3 days — it sweeps the entire SuiteSparse collection four times, once per algorithm. The dominant cost is .mtx parsing from disk, not GPU time.
-
Expect occasional matrix-level failures. Some files in SuiteSparse are mislabeled or otherwise malformed, and a few matrices are larger than the GPU can hold. These produce runtime exceptions or
offset_toverflows and can be safely ignored — they don't affect the rest of the sweep. -
Run on a smaller subset by editing the stop condition in
scripts/run.sh#L22(it caps at the first 10 matrices by default), or removing the conditional in L22–L26 to process every.mtxfile underDATASET_DIR.
Output
Each algorithm writes a CSV alongside run.sh. Drop these into plots/data/ to replace the bundled results, then open plots/performance_evaluation.ipynb (Jupyter — install with pip install jupyter or following the official instructions) and re-run all cells to regenerate the figures.
A representative line of CSV output:
kernel,dataset,rows,cols,nnzs,elapsed
merge-path,144,144649,144649,2148786,0.0720215
merge-path,08blocks,300,300,592,0.0170898
merge-path,1138_bus,1138,1138,4054,0.0200195