Datasets
loops ships with a single bundled matrix, datasets/chesapeake/chesapeake.mtx, so all examples can be run as smoke tests right out of the box. For benchmarking and the paper's experiments we use the SuiteSparse Matrix Collection.[^1]
Downloading SuiteSparse
The full collection is large enough that a download takes a long time and a lot of disk; we recommend running the command below inside a tmux session.
wget --recursive --no-parent --force-directories -l inf -X RB,mat \
--accept "*.tar.gz" \
"https://suitesparse-collection-website.herokuapp.com/"
| Flag | Purpose |
|---|---|
--recursive |
Recursively follow links |
--no-parent |
Don't fetch links above the starting URL |
-l inf |
No depth limit |
-X RB,mat |
Skip the RB and mat subdirs (we only want MatrixMarket) |
--accept "*.tar.gz" |
Only download tarballs |
--force-directories |
Preserve the original site hierarchy |
Total downloaded size: ~887 GB (uncompressed + compressed).
Uncompressing
After the wget completes, decompress every tarball in place:
find . -name '*.tar.gz' -execdir tar -xzvf '{}' \;
Picking a subset
The list of matrices used by the paper's runs lives in datasets/suitesparse.txt. When invoking scripts/run.sh you can either:
- point
DATASET_FILES_NAMEat this list to reproduce the paper's experiment set, or - supply your own newline-separated list to run on a custom subset.
[^1]: Timothy A. Davis and Yifan Hu. 2011. The University of Florida Sparse Matrix Collection. ACM Transactions on Mathematical Software 38, 1, Article 1 (December 2011), 25 pages. DOI: https://doi.org/10.1145/2049662.2049663.