GraphSearch#
The Phase 1 report for GraphSearch can be found here.
The graph search (GS) workflow is a walk-based method that searches a graph for nodes that score highly on some arbitrary indicator of interest.
The use case given by the HIVE government partner was sampling a graph: given some seed nodes, and some model that can score a node as “interesting”, find lots of “interesting” nodes as quickly as possible. Their algorithm attempts to solve this problem by implementing several different strategies for walking the graph.
Scalability Summary#
Bottlenecked by network bandwidth between GPUs
Summary of Results#
We rely on a Gunrock’s multi-GPU ForALL
operator to implement GraphSearch as the entire behavior can be described within a single-loop like structure. The core computation focuses on determining which neighbor to visit next based on uniform, greedy, or stochastic functions. Each GPU is given an equal number of vertices to process. No scaling is observed, and in general we see a pattern of decreased performance as we move from 1 to 16 GPUs due to random neighbor access across GPU interconnects.
Summary of Gunrock Implementation#
The Phase 1 single-GPU implementation is here.
We parallelize across GPUs by using a multi-GPU ForAll
operator that splits arrays equally across GPUs. For more detail on how ForAll
was written to be multi-GPU can be found in Gunrock’s ForAll
Operator section of the report.
Differences in implementation from Phase 1#
No change from Phase 1.
How To Run This Application on NVIDIA’s DGX-2#
Prerequisites#
git clone https://github.com/gunrock/gunrock -b multigpu
mkdir build
cd build/
cmake ..
make -j16 rw
Verify git SHA: commit d70a73c5167c5b59481d8ab07c98b376e77466cc
Partitioning the input dataset#
How did you do this? Command line if appropriate.
include a transcript
Running the application (default configurations)#
From the build
directory
cd ../examples/rw/
./hive-mgpu-run.sh
This will launch jobs that sweep across 1 to 16 GPU configurations per dataset and application option as specified across three different test scripts:
hive-rw-undirected-uniform.sh
hive-rw-directed-uniform.sh
hive-rw-directed-greedy.sh
Please see Running the Applications for additional information.
Datasets#
Default Locations:
/home/u00u7u37rw7AjJoA4e357/data/gunrock/hive_datasets/mario-2TB/graphsearch
Names:
dir_gs_twitter
gs_twitter.values
Running the application (alternate configurations)#
hive-mgpu-run.sh#
Modify OUTPUT_DIR
to store generated output and json files in an alternate location.
Additional hive-rw-*.sh scripts#
This application relies on Gunrock’s random walk rw
primitive. Modify WALK_MODE
to control the application’s --walk-mode
parameter and specify --undirected
as true
or false
. Please see the Phase 1 single-GPU implementation details here for additional parameter information.
Output#
No change from Phase 1.
Performance and Analysis#
No change from Phase 1.
Implementation limitations#
No change from Phase 1.
Performance limitations#
Single-GPU: No change from Phase 1.
Multiple-GPUs: Performance bottleneck is the remote memory accesses from one GPU to another GPU’s memory through NVLink.
Scalability behavior#
GraphSearch scales poorly due to low compute (not enough computation per memory access) and high communication costs due to random access patterns (across multiple GPUs) characteristic to the underlying “random walk” algorithm used.