Graph analytics has been routinely used to solve problems in a wide range of real-life
applications. Efficiently processing concurrent graph analytics queries in a multiuser
environment is highly desirable as we enter a world of edge device oriented services.
Existing research, however, primarily focuses on analyzing a single, large graph dataset
and leaves the efficient processing of multiple mid-sized graph analytics queries
an intriguing yet challenging open problem. In this work, we investigate the scheduling
of concurrent graph analytics queries on NUMA machines. We analyze the performance
of several graph analytics algorithms and observe that they have diminishing performance
returns as the number of processor cores increases. With concurrent graph analytics,
such diminishing returns translate to no or even negative performance gains because
of increasing contention on shared hardware resources. We also demonstrate the unpredictability
of memory bandwidth usage for numerous graph analytics algorithms, which can lead
to sub-optimal performance due to its potential to cause severe memory bandwidth contention.
Motivated by the above observations, we propose CongraPlus, a NUMA-aware scheduler
that intelligently manages concurrent graph analytics queries for better system throughput
and memory bandwidth efficiency. CongraPlus collects the memory bandwidth consumption
characteristics of graph analytics queries via offline profiling and eliminates memory
bandwidth contention by computing the optimal sequence to launch queries. It also
avoids computation resource contention by assigning a certain number of processor
cores to the individual queries. We implement CongraPlus in C++ on top of the Ligra
graph processing framework and test it with judiciously selected graph processing
query combinations. Our results reveal that CongraPlus-based schemes improve query
throughput by 30 percent compared to the conventional approach. It also exhibits a
much better quality of service and scalability.