Benchmarking Distributed Graph Databases:

Galaxybase, TigerGraph, Nebula Graph, JanusGraph Jan. 20, 2022

Executive Summary

As the first domestic native, distributed, and MPP (Massively Parallel Processing) graph database for enterprise, Galaxybase provides one-step solution to large scale graph data store and computation. Itstores data in vertexes, edges, and properties with index-free adjacency for optimum processing of CRUD operations. A distributed and parallel computing framework is built on top of thenativegraph store for efficient large scale graph computations.

This benchmark examines the data loading and query performance of Galaxybase Single Server, Tigergraph (abbr.Tiger), Nebula Graph (abbr. Nebula), and JanusGraph (abbr. Janus).
Tests included in this benchmarkare as follows:

Data Loading
1. Loading time
2. Storage size of loaded data
Querying
1. Query response time for K-hop Path Queries
2. Query response time for the Shortest Path Queries
3. Query response time for Analytic Queries
Stress Testing
1. Stress testing for read and write performance

1. Benchmark Setup

This section describes the graph systems tested, the hardware platforms, the software environmentand the datasets used.

1.1 Distributed Graph Databases
  • Galaxybase 3.3.0
  • TigerGraph 3.2
  • Nebula Graph 2.5.1
  • JanusGraph 0.6.0
1.2 Hardware Platforms

This benchmark is performed in a multi-machine environment with 3 nodes and all graph databases are tested with servers of the same configuration. Table 1 demonstrates details of the configuration

Server Configuration
Age
Age
CPU
12Core 3.5GHz
Memory
128G DDR4
Bandwidth
Gbps
Hard Disk
5.5T HDD
1.3 Software Environment
  • OS: Ubuntu 16.04.1 LTS (kernel:4.15.0-72-generic)
  • Java: Java SE 8 Update 201 (build 1.8.0_201-b09)
  • Docker: Docker version 17.06.2
1.4 Datasets

We use three publicly available datasets. The first one is synthetic, from the graph500.org Kronecker graph generator. The second one is a well-known Twitter follower dataset. And the third one is the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB) Scale-Factor 10Kdataset.

Name
Description
Vertices (Million)
Edges (Million)
RawSize(GB)
Graph 500
Synthetic Kronecker graph http://graph500.org
2.4 M
67 M
1 G
Twitter-2010
Twitter user-follower directed graph http://an.kaist.ac.kr/traces/WWW2010.html
41.6 M
1470 M
24.6 G
SF10
LDBC benchmarks benchmarks
29.98 M
170 M
8.3 G
Note: For each graph, the raw data are formatted as separated vertex file and edge file.

2. Data Loading Tests

Data loading tests examine the following two areas:

  • Loading time and speed
  • Storage size of loaded data
2.1 Loading Methodology

For each graph database, we select the most favorable method for bulk loading of the initial data.Loading Methods for Each Database

Name
Loading API or Method
Galaxybase
galaxybase-load tool
TigerGraph
GSQL declarative loading job
Nebula Graph
Nebula Graph Importer tool
JanusGraph
Java program which uses TinkerPop API to add vertices andedges
2.2 Methods for Measuring the Storage Size of Loaded Data

Every graph database tested has its data storage path, and in this test we use du -sh command to get the summary of a grand total storage size of the loaded data. The following Table 5 depicts the respective storage directory of each graph database.

Name
Storage Directory
Galaxybase
${galaxybase-home}/db/store/${graphIndex}
TigerGraph
${tigergraph-home}/gstore
Nebula Graph
/usr/local/nebula/data/storage/nebula
JanusGraph
${cassandra-home}/data/data/${graph-name}
2.3 Loading Time and Storage Size of Loaded Data
Note:
Nebula Graph offers the function of Compaction that re-organizes data structure and indexes tomakethe data easier to read. Yet this function can cause unexpected IO occupancy, which canresult inavarying range of data loading time and thus fail to objectively reflect loading time. Consequently, weload data into Nebula with Compaction turned off, and after the dataset is loaded, we enable the Compaction manually. The storage size of loaded data in Nubula is eventually 1.9 Gfor Graph500,42 G for Twitter 2010, and 8.5 G for SF10.
Dataset
Graph 500
Testing Item
Galaxybase
Tiger
*Nebula
Janus
Loading Time(sec)
67.1 s
62.0 s
170.0 s
6360.0 s
Storage Size
2.4 G
855.0 M
1.9 G
2.5 G
Raw Size
1.0 G
Dataset
Twitter-2010
Loading Time(sec)
1311.1 s
1044.0 s
5556.0 s
115060.0 s
Storage Size
47.0 G
20.8 G
42.0 G
50.0G
Raw Size
24.6 G
Dataset
SF10
Loading Time(sec)
315.0 s
124.0 s
688.0 s
13317.0 s
Storage Size
10.6 G
7.3 G
8.5 G
48.0G
Raw Size
8.3 G
2.4 Summary
  • Galaxybase demands a longer period of data loading time and greater storage spacethanTigerGraph. Despite the seemingly superior performance, TigerGraph applies toalimitedvariety of use cases since it only supports one single edge of the same type between twovertices.Nevertheless, Galaxybase sets no limitations over the number of edges of the same typebetweentwo vertices, hence a broader variety of use cases. Besides, for accuracy andprecision,Galaxybase has every edge get their own IDs, which comes at the cost of loadingspeedandresults in a larger storage space.
  • Compared to Nebula Graph, it takes Galaxybase a longer time for data loading andalargerstorage size for the loaded data. It can be attributed to Nebula’s Compact functionwhichautomatically compresses the data, and that Galaxybase additionally stores internal fields(e.g.,edge id) at the cost of a larger storage size, which benefits the algorithmexecution afterwards.
  • Additionally, Nebula still demands a greater storage space than Galaxybase even after thedatahave been compressed.
  • Galaxybase takes only 1.1% to 2.3% of the time that JanusGraph demands, and Galaxybaseasksfor a smaller storage space than JanusGraph does.

3. Query Performance Tests

The query performance tests examine the following three areas:

  • Query response time for K-hop Path Queries
  • Query response time for Shortest Path Queries
  • Query response time for Analytic Queries
3.1 K-hop Path Queries

The K-hop path query, which asks for the total count of the vertices that have a k-hop pathfromastarting vertex, is a classic measure for graph traversal performance.

3.1.1 Query Methodology

For each dataset, we measure the query response time for the following queries: Count all 1-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to3minutes/query. Count all 2-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to3minutes/query. Count all 3-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to 1 hour/query. Count all 4-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to 1 hour/query. Count all 5-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to 1 hour/query. Count all 6-hop-path endpoint vertices for 100 fixed random seeds, with timeout set to 1 hour/query

We implement the query in the query language of each database: JavaAPI (bfsMaster)forGalaxybase, GSQL for TigerGraph, Gremlin for JanusGraph, and nGQL for Nebula Graph.

Note:

1. In the selection of sample data, two issues should be considered: 1) The traversal capabilitycannot be fully demonstrated if the number of outedge is too small; 2) An unreasonablylargedifference between outedge number of samples will lead to huge disparities of executiontime,hence a reduction in the guiding significance revealed by the statistical average. Toavoidthesuch problems, we choose 100 fixed random seeds, each of whose outedge number is 1000.

2. To focus on the graph traversal and to minimize the network output time, we output onlythesizeof the k-hop-path neighborhood, rather than the complete list of vertices.

3. The query results of the graph database systems have been successfully cross-validatedforreliability.

4. A path length more than 3 hops is much more challenging for most of the databases. Therefore,despite the raised timeout threshold per query from 3 minutes to 1 hour, JanusGraphstill runsout of memory or fails to finish the execution within 1 hour. To keep the test manageable, wereduce the total number of trials from 100 to 10, with the top ten samples selected.

3.1.2 Testing Results

  • Response Time for K-hop Path Query (Graph 500)
  • Testing Item
    Response Time (ms)
    Average Neighbor Number
    Dataset
    Hops
    Galaxybase
    Tiger
    Nebula
    Janus
    Graph 500
    1-hop
    5
    5
    4
    26
    984
    2-hop
    127
    936
    3610
    19970
    497163
    3-hop
    643
    2021
    80444
    1286613
    1754778
    4-hop
    1092
    2466
    167263
    2424954
    1814567
    5-hop
    1120
    2880
    251247
    2482932
    1819468
    6-hop
    1153
    3318
    336603
    2488557
    1820878
  • Response Time for K-hop Path Query (Twitter 2010)
  • Testing Item
    Response Time (ms)
    Average Neighbor Number
    Dataset
    Hops
    Galaxybase
    Tiger
    Nebula
    Janus
    Twitter-2010
    1-hop
    5
    5
    6
    27
    1001
    2-hop
    457
    1458
    18268
    74558
    1990879
    3-hop
    8052
    12416
    2142757
    *N/A
    22949457
    4-hop
    18569
    24326
    *N/A
    *N/A
    33384879
    5-hop
    22592
    26553
    *N/A
    *N/A
    34855258
    6-hop
    23291
    27435
    *N/A
    *N/A
    34999822
    Note:*N/A means the graph database fails (due to timeout or error report) in the trial.
  • 1-hop Path Query Time
  • Query response time for 1-hop query (ms)

  • 2-hop Path Query Time
  • Query response time for 2-hop query (ms)

  • 3-hop Path Query Time
  • Query response time for 3-hop query (ms)

  • 4-hop Path Query Time
  • Query response time for 4-hop query (ms)

  • 5-hop Path Query Time
  • Query response time for 5-hop query (ms)

  • 6-hop Path Query Time
  • Query response time for 6-hop query (ms)

3.1.3 Conclusions

  • Galaxybase outclasses the other counterparts in K-hop path query, and the advantagebecomesgreater as the size of data and the number of hops increase.
  • A path length more than 3 hops (3 hops included) can be quite challenging for most databases,let alone a query over 3 hops against a considerable size of data. With data sizes growing,Nebula and Janus tend to fail the query due to timeout or error reports when runningak-hop CreateLink Tech All Rights Reserved 15(k>=3) query on Twitter 2010. While Galaxybase successfully outputs a 6-hop neighborhoodof35 million around 23 seconds.
  • Galaxybase’s inherent native storage and its optimization for the graph traversal algorithmsenable its excellent performance in K-hop path query
3.2 Shortest Path Queries

The Shortest Path Algorithm calculates the shortest path between a pair of vertices. It’s useful foruser interactions and dynamic workflows because it works in real-time.The 100 sets of samples consist of five groups, each accounting for 20%and the correspondingpathlength ranging from 1 to 5 hops. Timeout is set to 5 minutes/query.

3.2.1 Testing Results

  • Response Time for Shorted Path Query (sec)
  • Testing Item
    Query Response Time (ms)
    Dataset
    Galaxybase
    Tiger
    Nebula
    Janus
    Graph 500
    36
    1505
    13205
    64244
    Twitter-2010
    65
    4403
    99192
    95568

3.2.2 Conclusions

  • Galaxybase is faster than the other graph databases by orders of magnitude across all theshortestpath queries. Additionally, it runs stably and is not prone to errors.
3.3 Analytic Queries

In this section we compare the execution time of database running analytic queries (PageRank,Weakly Connected Component and Label Propagation Algorithm) on Graph 500 and Twitter 2010respectively.

3.3.1 Testing Results

  • Query Response Time for Analytic Queries (sec)
  • Testing Item
    Average Response Time (sec)
    Dataset
    Algorithm
    Galaxybase
    Tiger
    Nebula
    Janus
    Graph 500
    PageRank
    1.25
    11.49
    unsupported
    unsupported
    WCC
    0.60
    10.11
    unsupported
    unsupported
    LPA
    3.70
    33.32
    unsupported
    unsupported
    Twitter-2010
    PageRank
    54.21
    227.65
    unsupported
    unsupported
    WCC
    12.15
    247.01
    unsupported
    unsupported
    LPA
    164.51
    728.97
    unsupported
    unsupported
    Note:

    1. PageRank is an iterative algorithm which traverses every edge during every iterationandcomputes a score for each vertex. After several iterations, the scores will converge tosteadystatevalues. For our experiment, we run 10 iterations

    2. A weakly connected component (WCC) is the maximal set of vertices and their connectingedges which can reach one another, if the direction of directed edges is ignored. The WCCqueryfinds and labels all the WCCs in a graph. This query requires every vertex and everyedgebetraversed.

    3. Label Propagation Algorithm (LPA) is a fast algorithm for finding communities inagraph. In LPA, vertices select their group based on their direct neighbors. This process is well suitedtonetworks where groupings are less clear and weights can be used to help a vertexdeterminewhich community to place itself within.

    4. Unsupported: The algorithm cannot be called directly from the database.

  • The chart below compares the average response time for each algorithmbasedonGraph500 and Twitter 2010 respectively:
  • Average Response Time (sec) based on Graph 500

    Average Response Time (sec) based on Twitter-2010

3.3.2 Conclusions

  • Galaxybase and TigerGraph, as graph databases for enterprise, support a larger varietyof algorithms than the other open-source counterparts tested
  • Galaxybase is 4 to 10 times faster than TigerGraph in PageRank algorithm
  • Galaxybase is 4 to 5 times faster than TigerGraph in Weakly Connected Components.
  • Galaxybase is 2 to 3 times faster than TigerGraph in Label Propagation.
3.4 Summary
  • Galaxybase is superior to the other graph databases regarding k-hop path query, and theadvantage increases as the number of hops and neighbors increases.
  • As to analytic queries, Galaxybase supports a greater variety of graph algorithms andis fasterattraversal and query response time.

4. StressTesting

4.1 Stress Testing for Read and Write Performance

We run the stress testing with 100 virtual users using the Linked Data Benchmark Council (LDBC)Social Network Benchmark (SNB) Scale-Factor 10K (SF10) dataset. To test the concurrentperformance of graph databases, the same type of request is rent to the database at the sametime.This test is a practical method for testing the concurrent performance of graph databases.

4.1.1 Methodology

The following three testing areas are covered using SF10 dataset in the context of 100concurrentexecutions for a duration of 5 minutes:

1. Find vertex, edge

2. Add vertex, edge

3. Modify vertex, edge

4.1.2 Samples

Samples for this stress testing are as follows:

1. Vertex sample: Vertex type is Comment. Sample data are collected from all vertex IDs fromSF10dataset.

2. Edge sample: Edge type is Person_Likes_Person, directing from Person to Post. Sampledataarecollected from all vertex IDs from SF10 dataset.

4.1.3 Results
  • Testing Results Based on SF10
  • Testing Item
    Processing requests per second (throughput/s)
    Dataset
    Item
    Galaxybase
    Tiger
    Nebula
    Janus
    SF10
    Find Vertex
    36736
    1856
    5410
    76
    Find Edge
    35356
    1921
    5823
    34
    Add Vertex
    35300
    5234
    6591
    47
    Add Edge
    16169
    5066
    6384
    41
    Modify Vertex
    35589
    5225
    6340
    64
    Modify Edge
    6881
    5077
    5460
    27
4.2 Summary
  • Galaxybase outperforms the other graph databases in terms of 100 concurrent executions. Takingtesting item of adding vertex for example, Galaxybase can handle 35,300 requests per second, whichis 6 times as more as what TigerGraph and Nebula can handle, and 750 times as that for JanusGraphto process.
创邻科技是国内首家全自主知识产权的商业图数据库供应商,提供多源异构数据的关联挖掘、深链查询、可视化分析及行业图智能计算应用服务。
home.official

商务咨询

400-882-6897

售前咨询

0571-88013575、0571-88016275

企业邮箱

partner@chuanglintech.com

媒体合作

Marketing@chuanglintech.com

地址

浙江省杭州市西湖区三墩镇振华路666号名栖首座6号楼605室

浙公网安备33010602011939号2019 浙江创邻科技有限公司 All Rights Reserved.
联系我们