MooseFS performance scores high on InfiniBand Network

November 24th, 2017 | MooseFS Team post_thumbnail

MooseFS showcases good performance on InfiniBand Network! In this article, we show how we tested MooseFS performance on InfiniBand Network and present what results the MooseFS has achieved.

We are excited to announce that tests were successfully conducted by Core Technology in cooperation with the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw, to check the performance of MooseFS over IPoIB configuration, demonstrating throughput numbers in single client and distributed setup environments. These tests were performed with MooseFS 4.0 software version but the results are also achievable with MooseFS 3.0.92+ version.

The tests show that the MooseFS distributed file system is able to achieve very good performance with IPoIB protocol. Also, we get to understand that the best performance can be achieved using at least 4 threads and block size of at least 64k. Block size is a very important aspect of TCP/IP network communication, especially for random operations.

The gathered data shows that not in all cases, increasing the number of threads increases MooseFS client performance. When we use block sizes greater than 128k the performance of sequential and random read/write does not increase more. However, increasing the number of threads very quickly leads to maximum throughput for sequential read and write. Also, random read performance increases up to 12 threads for 2048k blocks and is linear for a 16k block in whole test range from 1 to 16 threads.

All the results were achieved with IPoIB configuration. Native IB throughput achieved in such a setup is unparalleled. All tests proved that storage based on MooseFS with InfiniBand network was able to provide exceptional performance. MooseFS network defined storage is a perfect solution for HPC environment. The optimal power of MooseFS is noticeable with parallel operations on many distributed MooseFS clients. This is indeed a very good news for all the MooseFS users!

About ICM UW  (Interdisciplinary Centre for Mathematical and Computational Modelling at University of Warsaw)

ICM UW is a leading data science facility in Central Europe. High-performance computers used for processing, analysis, visualization and advanced computing tasks are ICM specialty. ICM’s goal is to understand data and provide innovative solutions to organizations and institutions, taking advantage of their data science expertise.

For more information please visit: http://icm.edu.pl

The tests were conducted in Single Client and Distributed Client setup Environments.The below two sections provide us with the detailed analysis in these two setups.

Single client test

The following section provides single client test description and configuration details. Single client test means that in the whole MooseFS cluster setup, only one server was dedicated as MooseFS client. Benchmark was executed inside the MooseFS client mount point. Benchmark tool used in this test was IOzone software, version 3.465.

MooseFS client tests were performed to show the differences between different block sizes and a number of threads. In data transmission and data storage, a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a maximum length. The number of threads in IOzone benchmark means the number of parallel processes executed during measurement. Each thread operates on one file. In a single client test, the maximum number of threads was set to 16. It means that 16 files were created in MooseFS cluster.

To properly measure performance differences between different block sizes and a number of threads, the test was executed five times for each set of parameters. Maximum and minimum results were removed from average calculations.

IOzone command used in tests:

$ iozone -eI -r {blocksize} -s1g -i0 -i1 -i2 -t {threads}

IOzone benchmark options:

  • e – Include flush (fsync, ush) in the timing calculations.
  • I – DIRECT I/O for all file operations. Tells the file system that all operations are to bypass the buffer cache and go directly to disk.
  • r – Record/block size
  • s – File size 1GB.
  • i – 0 = write operations, 1 = read operations, 2 = random read and random write operations.
  • t – Allows the user to specify how many threads or processes to be active during the measurement.

Topology

Single client test cluster consisted of two master servers (leader and follower), seven chunk servers and one client-server (Figure 1). MooseFS client software was installed only on one physical server. All servers were connected through Mellanox FDR switch with 0.02 ms port to port latency declared by the producer. InfiniBand adapter used in each server was ConnectX-3 Mellanox card with maximum throughput 56 Gbit/s. All connections were made with QSFP+ fiber optic cables.

single client 1024x868 - MooseFS performance scores high on InfiniBand Network
Figure 1: Single client test topology

Configuration

To eliminate hard disk bottleneck, 100GB RAM disks were created on each chunk server. Network transport used IPoIB protocol. No kernel modifications and no additional components were required. MooseFS replication was set to goal 1. Measured average ping between client-server and other servers in the cluster was 0.022 ms. The operating system was Centos 7.3 with kernel 3.10.0-514.6.1.el7.x86_64.

Hardware configuration of all machines:

  • CPU – 2 x Intel Xeon CPU E5-2680 v3 2,5GHz (12 cores, 24 threads)
  • RAM – 128GB DDR4 2133 MHz
  • NIC – ConnectX-3 Mellanox MT27500 Family (56 Gbit/s)
  • Mellanox FDR switch

Results

The following subsection shows plots with test results for sequential and random read/write operations. Figures 2, 3 show how performance changes with block size and a number of processing threads. We choose 4 and 8 threads to prepare block size plot (Figure 2) and 16k and 2048 blocks for threads plot (Figure 3). Figures 4, 5 show performance during random access read/write operations, similar to the previous two. The last plot (Figure 6) shows sequential and random access for read/write IOPS with 16k blocks and threads in the range from 1 to 16.

chart1 - MooseFS performance scores high on InfiniBand Network
Figure 2: Read/write test results using 4 and 8 threads for block sizes starting from 4k to 2048k

 

chart2 - MooseFS performance scores high on InfiniBand Network
Figure 3: Read/write test results using 16k and 2048 blocks for a number of threads from 1 to 16

 

chart3 - MooseFS performance scores high on InfiniBand Network
Figure 4: Random read/write test results with 4, 8 threads for block size from 4k to 2048k

 

chart4 - MooseFS performance scores high on InfiniBand Network
Figure 5: Random read/write test results with 16k, 2048k blocks for threads from 1 to 16

 

chart5 - MooseFS performance scores high on InfiniBand Network
Figure 6: Sequential and random read/write IOPS with 16k blocks

Distributed client test

This section provides description and configuration details for the distributed test. In this test, all eight MooseFS servers worked as chunkserver and client simultaneously. IOzone benchmark software was executed in a cluster testing mode. Each MooseFS client handled 4 separate IOzone processes, each IOzone process operated on four files. In total, the test had 32 threads distributed over eight servers. To properly present performance differences between different block sizes, the test was executed five times. Maximum and minimum results were removed from average calculations.

IOzone command line:

$ iozone -ceIT -i0 -i1 -i2 -+n -r {blocksize} -s1g -+H moosefs -m1 -+m hosts.cfh -t32

IOzone benchmark options:

  • c – Include close() in the timing calculations

  • e – Include flush (fsync, ush) in the timing calculations

  • I – Direct I/O for all file operations. Tells the file system that all operations are to bypass the buffer cache and go directly to disk

  • T – Use POSIX threads for throughput tests. Available on platforms that have POSIX threads.

  • i0 = write, 1 = read, 2 = random read and random write operations

  • -+n – No retests selected.

  • r – Record/block size

  • s1g – File size 1GB.

  • -+H – Hostname of the PIT server

  • -+m – hosts.cfg file contains the configuration information of the clients for cluster testing

  • t – Allows the user to specify how many threads or processes to have active during the measurement.

Distributed client test topology

Distributed client test cluster consists of two master servers and eight chunk servers and clients. All hardware components were the same as in the single client test. One additional chunkserver was prepared on the client machine from the previous test. All of the eight chunk servers used MooseFS client to run IOzone tests.

distributed schema 1024x546 - MooseFS performance scores high on InfiniBand Network
Figure 7: MooseFS distributed test infrastructure

Distributed test results

chart6 - MooseFS performance scores high on InfiniBand Network
Figure 8: Sequential and random read/write distributed test results with 32 threads and block size in the range from 4k to 2048k

The following graph shows read, write, random read and random write operations throughput with different block size for 32 threads distributed test. On X-axis is the block size and on Y-axis is the throughput in gigabytes per second.

Appendix

This section provides detailed results gathered during single and distributed IOzone benchmark tests. The following tables present more detailed information about IOzone tests. Table 1 presents IOzone test results with threads in the range from 1 to 16 and block sizes in the range from 4k to 2048k. Table 2 presents IOzone distributed test results with 32 threads using eight machines.

Table 1: MooseFS single client IOzone test results

Block sizeThreadsSequential ReadSequential WriteRandom readRandom write
MB/sIOPSMB/sIOPSMB/sIOPSMB/sIOPS
4k121354 65411328 803256 29613735 151
2403103 11420552 5904611 87924261 839
4396101 35220051 2479023 14636092 234
635590 86017644 93313233 69735390 347
836693 67919048 59417845 66834387 759
1037997 01820752 89322958 66129475 150
12408104 36223660 30127871 15031480 390
14433110 83725665 52633084 43232883 970
16429109 83726066 54737896 71630076 921
8k137948 52822428 657475 96022428 619
269588 97637648 1018911 39544857 324
4953121 93538649 37316721 40866384 917
668888 10134444 03024631 47968587 694
867886 79936146 15233042 18362780 299
1068387 47839250 12542654 49455771 307
1269388 66144957 46151265 59756772 601
1472793 10447861 24860176 92756972 781
1676197 39548962 58267986 89354770 006
16k156536 15937624 085855 43039525 302
21 05967 75666242 38216210 35477449 535
41 718109 98268543 85029919 1281 11671 438
61 49395 56464341 17745729 2731 17875 389
81 19676 52066342 41462640 0631 15874 084
101 14273 10272346 27180051 21995561 111
121 13072 33881552 13397062 09794960 734
141 12572 02184554 0871 12271 82994560 450
161 14773 41685354 5911 28882 42490457 849
32k180625 78157818 4991484 72759919 166
21 38444 3031 10735 4142818 9981 20438 540
42 40076 7991 27940 92753117 0001 65052 794
62 59482 9921 12736 06880025 5881 60651 403
81 93661 9441 16337 2301 09535 0551 83958 860
101 79757 5091 23539 5171 41045 1291 38244 226
121 71354 8221 35243 2701 70654 5961 45246 452
141 68854 0311 36743 7471 98663 5481 43245 812
161 70754 6271 38044 1662 25272 0641 42345 540
64k194315 08471511 4462293 65992614 821
21 56325 0041 41222 5944286 8481 66626 658
42 69143 0592 18434 94483813 4082 70943 345
63 00948 1471 77128 3391 26720 2662 69443 102
83 24451 9091 80328 8461 73727 7982 57941 265
103 34753 5501 81028 9672 18734 9861 95631 299
122 51140 1731 94931 1852 60341 6541 99831 973
142 55740 9181 97031 5183 01748 2641 98131 695
162 65842 5251 96531 4443 31152 9701 97131 543
128k11 0268 2068046 4323312 6449037 221
21 77914 2311 61312 9016214 9691 92115 365
42 81022 4843 03624 2871 26210 0933 54928 394
63 22325 7842 46119 6911 83514 6823 30326 423
83 32426 5902 43919 5122 46919 7493 15525 241
103 40427 2302 35018 7983 01724 1372 43819 505
123 38627 0912 47219 7733 55628 4512 51520 119
143 48427 8742 47819 8223 91531 3172 49119 925
163 30326 4252 49219 9383 84330 7422 48719 899
256k11 0364 1438233 2913341 3389063 625
21 7136 8521 6786 7106292 5141 9597 837
42 72710 909326413 0571 2424 9683 71514 860
63 03212 1292 69110 7631 8617 4443 09112 362
83 28713 1502 84011 3612 4889 9533 61814 473
103 37813 5122 56710 2683 05912 2352 62910 518
123 41113 6452 64010 5603 64014 5602 65910 637
143 35413 4172 65510 6213 94215 7682 63510 542
163 26713 0692 64610 5843 88615 5442 64610 585
512k11 0582 1168291 6593346699601 919
21 6893 3771 6363 2726221 2452 0544 108
42 7855 5713 3136 6261 2692 5393 8977 794
63 1776 3552 8445 6891 8413 6823 4786 956
83 3806 7602 8265 6522 5465 0914 3848 769
103 4066 8132 6615 3233 0786 1562 7335 465
123 4376 8742 7425 4833 6237 2452 7385 477
143 4246 8492 7295 4593 9697 9392 7335 465
163 2776 5542 7425 4843 8447 6882 7305 461
1024k11 0311 031841841335335969969
21 6481 6481 6071 6076286282 0802 080
42 7742 7743 3303 3301 2581 2583 9663 966
63 1763 1763 0873 0871 7921 7923 1033 103
83 2743 2742 7212 7212 4802 4803 7673 767
103 4423 4422 6982 6983 1183 1182 7772 777
123 3733 3732 7772 7773 6023 6022 7672 767
143 3893 3892 7952 7953 9173 9172 7682 768
163 3533 3532 8052 8053 8383 8382 7972 797
2048k11 020510815407337169958479
21 7148571 5817906293152 0901 045
42 7341 3673 2831 6421 2556274 0092 005
63 0751 5383 1031 5511 8389193 2981 649
83 2481 6242 6861 3432 5071 2533 8371 919
103 3811 6912 7771 3883 0541 5272 8221 411
123 3881 6942 8191 4093 6021 8012 8261 413
143 3971 6992 8641 4323 8691 9342 8231 411
163 3801 6902 8241 4123 8991 9502 8031 401

We hide the results table on small screens.

Table 2: MooseFS distributed IOzone test with 32 threads

We hide the results table on small screens.

Download pdf report

If you want to download this article as pdf, please click here: MooseFS performance scores high on InfiniBand network.

See also what results the MooseFS has achieved during performance tests on Docker