MooseFS Performance Scores High on InfiniBand Network
IOzone benchmarks on a 56 Gbit/s IPoIB cluster – run jointly by Core Technology and ICM University of Warsaw – show MooseFS hitting over 18 GB/s sequential read throughput in a 32-thread distributed setup. The full results tables, optimal block-size and thread-count findings, and hardware configuration details are all here.
We are excited to announce that tests were successfully conducted by Core Technology in cooperation with the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw, to check the performance of MooseFS over IPoIB configuration, demonstrating throughput numbers in single client and distributed setup environments. These tests were performed with MooseFS 4.0 software version but the results are also achievable with MooseFS 3.0.92+ version.
The tests show that the MooseFS distributed file system is able to achieve very good performance with IPoIB protocol. Also, we get to understand that the best performance can be achieved using at least 4 threads and block size of at least 64k. Block size is a very important aspect of TCP/IP network communication, especially for random operations.
The gathered data shows that not in all cases, increasing the number of threads increases MooseFS client performance. When we use block sizes greater than 128k the performance of sequential and random read/write does not increase more. However, increasing the number of threads very quickly leads to maximum throughput for sequential read and write. Also, random read performance increases up to 12 threads for 2048k blocks and is linear for a 16k block in whole test range from 1 to 16 threads.
All the results were achieved with IPoIB configuration. Native IB throughput achieved in such a setup is unparalleled. All tests proved that storage based on MooseFS with InfiniBand network was able to provide exceptional performance. MooseFS network defined storage is a perfect solution for HPC environment. The optimal power of MooseFS is noticeable with parallel operations on many distributed MooseFS clients. This is indeed very good news for all MooseFS users!
About ICM UW
ICM UW (Interdisciplinary Centre for Mathematical and Computational Modelling at University of Warsaw) is a leading data science facility in Central Europe. High-performance computers used for processing, analysis, visualization and advanced computing tasks are ICM’s specialty. ICM’s goal is to understand data and provide innovative solutions to organizations and institutions, taking advantage of their data science expertise.
For more information please visit: http://icm.edu.pl
The tests were conducted in Single Client and Distributed Client setup environments. The below two sections provide a detailed analysis in these two setups.
Single Client Test
The following section provides single client test description and configuration details. Single client test means that in the whole MooseFS cluster setup, only one server was dedicated as MooseFS client. Benchmark was executed inside the MooseFS client mount point. Benchmark tool used in this test was IOzone software, version 3.465.
MooseFS client tests were performed to show the differences between different block sizes and a number of threads. In data transmission and data storage, a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a maximum length. The number of threads in IOzone benchmark means the number of parallel processes executed during measurement. Each thread operates on one file. In a single client test, the maximum number of threads was set to 16. It means that 16 files were created in MooseFS cluster.
To properly measure performance differences between different block sizes and a number of threads, the test was executed five times for each set of parameters. Maximum and minimum results were removed from average calculations.
IOzone command used in tests:
$ iozone -eI -r {blocksize} -s1g -i0 -i1 -i2 -t {threads}
IOzone benchmark options:
-e– Include flush (fsync, flush) in the timing calculations.-I– DIRECT I/O for all file operations. Tells the file system that all operations are to bypass the buffer cache and go directly to disk.-r– Record/block size.-s– File size 1 GB.-i– 0 = write operations, 1 = read operations, 2 = random read and random write operations.-t– Allows the user to specify how many threads or processes to be active during the measurement.
Topology
Single client test cluster consisted of two master servers (leader and follower), seven chunk servers and one client-server. MooseFS client software was installed only on one physical server. All servers were connected through Mellanox FDR switch with 0.02 ms port to port latency declared by the producer. InfiniBand adapter used in each server was ConnectX-3 Mellanox card with maximum throughput 56 Gbit/s. All connections were made with QSFP+ fiber optic cables.
Configuration
To eliminate hard disk bottleneck, 100 GB RAM disks were created on each chunk server. Network transport used IPoIB protocol. No kernel modifications and no additional components were required. MooseFS replication was set to goal 1. Measured average ping between client-server and other servers in the cluster was 0.022 ms. The operating system was CentOS 7.3 with kernel 3.10.0-514.6.1.el7.x86_64.
Hardware configuration of all machines:
- CPU – 2 × Intel Xeon CPU E5-2680 v3 2.5 GHz (12 cores, 24 threads)
- RAM – 128 GB DDR4 2133 MHz
- NIC – ConnectX-3 Mellanox MT27500 Family (56 Gbit/s)
- Mellanox FDR switch
Results
The following subsection shows plots with test results for sequential and random read/write operations. Figures 2 and 3 show how performance changes with block size and a number of processing threads. We chose 4 and 8 threads to prepare the block size plot (Figure 2) and 16k and 2048k blocks for the threads plot (Figure 3). Figures 4 and 5 show performance during random access read/write operations. The last plot (Figure 6) shows sequential and random access read/write IOPS with 16k blocks and threads in the range from 1 to 16.
Distributed Client Test
This section provides description and configuration details for the distributed test. In this test, all eight MooseFS servers worked as chunkserver and client simultaneously. IOzone benchmark software was executed in cluster testing mode. Each MooseFS client handled 4 separate IOzone processes, each IOzone process operated on four files. In total, the test had 32 threads distributed over eight servers. To properly present performance differences between different block sizes, the test was executed five times. Maximum and minimum results were removed from average calculations.
IOzone command line:
$ iozone -ceIT -i0 -i1 -i2 -+n -r {blocksize} -s1g -+H moosefs -m1 -+m hosts.cfg -t32
IOzone benchmark options:
-c– Include close() in the timing calculations.-e– Include flush (fsync, flush) in the timing calculations.-I– Direct I/O for all file operations. Tells the file system that all operations are to bypass the buffer cache and go directly to disk.-T– Use POSIX threads for throughput tests.-i– 0 = write, 1 = read, 2 = random read and random write operations.-+n– No retests selected.-r– Record/block size.-s1g– File size 1 GB.-+H– Hostname of the PIT server.-+m– hosts.cfg file contains the configuration information of the clients for cluster testing.-t– Allows the user to specify how many threads or processes to have active during the measurement.
Distributed Client Test Topology
Distributed client test cluster consists of two master servers and eight chunk servers and clients. All hardware components were the same as in the single client test. One additional chunkserver was prepared on the client machine from the previous test. All eight chunk servers used MooseFS client to run IOzone tests.
Distributed Test Results
The graph shows read, write, random read and random write operations throughput with different block sizes for 32-thread distributed test. On the X-axis is the block size and on the Y-axis is the throughput in gigabytes per second.
Appendix
This section provides detailed results gathered during single and distributed IOzone benchmark tests.
Table 1: MooseFS Single Client IOzone Test Results
| Block size | Threads | Seq. Read (MB/s) | Seq. Read (IOPS) | Seq. Write (MB/s) | Seq. Write (IOPS) | Rnd Read (MB/s) | Rnd Read (IOPS) | Rnd Write (MB/s) | Rnd Write (IOPS) |
|---|---|---|---|---|---|---|---|---|---|
| 4k | 1 | 213 | 54654 | 113 | 28803 | 25 | 6296 | 137 | 35151 |
| 4k | 2 | 403 | 103114 | 205 | 52590 | 46 | 11879 | 242 | 61839 |
| 4k | 4 | 396 | 101352 | 200 | 51247 | 90 | 23146 | 360 | 92234 |
| 4k | 6 | 355 | 90860 | 176 | 44933 | 132 | 33697 | 353 | 90347 |
| 4k | 8 | 366 | 93679 | 190 | 48594 | 178 | 45668 | 343 | 87759 |
| 4k | 10 | 379 | 97018 | 207 | 52893 | 229 | 58661 | 294 | 75150 |
| 4k | 12 | 408 | 104362 | 236 | 60301 | 278 | 71150 | 314 | 80390 |
| 4k | 14 | 433 | 110837 | 256 | 65526 | 330 | 84432 | 328 | 83970 |
| 4k | 16 | 429 | 109837 | 260 | 66547 | 378 | 96716 | 300 | 76921 |
| 8k | 1 | 379 | 48528 | 224 | 28657 | 47 | 5960 | 224 | 28619 |
| 8k | 2 | 695 | 88976 | 376 | 48101 | 89 | 11395 | 448 | 57324 |
| 8k | 4 | 953 | 121935 | 386 | 49373 | 167 | 21408 | 663 | 84917 |
| 8k | 6 | 688 | 88101 | 344 | 44030 | 246 | 31479 | 685 | 87694 |
| 8k | 8 | 678 | 86799 | 361 | 46152 | 330 | 42183 | 627 | 80299 |
| 8k | 10 | 683 | 87478 | 392 | 50125 | 426 | 54494 | 557 | 71307 |
| 8k | 12 | 693 | 88661 | 449 | 57461 | 512 | 65597 | 567 | 72601 |
| 8k | 14 | 727 | 93104 | 478 | 61248 | 601 | 76927 | 569 | 72781 |
| 8k | 16 | 761 | 97395 | 489 | 62582 | 679 | 86893 | 547 | 70006 |
| 16k | 1 | 565 | 36159 | 376 | 24085 | 85 | 5430 | 395 | 25302 |
| 16k | 2 | 1059 | 67756 | 662 | 42382 | 162 | 10354 | 774 | 49535 |
| 16k | 4 | 1718 | 109982 | 685 | 43850 | 299 | 19128 | 1116 | 71438 |
| 16k | 6 | 1493 | 95564 | 643 | 41177 | 457 | 29273 | 1178 | 75389 |
| 16k | 8 | 1196 | 76520 | 663 | 42414 | 626 | 40063 | 1158 | 74084 |
| 16k | 10 | 1142 | 73102 | 723 | 46271 | 800 | 51219 | 955 | 61111 |
| 16k | 12 | 1130 | 72338 | 815 | 52133 | 970 | 62097 | 949 | 60734 |
| 16k | 14 | 1125 | 72021 | 845 | 54087 | 1122 | 71829 | 945 | 60450 |
| 16k | 16 | 1147 | 73416 | 853 | 54591 | 1288 | 82424 | 904 | 57849 |
| 32k | 1 | 806 | 25781 | 578 | 18499 | 148 | 4727 | 599 | 19166 |
| 32k | 2 | 1384 | 44303 | 1107 | 35414 | 281 | 8998 | 1204 | 38540 |
| 32k | 4 | 2400 | 76799 | 1279 | 40927 | 531 | 17000 | 1650 | 52794 |
| 32k | 6 | 2594 | 82992 | 1127 | 36068 | 800 | 25588 | 1606 | 51403 |
| 32k | 8 | 1936 | 61944 | 1163 | 37230 | 1095 | 35055 | 1839 | 58860 |
| 32k | 10 | 1797 | 57509 | 1235 | 39517 | 1410 | 45129 | 1382 | 44226 |
| 32k | 12 | 1713 | 54822 | 1352 | 43270 | 1706 | 54596 | 1452 | 46452 |
| 32k | 14 | 1688 | 54031 | 1367 | 43747 | 1986 | 63548 | 1432 | 45812 |
| 32k | 16 | 1707 | 54627 | 1380 | 44166 | 2252 | 72064 | 1423 | 45540 |
| 64k | 1 | 943 | 15084 | 715 | 11446 | 229 | 3659 | 926 | 14821 |
| 64k | 2 | 1563 | 25004 | 1412 | 22594 | 428 | 6848 | 1666 | 26658 |
| 64k | 4 | 2691 | 43059 | 2184 | 34944 | 838 | 13408 | 2709 | 43345 |
| 64k | 6 | 3009 | 48147 | 1771 | 28339 | 1267 | 20266 | 2694 | 43102 |
| 64k | 8 | 3244 | 51909 | 1803 | 28846 | 1737 | 27798 | 2579 | 41265 |
| 64k | 10 | 3347 | 53550 | 1810 | 28967 | 2187 | 34986 | 1956 | 31299 |
| 64k | 12 | 2511 | 40173 | 1949 | 31185 | 2603 | 41654 | 1998 | 31973 |
| 64k | 14 | 2557 | 40918 | 1970 | 31518 | 3017 | 48264 | 1981 | 31695 |
| 64k | 16 | 2658 | 42525 | 1965 | 31444 | 3311 | 52970 | 1971 | 31543 |
| 128k | 1 | 1026 | 8206 | 804 | 6432 | 331 | 2644 | 903 | 7221 |
| 128k | 2 | 1779 | 14231 | 1613 | 12901 | 621 | 4969 | 1921 | 15365 |
| 128k | 4 | 2810 | 22484 | 3036 | 24287 | 1262 | 10093 | 3549 | 28394 |
| 128k | 6 | 3223 | 25784 | 2461 | 19691 | 1835 | 14682 | 3303 | 26423 |
| 128k | 8 | 3324 | 26590 | 2439 | 19512 | 2469 | 19749 | 3155 | 25241 |
| 128k | 10 | 3404 | 27230 | 2350 | 18798 | 3017 | 24137 | 2438 | 19505 |
| 128k | 12 | 3386 | 27091 | 2472 | 19773 | 3556 | 28451 | 2515 | 20119 |
| 128k | 14 | 3484 | 27874 | 2478 | 19822 | 3915 | 31317 | 2491 | 19925 |
| 128k | 16 | 3303 | 26425 | 2492 | 19938 | 3843 | 30742 | 2487 | 19899 |
| 256k | 1 | 1036 | 4143 | 823 | 3291 | 334 | 1338 | 906 | 3625 |
| 256k | 2 | 1713 | 6852 | 1678 | 6710 | 629 | 2514 | 1959 | 7837 |
| 256k | 4 | 2727 | 10909 | 3264 | 13057 | 1242 | 4968 | 3715 | 14860 |
| 256k | 6 | 3032 | 12129 | 2691 | 10763 | 1861 | 7444 | 3091 | 12362 |
| 256k | 8 | 3287 | 13150 | 2840 | 11361 | 2488 | 9953 | 3618 | 14473 |
| 256k | 10 | 3378 | 13512 | 2567 | 10268 | 3059 | 12235 | 2629 | 10518 |
| 256k | 12 | 3411 | 13645 | 2640 | 10560 | 3640 | 14560 | 2659 | 10637 |
| 256k | 14 | 3354 | 13417 | 2655 | 10621 | 3942 | 15768 | 2635 | 10542 |
| 256k | 16 | 3267 | 13069 | 2646 | 10584 | 3886 | 15544 | 2646 | 10585 |
| 512k | 1 | 1058 | 2116 | 829 | 1659 | 334 | 669 | 960 | 1919 |
| 512k | 2 | 1689 | 3377 | 1636 | 3272 | 622 | 1245 | 2054 | 4108 |
| 512k | 4 | 2785 | 5571 | 3313 | 6626 | 1269 | 2539 | 3897 | 7794 |
| 512k | 6 | 3177 | 6355 | 2844 | 5689 | 1841 | 3682 | 3478 | 6956 |
| 512k | 8 | 3380 | 6760 | 2826 | 5652 | 2546 | 5091 | 4384 | 8769 |
| 512k | 10 | 3406 | 6813 | 2661 | 5323 | 3078 | 6156 | 2733 | 5465 |
| 512k | 12 | 3437 | 6874 | 2742 | 5483 | 3623 | 7245 | 2738 | 5477 |
| 512k | 14 | 3424 | 6849 | 2729 | 5459 | 3969 | 7939 | 2733 | 5465 |
| 512k | 16 | 3277 | 6554 | 2742 | 5484 | 3844 | 7688 | 2730 | 5461 |
| 1024k | 1 | 1031 | 1031 | 841 | 841 | 335 | 335 | 969 | 969 |
| 1024k | 2 | 1648 | 1648 | 1607 | 1607 | 628 | 628 | 2080 | 2080 |
| 1024k | 4 | 2774 | 2774 | 3330 | 3330 | 1258 | 1258 | 3966 | 3966 |
| 1024k | 6 | 3176 | 3176 | 3087 | 3087 | 1792 | 1792 | 3103 | 3103 |
| 1024k | 8 | 3274 | 3274 | 2721 | 2721 | 2480 | 2480 | 3767 | 3767 |
| 1024k | 10 | 3442 | 3442 | 2698 | 2698 | 3118 | 3118 | 2777 | 2777 |
| 1024k | 12 | 3373 | 3373 | 2777 | 2777 | 3602 | 3602 | 2767 | 2767 |
| 1024k | 14 | 3389 | 3389 | 2795 | 2795 | 3917 | 3917 | 2768 | 2768 |
| 1024k | 16 | 3353 | 3353 | 2805 | 2805 | 3838 | 3838 | 2797 | 2797 |
| 2048k | 1 | 1020 | 510 | 815 | 407 | 337 | 169 | 958 | 479 |
| 2048k | 2 | 1714 | 857 | 1581 | 790 | 629 | 315 | 2090 | 1045 |
| 2048k | 4 | 2734 | 1367 | 3283 | 1642 | 1255 | 627 | 4009 | 2005 |
| 2048k | 6 | 3075 | 1538 | 3103 | 1551 | 1838 | 919 | 3298 | 1649 |
| 2048k | 8 | 3248 | 1624 | 2686 | 1343 | 2507 | 1253 | 3837 | 1919 |
| 2048k | 10 | 3381 | 1691 | 2777 | 1388 | 3054 | 1527 | 2822 | 1411 |
| 2048k | 12 | 3388 | 1694 | 2819 | 1409 | 3602 | 1801 | 2826 | 1413 |
| 2048k | 14 | 3397 | 1699 | 2864 | 1432 | 3869 | 1934 | 2823 | 1411 |
| 2048k | 16 | 3380 | 1690 | 2824 | 1412 | 3899 | 1950 | 2803 | 1401 |
Table 2: MooseFS Distributed IOzone Test with 32 Threads
| Block size | Threads | Seq. Read (MB/s) | Seq. Read (IOPS) | Seq. Write (MB/s) | Seq. Write (IOPS) | Rnd Read (MB/s) | Rnd Read (IOPS) | Rnd Write (MB/s) | Rnd Write (IOPS) |
|---|---|---|---|---|---|---|---|---|---|
| 4k | 32 | 5575 | 1427207 | 5008 | 1281999 | 734 | 187937 | 537 | 137516 |
| 8k | 32 | 10570 | 1352946 | 7602 | 973014 | 1262 | 161583 | 939 | 120175 |
| 16k | 32 | 15724 | 1006352 | 7947 | 508592 | 2309 | 147771 | 1586 | 101510 |
| 32k | 32 | 17581 | 562595 | 7711 | 246761 | 4143 | 132588 | 2408 | 77062 |
| 64k | 32 | 18623 | 297962 | 7853 | 125656 | 6805 | 108881 | 3585 | 57363 |
| 128k | 32 | 18552 | 148417 | 7839 | 62712 | 10144 | 81151 | 4000 | 32001 |
| 256k | 32 | 18590 | 74362 | 7833 | 31332 | 10218 | 40871 | 3872 | 15489 |
| 512k | 32 | 18704 | 37409 | 7878 | 15757 | 10323 | 20646 | 3964 | 7928 |
| 1024k | 32 | 18700 | 18700 | 7802 | 7802 | 10371 | 10371 | 3828 | 3828 |
| 2048k | 32 | 18247 | 9123 | 7565 | 3783 | 10424 | 5212 | 4950 | 2475 |
If you want to download this article as PDF, please click here: MooseFS performance scores high on InfiniBand network (PDF).
See also what results MooseFS achieved during performance tests on Docker.