mj yu
mj yu

Reputation: 1

The bandwidth used to test the code is not the same as that tested by perftest

In order to learn RDMA, I found an example on the Internet, which is similar to the one provided by MELLANOX , but when I used two machines to run, I found the following problems:

1.There is a big gap between the bandwidth of the code tested and that tested by Perftest .

2.In addition to this, the use of GID 0 or 2 on one of the two machines will significantly reduce the bandwidth.

Machine A:

configure:

hca_id: mlx5_bond_0
        transport:                      InfiniBand (0)
        fw_ver:                         20.39.3004
        node_guid:                      1070:fd03:00e5:f118
        sys_image_guid:                 1070:fd03:00e5:f118
        vendor_id:                      0x02c9
        vendor_part_id:                 4123
        hw_ver:                         0x0
        board_id:                       MT_0000000224
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

DEV     PORT    INDEX   GID                                     IPv4            VER     DEV
---     ----    -----   ---                                     ------------    ---     ---
mlx5_bond_0     1       0       fe80:0000:0000:0000:b0fc:4eff:feb3:1112                 v1      bond0
mlx5_bond_0     1       1       fe80:0000:0000:0000:b0fc:4eff:feb3:1112                 v2      bond0
mlx5_bond_0     1       2       0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61    v1      bond0
mlx5_bond_0     1       3       0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61    v2      bond0

test in perftest on GID 1

---------------------------------------------------------------------------------------
                    RDMA_Read BW Test
RX depth:               1
post_list:              1
inline_size:            0
 Dual-port       : OFF          Device         : mlx5_bond_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 CQ Moderation   : 1
 Mtu             : 1024[B]
 Link type       : Ethernet
 GID index       : 1
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
 GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
 remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
 GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MiB/sec]    BW average[MiB/sec]   MsgRate[Mpps]
 65536      1000             10829.53            10829.17       0.173267
---------------------------------------------------------------------------------------

Machine B:

hca_id: mlx5_bond_0
        transport:                      InfiniBand (0)
        fw_ver:                         20.39.3004
        node_guid:                      e8eb:d303:0032:b212
        sys_image_guid:                 e8eb:d303:0032:b212
        vendor_id:                      0x02c9
        vendor_part_id:                 4123
        hw_ver:                         0x0
        board_id:                       MT_0000000224
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

DEV     PORT    INDEX   GID                                     IPv4              VER     DEV
---     ----    -----   ---                                     ------------      ---     ---
mlx5_bond_0     1       0       fe80:0000:0000:0000:649b:9aff:feac:0929                   v1      bond0
mlx5_bond_0     1       1       fe80:0000:0000:0000:649b:9aff:feac:0929                   v2      bond0
mlx5_bond_0     1       2       0000:0000:0000:0000:0000:ffff:0a77:2e3e   10.119.46.62    v1      bond0
mlx5_bond_0     1       3       0000:0000:0000:0000:0000:ffff:0a77:2e3e   10.119.46.62    v2      bond0
n_gids_found=4

test in perftest on GID 0

                    RDMA_Read BW Test
RX depth:               1
post_list:              1
inline_size:            0
 Dual-port       : OFF          Device         : mlx5_bond_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 CQ Moderation   : 1
 Mtu             : 1024[B]
 Link type       : Ethernet
 GID index       : 1
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
 GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
 remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
 GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MiB/sec]    BW average[MiB/sec]   MsgRate[Mpps]
 65536      1000             10829.53            10829.17       0.173267
---------------------------------------------------------------------------------------

If I test on the example code, the Bandwidth is about 0.0124GB/s when M1 use GID0 and M2 use GID0/GID1. And the Bandwidth is about 6GB/s when M1 use GID1 and M2 use GID1. I'd like to know what optimizations the perftest code has done, or what deficiencies the code in the example above has caused a big difference in the bandwidth of the tests.

Upvotes: -1

Views: 39

Answers (1)

yupe
yupe

Reputation: 127

The reason is that the example code cannot unlock the hardware's full ability, i.e. the message is too small! Compared to small message transmission(preparation time cannot be ignored), large message transmission are large enough to support full speed transmission.

Try the parameter --all with perftest program to see the speed difference among 2 and 2^23 byte sizes messages.

Upvotes: 0

Related Questions