Reputation: 1
In order to learn RDMA, I found an example on the Internet, which is similar to the one provided by MELLANOX , but when I used two machines to run, I found the following problems:
1.There is a big gap between the bandwidth of the code tested and that tested by Perftest .
2.In addition to this, the use of GID 0 or 2 on one of the two machines will significantly reduce the bandwidth.
Machine A:
configure:
hca_id: mlx5_bond_0
transport: InfiniBand (0)
fw_ver: 20.39.3004
node_guid: 1070:fd03:00e5:f118
sys_image_guid: 1070:fd03:00e5:f118
vendor_id: 0x02c9
vendor_part_id: 4123
hw_ver: 0x0
board_id: MT_0000000224
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_bond_0 1 0 fe80:0000:0000:0000:b0fc:4eff:feb3:1112 v1 bond0
mlx5_bond_0 1 1 fe80:0000:0000:0000:b0fc:4eff:feb3:1112 v2 bond0
mlx5_bond_0 1 2 0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61 v1 bond0
mlx5_bond_0 1 3 0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61 v2 bond0
test in perftest on GID 1
---------------------------------------------------------------------------------------
RDMA_Read BW Test
RX depth: 1
post_list: 1
inline_size: 0
Dual-port : OFF Device : mlx5_bond_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 1
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
65536 1000 10829.53 10829.17 0.173267
---------------------------------------------------------------------------------------
Machine B:
hca_id: mlx5_bond_0
transport: InfiniBand (0)
fw_ver: 20.39.3004
node_guid: e8eb:d303:0032:b212
sys_image_guid: e8eb:d303:0032:b212
vendor_id: 0x02c9
vendor_part_id: 4123
hw_ver: 0x0
board_id: MT_0000000224
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_bond_0 1 0 fe80:0000:0000:0000:649b:9aff:feac:0929 v1 bond0
mlx5_bond_0 1 1 fe80:0000:0000:0000:649b:9aff:feac:0929 v2 bond0
mlx5_bond_0 1 2 0000:0000:0000:0000:0000:ffff:0a77:2e3e 10.119.46.62 v1 bond0
mlx5_bond_0 1 3 0000:0000:0000:0000:0000:ffff:0a77:2e3e 10.119.46.62 v2 bond0
n_gids_found=4
test in perftest on GID 0
RDMA_Read BW Test
RX depth: 1
post_list: 1
inline_size: 0
Dual-port : OFF Device : mlx5_bond_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 1
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
65536 1000 10829.53 10829.17 0.173267
---------------------------------------------------------------------------------------
If I test on the example code, the Bandwidth is about 0.0124GB/s when M1 use GID0 and M2 use GID0/GID1. And the Bandwidth is about 6GB/s when M1 use GID1 and M2 use GID1. I'd like to know what optimizations the perftest code has done, or what deficiencies the code in the example above has caused a big difference in the bandwidth of the tests.
Upvotes: -1
Views: 39
Reputation: 127
The reason is that the example code cannot unlock the hardware's full ability, i.e. the message is too small! Compared to small message transmission(preparation time cannot be ignored), large message transmission are large enough to support full speed transmission.
Try the parameter --all
with perftest
program to see the speed difference among 2 and 2^23 byte sizes messages.
Upvotes: 0