Reputation: 39065
While writing a simple DPDK packet generator I noticed some additional initialization steps that are required for reliable and successful packet transmission:
rte_eth_link_get()
or rte_eth_timesync_enable()
after rte_eth_dev_start()
rte_eth_tx_burst()
So these steps are necessary when I use the ixgbe DPDK vfio driver with an Intel X553 NIC.
When I'm using the AF_PACKET
DPDK driver, it works without those extra steps.
rte_eth_link_get()
in order to complete the device initialization for transmission?Additional information: When I connect the NIC to a mirrored port (which is configured via Mikrotik's mirror-source/mirror-target ethernet switch settings) and the sleep(2)
is removed then I see the first packet transmitted to the mirror target but not to the primary destination. Thus, it seems like the sleep is necessary to give the switch some time after the link is up (after the dpdk program start) to completely initialize its forwarding table or something like that?
Waiting just 1 second before the first transmission works less reliable, i.e. the packet reaches the receiver only every odd time.
My device/port initialization procedure implements the following setup sequence:
rte_eth_dev_count_avail()
rte_eth_dev_is_valid_port()
rte_eth_dev_info_get()
rte_eth_dev_adjust_nb_rx_tx_desc()
rte_eth_dev_configure(port_id, 0 /* rxrings */, 1 /* txrings */, &port_conf)
rte_eth_tx_queue_setup()
rte_eth_dev_start()
rte_eth_macaddr_get()
rte_eth_link_get() // <-- REQUIRED!
rte_eth_dev_get_mtu()
Without rte_eth_link_get()
(or rte_eth_timesync_enable()
) the first transmitted packet doesn't even show up on the mirrored port.
The above functions (and rte_eth_tx_burst()
) complete successfully with/without rte_eth_link_get()
/sleep(2)
being present. Especially, the read MAC address, MTU have the expected values (MTU -> 1500) and rte_eth_tx_burst()
returns 1
for a burst of one UDP packet.
The returned link status is: Link up at 1 Gbps FDX Autoneg
The fact that rte_eth_link_get()
can be replaced with rte_eth_timesync_enable()
probably can be explained by the latter calling ixgbe_start_timecounters()
which calls rte_eth_linkstatus_get()
which is also called by rte_eth_link_get()
.
I've checked the DPDK examples and most of them don't call rte_eth_link_get()
before sending something. There is also no sleep after device initialization.
I'm using DPDK 20.11.2.
Even more information - to answer the comments:
I'm running this on Fedora 33 (5.13.12-100.fc33.x86_64
).
Ethtool reports: firmware-version: 0x80000877
I had called rte_eth_timesync_enable()
in order to work with the transmit timestamps. However, during debugging I removed it to arrive at an minimal reproducer. At that point I noticed that removing it made it actually worse (i.e. no packet transmitted over the mirror port). I thus investigated what part of that function might make the difference and found rte_eth_link_get()
which has similar side-effects.
When switching to AF_PACKET
I'm using the stock ixgbe kernel driver, i.e. ixgbe
with default settings on a device that is initialized by networkd (dhcp enabled).
My expectation was that when rte_eth_dev_start()
terminates that the link is up and the device is ready for transmission.
However, it would be nice, I guess, if one could avoid resetting the device after program restarts. I don't know if DPDK supports this.
Regarding delays: I've just tested the following: rte_eth_link_get()
can be omitted if I increase the sleep to 6 seconds. Whereas a call to rte_eth_link_get()
takes 3.3 s. So yeah, it's probably just helping due to the additional delay.
Upvotes: 1
Views: 1579
Reputation: 4798
Based on the interaction via comments, the real question is summarized as I'm just asking myself if it's possible to keep the link-up between invocations of a DPDK program (when using a vfio device) to avoid dealing with the relatively long wait times before the first transmit comes through. IOW, is it somehow possible to skip the device reset when starting the program for the 2nd time?
The short answer is No
for the packet-generator program between restarts, because any Physcial NIC which uses PCIe config space for both PF (IXGBE for X533) and VF (IXGBE_VF for X553) bound with uio_pci_generic|igb_uio|vfio-pci requires PCIe reset & configuration. But, when using AF_PACKET (ixgbe kernel diver) DPDK PMD, this is the virtual device that does not do any PCIe resets and directly dev->data->dev_link.link_status = ETH_LINK_UP;
in eth_dev_start function.
For the second part Is the delay for the first TX packets expected?
[Answer] No, as a few factors contribute to delay in the first packet transmission
Note: Since the issue is mentioned for VF ports only (and not PF ports), my assumption is
dmesg
Steps to isolate the problem is the NIC by:
Steps to isolate the problem is the Switch by:
FD auto-neg-disable speed 1Gbps
and check the behaviour[EDIT-1] I do agree with the workaround solution suggested by @stackinside using DPDK primary-secondary process concept. As the primary is responsible for Link and port bring up. While secondary is used for actual RX and TX burst.
Upvotes: 1
Reputation:
The difference between the two attempted approaches
In order to use af_packet
PMD, you first bind the device in question to the kernel driver. At this point, a kernel network interface is spawned for that device. This interface typically has the link active by default. If not, you typically run ip link set dev <interface> up
. When you launch your DPDK application, af_packet
driver does not (re-)configure the link. It just unconditionally reports the link to be up on device start (see https://github.com/DPDK/dpdk/blob/main/drivers/net/af_packet/rte_eth_af_packet.c#L266) and vice versa when doing device stop. Link update operation is also no-op in this driver (see https://github.com/DPDK/dpdk/blob/main/drivers/net/af_packet/rte_eth_af_packet.c#L416).
In fact, with af_packet
approach, the link is already active at the time you launch the application. Hence no need to await the link.
With VFIO approach, the device in question has its link down, and it's responsibility of the corresponding PMD to activate it. Hence the need to test link status in the application.
Is it possible to avoid waiting on application restarts?
Long story short, yes. Awaiting link status is not the only problem with application restarts. You effectively re-initialise EAL as a whole when you restart, and that procedure is also eye-wateringly time consuming. In order to cope with that, you should probably check out multi-process support available in DPDK (see https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html).
This requires that you re-implement your application to have its control logic in one process (also, the primary process) and Rx/Tx datapath logic in another one (the secondary process). This way, you can keep the first one running all the time and re-start the second one when you need to change Rx/Tx logic / re-compile. Restarting the secondary process will re-attach to the existing EAL instance all the time. Hence no PMD restart being involved, and no more need to wait.
Upvotes: 1