Reputation: 383
I have a dpdk 19 application and read from nic(MT27800 Family [ConnectX-5] 100G) with 32 rx multiqueue with RSS .
So there are 32 processes that receive traffic from nic with dpdk, Each process read from a different Queue, copy from the mbuf the data to allocated memory, accumulate to 6MB and send it to another thread via a lockless Queue, that other thread only write the data to disk. As a result I/O write is cached in linux memory.
All processes run with cpu affinity, there is isolcpus in the grub
This a little pseudo code of what happen in each of the 32 processes that read from its Queue, i can't put the real code, it is too much
MainFunction()
{
char * local_buf = new...
int nBufs = rte_eth_rx_burst(pi_nPort, pi_nQNumber, m_mbufs, 216);
for(mbuf in m_mbufs)
{
memcpy(local_buf+offset, GetData(mbuf),len);//accumulate to buf
if(local_buf.len > MAX)
{
PushToQueue(local_buf);
local_buf = new ...
}
rte_pktmbuf_free(mbuf);
}
}
WriterThreadMainFunc
{
While(QueueNotEmpty)
{
buf = PullFromQ
WriteToDisk(buf)
delete buf;
}
}
When the server memory is completely cache ( I know it is still available) I start seeing drops at nic.
If I delete the data from disk every minute the cached memory is released to free and and no drops at nic. So the drops are clearly linked to the cached data. Until the first drops the application can receive run without drops for 2 hours. The process don't use much memory each process is at 500 MB.
How can I avoid the drops at nic?
total used free shared buff/cache available
Mem: 125G 77G 325M 29M 47G 47G
Swap: 8.0G 256K 8.0G
I use Centos 9.7 linux 3.10.0-1160.49.1.el7.x86_64.
Upvotes: 0
Views: 514
Reputation: 4798
DPDK API rte_eth_rx_burst
uses the mempool (or pktmbuf) memory region to hold the metadata and ethernet frame. In each rx_burst cycle internally
ref_cnt
as 1
, to indicate the mbuf is in use and not to free.tx_burst or rte_mbuf_free
is invoked the mbuf is never pushed to local cache or mempool for reuse.Hence as shared in the code snippet, the performance of WriterThreadMainFunc
affects the availability of mempool. that is, if speed of rx_burst (Million Packet per Sec) is greater than
this will leads to the scenario mbuf_free is slower than rx_burst
. To validate the same, one can
stats and xstats
, to check the counter rx_no_mbuf
get_stats and get_xstats
for the same counter.Normally files when open (especially in RW mode) will be cached in 4k Pages or on transparent huge pages (expect for never setting) for performance. Based conversation in the comments it looks like, since caching is in effect DISK IO
runs slower which leads to WriterThreadMainFunc
to runs slower. To check this behaviour as suggested in comments please
echo 1 | sudo tee /proc/sys/vm/drop_caches
.fflush & fsync
periodically.Once the problem is isolated you can use setbuf(f, NULL)
to disable buffering at the start itself.
Note: there are mulltitude of other options like create per port-queue, per flow, per flow-port-queue with mmap for the current requirement too.
Upvotes: 0