Reputation: 13116
As known, there are: https://www.kernel.org/doc/Documentation/networking/scaling.txt
Does it meant that:
Is that correct?
Upvotes: 4
Views: 12576
Reputation: 1744
osgx's answer covers the main differences, but it is important to point out that it is also possible to use RSS and RPS in unison.
RSS controls the selected HW queue for receiving a stream of packets. Once certain conditions are met, an interrupt would be issued to the SW. The interrupt handler, which is defined by the NIC's driver, would be the SW starting point for processing received packets. The code there would poll the packets from the relevant receive queue, might perform initial processing and then move the packets for higher level protocol processing.
At this point RPS mechanism might be used, if configured. The driver calls netif_receive_skb(), which (eventually) will check for RPS configuration. If exists it would enqueue the SKB for continuing processing on the selected CPU:
int netif_receive_skb(struct sk_buff *skb)
{
...
return netif_receive_skb_internal(skb);
}
static int netif_receive_skb_internal(struct sk_buff *skb)
{
...
int cpu = get_rps_cpu(skb->dev, skb, &rflow);
if (cpu >= 0) {
ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
rcu_read_unlock();
return ret;
}
...
}
In some scenarios, it would be smart to use RSS and RPS together in order to avoid CPU utilization bottlenecks on the receiving side. A good example is IPoIB (IP over Infiniband). Without diving into too many details, IPoIB has a mode which can only open a single channel. This means all the incoming traffic would be handled by a single core. By properly configuring RPS, some of the processing load can be shared by multiple cores, which dramatically improves performance for this scenario.
Since transmitting was mentioned, it worth noting that packet transmission, which results from the receiving process (ACKs, forwarding), would be processed from the same core selected by netif_receive_skb().
Hope this helps.
Upvotes: 5
Reputation: 94245
Quotes are from https://www.kernel.org/doc/Documentation/networking/scaling.txt.
RSS should be enabled when latency is a concern or whenever receive interrupt processing forms a bottleneck. Spreading load between CPUs decreases queue length.
RPS has some advantages over RSS: 1) it can be used with any NIC, 2) software filters can easily be added to hash over new protocols, 3) it does not increase hardware device interrupt rate (although it does introduce inter-processor interrupts (IPIs)).
The goal of RFS is to increase datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running. RFS relies on the same RPS mechanisms to enqueue packets onto the backlog of another CPU and to wake up that CPU. ... In RFS, packets are not forwarded directly by the value of their hash, but the hash is used as index into a flow lookup table. This table maps flows to the CPUs where those flows are being processed.
ndo_rx_flow_steer
) "Accelerated RFS is to RFS what RSS is to RPS: a hardware-accelerated load balancing mechanism that uses soft state to steer flows based on where the application thread consuming the packets of each flow is running.".Similar method for packet transmitting (but packet is already generated and ready to be send, just select best queue to send it with - and to easier post-processing like freeing skb)
Upvotes: 11