dimba
dimba

Reputation: 27641

NIC memory managment managment and RSS queues

I want to understand how NIC manages memory for ring buffers.

Say I have Q RSS queues of size N. The driver will allocate in kernel space Q ring buffers of size N packets:

enter image description here

My question is what happening on HW side in case OS fails to pull or pulls slowly packets for a particular queue and there N packets on the NIC side waiting to be pulled. I can imagine two scenarios:

  1. Packets for the queue will "eat" all memory of NIC, thus forcing NIC to drop packets for other queues
  2. NIC will stop receiving packets for the queue when it will reach N packets, thus rest of queues will be left unaffected?

Thanks

Upvotes: 4

Views: 1032

Answers (2)

Sreeram Nair
Sreeram Nair

Reputation: 2393

Current network stacks (and commodity OSes in general) have developed from models based on simple NICs that feed unicore CPUs incrementally. When multicore machines became prevalent and the scalability of the software stack became a serious concern, significant efforts were made to adapt these models to take advantage of multiple cores

As with any other rule hardcoding in NIC hardware, the main drawback of RSS is that the OS has little or no influence over how queues are allocated to flows.

RSS drawbacks can be overcome by using more flexible NIC filters or trying to smartly assign queues to flows using a software baked in the system operator.

The following ASCII art image describes how the ring might look after the hardware has received two packets and delivered the OS an interrupt:

    +--------------+ <----- OS Pointer
    | Descriptor 0 |
    +--------------+
    | Descriptor 1 |
    +--------------+ <----- Hardware Pointer
    | Descriptor 2 |
    +--------------+
    |     ...      |
    +--------------+
    | Descriptor n |
    +--------------+

When the OS receives the interrupt, it reads where the hardware pointer is and processes those packets between its pointer and the hardware's. Once it's done, it doesn't have to do anything until it prepares those descriptors with fresh buffers. Once it does, it'll update its pointer by writing to the hardware. For example, if the OS has processed those first two descriptors and then updates the hardware, the ring will look something like:

    +--------------+
    | Descriptor 0 |
    +--------------+
    | Descriptor 1 |
    +--------------+ <----- Hardware Pointer, OS Pointer
    | Descriptor 2 |
    +--------------+
    |     ...      |
    +--------------+
    | Descriptor n |
    +--------------+

When you send packets, it's similar. The OS fills in descriptors and then notifies the hardware. Once the hardware has sent them out on the wire, it then injects an interrupt and indicates which descriptors it's written to the network, allowing the OS to free the associated memory.

Upvotes: 1

Sam Mason
Sam Mason

Reputation: 16213

not an expert here, using the opportunity to learn a bit about how higher performance network cards work. this question seems to be dependent on the type of network adaptor you're using and to a lesser extent the kernel (e.g. how it sets up the hardware). the Linux docs I could find seemed to refer to the bnx2x driver, e.g. kernel docs and also RHEL 6 docs. that said, I couldn't find much in the way of technical docs about that NIC, and I had much more luck with Intel and I spent a while going through the X710 docs

as far as I can tell, the queues are just ring-buffers and hence if the kernel doesn't get through packets fast enough the old ones will be overwritten by new ones. I couldn't find this behaviour explicitly documented with respect to RSS, but it seems to make sense

the queues are also basically independent, so if/when this happens it shouldn't affect other queues and hence their flows should be unaffected

Upvotes: 0

Related Questions