Have They Measured the Whole Problem?
It's an interesting idea, but I do wonder.
One of the reasons to respond to NIC IRQs is to get that packet in, in memory somewhere, fast, so that more packets arriving on the network can all make their way through the limited buffers within the NIC. If the traffic load is such that the application isn't really keeping up, and one is now polling for available packets, it seems to me that there's the potential for network packets to get dropped. There is a parameter irq_suspend_timeout involved, which seems to be a timeout to ensure that the OS will start paying attention to the NIC if the application has taken too long to ask for more data. The suggestion is that this is tuned by the app developer "to cover the processing of an entire application batch"; but, what about the NIC's ability to continue to absorb packets in the meantime?
The patch documentation doesn't mention packet loss, drop, etc at all, so I'm presuming that actually that's covered off somehow, hence no need to explain the risk of it.
The thing is, dropped network packets will start having a big impact on the amount of energy consumed by the network itself. It costs quite a lot of power to fire bits down lengths of fibre or UTP, and the energy cost of dropped packets starts being more than a doubling of that power (because there's more network traffic than just re-sending the dropped ones). So that's why I'm interested in whether or not they have got packet dropping covered off somehow.
However, on the whole, a clever idea and well worthwhile!
Tuning Architectures?
Ultimately, if more network traffic is being fired at a host than the host can consume, then the architecture is perhaps wrong, or wrongly scaled. Our networks effectively implement Actor Model systems, which are notable in that a lack of performance gets hidden in increased system latency, because it muddles through the data backlog eventually (or at least, that's the hope). Thus, it's tempting to write off the increased latency as "who cares", and move on. That is often entirely acceptable (which is why all networking and nearly all software kinda works that way).
However, if one adopts a more Communicating Sequential Processes view of networking (think Golang's go routines / channels, but across networks instead, or a http put), this has the trait that if a recipient of data isn't keeping up, the sender knows all about it (send / receive block until the transfer is complete - an execution rendezvous). There's no hiding a lack of performance in buffers in NICs, networks, because there aren't any (not ones that count, anyway). It sounds like a nightmare, but actually it's quite refreshing; inadequate performance is never hidden, and you know for sure what you have to do to address it. However, if you do get the balance of data / processing right, all the "reading" is started just as the next "send" happens, and the intervening data transport shouldn't find a need to buffer or interrupt anything.
This new mechanism brings the opportunity to kinda blend both Actor and CSP. It's "Actor", in that data could build up in buffers, but if an application / system developer did tune their architecture scale just right in relation to processing performance, the packets would just keep rolling in and be consumed immediately with barely an interrupt in sight, as if it were a CSP system but without any explicit network transfer to ensure the synchronisation of sending and receiving.
Of course, achieving that in real life is hard for many applications. There are some where there's constant data rates, e.g. I/Q data streaming from a software defined radio (well, the ADC part of it anyway). This new mechanism paired with the fact that PREEMPT_RT has just become a 1st rank kernel component (another hooray for that!) does some interesting things to the performance that could be achieved.