Mark
Mark

Reputation: 6464

ebpf: drop ICMP packet in socket filter program on lo interface

Consider a very simple ebpf code of BPF_PROG_TYPE_SOCKET_FILTER type:

struct bpf_insn prog[] = {
   BPF_MOV64_IMM(BPF_REG_0, -1),
   BPF_EXIT_INSN(),
};

The code snippets below from net/core/filter.c and net/core/sock/c show how the filter will be invoked:

static inline int pskb_trim(struct sk_buff *skb, unsigned int len)
{
    return (len < skb->len) ? __pskb_trim(skb, len) : 0;
}

...

int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap)
{
        int err;
        ...

        if (filter) {
                pkt_len = bpf_prog_run_save_cb(filter->prog, skb);
                skb->sk = save_sk;
                err = pkt_len ? pskb_trim(skb, max(cap, pkt_len)) : -EPERM;
        }
        ...
        return err;
}
...

static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
{
    return sk_filter_trim_cap(sk, skb, 1);
}

Eventually sk_filter() will be called by sock_queue_rcv_skb(), i.e. a packet reaching socket will be processed by the filter and queued if sk_filter() returns 0.

If I understand this code correctly, in this case (return code 0xffffffff) will result in a packet being dropped. However, my simple ebpf code when attached to a AF_PACKET raw socket (bound to lo interface) does not drop icmp packets sent across the loopback interface. Does it have anything to do with the eBPF or behaviour of ICMP on loopback interface?

UPDATE As pchaigno has pointed out, socket filter programs deal with the copies of packets. In my case, ebpf application basically creates a tap socket (AF_PACKET raw socket), which will be handed ingress packets before they are delivered up to a protocol layer. I did some investigation in the kernel code and found that a received packet will eventually land in __netif_receive_skb_core() function, which does the following:

list_for_each_entry_rcu(ptype, &skb->dev->ptype_all, list) {
        if (pt_prev)
            ret = deliver_skb(skb, pt_prev, orig_dev);
        pt_prev = ptype;
}

This will pass the packet to the AF_PACKET handler, which will run ebpf filter eventually.

As a way to confirm that the ebpf filter actually filters on raw AF_PACKET sockets, one can dump stats as follows:

struct tpacket_stats stats;
...
len = sizeof(stats);
err = getsockopt(sock, SOL_PACKET, PACKET_STATISTICS, &stats, &len);

This stats will indicate the filter behaviour.

Upvotes: 2

Views: 1929

Answers (3)

A.C.
A.C.

Reputation: 31

Socket Filter BPFs receive a copy of the packet; therefore this BPF filter drops or truncates a copy of the packet not the original packet. The original packet passes through the kernel unaffected by the filter.

The return value of socket filters eBPF program actually affects only the BPF filters chained after or inserted after itself.

If you want eBPF functionality and need to drop packets, you should choose different Kernel hook points, e.g. Netfilter Hooks, Traffic Control etc.

Traffic control is ingress/egress, and filters are attached to single network interfaces, supports BPF in the form of Traffic control BPF (tc-eBPF). Netfilter Hooks are IP-level, supports Iptables, Nftables, Nfqueue and are easy to use in a kernel module. The BPF support is not as good compared with traffic control that easily modifies or drops the packet at super early (ingress/egress) stage. Iptables-extensions BPF (xt-bpf) can drop packets, store packet information in maps, but packet modification is limited.

But for ICMP packet dropping, a single Iptables rule should suffice.

Upvotes: 3

Qeole
Qeole

Reputation: 9114

The answer from pchaigno is correct and the kernel code is always the ultimate source of truth, but in that case I'd also redirect to the manual page for the sockets, man 7 socket. This is where the option for attaching BPF filters to sockets is described, and it says:

SO_ATTACH_FILTER (since Linux 2.2), SO_ATTACH_BPF (since Linux
       3.19)
              Attach a classic BPF (SO_ATTACH_FILTER) or an extended BPF
              (SO_ATTACH_BPF) program to the socket for use as a filter
              of incoming packets.  A packet will be dropped if the
              filter program returns zero.  If the filter program
              returns a nonzero value which is less than the packet's
              data length, the packet will be truncated to the length
              returned.  If the value returned by the filter is greater
              than or equal to the packet's data length, the packet is
              allowed to proceed unmodified.

So:

  • 0 drops the packet (“truncates to length 0”).
  • Small value truncates the packet (to a length corresponding to that value).
  • Large value passes the packet.

This is valid for both cBPF and eBPF.

Upvotes: 1

pchaigno
pchaigno

Reputation: 13063

It's the other way around: it will drop the packet if 0 is returned. From the code:

*   sk_filter_trim_cap - run a packet through a socket filter
*   [...]
*
* Run the eBPF program and then cut skb->data to correct size returned by
* the program. If pkt_len is 0 we toss packet. If skb->len is smaller
* than pkt_len we keep whole skb->data. [...]

Upvotes: 1

Related Questions