pilcrow
pilcrow

Reputation: 58589

sockets and ARP (IP neighbor) table entries

On CentOS 6.4 (kernel 2.6.32), why does the second arping invocation below create a new ARP table entry, but the first does not? The network behavior is identical, and my confusion is that, to my eye, the syscalls are practically equivalent. What system behavior am I missing here?

#:- arping -s 10.0.2.15 -f 10.0.2.4
ARPING 10.0.2.4 from 10.0.2.15 eth0
Unicast reply from 10.0.2.4 [52:54:00:01:02:03]  0.681ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
#:- ip neigh show | grep -c '^10\.0\.2\.4 '
0                                                   # <--- no new ARP entry

#:- arping -f 10.0.2.4
ARPING 10.0.2.4 from 10.0.2.15 eth0
Unicast reply from 10.0.2.4 [52:54:00:01:02:03]  0.681ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
#:- ip neigh show | grep -c '^10\.0\.2\.4 '
1                                                   # <--- new ARP entry

10.0.2.15 is the address of eth0 in the above. If you try to reproduce this yourself, be careful that your target is not in the ARP table at all — e.g., it cannot be in a STALE state, it must be entirely absent.

Now, if I strace each invocation (and ignore differences in memory locations) the relevant diff is this:

$:- diff -uN /tmp/arp-no.trace /tmp/arp-yes.trace
--- /tmp/arp-no.trace   2014-04-23 20:17:46.301575314 -0500
+++ /tmp/arp-yes.trace  2014-04-23 20:17:48.790575314 -0500
@@ -1,4 +1,4 @@
-execve("/sbin/arping", ["arping", "-s", "10.0.2.15", "-f", "10.0.2.4"], [/* 19 vars */]) = 0
+execve("/sbin/arping", ["arping", "-f", "10.0.2.4"], [/* 19 vars */]) = 0
 brk(0)                                  = 0xMEMLOCATION
 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xMEMLOCATION
 access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
@@ -37,7 +37,9 @@
 ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0
 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0
-bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.0.2.15")}, 16) = 0
+setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0
+connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.0.2.4")}, 16) = 0
+getsockname(4, {sa_family=AF_INET, sin_port=htons(38207), sin_addr=inet_addr("10.0.2.15")}, [16]) = 0
 close(4)                                = 0
 bind(3, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(0)={0, }, 40) = 0
 getsockname(3, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(6)={1, 080027538173}, [18]) = 0
 ... and then the sendto/recvfrom happen ...

In the case where I specify the source IP and no ARP entry is created, the source IP address is validated with a short-lived IPPROTO_IP socket created, bound, and closed. In the second case, arping guesses the source IP address with a short-lived IPPROTO_IP socket created, connect()ed, getsockname()d, and closed.

After that, the program behavior (and network activity) is identical. However, the system reaction is not, and the only material difference I see is each program's different but innocuous use of a now-closed socket.

Upvotes: 2

Views: 1518

Answers (1)

Thomas
Thomas

Reputation: 4255

It's curious, isn't it? :-)

I'm pretty sure it's the same kernel bug/feature I investigated in another Arping implementation (mine):

https://blog.habets.se/2012/10/Interesting-Arping-bug-report

No point cut-and-pasting the long explanation, but in short: If you look up a route and then send a raw ARP packet, then the kernel will for some reason sniff the ARP reply and populate the ARP table.

I have no idea why, or if it's even intended behaviour. But it is the kernel that does it.

Upvotes: 2

Related Questions