Reputation: 58589
On CentOS 6.4 (kernel 2.6.32), why does the second arping invocation below create a new ARP table entry, but the first does not? The network behavior is identical, and my confusion is that, to my eye, the syscalls are practically equivalent. What system behavior am I missing here?
#:- arping -s 10.0.2.15 -f 10.0.2.4
ARPING 10.0.2.4 from 10.0.2.15 eth0
Unicast reply from 10.0.2.4 [52:54:00:01:02:03] 0.681ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
#:- ip neigh show | grep -c '^10\.0\.2\.4 '
0 # <--- no new ARP entry
#:- arping -f 10.0.2.4
ARPING 10.0.2.4 from 10.0.2.15 eth0
Unicast reply from 10.0.2.4 [52:54:00:01:02:03] 0.681ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
#:- ip neigh show | grep -c '^10\.0\.2\.4 '
1 # <--- new ARP entry
10.0.2.15 is the address of eth0 in the above. If you try to reproduce this yourself, be careful that your target is not in the ARP table at all — e.g., it cannot be in a STALE state, it must be entirely absent.
Now, if I strace each invocation (and ignore differences in memory locations) the relevant diff is this:
$:- diff -uN /tmp/arp-no.trace /tmp/arp-yes.trace
--- /tmp/arp-no.trace 2014-04-23 20:17:46.301575314 -0500
+++ /tmp/arp-yes.trace 2014-04-23 20:17:48.790575314 -0500
@@ -1,4 +1,4 @@
-execve("/sbin/arping", ["arping", "-s", "10.0.2.15", "-f", "10.0.2.4"], [/* 19 vars */]) = 0
+execve("/sbin/arping", ["arping", "-f", "10.0.2.4"], [/* 19 vars */]) = 0
brk(0) = 0xMEMLOCATION
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xMEMLOCATION
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
@@ -37,7 +37,9 @@
ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0
-bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.0.2.15")}, 16) = 0
+setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0
+connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.0.2.4")}, 16) = 0
+getsockname(4, {sa_family=AF_INET, sin_port=htons(38207), sin_addr=inet_addr("10.0.2.15")}, [16]) = 0
close(4) = 0
bind(3, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(0)={0, }, 40) = 0
getsockname(3, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(6)={1, 080027538173}, [18]) = 0
... and then the sendto/recvfrom happen ...
In the case where I specify the source IP and no ARP entry is created, the source IP address is validated with a short-lived IPPROTO_IP socket created, bound, and closed. In the second case, arping guesses the source IP address with a short-lived IPPROTO_IP socket created, connect()ed, getsockname()d, and closed.
After that, the program behavior (and network activity) is identical. However, the system reaction is not, and the only material difference I see is each program's different but innocuous use of a now-closed socket.
Upvotes: 2
Views: 1518
Reputation: 4255
It's curious, isn't it? :-)
I'm pretty sure it's the same kernel bug/feature I investigated in another Arping implementation (mine):
https://blog.habets.se/2012/10/Interesting-Arping-bug-report
No point cut-and-pasting the long explanation, but in short: If you look up a route and then send a raw ARP packet, then the kernel will for some reason sniff the ARP reply and populate the ARP table.
I have no idea why, or if it's even intended behaviour. But it is the kernel that does it.
Upvotes: 2