Multiple network interfaces and ARP flux
Overview
This page discusses how to setup a HN with multiple network interfaces on the same physical network and on the same IP network. Then how to setup multiple VE's to use only one of these interfaces.
For example, you want some of your VE's to always use eth3, and some to use eth4. But none of the VE traffic should use eth0, which is reserved for use by the HN only. This makes sense if you have VE's that may generate or receive a lot of traffic and you don't want your remote administration of the server over eth0 to degrade or get blocked because of this.
To make this clear we'll use the following HN configuration. We'll also have another system to act as the client.
System | Interface | MAC Address | IP Address |
---|---|---|---|
HN | eth0 | 00:0c:29:b3:a2:54 | 192.168.18.10 |
HN | eth3 | 00:0c:29:b3:a2:68 | 192.168.18.11 |
HN | eth4 | 00:0c:29:b3:a2:5e | 192.168.18.12 |
client | eth0 | 00:0c:29:d2:c7:aa | 192.168.18.129 |
HN ARP Flux
The first issue is ARP flux. Any client on the network broadcasting an ARP "who has" message for any of these addresses will receive replies from all three interfaces. This results in IP addresses that float between three MAC addresses, depending on which response a client accepts first.
For example, the following is a tcpdump capture from executing
ping -c2 192.168.18.10
from another system on the network.
00:0c:29:d2:c7:aa > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.10 tell 192.168.18.129 00:0c:29:b3:a2:5e > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.10 is-at 00:0c:29:b3:a2:5e 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.10 is-at 00:0c:29:b3:a2:54 00:0c:29:b3:a2:68 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.10 is-at 00:0c:29:b3:a2:68 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:5e, IPv4, length 98: 192.168.18.129 > 192.168.18.10: ICMP echo request, id 32313, seq 1, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.10 > 192.168.18.129: ICMP echo reply, id 32313, seq 1, length 64 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:5e, IPv4, length 98: 192.168.18.129 > 192.168.18.10: ICMP echo request, id 32313, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.10 > 192.168.18.129: ICMP echo reply, id 32313, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp who-has 192.168.18.129 tell 192.168.18.10 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 192.168.18.129 is-at 00:0c:29:d2:c7:aa
The ARP "who has" message generated replies from all three MAC addresses on the HN. In this case the client took the MAC address for eth4. The three ICMP messages are then sent to eth4, but all the replies com from eth0. Normally this behavior isn't a problem, though it may generate some false alarms for a network monitor as it appears someone could be executing a man in the middle attack.
The following output is from executing this command on the HN.
sysctl -a | grep net.ipv4.conf.*.arp
net.ipv4.conf.venet0.arp_accept = 0 net.ipv4.conf.venet0.arp_ignore = 0 net.ipv4.conf.venet0.arp_announce = 0 net.ipv4.conf.venet0.arp_filter = 0 net.ipv4.conf.venet0.proxy_arp = 0 net.ipv4.conf.eth4.arp_accept = 0 net.ipv4.conf.eth4.arp_ignore = 0 net.ipv4.conf.eth4.arp_announce = 0 net.ipv4.conf.eth4.arp_filter = 0 net.ipv4.conf.eth4.proxy_arp = 0 net.ipv4.conf.eth3.arp_accept = 0 net.ipv4.conf.eth3.arp_ignore = 0 net.ipv4.conf.eth3.arp_announce = 0 net.ipv4.conf.eth3.arp_filter = 0 net.ipv4.conf.eth3.proxy_arp = 0 net.ipv4.conf.eth0.arp_accept = 0 net.ipv4.conf.eth0.arp_ignore = 0 net.ipv4.conf.eth0.arp_announce = 0 net.ipv4.conf.eth0.arp_filter = 0 net.ipv4.conf.eth0.proxy_arp = 0 net.ipv4.conf.lo.arp_accept = 0 net.ipv4.conf.lo.arp_ignore = 0 net.ipv4.conf.lo.arp_announce = 0 net.ipv4.conf.lo.arp_filter = 0 net.ipv4.conf.lo.proxy_arp = 0 net.ipv4.conf.default.arp_accept = 0 net.ipv4.conf.default.arp_ignore = 0 net.ipv4.conf.default.arp_announce = 0 net.ipv4.conf.default.arp_filter = 0 net.ipv4.conf.default.proxy_arp = 0 net.ipv4.conf.all.arp_accept = 0 net.ipv4.conf.all.arp_ignore = 0 net.ipv4.conf.all.arp_announce = 0 net.ipv4.conf.all.arp_filter = 0 net.ipv4.conf.all.proxy_arp = 0
If all three network interfaces are on different IP networks (such as 10.x.x.x, 172.16.x.x, 192.168.x.x) then executing the following will work:
sysctl -w net.ipv4.conf.all.arp_filter=1
However, if they are all on the same IP network, which is the case here, then the following solution will work. This can be added to your /etc/sysctl.conf file once you've tested it.
sysctl -w net.ipv4.conf.all.arp_ignore=1 sysctl -w net.ipv4.conf.all.arp_announce=2
The following output is from executing this command on the HN.
sysctl -a | grep net.ipv4.conf.*.arp
net.ipv4.conf.venet0.arp_accept = 0 net.ipv4.conf.venet0.arp_ignore = 0 net.ipv4.conf.venet0.arp_announce = 0 net.ipv4.conf.venet0.arp_filter = 0 net.ipv4.conf.venet0.proxy_arp = 0 net.ipv4.conf.eth4.arp_accept = 0 net.ipv4.conf.eth4.arp_ignore = 0 net.ipv4.conf.eth4.arp_announce = 0 net.ipv4.conf.eth4.arp_filter = 0 net.ipv4.conf.eth4.proxy_arp = 0 net.ipv4.conf.eth3.arp_accept = 0 net.ipv4.conf.eth3.arp_ignore = 0 net.ipv4.conf.eth3.arp_announce = 0 net.ipv4.conf.eth3.arp_filter = 0 net.ipv4.conf.eth3.proxy_arp = 0 net.ipv4.conf.eth0.arp_accept = 0 net.ipv4.conf.eth0.arp_ignore = 0 net.ipv4.conf.eth0.arp_announce = 0 net.ipv4.conf.eth0.arp_filter = 0 net.ipv4.conf.eth0.proxy_arp = 0 net.ipv4.conf.lo.arp_accept = 0 net.ipv4.conf.lo.arp_ignore = 0 net.ipv4.conf.lo.arp_announce = 0 net.ipv4.conf.lo.arp_filter = 0 net.ipv4.conf.lo.proxy_arp = 0 net.ipv4.conf.default.arp_accept = 0 net.ipv4.conf.default.arp_ignore = 0 net.ipv4.conf.default.arp_announce = 0 net.ipv4.conf.default.arp_filter = 0 net.ipv4.conf.default.proxy_arp = 0 net.ipv4.conf.all.arp_accept = 0 net.ipv4.conf.all.arp_ignore = 1 net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.all.arp_filter = 0 net.ipv4.conf.all.proxy_arp = 0
Now we repeat the ping command, after the arp cache has been cleared.
00:0c:29:d2:c7:aa > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.10 tell 192.168.18.129 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.10 is-at 00:0c:29:b3:a2:54 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 192.168.18.129 > 192.168.18.10: ICMP echo request, id 32066, seq 1, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.10 > 192.168.18.129: ICMP echo reply, id 32066, seq 1, length 64 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 192.168.18.129 > 192.168.18.10: ICMP echo request, id 32066, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.10 > 192.168.18.129: ICMP echo reply, id 32066, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp who-has 192.168.18.129 tell 192.168.18.10 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 192.168.18.129 is-at 00:0c:29:d2:c7:aa
The desired affect has been achieved. Only interface eth0 on the HN responds to the ARP message and the other interfaces are silent.
Adding some VE's
Now let's add some VE's to the HN as follows:
VEID | IP |
---|---|
101 | 192.168.18.101 |
102 | 192.168.18.102 |
From another system on the network you should be able to ping both. However, looking at the ARP traffic with tcpdump you'll see that once again the physical address associated with each VE will be subject to ARP flux, drifting between all three IP addresses over time.
00:0c:29:d2:c7:aa > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 tell 192.168.18.129 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:54 00:0c:29:b3:a2:68 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:68 00:0c:29:b3:a2:5e > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:5e 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 192.168.18.129 > 192.168.18.101: ICMP echo request, id 43311, seq 1, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.101 > 192.168.18.129: ICMP echo reply, id 43311, seq 1, length 64 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 192.168.18.129 > 192.168.18.101: ICMP echo request, id 43311, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 192.168.18.101 > 192.168.18.129: ICMP echo reply, id 43311, seq 2, length 64 00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp who-has 192.168.18.129 tell 192.168.18.10 00:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 192.168.18.129 is-at 00:0c:29:d2:c7:aa
The reasons for this can be found from executing the following command on the HN.
arp -an
? (192.168.18.129) at 00:0C:29:D2:C7:AA [ether] on eth0 ? (192.168.18.102) at <from_interface> PERM PUB on eth3 ? (192.168.18.102) at <from_interface> PERM PUB on eth4 ? (192.168.18.102) at <from_interface> PERM PUB on eth0 ? (192.168.18.101) at <from_interface> PERM PUB on eth3 ? (192.168.18.101) at <from_interface> PERM PUB on eth4 ? (192.168.18.101) at <from_interface> PERM PUB on eth0
Another view is obtained from the following command on the HN.
cat /proc/net/arp
IP address HW type Flags HW address Mask Device 192.168.18.102 0x1 0xc 00:00:00:00:00:00 * eth3 192.168.18.102 0x1 0xc 00:00:00:00:00:00 * eth4 192.168.18.102 0x1 0xc 00:00:00:00:00:00 * eth0 192.168.18.101 0x1 0xc 00:00:00:00:00:00 * eth3 192.168.18.101 0x1 0xc 00:00:00:00:00:00 * eth4 192.168.18.101 0x1 0xc 00:00:00:00:00:00 * eth0
What this shows is that each VE's IP address is associated with each HN's interface. Therefore each interface will respond to any ARP "who has" query.
These entries are created by the vzarp function in the vps_functions script, which are called by vps-net_add, vps-net_del and vps-stop. The result of this function in our case is to execute the following commands:
/sbin/ip neigh add proxy 192.168.18.101 dev eth0 /sbin/ip neigh add proxy 192.168.18.101 dev eth4 /sbin/ip neigh add proxy 192.168.18.101 dev eth3 /sbin/ip neigh add proxy 192.168.18.102 dev eth0 /sbin/ip neigh add proxy 192.168.18.102 dev eth4 /sbin/ip neigh add proxy 192.168.18.102 dev eth3
In addition, the following ARP messages are sent when VEID 101 is started.
00:0c:29:b3:a2:54 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.10 00:0c:29:b3:a2:5e > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.12 00:0c:29:b3:a2:68 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.11 00:0c:29:b3:a2:54 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.101 00:0c:29:b3:a2:5e > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.101 00:0c:29:b3:a2:68 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 192.168.18.101 (ff:ff:ff:ff:ff:ff) tell 192.168.18.101 00:0c:29:b3:a2:5e > 00:0c:29:b3:a2:68, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:5e 00:0c:29:b3:a2:5e > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:5e 00:0c:29:b3:a2:68 > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:68 00:0c:29:b3:a2:68 > 00:0c:29:b3:a2:5e, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:68 00:0c:29:b3:a2:54 > 00:0c:29:b3:a2:5e, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:54 00:0c:29:b3:a2:54 > 00:0c:29:b3:a2:68, ARP, length 60: arp reply 192.168.18.101 is-at 00:0c:29:b3:a2:54
What we see here is the result of vzarpipdetect, another function in vps_functions called by vps-net_add. An ARP "who has" message is sent by each interface and answered by the other interfaces.
What we want is to only add the IP addresses of our VE's to specific devices, not to all devices. This will prevent the ARP flux problem for our VE's.
Unfortunately this involves editing the OpenVZ scripts. The only case we really care about is vps-net_add, as the others execute ip neigh del proxy
.
TODO: Discuss changes to scripts.