How to trace the tap interfaces and linux bridges on the hypervisor your OpenStack VM is on…

under-the-hood-scenario-1-ovs-compute

 

 

If your familiar with OpenStack you might have seen the above graphic at some point. Although it looks intimidating, it simply explains how the networking of your VMs is plumbed on the hypervisor in a traditional OpenStack setup. When I say traditional OpenStack, I mean to say the reference implementation with ML2/OVS and using the iptables firewall driver. In other implementations, few of the pieces in the graphic might be missing like we will see later in this post.

 

Looking more closely at the above graphic, the NIC on the VM is connected as a tap device on the hypervisor. The tap device is connected to a linux bridge. The only reason the linux bridge is needed here is because iptables rules on this bridge are used to implement the security group rules for the VM. This linux bridge is connected via a veth pair to the OpenvSwitch integration bridge br-int. Veth devices are  virtual ethernet interfaces that come in pairs and act like a tube, what goes in on one end will come out of the other. The integration bridge br-int on a compute node adds internal VLAN tag to the packets coming from a VM. This VLAN tag is unique per neutron network on that compute node. The integration bridge is connected to a tunnel bridge br-tun in a case where an encapsulation protocol such as VxLAN or GRE is being used for overlay (it is depicted as br-eth in the graphic). The tunneling bridge has several flow entries that forward the packets coming from the VMs through br-int after encapsulating them and adding a unique tag to identify the neutron network on which the VM is. The packets are forwarded through the interface that is on the tunnel or tunnel interface (eth1 in the graphic) to other compute nodes or the controller nodes depending on the traffic pattern (L2/ L3 East-West, L3 North-South) and the neutron L3 routing implementation ( Legacy vs DVR).

 

Sometimes SSH to the instances is broken. MTU mismatch is one of the most common reasons SSH fails. Hence,  it is worth checking that there is no MTU mismatch in the data path to the VM. So, we need to make sure the the MTU of the tap interface is the same as the MTU of the linux bridge (since the MTU of OVS bridges doesn’t seem to really matter). In case of VxLAN encapsulation for overlay, we ideally want the tap interface and linux bridge to have an MTU of 1450 bytes since we reserve 50 bytes for the encapsulation overhead tunneling adds (1450 + 50 =1500 bytes which is the ethernet frame size).  The question now is, how do we identify which tap interface and linux bridge on the compute node belong to the VM of interest.

 

In the graphic shown earlier in the post,  we see the linux bridges are named  qbrXXXX, the veth pairs qvbXXXX and qvoXXXX. Although not depicted in the picture let us assume the tap device on the compute node is also named tapXXXX. How do we know what the value of  XXXX is, so that we are able to look at the right interfaces of the VM of interest?

 

A very simple way to do this would be to start with a instance ID of the VM. Using the instance id, one can obtain the neutron port id.

(overcloud) [stack@c09-h04-r630 ansible]$ neutron port-list --device-id=24e56cb6-ffb1-4bfe-a164-5d27edf3c682
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+------+----------------------------------+-------------------+-----------------------------------------------------------------------------------+
| id | name | tenant_id | mac_address | fixed_ips |
+--------------------------------------+------+----------------------------------+-------------------+-----------------------------------------------------------------------------------+
| f91776a7-9ce3-4a9b-877a-1be538f185e1 | | 52cbb98f2e2f4e1fa63b48f2059bd92e | fa:16:3e:b3:f0:e0 | {"subnet_id": "cd0a1c9d-47ba-4de2-842f-3f04b5289d59", "ip_address": "10.2.13.11"} |
+--------------------------------------+------+----------------------------------+-------------------+-----------------------------------------------------------------------------------+
</pre>
So, using this command we have identified the neutron port id of this VM. &nbsp;The first 10 characters of the neutron port id will become the value of XXXX. In our case,
<pre>

[heat-admin@overcloud-compute-2 ~]$ sudo su
[root@overcloud-compute-2 heat-admin]# ip a | grep f91776a7-9c
20: tapf91776a7-9c: &amp;amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;amp;gt; mtu 1500 qdisc pfifo_fast master ovs-system state UNKNOWN qlen 1000

 

Here, we could see how tapf91776a7-9c is the tap device on the compute node for the VM with UUID 24e56cb6-ffb1-4bfe-a164-5d27edf3c682. Our grep did not find any linux bridges since I was using Ml2/ODL as the neutron backend at the time of writing this post. ML2/ODL does not rely on iptables for state tracking and hence that eliminates the need for the linux bridge altogether. It is interesting to note that even in the reference  Ml2/OVS implementation, if you choose to use the openvswitch firewall driver instead of the default iptables firewall driver there wouldn't be any linux bridges on the compute nodes.

 

For the curious souls out there, you can switch to the conntrack based openvswitch firewall driver in ML2/OVS by setting firewall_driver = openvswitch in  /etc/neutron/plugins/ml2/openvswitch_agent.ini on the compute node. For the performance geeks, you will typically see better performance when using the openvswitch firewall driver since the additional overhead the linux bridge adds is removed. The firewall_driver is a compute node specific setting, hence you are free to have a hybrid environment with some compute nodes using iptables firewall driver and some using the openvswitch firewall driver.

One thought on “How to trace the tap interfaces and linux bridges on the hypervisor your OpenStack VM is on…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s