Friday, May 8, 2015

NORMAL mode in OVS - MAC learning

OpenVSwitch or OVS can run in two modes
1. NORMAL mode
2. Flow mode

In NORMAL mode, OVS works like any other L2 layer switch operating on MAC-Port mapping.
To use an OVS in NORMAL mode, simple create an OVS bridge and add the below flow entry
ovs-vsctl add-br br0
ovs-vsctl add-port eth0
ovs-ofctl add-flow br0 action=NORMAL
For flows hitting this entry, the MAC table maintained by OVS would be referred for forwarding decision.

In Flow mode, the regular OpenFlow pipeline is hit and OVS works on the basis of flows installed.

In my experiment, I would explain how the NORMAL mode works standalone and also how it works in conjunction with flow mode.
I have only used mininet and OVS for this experiment on Ubuntu 14.04 LTS.

Step 1: Create a simple topology with one switch and three hosts.

$ sudo mn --topo=single,3 --controller=none --mac
*** Creating network
*** Adding controller
*** Adding hosts:
h1 h2 h3
*** Adding switches:
s1
*** Adding links:
(h1, s1) (h2, s1) (h3, s1)
*** Configuring hosts
h1 h2 h3
*** Starting controller
*** Starting 1 switches
s1
*** Starting CLI:

Step 2: Execute ovs-appctl command. This command displays the MAC table created by OVS. Since we have not done any ping, the table doesn't has the MAC entries for h1,h2 and h3.
mininet> sh ovs-appctl fdb/show s1
 port  VLAN  MAC                Age
LOCAL     0  b2:c9:fc:3c:1a:41    8
mininet>

Step 3: We do not have any flows installed on OVS. If you try to ping between hosts, nothing would happen.
mininet> sh ovs-ofctl dump-flows s1
NXST_FLOW reply (xid=0x4):
mininet>

Step 4: Now, lets add a NORMAL action flow entry on OVS and see what happens.
mininet> sh ovs-ofctl add-flow s1 action=normal
mininet> sh ovs-ofctl dump-flows s1
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=12.023s, table=0, n_packets=0, n_bytes=0, idle_age=12, actions=NORMAL
mininet>

The MAC table is still blank as in step 2.
mininet> sh ovs-appctl fdb/show s1
 port  VLAN  MAC                Age
LOCAL     0  b2:c9:fc:3c:1a:41   56
mininet>

Step 5: Lets ping
mininet> h1 ping h2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.587 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.085 ms
^C
--- 10.0.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.085/0.336/0.587/0.251 ms

The ping is successful because of NORMAL flow entry. The MAC addresses of the hosts h1 and h2 are learnt. h3 is still not learnt as there has been no ping to/from h3.

mininet> sh ovs-appctl fdb/show s1
 port  VLAN  MAC                Age
LOCAL     0  b2:c9:fc:3c:1a:41   63
    2     0  00:00:00:00:00:02    1
    1     0  00:00:00:00:00:01    1
mininet>

The above steps show the OVS operation in NORMAL mode.
Step 6 onwards, I would explain how NORMAL and FLOW mode work in conjunction,

Step 6: Now, lets add a flow for MAC address and port of h3 with a higher priority.
mininet> sh ovs-ofctl add-flow s1 priority=60000,dl_dst=00:00:00:00:00:03,actions=output:3
mininet> sh ovs-ofctl dump-flows s1
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=4.242s, table=0, n_packets=0, n_bytes=0, idle_age=4, priority=60000,dl_dst=00:00:00:00:00:03 actions=output:3
 cookie=0x0, duration=71.223s, table=0, n_packets=8, n_bytes=560, idle_age=43, actions=NORMAL
mininet>

Step 7: Lets ping and notice n_packets/n_bytes of flow entries.

mininet> h1 ping h3
PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.
64 bytes from 10.0.0.3: icmp_seq=1 ttl=64 time=0.558 ms
64 bytes from 10.0.0.3: icmp_seq=2 ttl=64 time=0.090 ms
^C
--- 10.0.0.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.090/0.324/0.558/0.234 ms
mininet> sh ovs-ofctl dump-flows s1
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=13.124s, table=0, n_packets=2, n_bytes=196, idle_age=2, priority=60000,dl_dst=00:00:00:00:00:03 actions=output:3  --> Flow1
 cookie=0x0, duration=80.105s, table=0, n_packets=12, n_bytes=840, idle_age=2, actions=NORMAL ---> Flow2
mininet>

Notice that the packets/bytes increase for both the entries. This is because when h1 ping is sent to h3, Flow 1 is hit and for the ping reply, Flow 2 gets hit.

Journey of a packet when a VM accesses internet in Openstack

I am putting down my understanding in the post to explain the journey of a packet from a VM to an external network. As you read on, I would explain this figure in detail at interface and bridge level with the help of some slides.

Step 1: VM to br-int - Packet filtering by Security Group


Each VM that is created, is attached to a TAP interface (vnetX). This tap interface is connected to a Linux bridge qbrXXX and then a veth pair qbrXXXX - qvoXXXX connects the Linux bridge with br-int.
The security groups are implemented on TAP devices using iptables rules. If an instance has multiple ports, the same security groups are applied on all ports of the instances.

Step 2: br-int to br-tun (Inside the Compute node)


br-int and br-tun are connected via patch ports. The external packets (VLAN tagged) reach br-tun via the patch ports. On br-tun, the VLAN tag is stripped and a tunnel id is added to send the packet to the tunnel between the compute node and the network node.

Step 3: Packet travels to Network Node through GRE Tunnel
At this point, the packet reaches the physical interface - eth1 of the network node via a GRE tunnel.


Step 4: Packet reaches br-int from br-tun

eth1 of the network node belongs to br-tun. The packet is thus received by br-tun. br-tun removes the GRE header and sends the packet to br-int via patch ports(qr veth pair,i.e the receiving interface on br-int is qrXXXX). This is done via GRE-VLAN mapping maintained as flow rules on br-tun.

Step 5: Firewall rules on network node

The packet exits br-int via qrXXXX interface which exists in the qROUTER namespace that belongs to the tenant. Both qrXXXX and qgXXXX interfaces exist in the qROUTER namespace. You can check the interface and route and iptables details using the below commands.
#ip netns exec ifconfig -a
#ip netns exec route -n
#ip netns exec iptables -L
#ip netns exec iptables -L -t nat
qrXXXX is the interface that serves as the internal gateway for a tenant.
qgXXXX is the interface towards the external network on br-ex.
Rules of the tenant's firewall then get executed which determines whether the packet going to external network should be dropped or allowed.
NATing is also done at this point, so the packet leaving the network node has the source IP as that of the qROUTER's external gateway.
Once allowed, the packet reaches qgXXXX interfaces on br-ex and is set to external network or the internet.


The response takes the same path in reverse direction.