My two cents !!!!: May 2015

OpenVSwitch or OVS can run in two modes

1. NORMAL mode

2. Flow mode

In NORMAL mode, OVS works like any other L2 layer switch operating on MAC-Port mapping.

To use an OVS in NORMAL mode, simple create an OVS bridge and add the below flow entry

ovs-vsctl add-br br0

ovs-vsctl add-port eth0

ovs-ofctl add-flow br0 action=NORMAL

For flows hitting this entry, the MAC table maintained by OVS would be referred for forwarding decision.

In Flow mode, the regular OpenFlow pipeline is hit and OVS works on the basis of flows installed.

In my experiment, I would explain how the NORMAL mode works standalone and also how it works in conjunction with flow mode.

I have only used mininet and OVS for this experiment on Ubuntu 14.04 LTS.

Step 1: Create a simple topology with one switch and three hosts.

$ sudo mn --topo=single,3 --controller=none --mac

*** Creating network

*** Adding controller

*** Adding hosts:

h1 h2 h3

*** Adding switches:

*** Adding links:

(h1, s1) (h2, s1) (h3, s1)

*** Configuring hosts

h1 h2 h3

*** Starting controller

*** Starting 1 switches

*** Starting CLI:

Step 2: Execute ovs-appctl command. This command displays the MAC table created by OVS. Since we have not done any ping, the table doesn't has the MAC entries for h1,h2 and h3.

mininet> sh ovs-appctl fdb/show s1

port VLAN MAC Age

LOCAL 0 b2:c9:fc:3c:1a:41 8

mininet>

Step 3: We do not have any flows installed on OVS. If you try to ping between hosts, nothing would happen.

mininet> sh ovs-ofctl dump-flows s1

NXST_FLOW reply (xid=0x4):

mininet>

Step 4: Now, lets add a NORMAL action flow entry on OVS and see what happens.

mininet> sh ovs-ofctl add-flow s1 action=normal

mininet> sh ovs-ofctl dump-flows s1

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=12.023s, table=0, n_packets=0, n_bytes=0, idle_age=12, actions=NORMAL

mininet>

The MAC table is still blank as in step 2.

mininet> sh ovs-appctl fdb/show s1

port VLAN MAC Age

LOCAL 0 b2:c9:fc:3c:1a:41 56

mininet>

Step 5: Lets ping

mininet> h1 ping h2

PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.

64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.587 ms

64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.085 ms

--- 10.0.0.2 ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1001ms

rtt min/avg/max/mdev = 0.085/0.336/0.587/0.251 ms

The ping is successful because of NORMAL flow entry. The MAC addresses of the hosts h1 and h2 are learnt. h3 is still not learnt as there has been no ping to/from h3.

mininet> sh ovs-appctl fdb/show s1

port VLAN MAC Age

LOCAL 0 b2:c9:fc:3c:1a:41 63

2 0 00:00:00:00:00:02 1

1 0 00:00:00:00:00:01 1

mininet>

The above steps show the OVS operation in NORMAL mode.

Step 6 onwards, I would explain how NORMAL and FLOW mode work in conjunction,

Step 6: Now, lets add a flow for MAC address and port of h3 with a higher priority.

mininet> sh ovs-ofctl add-flow s1 priority=60000,dl_dst=00:00:00:00:00:03,actions=output:3

mininet> sh ovs-ofctl dump-flows s1

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=4.242s, table=0, n_packets=0, n_bytes=0, idle_age=4, priority=60000,dl_dst=00:00:00:00:00:03 actions=output:3

cookie=0x0, duration=71.223s, table=0, n_packets=8, n_bytes=560, idle_age=43, actions=NORMAL

mininet>

Step 7: Lets ping and notice n_packets/n_bytes of flow entries.

mininet> h1 ping h3

PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.

64 bytes from 10.0.0.3: icmp_seq=1 ttl=64 time=0.558 ms

64 bytes from 10.0.0.3: icmp_seq=2 ttl=64 time=0.090 ms

--- 10.0.0.3 ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 999ms

rtt min/avg/max/mdev = 0.090/0.324/0.558/0.234 ms

mininet> sh ovs-ofctl dump-flows s1

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=13.124s, table=0, n_packets=2, n_bytes=196, idle_age=2, priority=60000,dl_dst=00:00:00:00:00:03 actions=output:3 --> Flow1

cookie=0x0, duration=80.105s, table=0, n_packets=12, n_bytes=840, idle_age=2, actions=NORMAL ---> Flow2

mininet>

Notice that the packets/bytes increase for both the entries. This is because when h1 ping is sent to h3, Flow 1 is hit and for the ping reply, Flow 2 gets hit.

I am putting down my understanding in the post to explain the journey of a packet from a VM to an external network. As you read on, I would explain this figure in detail at interface and bridge level with the help of some slides.

Step 1: VM to br-int - Packet filtering by Security Group

Each VM that is created, is attached to a TAP interface (vnetX). This tap interface is connected to a Linux bridge qbrXXX and then a veth pair qbrXXXX - qvoXXXX connects the Linux bridge with br-int.
The security groups are implemented on TAP devices using iptables rules. If an instance has multiple ports, the same security groups are applied on all ports of the instances.

Step 2: br-int to br-tun (Inside the Compute node)

br-int and br-tun are connected via patch ports. The external packets (VLAN tagged) reach br-tun via the patch ports. On br-tun, the VLAN tag is stripped and a tunnel id is added to send the packet to the tunnel between the compute node and the network node.

Step 3: Packet travels to Network Node through GRE Tunnel

At this point, the packet reaches the physical interface - eth1 of the network node via a GRE tunnel.

Step 4: Packet reaches br-int from br-tun

eth1 of the network node belongs to br-tun. The packet is thus received by br-tun. br-tun removes the GRE header and sends the packet to br-int via patch ports(qr veth pair,i.e the receiving interface on br-int is qrXXXX). This is done via GRE-VLAN mapping maintained as flow rules on br-tun.

Step 5: Firewall rules on network node

The packet exits br-int via qrXXXX interface which exists in the qROUTER namespace that belongs to the tenant. Both qrXXXX and qgXXXX interfaces exist in the qROUTER namespace. You can check the interface and route and iptables details using the below commands.
#ip netns exec ifconfig -a
#ip netns exec route -n
#ip netns exec iptables -L
#ip netns exec iptables -L -t nat
qrXXXX is the interface that serves as the internal gateway for a tenant.
qgXXXX is the interface towards the external network on br-ex.
Rules of the tenant's firewall then get executed which determines whether the packet going to external network should be dropped or allowed.
NATing is also done at this point, so the packet leaving the network node has the source IP as that of the qROUTER's external gateway.
Once allowed, the packet reaches qgXXXX interfaces on br-ex and is set to external network or the internet.

The response takes the same path in reverse direction.

My two cents !!!!

Friday, May 8, 2015

NORMAL mode in OVS - MAC learning

Journey of a packet when a VM accesses internet in Openstack

My two cents !!!!

Blog Archive