L2TPv3 Ethernet tunnel between Juniper vMX and Snabb Switch

In today’s post I explored the interoperability using L2TPv3 between snabbnfv and Juniper’s virtual router vMX. Ethernet traffic from a Linux VM client shall be encapsulated by snabb into L2TPv3, then sent via IPv6 over a loopback cable to the vMX, which extracts the Ethernet payload from the tunnel and switches it to an Ethernet port. Same goes for the reverse direction:

title

Contrary to my previous blog post about vMX and snabb, I connect the vMX to one end of the 10GE loopback cable via a linux bridge (brge0), so without using snabb. The goal really is to show interoperability between snabb and vMX for L2TPv3 based Ethernet tunnels.

Snabbnfv L2TPv3

Configuring snabb to encap/decap L2TPv3 is done via the configuratin file. Its syntax can be found at snabbnfv. Please have a look at one of my previous blog post about installing snabb and finding the right PCI address for the 10G port.

# cat snabb-l2tpv3.cfg
return {
  { vlan = nil,
    mac_address = "52:54:00:00:00:01",
    port_id = "snabb0",
  tunnel = { type = "L2TPv3",
               remote_ip = "2003:1228:8:3c80::2",
               local_ip  = "2003:1228:28:3c80::2",
               session = 4061,
               local_cookie = "00000000",
               remote_cookie = "00000000",
               next_hop = "2003:1228:28:3c80::a" },
  },
}

Launch snabb in a separate window, referring above config file:

# cat launch-snabb.sh
#!/bin/bash
mkdir ./vhost-sockets
while [ 1 ]; do
  ~/snabbswitch/src/snabb snabbnfv traffic -k 10 -D 60 0000:04:00.1 ./snabb-l2tpv3.cfg ./vhost-sockets/%s.socket
  echo "CRASH"
done

# ./launch-snabb.sh
mkdir: cannot create directory ‘./vhost-sockets’: File exists
snabbnfv traffic starting
Loading ./snabb-l2tpv3.cfg
engine: start app snabb0_Virtio
engine: start app snabb0_ND
engine: start app snabb0_Tunnel
engine: start app snabb0_NIC
Sending neighbor solicitation for next-hop 2003:1228:28:3c80::a
Sending neighbor solicitation for next-hop 2003:1228:28:3c80::a
Resolved next-hop 2003:1228:28:3c80::a to 52:54:00:11:13:aa

(The IPv6 neighbour resolve message will only show when the vMX is also up and running).

Now its time to launch a small Linux VM on top of snabb0.socket. For simplicity, I picked a cirros cloud image. It will try to use dhcp and cloud-init during startup and fail, because there is no dhcp server nor cloud-init web server, but eventually time out and allow me to login and statically configure the eth0 interface. The following shell script takes care of downloading and launching the image with qemu:

# cat launch-cirros.sh
#!/bin/bash

IMAGE=http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img

if [ ! -f cirros.img ]; then
  wget $IMAGE
  mv cirros*disk.img cirros.img
fi

/usr/local/bin/qemu-system-x86_64 \
  -drive if=ide,file=./cirros.img \
  --enable-kvm  \
  -m 512 -numa node,memdev=mem \
  -smp 1,sockets=1,cores=1,threads=1 \
  -object memory-backend-file,id=mem,size=512M,mem-path=/mnt/huge,share=on \
  -chardev socket,id=char0,path=./vhost-sockets/snabb0.socket,server \
  -netdev type=vhost-user,id=net0,chardev=char0,vhostforce=on -device virtio-net-pci,netdev=net0,mac=52:54:00:00:00:01 \
  -nographic

Execute the image in a separate terminal window. It will allow interaction with the VM directly in the terminal window:

# ./launch-cirros.sh
QEMU waiting for connection on: unix:./vhost-sockets/snabb0.socket,server
qemu-system-x86_64: -netdev type=vhost-user,id=net0,chardev=char0,vhostforce=on: chardev "char0" went up

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.2.0-80-virtual (buildd@batsu) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #116-Ubuntu SMP Mon Mar 23 17:28:52 UTC 2015 (Ubuntu 3.2.0-80.116-virtual 3.2.68)
[    0.000000] Command line: LABEL=cirros-rootfs ro console=tty1 console=ttyS0
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Centaur CentaurHauls
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000001ffe0000 (usable)
... (removed a lot of output) ...
Aug  4 14:09:53 cirros authpriv.info dropbear[295]: Running in background
############ debug end   ##############
  ____               ____  ____
 / __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/  /_/   \____/___/
   http://cirros-cloud.net


login as 'cirros' user. default password: 'cubswin:)'. use 'sudo' for root.
cirros login:

Log in and configure IP address 1.1.1.1 on eth0:

cirros login: cirros
Password:
$ sudo ifconfig eth0 1.1.1.1/24
$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 52:54:00:00:00:01
          inet addr:1.1.1.1  Bcast:1.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::5054:ff:fe00:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:1132 (1.1 KiB)

$

Now its time to get the vMX going. Without going into details (see e.g. my blog post on vMX), I created and use the following shell script that takes care of the various virtual bridge and tap interfaces:

# cat launch-vmx.sh
#!/bin/bash

IMAGES=/backup/vmx-14.1R5.4-1/images/
IF=p2p1
# needed to reclaim the interface from snabb
echo -n  "0000:04:00.0" > /sys/bus/pci/drivers/ixgbe/bind

if [ "X" == "X`ifconfig tap-vcp`" ]; then
  ip tuntap add dev tap-vcp mode tap
  ip link set tap-vcp up promisc on
fi

if [ "X" == "X`ifconfig tap-vfp`" ]; then
  ip tuntap add dev tap-vfp mode tap
  ip link set tap-vfp up promisc on
fi

if [ "X" == "X`ifconfig tap-vcpi`" ]; then
  ip tuntap add dev tap-vcpi mode tap
  ip link set tap-vcpi up promisc on
fi

if [ "X" == "X`ifconfig tap-vfpi`" ]; then
  ip tuntap add dev tap-vfpi mode tap
  ip link set tap-vfpi up promisc on
fi

if [ "X" == "X`ifconfig virbr0`" ]; then
  brctl addbr virbr0
  ip link set virbr0 up
  ifconfig virbr0 192.168.122.1/24
fi

if [ "X" == "X`brctl show virbr0|grep tap-vcp`" ]; then
  brctl addif virbr0 tap-vcp
fi

if [ "X" == "X`brctl show virbr0|grep tap-vfp`" ]; then
  brctl addif virbr0 tap-vfp
fi

if [ "X" == "X`ifconfig brint0`" ]; then
  brctl addbr brint0
  ip link set brint0 up
  ifconfig brint0 172.16.0.10/24
fi

if [ "X" == "X`brctl show brint0|grep tap-vcpi`" ]; then
  brctl addif brint0 tap-vcpi
fi

if [ "X" == "X`brctl show brint0|grep tap-vfpi`" ]; then
  brctl addif brint0 tap-vfpi
fi

if [ "X" == "X`ifconfig tap-ge0`" ]; then
  ip tuntap add dev tap-ge0 mode tap
  ip link set tap-ge0 up promisc on
fi

if [ "X" == "X`ifconfig tap-ge1`" ]; then
  ip tuntap add dev tap-ge1 mode tap
  ip link set tap-ge1 up promisc on
fi

if [ "X" == "X`ifconfig brge0`" ]; then
  brctl addbr brge0
  ip link set brge0 up
fi

if [ "X" == "X`ifconfig brge1`" ]; then
  brctl addbr brge1
  ip link set brge1 up
  ifconfig brge1 up
  ifconfig brge1 1.1.1.2/24
fi

if [ "X" == "X`brctl show brge0|grep tap-ge0`" ]; then
  brctl addif brge0 tap-ge0
fi

if [ "X" == "X`brctl show brge0|grep $IF`" ]; then
  ifconfig $IF up
  brctl addif brge0 $IF
fi

if [ "X" == "X`brctl show brge1|grep tap-ge1`" ]; then
  brctl addif brge1 tap-ge1
fi

brctl show virbr0
brctl show brint0
brctl show brge0
brctl show brge1

if [ ! -f vmxhdd.img ]; then
  cp $IMAGES/vmxhdd.img .
fi

if [ ! -f vcp.img ]; then
  cp $IMAGES/jinstall64-vmx*.img vcp.img
fi

if [ ! -f vfp.img ]; then
  cp $IMAGES/vPFE-lite-*.img vfp.img
fi

/usr/local/bin/qemu-system-x86_64 \
  -drive if=ide,file=./vcp.img \
  -drive if=ide,file=./vmxhdd.img \
  -M pc -smp 1 --enable-kvm -cpu host \
  -m 2000 -numa node,memdev=mem \
  -object memory-backend-file,id=mem,size=2000M,mem-path=/mnt/huge,share=on \
  -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
  -netdev tap,id=tc0,ifname=tap-vcp,script=no,downscript=no \
  -device virtio-net-pci,netdev=tc0,mac=52:54:00:45:11:01 \
  -netdev tap,id=tc1,ifname=tap-vcpi,script=no,downscript=no \
  -device virtio-net-pci,netdev=tc1,mac=52:54:00:45:12:01 \
  -chardev socket,id=charserial0,host=127.0.0.1,port=8896,telnet,server,nowait \
  -device isa-serial,chardev=charserial0,id=serial0 \
  -vnc 127.0.0.1:1 -daemonize

/usr/local/bin/qemu-system-x86_64 \
  -drive if=ide,file=./vfp.img,format=raw \
  --enable-kvm  \
  -m 5000 -numa node,memdev=mem \
  -smp 4,sockets=1,cores=4,threads=1 \
  -object memory-backend-file,id=mem,size=5000M,mem-path=/mnt/huge,share=on \
  -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
  -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
  -netdev tap,id=t0,ifname=tap-vfp,script=no,downscript=no \
  -device virtio-net-pci,netdev=t0,mac=52:54:00:11:11:02 \
  -netdev tap,id=t1,ifname=tap-vfpi,script=no,downscript=no \
  -device virtio-net-pci,netdev=t1,mac=52:54:00:11:12:02 \
  -netdev tap,id=t2,ifname=tap-ge0,script=no,downscript=no \
  -device virtio-net-pci,netdev=t2,mac=52:54:00:11:13:AA \
  -netdev tap,id=t3,ifname=tap-ge1,script=no,downscript=no \
  -device virtio-net-pci,netdev=t3,mac=52:54:00:11:13:BB \
  -vnc 127.0.0.1:2 \
  -chardev socket,id=charserial0,host=127.0.0.1,port=8897,telnet,server,nowait \
  -device isa-serial,chardev=charserial0,id=serial0 \
  -daemonize

It assumes an unpacked vMX tar file in the path set in the variable $IMAGES and the 10G interface to be used in $IF (p2p1). Adjust both to match your setup.

The script will also assign IP address 1.1.1.2 to the virtual bridge brge1. This will be used later to ping the Cirros VM.

Launching the script will provide no output in case all is fine. You can reach the serial console of the VCP via ‘telnet localhost 8896’. Log in and configure a root-password and the following configuration:

root@vMX> show configuration |display set
set version 14.1R5.4
set system host-name vMX
set system root-authentication encrypted-password "$1$rJn2RXG1$kiez5.P.cPeT.Bev6AVAs1"
set system services ssh
set system syslog user * any emergency
set system syslog file messages any notice
set system syslog file messages authorization info
set system syslog file interactive-commands interactive-commands any
set interfaces ge-0/0/0 unit 0 family inet6 filter input decap
set interfaces ge-0/0/0 unit 0 family inet6 address 2003:1228:28:3c80::a/64
set interfaces ge-0/0/1 encapsulation ethernet-ccc
set interfaces ge-0/0/1 unit 0 family ccc filter input l2tpv3-ifl-1
set interfaces em0 unit 0 family inet address 192.168.122.2/24
set firewall family inet6 filter decap term 1 from destination-address 2003:1228:8:3c80::2/128
set firewall family inet6 filter decap term 1 then count traffic-hit-decap-1
set firewall family inet6 filter decap term 1 then decapsulate l2tp cookie 0x00000000
set firewall family inet6 filter decap term 1 then decapsulate l2tp output-interface ge-0/0/1.0
set firewall family inet6 filter decap term 2 then accept
set firewall family ccc filter l2tpv3-ifl-1 term 0 then count traffic-hit-ifl-1
set firewall family ccc filter l2tpv3-ifl-1 term 0 then encapsulate ifl-1
set firewall tunnel-end-point ifl-1 ipv6 source-address 2003:1228:8:3c80::2
set firewall tunnel-end-point ifl-1 ipv6 destination-address 2003:1228:28:3c80::2
set firewall tunnel-end-point ifl-1 l2tp cookie 0x00000000
set firewall tunnel-end-point ifl-1 l2tp session-id 4061

Wait until the two ge ethernet ports are up, then try if the Cirros VM can be pinged via virge1:

# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=9.14 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=64 time=1.63 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=64 time=11.8 ms
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.639/7.551/11.872/4.327 ms

SUCCESS 😉 !!!

Capture the tunneled traffic via tcpdump on p2p1:

# tcpdump -n -i p2p1 -s 1500 -vv
tcpdump: WARNING: p2p1: no IPv4 address assigned
tcpdump: listening on p2p1, link-type EN10MB (Ethernet), capture size 1500 bytes
15:27:05.618053 IP6 (hlim 255, next-header unknown (115) payload length: 110) 2003:1228:8:3c80::2 > 2003:1228:28:3c80::2: ip-proto-115 110
15:27:05.618509 IP6 (hlim 64, next-header unknown (115) payload length: 110) 2003:1228:28:3c80::2 > 2003:1228:8:3c80::2: ip-proto-115 110
15:27:06.619242 IP6 (hlim 255, next-header unknown (115) payload length: 110) 2003:1228:8:3c80::2 > 2003:1228:28:3c80::2: ip-proto-115 110
15:27:06.619820 IP6 (hlim 64, next-header unknown (115) payload length: 110) 2003:1228:28:3c80::2 > 2003:1228:8:3c80::2: ip-proto-115 110
15:27:07.620026 IP6 (hlim 255, next-header unknown (115) payload length: 110) 2003:1228:8:3c80::2 > 2003:1228:28:3c80::2: ip-proto-115 110
15:27:07.621044 IP6 (hlim 64, next-header unknown (115) payload length: 110) 2003:1228:28:3c80::2 > 2003:1228:8:3c80::2: ip-proto-115 110
15:27:08.620137 IP6 (hlim 255, next-header unknown (115) payload length: 110) 2003:1228:8:3c80::2 > 2003:1228:28:3c80::2: ip-proto-115 110
15:27:08.620805 IP6 (hlim 64, next-header unknown (115) payload length: 110) 2003:1228:28:3c80::2 > 2003:1228:8:3c80::2: ip-proto-115 110
^C
8 packets captured
8 packets received by filter
0 packets dropped by kernel

Summary

This simple test has shown basic interoperability between snabbnfv L2TPv3 and vMX running FRS code 14.1R5.4. There is (at least) one caveat in respect to the use of the cookie: Snabb is placing the cookie directly after the L2TPv3 header, followed by 0x0000000, whereas the MX is placing 0x00000000 there, followed by the cookie. To get this demo setup going, I had to set the cookie on both systems to 0x00000000.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: