Juniper Networks vMX on Snabb NFV

Juniper Networks vMX on Snabb NFV

This is my first attempt getting Juniper Networks vMX virtual router running on top of Snabb Switch. This is probably one of the first few attempts to run non-Linux based VM’s on top of Snabb NFV and more work is need to get it fully functional. This blog explains the required steps to get the first few successful ICMP pings between vMX and a Linux host over a 10 GbE loopback cable.

Requirements

  • Host system

    A single Ubuntu 14.04.2 based Linux server with an Intel 82599ES dual port 10 GbE card is sufficient for an initial connectivity test. A loopback cable connects both ports. Port p2p1 (PCI address 04:00.0) remains attached to the Linux kernel, while the other port, p2p2 (PCI address 04:00.1) will be consumed by the snabbnfv app to provide all required virtual interfaces (virtio) of a vMX.

  • Snabb and qemu:

    Snabb and qemu must be installed on the host server. Please consult my previous post Simple snabbnfv traffic test between 2 VMs over a 10G loopback cable on how to install them.

  • Virtual Router vMX:

    vMX is available for download at https://www.juniper.net/support/downloads/ for authorized customers and partners. In a nutshell, vMX consists of two virtual machines (VM): a control plane VM (vcp) and a forwarding VM (vfp). The vcp image is running Junos (FreeBSD based), while the vfp image is running the vTrio packet handling and forwarding by compiling the programmable Junos Trio chipset microcode for x86 chipsets. The vcp and vfp image each have a management (mgmt) ethernet port and must be interconnected via seperate virtual ethernet ports. While the vMX supports up to 10 virtual GbE or 10 GbE ports, a single data port (ge–0/0/0) is used in this setup.

Network Diagram:

All required virtual interfaces for the vMX are connected to different VLAN’s on the first physical 10 GigE port via snabb NFV. Via loopback cable, the traffic reaches the Linux host on different logical interfaces over the second 10 GigE port p2p1:

             +-----------------------------
             |
        +----+ p2p1 04:00.0 (tester port)
loopback|    |
  cable |    |  
        +----+ p2p2 04:00.1 (snabbnfv port)
             |
             +-----------------------------

The two VM’s, vcp and vfp, have their management ports (em0 and eth0) connected via an untagged virtual network to p2p1 on the Linux host. The internal connection between the VM’s are connected to VLAN 1 and can communicate with each other, thanks to snabb leveraging local switching on the 10 GigE network card.

Optionally, the Linux host is also connected to the internal network via p2p1.1 to test connectivity and access to the VM’s (which have factory assigned, fixed IP addresses that can’t be changed).

Finally, the third virtual ethernet port of the vfp VM is connected via VLAN 2 to the Linux host at p2p1.2. This virtual ethernet port will be available as ge–0/0/0 on vMX, once its up and running.

                                                                 +-----------------
                                                             .10 | 
                         +---------------------------------------+ p2p1.1 internal
                         |                                       |
+---------------+        |        +---------------+ xe0          |   Linux Host
|      vcp      |int1    |    int2|      vfp      |.11        .1 | 
|           em1 +--------+--------+ eth1    0/0/0 +--------------+ p2p1.2 ge-0/0/0  
|      em0      |.1 172.16.0/24 .2|      eth0     |  10.1.1.0/24 |
+-------+-------+                 +-------+-------+              |  
     .11|          192.168.0/24        .12|                   .1 | 
        +---------------------------------+----------------------+ p2p1   mgmt
      mgmt1                           mgmt2                      |
                                                                 +-----------------

Preparing the host interface p2p1

Find the PCI addresses of available 10 GbE ports:

# lspci|grep 10
04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

In case the port has been consumed by a prior run of snabb, bind it back to the host:

# echo -n  "0000:04:00.0" > /sys/bus/pci/drivers/ixgbe/bind

Add VLAN support and configure the interface for untagged and tagged (vlan 1 and 2) traffic:

# IF=p2p1
# apt-get install vlan
# modprobe 8021q
# vconfig add $IF 1
# vconfig add $IF 2
# ifconfig $IF up
# ifconfig $IF.1 up
# ifconfig $IF.2 up
# ip addr add 192.168.0.1/24 dev $IF
# ip addr add 172.16.0.10/24 dev $IF.1
# ip addr add 10.1.1.1/24 dev $IF.2

# ifconfig p2p1
p2p1      Link encap:Ethernet  HWaddr 0c:c4:7a:1f:7e:60
          inet addr:192.168.0.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::ec4:7aff:fe1f:7e60/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:221 errors:0 dropped:0 overruns:0 frame:0
          TX packets:101 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:60642 (60.6 KB)  TX bytes:6522 (6.5 KB)

# ifconfig p2p1.1
p2p1.1    Link encap:Ethernet  HWaddr 0c:c4:7a:1f:7e:60
          inet addr:172.16.0.10  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::ec4:7aff:fe1f:7e60/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:41 errors:0 dropped:0 overruns:0 frame:0
          TX packets:65 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2364 (2.3 KB)  TX bytes:3434 (3.4 KB)

# ifconfig p2p1.2
p2p1.2    Link encap:Ethernet  HWaddr 0c:c4:7a:1f:7e:60
          inet addr:10.1.1.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::ec4:7aff:fe1f:7e60/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:11 errors:0 dropped:0 overruns:0 frame:0
          TX packets:28 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:810 (810.0 B)  TX bytes:2440 (2.4 KB)

Create port config and Launch snabb NFV

Create vmx-ports.cfg file with the following content:

return {
  { vlan = 0,
    mac_address = "52:54:00:00:00:01",
    port_id = "mgmt1",
  },
  { vlan = 0,
    mac_address = "52:54:00:00:00:02",
    port_id = "mgmt2",
  },
  { vlan = 1,
    mac_address = "52:54:00:00:01:01",
    port_id = "int1",
  },
  { vlan = 1,
    mac_address = "52:54:00:00:01:02",
    port_id = "int2",
  },
  { vlan = 2,
    mac_address = "52:54:00:00:02:01",
    port_id = "xe0",
  },
}

The ports “mgmt1” and “mgmt2” will be connected to em0 respectivly eth0 on vcp and vfp and provide management access to the VM’s. (Note that while vfp will try to get a dhcp lease on this interface, this setup won’t provide a dhcp server, but access to these interfaces isn’t really required for this test).

The ports “int1” and “int2” are in VLAN 1 and will provide the inter-VM connectivity between vcp and vfp.

The port “xe0” is connected to VLAN 2 and be connected to ge–0/0/0 on the vMX.

Launch snabb snabbnfv with a shell script as root. The script will relaunch snabb automatically in case of an error:

# cat launch-vmx-ports.sh
#!/bin/bash
mkdir ./vhost-sockets
while [ 1 ]; do
  ~/snabbswitch/src/snabb snabbnfv traffic -k 10 -D 60 0000:04:00.1 ./vmx-ports.cfg ./vhost-sockets/%s.socket
  sleep 10
done
# ./launch-vmx-ports.sh
mkdir: cannot create directory ‘./vhost-sockets’: File exists
snabbnfv traffic starting
Loading ./vmx-ports.cfg
engine: start app int2_NIC
engine: start app mgmt2_Virtio
engine: start app mgmt2_NIC
engine: start app int2_Virtio
engine: start app int1_Virtio
engine: start app xe0_Virtio
engine: start app xe0_NIC
engine: start app int1_NIC
engine: start app mgmt1_NIC
engine: start app mgmt1_Virtio
load: time: 1.00s  fps: 0         fpGbps: 0.000 fpb: 0   bpp: -    sleep: 100 us
load: time: 1.00s  fps: 0         fpGbps: 0.000 fpb: 0   bpp: -    sleep: 100 us
load: time: 1.00s  fps: 0         fpGbps: 0.000 fpb: 0   bpp: -    sleep: 100 us
. . .

Prepare and launch vMX

Get the two virtual images from the tar distribution file. Search for *.img. They are typically in the images directory:

vmx_20150508.1 # ls images/
debug-20150508.img                                  vmxhdd.img
jinstall64-vmx-14.1-20150505.0-domestic.img         vPFE-20150508.img
jinstall64-vmx-14.1-20150505.0-domestic-signed.tgz  vPFE-lite-20150508.img

The jinstall64-vmx–14.1–20150505.0-domestic.img is used as vcp image and vPFE-lite–20150508.img as vfp image. The lite version indicates support for non-SR-IOV virtio support, which is required to run snabb (TODO verify this).
The vmxhdd.img can optionally be used as configuration and log file disk, when attached to the vcp VM (not used in this example).

The actual images used in this test are a tad more recent and equal to the FCS version of the vMX.

To keep things as simple as possible (from a dependency and reproduction point of view), the VM’s are launched directly via qemu-system-x86_64.

Open a separate terminal window to launch vcp via a shell script. Prepare the image by copying the jinstall image to vcp.img and create the shell script:

# cp jinstall64-vmx-14.1-20150505.0-domestic.img vcp.img
# cat launch-vcp.sh
#!/bin/bash
sudo /usr/local/bin/qemu-system-x86_64 -drive if=ide,file=./vcp.img \
  -M pc -smp 1 --enable-kvm -cpu host -m 2048 -numa node,memdev=mem \
  -object memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on \
  -chardev socket,id=char0,path=./vhost-sockets/mgmt1.socket,server \
  -netdev type=vhost-user,id=net0,chardev=char0,vhostforce=on  -device virtio-net-pci,netdev=net0,mac=52:54:00:00:00:01 \
  -chardev socket,id=char1,path=./vhost-sockets/int1.socket,server \
  -netdev type=vhost-user,id=net1,chardev=char1,vhostforce=on  -device virtio-net-pci,netdev=net1,mac=52:54:00:00:01:01 \
  -nographic

Launch vcp from a dedicated terminal window:

root@sm:~/vmxlab# ./launch-vcp.sh
QEMU waiting for connection on: unix:./vhost-sockets/mgmt1.socket,server
QEMU waiting for connection on: unix:./vhost-sockets/int1.socket,server
qemu-system-x86_64: -netdev type=vhost-user,id=net0,chardev=char0,vhostforce=on: chardev "char0" went up

qemu-system-x86_64: -netdev type=vhost-user,id=net1,chardev=char1,vhostforce=on: chardev "char1" went up

Consoles: serial port
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS 639kB/2096000kB available memory

FreeBSD/i386 bootstrap loader, Revision 1.2
(builder@aliath.juniper.net, Tue May 26 02:33:14 UTC 2015)
Loading /boot/defaults/loader.conf
/kernel text=0x965318 data=0x858f8+0x13d980 syms=[0x8+0xe9868+0x8+0xe5774]
...
(output removed)
...
kern.securelevel: -1 -> 1
starting local daemons:set cores for group access
.
Sun Jul 12 19:53:25 UTC 2015
Jul 12 19:53:26 init: mpls-traceroute (PID 2334) started

Amnesia (ttyd0)

login:

You can log in initially via root without password, launch ‘cli’, enter config, set a root password and provision em0 (mgmt port) and set an ip address to ge–0/0/0. Until the vfp is also up and running and successfully connected to vcp, the actual ge interfaces won’t show up.

Prepare vfp.img image and create a launch shell script:


# cp vPFE-lite-20150508.img vfp.img
# cat launch-vfp.sh
#!/bin/bash
sudo /usr/local/bin/qemu-system-x86_64 -drive if=ide,file=./vfp.img \
--enable-kvm -cpu SandyBridge \
-m 5120 -numa node,memdev=mem \
-smp 3,sockets=1,cores=3,threads=1 \
-object memory-backend-file,id=mem,size=5120M,mem-path=/mnt/huge,share=on \
-chardev socket,id=char0,path=./vhost-sockets/mgmt2.socket,server \
-netdev type=vhost-user,id=net0,chardev=char0 \
-device virtio-net-pci,netdev=net0,mac=52:54:00:00:00:02 \
-chardev socket,id=char1,path=./vhost-sockets/int2.socket,server \
-netdev type=vhost-user,id=net1,chardev=char1 \
-device virtio-net-pci,netdev=net1,mac=52:54:00:00:01:02 \
-chardev socket,id=char2,path=./vhost-sockets/xe0.socket,server \
-netdev type=vhost-user,id=net2,chardev=char2 \
-device virtio-net-pci,netdev=net2,mac=52:54:00:00:02:01 \
-nographic

Launch vfp from a dedicated terminal window:

# ./launch-vfp.sh
QEMU waiting for connection on: unix:./vhost-sockets/mgmt2.socket,server
QEMU waiting for connection on: unix:./vhost-sockets/int2.socket,server
QEMU waiting for connection on: unix:./vhost-sockets/xe0.socket,server
qemu-system-x86_64: -netdev type=vhost-user,id=net0,chardev=char0: chardev "char0" went up

qemu-system-x86_64: -netdev type=vhost-user,id=net1,chardev=char1: chardev "char1" went up

qemu-system-x86_64: -netdev type=vhost-user,id=net2,chardev=char2: chardev "char2" went up


SYSLINUX 6.01 2013-07-04 Copyright (C) 1994-2013 H. Peter Anvin et al
...
(output removed)
...
EAL:   0000:00:03.0 not managed by UIO driver, skipping
EAL:   0000:00:04.0 not managed by UIO driver, skipping
INIT: Initializing NIC port 0 ...
INIT: Initializing NIC port 0 RX queue 0 ...
INIT: Initializing NIC port 0 TX queue 0 ...
RPIO: Command socket listening on: 0.0.0.0:3000
RPIO: Event socket listening on: 0.0.0.0:3001
LU: Initializing LU
INIT: Initialization completed.
CONFIG: NIC RX ports: CONFIG: 0 (CONFIG: 0 CONFIG: )  CONFIG: ;
CONFIG: I/O lcore 1 (socket 0): CONFIG: RX ports  CONFIG: (0, 0, 0)  CONFIG: ; CONFIG: Output rings  CONFIG:
Priority : HiCONFIG:  0x7f31a55e0000  CONFIG:
Priority : NorCONFIG:  0x7f31a55e2080  CONFIG: ;
CONFIG: Worker lcore 2 (socket 0) ID 0: CONFIG: Input rings  CONFIG:
Priority : HiCONFIG:  0x19c85c0  CONFIG:  0x19c8688  CONFIG:
Priority : NorCONFIG:  0x19c85c0  CONFIG: ;
CONFIG:
CONFIG: NIC TX ports:  CONFIG: 0  CONFIG: ;
CONFIG: I/O lcore 1 (socket 0): CONFIG: Input rings per TX port  CONFIG: 0 (CONFIG: 0x7f31a55e4100  CONFIG: )  CONFIG: ;
CONFIG: Worker lcore 2 (socket 0) ID 0:
CONFIG: Output rings per TX port  CONFIG: 0 (0x7f31a55e4100)  CONFIG: ;
CONFIG: Ring sizes: NIC RX = 1024; Worker in = 1024; Worker out = 1024; NIC TX = 1024;
CONFIG: Burst sizes: I/O RX (rd = 32, wr = 32); Worker (rd = 32, wr = 32); I/O TX (rd = 32, wr = 32)
RUNTIME: Logical core 1 (I/O) main loop.

RUNTIME: Logical core 2 (worker 0) main loop.

RPIO: Accepted connection from 172.16.0.1:58521 <-> 172.16.0.2:3000
RPIO: Accepted connection from 172.16.0.1:61497 <-> 172.16.0.2:3000
RIOT: Received bandwidth config: b/w : 125000

RIOT: Initializing policer for bank 0, bucket : 0 rate: 125000

METER: Low level srTCM config:
    CIR period = 19333, CIR bytes per period = 1
RIOT: New policer index: 0

Look out for the RPIO: Accepted connection messages. If present, they document a successful connection between vfp and vcp.

Log into the vcp console (using the dedicated terminal window for vcp) and check the progress of the vfp connection:

root@vmx> show chassis fpc
                     Temp  CPU Utilization (%)   Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      DRAM (MB) Heap     Buffer
  0  Present          Testing

root@vmx>

root@vmx> show chassis fpc
                     Temp  CPU Utilization (%)   Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      DRAM (MB) Heap     Buffer
  0  Online           Testing   0         0         0         0          0

root@vmx> show chassis fpc
                     Temp  CPU Utilization (%)   Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      DRAM (MB) Heap     Buffer
  0  Online           Testing 100         0       512        14          0

root@vmx> show interfaces terse ge-0/0/0
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                up    up
ge-0/0/0.0              up    up   inet     10.1.1.11/24
                                   multiservice

Assuming the ge–0/0/0 interface is configured, it should be possible to ping the linux host logical interface over the loopback cable at p2p1.2:

root@vmx> ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1): 56 data bytes
64 bytes from 10.1.1.1: icmp_seq=0 ttl=64 time=373.171 ms
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=16.625 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=7.220 ms
64 bytes from 10.1.1.1: icmp_seq=3 ttl=64 time=3.704 ms
64 bytes from 10.1.1.1: icmp_seq=4 ttl=64 time=147.411 ms
64 bytes from 10.1.1.1: icmp_seq=5 ttl=64 time=159.695 ms
64 bytes from 10.1.1.1: icmp_seq=6 ttl=64 time=9.828 ms


qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: chardev "char1" went down
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: chardev "char0" went down

^C
--- 10.1.1.1 ping statistics ---
8 packets transmitted, 7 packets received, 12% packet loss
round-trip min/avg/max/stddev = 3.704/102.522/373.171/127.253 ms

root@vmx> qemu-system-x86_64: chardev "char1" went up

qemu-system-x86_64: chardev "char0" went up
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: chardev "char1" went down
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: Failed to read msg header. Read 0 instead of 12.
qemu-system-x86_64: chardev "char0" went down
qemu-system-x86_64: chardev "char1" went up
qemu-system-x86_64: chardev "char0" went up

While the first few ping’s worked, snabb ran into an issue in this first test. On the snabbnfv terminal window, I got the following error:

 VIRTIO_F_ANY_LAYOUT VIRTIO_NET_F_MQ VIRTIO_NET_F_CTRL_VQ VIRTIO_NET_F_MRG_RXBUF VIRTIO_RING_F_INDIRECT_DESC VIRTIO_NET_F_CSUM
apps/vhost/vhost_user.lua:161: vhost_user: unrecognized request: 0
stack traceback:
core/main.lua:118: in function <core/main.lua:116>
[C]: in function 'error'
apps/vhost/vhost_user.lua:161: in function 'method'
apps/vhost/vhost_user.lua:142: in function 'process_qemu_requests'
apps/vhost/vhost_user.lua:39: in function 'fn'
core/timer.lua:33: in function 'call_timers'
core/timer.lua:42: in function 'run_to_time'
core/timer.lua:23: in function 'run'
core/app.lua:235: in function 'main'
program/snabbnfv/traffic/traffic.lua:85: in function 'traffic'
program/snabbnfv/traffic/traffic.lua:61: in function 'run'
program/snabbnfv/snabbnfv.lua:15: in function 'run'
core/main.lua:56: in function <core/main.lua:32>
[C]: in function 'xpcall'
core/main.lua:125: in main chunk
[C]: at 0x0044d9e0
[C]: in function 'pcall'
core/startup.lua:1: in main chunk
[C]: in function 'require'
[string "require "core.startup""]:1: in main chunk

This obviously needs further investigation, but its a good start ;-). Time to learn the Lua programming language.

Update: This issue is being tracked and a workaround provided at https://github.com/SnabbCo/snabbswitch/issues/560

Advertisements

One thought on “Juniper Networks vMX on Snabb NFV

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: