BGP over unnumbered interfaces, automated

BGP over unnumbered interfaces, automated

Configuring BGP peering between many network devices in a datacenter can become quickly repeatitive, boring and hence open to human errors. Each link requires its own dual stack IP subnet with unique endpoints. The very same endpoint IP addresses must then be used to configure BGP peers on either side of the link. The BGP neighbors will only connect and exchange routes when all the configuration is correct. With all the talks about automation, why not explore it. Tools like Ansible come to mind, which excel at provisioning network devices with help of playbooks based off templates.

Then I stumbled upon a great blog post by ipSpace.net: BGP Configuration Made Simple with Cumulus Linux.
It takes advantage of RFC5549 to exchange IPv4 BGP prefixes with IPv6 next hops and combines it with running EBGP over IPv6 link local addresses. Though that alone won’t solve the issue of discovering the peer AS#. According to the blog post, they use some non-standard method by learning it during the BGP open message exchange instead of being strict about matching configuration.

Cool, but I mainly work with Junos based network elements. There is ongoing work in the IETF to augment LLDP with BGP peer information: draft-acee-idr-lldp-peer-discovery. To bridge the gap until this will eventually get implemented in Junos, I pursued an on-box prototype to simplify and fully automate the provisioning of EBGP peers in the meantime using IPv6 route advertisements using a SLAX based event script and the ephemeral database.

bgp_unnumbered_spine_leaf
bgp_unnumbered_spine_leaf

The event script ad_bgp_peers.slax is run peridiodically that uses the IPv6 router advertisement messages, learning about the interface and the remote IPv6 link local addresses. Learned peers are dynamically provisioned via Junos Ephemeral Database.

Similar to what Ivan described in his blog post, a non-standard method to learn the remote AS is required. One option would be to add custom LLDP TLV’s and learn the AS# that way, but for simplicity, a hack is used that stores the AS# in the ipv6-ra-reachable-time field. This allows to learn all relevent parameters from IPv6 RA message. You probably cry foul here, how can you re-use a timer field? Well, we talk here about sharing information between two peers using the same automation method and IPv6 RA’s already announce the the local IPv6 address (LLA in our case). Would have been great to add custom TLV’s, but I don’t think there is support for it. The I-D mentioned earlier uses LLDP to transport such information. Once that solution is available in Junos, one can simply adjust the script or even better, it might become obsolete.

While working on this idea, I picked up some skills in using IPv4 unnumbered links and IPv6 LLA’s and transport family inet and inet6 over a single IPv6 EBGP peer. Even if one doesn’t have a need for the automation script, I’m sure to look back to my very own blog post on how to do it ;-).

The prototype has been developed using vMX 18.1R1, but should work on Junos equal or newer than 16.1R3 (ephemeral DB being the gating feature). You can find the files on https://gitlab.com/mwiget/bgp-unnumbered.

Features

  • All connecting links are equally configured with IPv4 unnumbered (loopback donor IP) and IPv6 with just link local addresses (“empty inet6” interface stanza is sufficient).
  • Automatic provisioning of the non-standard transport of the local AS via IPv6 RA messages
  • Automatic provisioning of BGP peers via ephemeral DB
  • Any topology is supported without distinguishing spine from leaf
  • BGP peers are only provisioned when both ends use this automation event script
  • Timeout before BGP peer gets purged can be configured, in minutes since the last IPv6 RA message was received

Installation

Repeat these steps for each device participating in the auto provisioning of BGP peering.

Upload event/op script ad_bgp_peers.slax to each Junos device under /var/db/scripts/op:

scp ad_bgp_peers.slax user@router:/var/db/scripts/op/

Configure unique loopback IP addresses:

set interfaces lo0 unit 0 family inet address 10.0.0.1/32
set interfaces lo0 unit 0 family inet6 address ::ffff:10.0.0.1/128

Configure IPv4 unnumbered and family IPv6 for all links. (This could be simplified via wildcards with configuration groups).

set interfaces xe-0/0/0 unit 0 family inet unnumbered-address lo0.0
set interfaces xe-0/0/0 unit 0 family inet6
set interfaces xe-0/0/1 unit 0 family inet unnumbered-address lo0.0
set interfaces xe-0/0/1 unit 0 family inet6
set interfaces xe-0/0/2 unit 0 family inet unnumbered-address lo0.0
set interfaces xe-0/0/2 unit 0 family inet6
set interfaces xe-0/0/3 unit 0 family inet unnumbered-address lo0.0
set interfaces xe-0/0/3 unit 0 family inet6
set interfaces xe-0/0/4 unit 0 family inet unnumbered-address lo0.0
set interfaces xe-0/0/4 unit 0 family inet6

Configure unique autonomous system # per device:

set routing-options autonomous-system 65001

Configure IPv6 ND and RA for all links on each device:

set protocols neighbor-discovery onlink-subnet-only
set protocols router-advertisement interface xe-0/0/2.0
set protocols router-advertisement interface xe-0/0/0.0
set protocols router-advertisement interface xe-0/0/1.0
set protocols router-advertisement interface xe-0/0/3.0
set protocols router-advertisement interface xe-0/0/4.0

Configure BGP with export policies for next-hop self and redistribute local and static:

set protocols bgp family inet unicast local-ipv4-address 10.0.0.1
set protocols bgp family inet6 unicast
set protocols bgp export send-direct
set protocols bgp export send-static
set protocols bgp export change-nh
set policy-options policy-statement change-nh from protocol bgp
set policy-options policy-statement change-nh then next-hop self
set policy-options policy-statement change-nh then accept
set policy-options policy-statement send-direct term 1 from protocol direct
set policy-options policy-statement send-direct term 1 then accept
set policy-options policy-statement send-static term 1 from protocol static
set policy-options policy-statement send-static term 1 then accept

Configure OSPF on all peering interfaces (required to resolve IPv4 unnumbered next hops):

set protocols ospf area 0.0.0.0 interface all interface-type p2p
set protocols ospf area 0.0.0.0 interface lo0.0 passive

Configure the ephemeral storage. The name must match the one used in the script:

set system configuration-database ephemeral instance ad_bgp_peers

Configure op/event script and periodic trigger (there might be better ways to trigger on specific events, but might come with its own set of issues). Configuring the even script as op script allows it to be used interactively too:

set system scripts op file ad_bgp_peers.slax
set event-options generate-event bgp_unnumbered_trigger time-interval 60
set event-options policy bgp_unnumbered_policy events bgp_unnumbered_trigger
set event-options policy bgp_unnumbered_policy then event-script ad_bgp_peers.slax

Verification

The following operational commands are run on spine1 from a 3 vMX topology spinned up via Docker compose using the following configuration files:

Verify IPv6 route advertisements:

mwiget@leaf1> show ipv6 router-advertisement
Interface: xe-0/0/0.0
  Advertisements sent: 107, last sent 00:01:34 ago
  Solicits received: 0
  Advertisements received: 104
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe13:2, heard 00:01:02 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65001 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/1.0
  Advertisements sent: 109, last sent 00:00:58 ago
  Solicits received: 0
  Advertisements received: 104
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe14:2, heard 00:02:25 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65001 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/2.0
  Advertisements sent: 110, last sent 00:02:22 ago
  Solicits received: 0
  Advertisements received: 103
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe15:2, heard 00:01:34 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65001 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/3.0
  Advertisements sent: 105, last sent 00:05:41 ago
  Solicits received: 0
  Advertisements received: 107
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe16:2, heard 00:01:04 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65001 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/4.0
  Advertisements sent: 114, last sent 00:02:44 ago
  Solicits received: 0
  Advertisements received: 106
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe17:2, heard 00:00:20 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65001 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/5.0
  Advertisements sent: 94, last sent 00:05:50 ago
  Solicits received: 0
  Advertisements received: 86
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe18:2, heard 00:00:14 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65002 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/6.0
  Advertisements sent: 95, last sent 00:02:05 ago
  Solicits received: 0
  Advertisements received: 92
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe19:2, heard 00:00:18 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65002 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/7.0
  Advertisements sent: 96, last sent 00:08:33 ago
  Solicits received: 0
  Advertisements received: 88
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe1a:2, heard 00:05:27 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65002 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/8.0
  Advertisements sent: 96, last sent 00:02:40 ago
  Solicits received: 0
  Advertisements received: 84
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe1b:2, heard 00:08:06 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65002 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64
Interface: xe-0/0/9.0
  Advertisements sent: 92, last sent 00:05:31 ago
  Solicits received: 0
  Advertisements received: 91
  Solicited router advertisement unicast: Disable
  IPv6 RA Preference: DEFAULT/MEDIUM
  Advertisement from fe80::242:acff:fe1c:2, heard 00:06:08 ago
    Managed: 0
    Other configuration: 0
    Reachable time: 65002 ms [65011 ms]
    Default lifetime: 1800 sec
    Retransmit timer: 0 ms
    Current hop limit: 64

A nice unintentional side effect of how the show ipv6 router-advertisement command displays differences between local and remote configuration, exposing the local and remote AS number as part of the reachable time, though with the “wrong” unit ;-):

Reachable time: 65002 ms [65011 ms]

Note the Reachable time lines, they contain two numbers: e.g. 65002 and 65011, representing the local and remote AS#, reported as a conflict.

Run the op script manually to verify its operation. While this script gets executed every minute, it doesn’t hurt to run it manually for documentation purpose:

mwiget@leaf1> op ad_bgp_peers
xe-0/0/0.0 neighbor fe80::242:acff:fe13:2 peer-as 65001 since 00:03:32
xe-0/0/1.0 neighbor fe80::242:acff:fe14:2 peer-as 65001 since 00:04:55
xe-0/0/2.0 neighbor fe80::242:acff:fe15:2 peer-as 65001 since 00:04:04
xe-0/0/3.0 neighbor fe80::242:acff:fe16:2 peer-as 65001 since 00:03:34
xe-0/0/4.0 neighbor fe80::242:acff:fe17:2 peer-as 65001 since 00:02:50
xe-0/0/5.0 neighbor fe80::242:acff:fe18:2 peer-as 65002 since 00:02:44
xe-0/0/6.0 neighbor fe80::242:acff:fe19:2 peer-as 65002 since 00:02:49
xe-0/0/7.0 neighbor fe80::242:acff:fe1a:2 peer-as 65002 since 00:07:57
xe-0/0/8.0 neighbor fe80::242:acff:fe1b:2 peer-as 65002 since 00:01:12
xe-0/0/9.0 neighbor fe80::242:acff:fe1c:2 peer-as 65002 since 00:01:54
Auto discovered BGP peers (show ephemeral-configuration ad_bgp_peers)

This shows the discovered remote IPv6 Link-Local Addresses (LLA) and AS#, leading to provisioned BGP peers in the ephemeral database. The last line shows the command to display its content:

mwiget@leaf1> show ephemeral-configuration ad_bgp_peers
## Last changed: 2018-06-09 23:32:34 UTC
protocols {
    router-advertisement {
        interface xe-0/0/2.0 {
            reachable-time 65011;
        }
        interface xe-0/0/1.0 {
            reachable-time 65011;
        }
        interface xe-0/0/0.0 {
            reachable-time 65011;
        }
        interface xe-0/0/3.0 {
            reachable-time 65011;
        }
        interface xe-0/0/4.0 {
            reachable-time 65011;
        }
        interface xe-0/0/5.0 {
            reachable-time 65011;
        }
        interface xe-0/0/6.0 {
            reachable-time 65011;
        }
        interface xe-0/0/7.0 {
            reachable-time 65011;
        }
        interface xe-0/0/8.0 {
            reachable-time 65011;
        }
        interface xe-0/0/9.0 {
            reachable-time 65011;
        }
    }
    bgp {
        group ad_bgp_peers {
            type external;
            neighbor fe80::242:acff:fe13:2 {
                local-interface xe-0/0/0.0;
                peer-as 65001;
            }
            neighbor fe80::242:acff:fe14:2 {
                local-interface xe-0/0/1.0;
                peer-as 65001;
            }
            neighbor fe80::242:acff:fe15:2 {
                local-interface xe-0/0/2.0;
                peer-as 65001;
            }
            neighbor fe80::242:acff:fe16:2 {
                local-interface xe-0/0/3.0;
                peer-as 65001;
            }
            neighbor fe80::242:acff:fe17:2 {
                local-interface xe-0/0/4.0;
                peer-as 65001;
            }
            neighbor fe80::242:acff:fe18:2 {
                local-interface xe-0/0/5.0;
                peer-as 65002;
            }
            neighbor fe80::242:acff:fe19:2 {
                local-interface xe-0/0/6.0;
                peer-as 65002;
            }
            neighbor fe80::242:acff:fe1a:2 {
                local-interface xe-0/0/7.0;
                peer-as 65002;
            }
            neighbor fe80::242:acff:fe1b:2 {
                local-interface xe-0/0/8.0;
                peer-as 65002;
            }
            neighbor fe80::242:acff:fe1c:2 {
                local-interface xe-0/0/9.0;
                peer-as 65002;
            }
        }
    }
}

So far all good. Check the BGP neighbors:

mwiget@leaf1> show bgp group
Group Type: External                               Local AS: 65011
  Name: ad_bgp_peers    Index: 0                   Flags: <>
  Export: [ send-direct send-static change-nh ]
  Holdtime: 0
  Total peers: 10       Established: 10
  fe80::242:acff:fe13:2+53735
  fe80::242:acff:fe14:2+62749
  fe80::242:acff:fe15:2+52357
  fe80::242:acff:fe16:2+55290
  fe80::242:acff:fe17:2+64098
  fe80::242:acff:fe18:2+179
  fe80::242:acff:fe19:2+54499
  fe80::242:acff:fe1a:2+51598
  fe80::242:acff:fe1b:2+61851
  fe80::242:acff:fe1c:2+179
  inet.0: 0/30/0/0
  inet6.0: 2/10/10/0

Groups: 1  Peers: 10   External: 10   Internal: 0    Down peers: 0   Flaps: 8
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0
                      30          0          0          0          0          0
inet6.0
                      10          2          0          0          0          0

And individual neighbors:

mwiget@leaf1> show bgp summary
Groups: 1 Peers: 10 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0
                      30          0          0          0          0          0
inet6.0
                      10          2          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
fe80::242:acff:fe13:2       65001        704        696       0       1     5:18:19 Establ
  inet.0: 0/3/0/0
  inet6.0: 1/1/1/0
fe80::242:acff:fe14:2       65001        704        696       0       2     5:18:15 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe15:2       65001        704        696       0       1     5:18:11 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe16:2       65001        704        696       0       2     5:18:07 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe17:2       65001        704        697       0       2     5:18:03 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe18:2       65002        821        819       0       0     6:12:06 Establ
  inet.0: 0/3/0/0
  inet6.0: 1/1/1/0
fe80::242:acff:fe19:2       65002        822        819       0       0     6:12:03 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe1a:2       65002        821        819       0       0     6:11:59 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe1b:2       65002        821        819       0       0     6:11:55 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0
fe80::242:acff:fe1c:2       65002        820        819       0       0     6:11:50 Establ
  inet.0: 0/3/0/0
  inet6.0: 0/1/1/0

I hope this post is useful and triggers other idea’s on how to automate and leverage what’s available today.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: