CCIE - EI: 3.1 MPLS 📝

2022-04-29 · Topic: CCIE-EI

This is a summary of the notes I’ve written for CCIE-EI - CCIE - EI: 3.1 MPLS. In other words, this only contains what I felt the need to write down and is not meant as a complete study resource. Please see the study resources I’ve used or related blogs for more coherent writeups.

3.1 - MPLS

Lower latency and higher throughput are two major benefits to MPLS/Label switching over IP routing.

MPLS Header(4B):

  • 20b - Label
  • 3b - Exp
  • 1b - S-bit
  • 8b - TTL

LIB - Similar to the RIB, containing all local and remote labels + neighbor RID per associated prefix.
LFIB - Similar to the FIB, containing label, outgoing interface and label/action for the best LSP Segment.

! LIB
show mpls ldp bindings {prefix} {length}

! LFIB
show mpls forwarding-table {prefix} {length}

Label switching:

Incoming MPLS labeled packets are looked up in the LFIB based on the label in the MPLS header, the label of the packet is replaced with the remote label listed in the LFIB entry and forwarded out the interface in the entry.

Incoming IP packets are looked up in the FIB, the local-label and the remote-label is listed if the destination is to be routed into MPLS and is encapsulated & forwarded without any LIB/LFIB lookups.

LSRs do the following:

  • Push, add a label
  • Swap, replace a label
  • Pop, remove labels

3.1.a Operations

Label stack, LSR, LSP

Glossary:

  • LSR - Label Switching Router, any router that forwards based on labels
  • E-LSR - Edge Label Switching Router, any router that receives IP packets and forwards based on labels
  • LSP - Label Switched Path, the chain of labels used to reach a prefix. The LSP is unidirectional.
  • Label stack - Stacking MPLS headers, used in L3VPNs
  • Local label - Label that has been assigned for a prefix by the local router
  • Remote label - Label that has been assigned for a prefix by a LDP peer
  • FEC - Forwarding Equivalence Class, a set of packets receiving the same treatment by a single LSR.

LDP - Label distribution protocol

Peering

Transport:
Hello - UDP/646 -> 224.0.0.2
Update - TCP/646 -> Transport address/LSR ID

The TCP connection is initiated by the LSR with the highest LDP ID.

LSR ID selection:
Same as OSPF

  1. Configuration mpls ldp discovery transport-address
  2. Highest UP/UP Loopback
  3. Highest UP/UP Non-loopback

Messages Hello - LSR ID(32b), Label space(2b), Transport address, Hold-time
Update -

Timers
Hello - 5s, 10s(targeted)
Hold - 15s, 90s(targeted)
Timers can be configured with mpls ldp discovery .....

Authentication
MD5 authentication supported, configured as such:

! Specific neighbor
mpls ldp neighbor {IP} password {password}

! All neighbors
mpls ldp password fallback {password}
mpls ldp password requried {password}

Minimum viable LDP configuration

! Enabled by default, not required unless manually disabled
ip cef

! LDP is default, not required unless manually set to TDP
mpls label protocol {ldp | tdp | both}

! Repeat the next two lines for each MPLS-enabled interface
interface type x/y/z
 mpls ip
! - OR 
mpls ldp autoconfig

mpls ldp autoconfig is preferred, this enables LDP for all interfaces where an IGP is running. This prevents black-holing traffic in MPLS unicast IP forwarding as “best LSP segment” is chosen individually based on best IP unicast route in the local RIB.

Multihop peerings can be achieved as such:

mpls ldp neighbor [vrf {vrf}] <ip-addr> targeted 

! Strictly required for successful peering
mpls ldp discovery targeted-hello accept 

In theory, this will also reduce convergence for flapping links.

Session protection prevents peerings from having to be re-established when directly connected links flap. Targeted peering hellos are sent to the dynamic neighbor transport-address. If the directly connected link fails the targeted hellos might still succeed through another path, in this case the LDP peering is maintained such that re-establishing peering won’t be neccessary once the link comes back up. Configured with mpls ldp session protection [duration {infinite|sec}].

Unicast MPLS

All LSRs add create an entry in the LIB for each destination prefix in the RIB, except for BGP routes. These routes are distributed through LDP to create the LSP.

LSP Creation:

  1. Prefix added to local RIB, local-label is selected
  2. Prefix + Label is advertised to all LDP neighbors
  3. Neighboring routers add the new information to the local LIB
  4. Neighboring routers learn the prefix through IGP and allocates a local-label
  5. Neighboring routers advertise the prefix + label to all neighbors, without split-horizon rules.

The end result are unidirectional LSPs towards each prefix.

LSP Segment selection:

! 1. Find prefix and next-hop
show ip route

! 2. Find labels associated with prefix and LSR ID
show mpls ldp bindings

! 3. Find neighbor associated with next-hop of IP route, 
! correlate with LSR ID of label binding to find LSP segment.
show mpls ldp neighbors

! 4. Information is added to the LFIB 
show mpls forwarding table {prefix} {len}

Mulipathing

The PE routers need to have maximum-paths set above 1 and path-hiding needs to be avoided for multipathing to work.

Path hiding on RRs can be avoided by using unique RDs on the PE routers(typically IP:NUM instead of ASN:NUM. This works due to the RR grouping routes with matching RDs for best-path calculation. Additional-paths can also be used.

Penultimate Hop Popping (PHP)

PHP reduces the load on egress LSRs in MPLS by instructing the second-last LSR in the path to pop the outermost label. Without PHP the egress LSR has to do two LFIB lookups in L3VPN, one for each label in the stack.

PHP is achieved through the egress LSR advertising a “null-label” towards it’s neighbors. A null labels are the reserved label values of 0(explicit) and 3(implicit). By default PHP uses implicit null labels(3) which causes the neighbors to pop the entire outer mpls header from packets.

An explicit null(0) causes neighbors to replace the label of the outer MPLS header with NULL, conserving the rest of the fields. This is useful whenver you want to keep the original EXP field for QOS. MPLS packets with the NULL label does not require the additional LFIB lookup on the egress LSR. Explicit NULL labels can be configured by adding mpls ldp explicit-null to the egress PE.

MPLS ping, MPLS traceroute

TTL Propagation

By default the ingress E-LSR will decrease the IP TTL, then copy the TTL of the incoming IP packet into the TTL field of the MPLS packet. LSRs will then decrease the TTL in the MPLS header without touching the IP TTL field. The egress E-LSR will decrease the MPLS header TTL, decapsulate the IP packet and insert the value of the MPLS header TTL into the IP header TTL.

no mpls ip propagate-ttl disables the copying of the TTL field to/from the IP header. Resulting in the entire MPLS “cloud” looking like 1 hop for external devices.

MPLS Ping

MPLS Traceroute

3.1.b L3VPN

L3VPN route advertisement:

  1. A PE learns a route in a VRF
  2. The PE creates a VPN label defining outgoing VRF and advertises VPN label + prefix in a VPNV4 route
  3. PEs learn VPNV4 route for the prefix
  4. PEs import/export routes into VRF RIB based on route-targets

Forwarding L3VPN traffic:

  1. PE receives packet in a VRF headed for a remote site
  2. PE does a lookup for the prefix in the FIB finds the associated VPN label and MPLS unicast label for the BGP next-hop address.
  3. PE adds both labels and forwards the packet
  4. Packet is forwarded according to MPLS unicast rules until egress router -1.
  5. Egress router -1 pops the MPLS unicast label and forwards it with VPN label only to the egress PE
  6. Egress router forwards packet into a VRF according to received VPN label

The general steps of configuring L3VPNs:

  1. Configure an IGP for the AS
  2. Configure MPLS unicast forwarding on the underlay
int> mpls ip
! OR
ospf> mpls ldp autoconfig
  1. Configure VRFs
vrf definition {name}
 route-distuingisher {IP:NUM}
 route-target both {ASN:NUM}
  1. Configure PE - CE routing and redistribution
  2. Configure MP-BGP between all PEs.
! RR could/should be used
router bgp 65000
 no bgp default ipv4-unicast
 neighbor {} remote-as 65000
 neighbor {} update-source lo0 ! Must be /32
 neighbor {} next-hop-self
 address-family vpnv4
  neighbor {} activate
  neighbor {} send-community extended

PE-CE routing

VRF aware routing into specific VRFs on the PE, with some nice gotcha’s…

EIGRP

Cost Community
When EIGRP is used for PE-CE routing the EIGRP metrics are sent along the VPNV4 route in the ‘Cost:pre-bestpath:’ extended community attribute. This allows advertising the routes as internal routes upon redistribution into EIGRP on the remote end if the EIGRP ASNs match. The metric specified on the redistribute eigrp command will not be used, but is required in the configuration.

This feature can be disabled with bgp bestpath cost-community ignore or the values can be set with the set extcommunitiy cost route-map action. Disabling the cost community entails a risk of any backdoor path being preferred over the MPLS VPN and requires the seed metric specified on the redistribution command to be well thought out.

If there is an EIGRP ASN mismatch between sites the cost community will not be used and the advertised metric will be equal to the metric configured on the redistribute eigrp command.

Site of Origin
Site of Origin(SoO) should be used whenever multi-homed sites exist to avoid routing-loops. SoO is both a BGP extended community and a EIGRP TLV used for filtering. When redistributed between BGP and EIGRP the SoO is retained as a TLV or as a community. Once a SoO value has been set it will never be rewritten.

An SoO is associated with an interface, routes sent out the interface will be sent with the SoO value, received routes matching the local SoO will be dropped.

SoO can be implemented in one of the following ways:

  1. Only SoO values on the CE pointing interface on all PEs
    • Allows for full redundancy, but can cause transient routing loops.
  2. SoO values on the CE pointing interfaces on all PEs and matching SoO on the LAN facing interface(s) of the CE
    • Reduces the chances of transient routing-loops, but can cause parts of the network to become isolated if a link between customer routers on one side of the backdoor link fails. In this case routers with routing only through the backdoor link will be unable to reach the routers on the same side with routing only through the PE and vica-versa.

The SoO values on the PEs can either match or be different. When the SoO matches the MPLS network cannot be used as a backup to the backdoor link, but is simpler to configure.

route-map SoO permit {seq}
 set extcommunity soo {ASN}:{Num}

! PE
interface Gi 0/0 
 ip vrf forwarding {vrf}
 ip vrf sitemap SoO 

! CE, vrf commands with an interface in the GRT is weird...
interface Gi 0/0 
 ip vrf sitemap SoO 

I highly recommend reading through this blog post from INE about EIGRP SoO and the BGP Cost metric.

OSPF

CE routers will see remote site routes as IA routes instead of external routes when OSPF domain-id matches between sites. The domain-id is derived from the OSPF process number by default, which is not ideal and should be configured with domain-id {} under OSPF configuration. The domain-id is carried as an extended-community in BGP VPNV4 updates.

Topology

The MPLS cloud is considered a “super-backbone” in OSPF, a backbone above the area 0. This forces the topology to either have the super-backbone reside “in” area 0(allowing multiple non-backbone areas on sites) or have the entire MPLS L3VPN cloud be in a single area.

Traffic engineering

Differing domain-id between CEs at different sites can be used if having E1/E2 routes is preferrable over IA. This is not a good solution.

Sham links can be used to avoid having intra-area routes over a backdoor being preferred over routes through the MPLS network. The sham link needs to be configured between ABRs and “over” the MPLS L3VPN network.
See 1.4 OSPF - Sham-links for more details.

BGP

The ASN loop prevention mechanism will cause route-installation to fail when multiple sites use the same ASN. This can be avoided on the PE end with as-override, or on the CE end with allowas-in. Handling it on the PE is preferred.

  • as-override removes the ASN on inbound routes on the PE.
  • allowas-in allows the local ASN on inbound routes on the CE router.

SoO should be used to avoid routing-loops when ASN trickery is in place. SoO works the same way as with EIGRP. Configured with neighbor {} soo {ASN:Site-Num} on the PE router on the peering towards the CE.

MP-BGP VPNv4/VPNv6

The MP-REACH NLRI allows for address-family information to be attached to prefixes.
In VPNv4 and VPNv6 the MP-REACH NLRI contains:

  • RD
  • VPN Label
  • Prefix
  • Length

The VPN label is created by the following logic:

  1. A route is added to the VRF RIB and FIB on a PE
  2. The PE creates an entry in the LFIB for the route
  3. The label in the LIB is correlated to the routes redistributed into VPNv4 and is added to the ….

The route-distuingisher(RD) is a 64b value that enables a router to dituinguish between routes in different VRFs in the local BGP table. Each VRF has one RD, but can use multiple route-targets. The format of the RD can be one of the following:

  • {2B int}:{4B int}
  • {4B int}:{2B int}
  • {4B int, dotted decimal}:{2B int} The leftmost integer value is either ASN or IP, the right one can be whatever.

Route-targets are values following the same format as the RD that is sent as an extended community along routes in a specific VRF. All VRFs must have at least one RT which is both imported and exported, easiest configured with route-target both {RT}. Importing or exporting RTs can be thought of as redistributing routes from the local VRF RIB into the BGP route-table.

Routes in different VRFs in BGP VPNv4/6 can be viewed with show ip bgp vpnv{4|6} {all|vrf}

Extranet (route leaking)

Route targets can be used to leak routes between VRFs. This only requires that BGP is running to work, there is no requirement for VPNv4/6 or any neighbors. This is hence also useful when VRF-lite is in use as well.

A VRF can have multiple RTs and multiple route-target import/export statements. Import/export statements causes the local router to install VPNV4/6 routes with the specific RT in a VRF RIB.

Route filtering of imported/exported routes is possible through the use of import map {route-map} and expert map {route-map}. This only does filtering to routes being imported/exported in route-target import/export commands.

Verification & Troubleshooting

The following should be checked:

  • IGP + LDP
    • IGP up end-to-end, check route-table
    • LDP on all IGP enabled interfaces, traceroute PE-PE Lo0
    • PHP issues, Occurs when peering with physical interface.
    • LDP allocation/advertisement filters
  • BGP VPNV4
    • Peerings up
    • Route filtering
    • RT Import & export
    • Unique RD if multipathing
  • PE - CE Routing
    • Correct routes in RIB
    • Any routing loops
    • Filtering

Useful if stuck:

debug mpls ldp transport events ! Useful for adjacency issues.
show ip [vrf {}] cef
clear ip cef  ! Changes in redistribution can mess up the RIB/FIB of a VRF.

Study resources

The MPLS Section of the INE CCIE Enterprise infrastructure learning track is a very good starting-point.

Books used, ranked by most value for time spent:

The CCIE Enterprise Infrastructure Foundation book by Narbik Kocharians hasn’t been released at the time of writing this, but i suspect it will also be a very good resource for the EI.

I have also used the IOS XE 16.2.x configuration guide for MPLS. The MPLS documentation for IOS XE isn’t particularly extensive/good IMO. Google is your friend here.

Various links I’ve found useful:


Got feedback or a question?
Feel free to contact me at hello@torbjorn.dev