CCIE - EI: 1.5 BGP 📝

2022-04-29 · Topic: CCIE-EI

This is a summary of the notes I’ve written for CCIE-EI - CCIE - EI: 1.5 BGP. In other words, this only contains what I felt the need to write down and is not meant as a complete study resource. Please see the study resources I’ve used or related blogs for more coherent writeups.

1.5 - BGP

1.5a IBGP and EBGP peer relationships

iBGP and eBGP

Defined by the remote-as specified for a neighbor.

Differences between iBGP and eBGP:

  • eBGP defaults to next-hop-self
  • eBGP uses an advertisement interval of 30, while iBGP uses 0 by default.
  • eBGP packets are sent with a TTL of 1, while iBGP uses 255 by default.
  • Some differences default behaviour for tie-breakers, see 1.5b Path selection
  • Some differences in default convergence optimzations, see 1.5e Convergence and Scalability

Peering

Transport: TCP/179
Router with the lowest RID initiates peering.
Initiating router can be identified with show tcp brief
Initiating router can be manually configured with neighbor {} transport active

PMTU discovery is enabled by default. This sets the MSS to the maximum value to maximise efficiency. Should be be disabled with no {neighbor {IP} | bgp} transport path-mtu-discovery if ICMP Destination unreachable isn’t returned along the path to BGP peers.

eBGP packet TTL can be altered with neighbor {} ebgp-multihop, allowing the use of a loopback update-source.

The “opposite” to eBGP multihop is TTL-Security. TTL security limits the number of hops a BGP packet can travel while still being accepted. This is achieved through sending packets with a TTL of 255 and setting a minimum TTL value for incoming BGP packets.

Default timers

Keepalive/Hold timer: 60/180
timers bgp {keepalive} {holdtime}
neighbor {} timers {keepalive} {holdtime}

Update delay: 120s
bgp update-delay {seconds}

Advertisement interval: 30 eBGP , 0 iBGP
neighbor {} advertisement-interval {s}

BGP Scanner: 60s
bgp scan-time {s}

Next-hop-tracker: 5s
bgp nextop trigger delay {}

Autonomous system numbers

“Regular” ASNs are 2B, which doesn’t allow for more than 65535 ASNs. 4B ASNs were introduced to avoid running out of ASNs and allows for 4294967295 ASNs to exist.

4B ASNs can be represented in two different ways on IOS-XE:

  • asplain, default - simply the decimal value of the ASN. E.g. 4294967295
  • asdot, representing 4B ASNs in decimal in 2 x 2B parts separated by a dot. E.g. 65535.65535

The representation of 4B ASNs in show command output can be changed with bgp asnotation dot|plain

The private ASN ranges are:

  • 2B ASN, 64512 - 65535
  • 4B ASN, 4200000000 - 4294967294

RID Selection:

  1. Configured with bgp router-id
  2. IPv4 address of highest numbered UP/UP loopback interface
  3. IPv4 address of highest numbered UP/UP non-loopback interface

If there are no IPv4 addresses configured, a RID will have to be statically configured. RID is only chosen on BGP process start and will not change without restarting BGP. Should be unique to avoid issues in path selection tie-breakers. Though it is allowed according to RFC 6286

Authentication: MD5/TCP-AO
MD5 is the traditional way of authenticating BGP sessions.
Configured with neighbor {IP} password {password}.

MD5 authentication is becoming obsolete and is replaced by TCP-AO(RFC RFC5925, TCP Authentication Option).

TCP-AO has some major benefits including:

  • Dynamic key rollover/better key management
  • Stronger algorithms
  • Better suited for long-lived TCP sesions

TCP-AO Config example:

key chain {name} tcp
key {number}
send-id {id}
recv-id {id}
key-string {string}
cryptographic-algorithm {algorithm}
include-tcp-options 			! Which headers to include in MAC
accept-ao-mismatch-connections	! Do not use, allows non TCP-AO peers
bgp {ASN}
neighbor {IP} ao {keychain}

BGP Messages

Open - Contains parameters needed for peering establishment.
Keepalive - Verifies whether neighbor is alive
Update - Contains withdrawn routes, a set of PAs and the associated NLRIs.
Notification - Sent upon BGP error, resulting in the peering being brought down

An NLRI is a prefix + length.
An odd thing about the Update message is that the NLRIs in the withdrawn routes field doesn’t need to be associated to the PAs in the message.

Peering requirements**:

  1. TCP connection must be recieved from an IP address specified in a neighbor command
    • Meaning that the source IP must match IP of configured neighbor on at least one of the peering routers.
  2. Configured ASN must match ASN in recieved messages
    • Can be manipulated with local-as, described under 1.5.d
  3. RID must be different between peering routers
  4. Authentication must pass

Notably there is no requirement for equal timers between peers. After the Open messages have been exchanged the peers will use the lowest of the two for hello and hold timers.

Peering states

  • Idle, peering has been “shut”
  • Connect, listening for incoming connections on TCP/179
  • Active, attempting to establish TCP connection
  • Open-sent, TCP session established and Open message is sent but not recieved.
  • Open-recieved, Open message is sent and recieved - awaiting first keepalive.
  • Established, Established

Upon entering open-received state during routers will enter “read-only” mode until all messages have been received, or the read-only mode timeout. During read-only mode the received routes from the neighbor will not be included in the best-path selection. The read-only mode timer is set to the value of the BGP update timer.

MP-BGP

Multi-protocol functionality is introduced through the MP_REACH_NLRI and MP_UNREACH_NLRI path attributes. The MP_REACH_NLRI PA consists of address-family, next-hop and an NLRI. MP-BGP capability is negotiated in OPEN messages during peering bring-up, the peering must hence be reset when new address-families are introduced.

When MP-BGP is in use it might make sense to disable the IPv4 address familty with no bgp default ipv4-unicast. The VPNv4 and VPNv6 address-families require extended communities to be enabled as this is the method used for tranmitting route targets and route origin.

Neighbors are configured under the global context and activated under address-families as such:

router bgp 65000
 no bgp default ipv4-unicast
 neighbor 10.0.0.1 remote-as 65001
 address-family ipv6 unicast
  neighbor 10.0.0.1 activate
 address-family vpnv4
  neighbor 10.0.0.1 send-community extended
  neighbor 10.0.0.2 activate

Dynamic peers

Allows for a router to listen for BGP peers without requiring a specific neighbor command.

router bgp 65000
 bgp listen range 10.0.0.1 peer-group {pg-name}
 bgp listen limit {n} ! Capping number of peers

Peer groups

A peer-group is a set of peers that shares the same outbound policies and will recieve the same UPDATE messages. A router can calculate the updates to be sent for all members at the same time.

The lowest numbered peer IP is elected the leader of the peer-group. All outbound updates are processed for this peer and replicated to all others. There is hence no way of overriding outbound policy for a single neighbor. Inbound policy for peer-group members can be overriden per neighbor.

Previously this had to be configured statically with the neighbor {pg-name} peer-group command. This is still useful for dynamic peering but mostly made obsolete due to the introduction of dynamic-peer groups and templates. The major benefit of using templates is greater flexibility in configuration.

There is no need for configuration for dynamic-peer groups to be established. There is also no way to disable dynamic-peer groups.

Peer-groups in use can be viewed with the show bgp replication command.

Templates

Templates simplifies configuration and comes in two flavours:

  • Policy tempalates - Settings related to routing policy/address families
  • Session templates - Settings related to peering sessions

Inheritance allows for hierarchical use of templates, increasing complexity a little but reducing the amount of configuration needed.

router bgp 6500
 template peer-session peer-session-base
  remote-as 65001
  local-as 65001
 template peer-session peer-session-1
  ttl-security hops 2
  inherit peer-session peer-session-base
 template peer-policy 
 neighbor 203.0.113.42 inherit peer-session-1

Injecting routes

The two (main) ways of injecting routes into BGP is through the network and the redistribute command.

network adds an NLRI to the BGP table, if a route for the specified prefix exits in the RIB. NEXT_HOP is set to the next-hop of the IGP route, but can be overridden in an associated route-map. If no mask is set in the command the classful mask will be used

redistribute redistributes routes from another routing protocol/static routes. Metric(MED) should be set when redistributing into BGP as the default of copying IGP metric shouldn’t be relied on. Connected subnets matched with a network statement in the IGP will also be redistributed into BGP.

Auto-summary only affects locally injected routes.
Auto-summary affects routes added with network and redistribute differently:

  • network injects an NLRI for the classful network in addition to the subnet routes when no mask/classful mask is specified.
  • redistribute Only injects the classful network(no subnets) matched routes would be a part of.

Locally injected routes are given the AD of 200 by default.

Synchronization

When syncronization is enabled a router will not advertise a iBGP learned routes to an eBGP neighbor if it doesn’t also exist in an IGP. BGP Syncronization has historically been used to avoid black-holing traffic within an AS due to not running iBGP within a transit AS and lacking routes in the IGP.

1.5b Path selection

Best-path selection

  • Next-hop reachable
  • Weight
  • Local pref
  • Locally injected
  • AS_PATH length
  • Origin
  • MED
  • Neighbor type
  • IGP Metric to next hop

Mnemonic for remembering the path-selection steps: N WLLA OMNI
Both the first and the last step is tied to IGP routing. Bigger is better for the first two modifyable steps, for the rest lower is better.

Tie-breakers

  1. Lowest RID, eBGP compared first. If existing route is eBGP - don’t replace. (see altering best-path selection)
  2. Shortest cluster list
  3. Lowest configured neighbor IP address

All routes can be elected for each of the tie-breakers. E.g. the lowest RID neighbor isn’t necessarily picked if selection reaches the final tie-breaker.

Load balancing

The maximum-paths [ibgp] command allows the best-path algorithm to choose up to 5 equal-cost best paths for installation in the route table. This does not alter which path(s) is advertised, only which routes are chosen and installed locally. Can be used with additional-paths in RR/DMVPN networks to avoid path-hiding.

Must be configured as two separate commands for non-vpnv4/6 routes. maximum-paths eibgp only works for VPNV4.

Requirements for route installation:

  • All candidate routes must reach the tie-breakers
  • If iBGP, all routes to be installed must have a differing next-hop
  • If eBGP, all routes to be installed must be in the same ASN.
maximum-paths {}
maximum-paths ibgp {}

! Excerpt of output from show ip bgp {prefix},
! all paths marked "multipath" are installed in the IP RIB.
!
! 10.2.3.8 (metric 11) from 10.1.3.4 (100.0.0.5)
!   Origin IGP, metric 0, localpref 100, valid, internal, multipath, best
!   Originator:100.0.0.5, Cluster list:100.0.0.4

! Output copied from IOS XE 16.12 docs.

Altering best-path selection

AS_PATH

bgp bestpath as-path multipath-relax allows eBGP load-balancing with differing neighbor ASNs.
Can be useful for load-balancing internet connections.

bgp bestpath as-path ignore completely disables the AS_PATH length comparison in best-path selection

MED

bgp bestpath med missing-as-worst fixes the fact that the default MED value is “best”.
Must be set for all routers in an AS if used.

bgp always-compare-med allows MED to be taken into account despite different neighoring ASNs.

bgp deterministic-med fixes issues caused by the way MED comparison is done in IOS. This blog entry from INE goes through the issue in detail.

Tie-breakers

bgp bestpath compare-routerid enables the router to use the “compare router-id” tie-breaker even if the paths compared are both eBGP learned.

1.5c Routing policies

PA Types

A path attribute can be:

  • Well-known, All BGP speakers must support it
    • Mandatory, All BGP updates must contain these
    • Discretionary, Can be added but isn’t requried (local-pref, atomic aggregate)
  • Optional, Not needed for all BGP implementations
    • Transitive, When the PA isn’t understood - forward it
    • Nontransitive, When the PA isn’t understood - drop it

Route filtering

Filtering can be achieved thorugh:

  • Distribution lists
  • Prefix lists
  • AS_Path filter lists
  • Route-maps in/out
  • aggregate-address through “summary-only” and “supress-map”

Routes in “rib-failure” state and differing next-hop between RIB and BGP table can be filtered with the bgp suppress-inactive.

Distribution lists & prefix lists

Both distribution lists and prefix lists achieve the same thing, filtering prefixes. Distribution-lists and Prefix lists are mutually exclusive and prefix lists are preferred.

Distribution-lists matches routes against an IP ACL. When using an extended ACL the destination field is used to match prefix length.

Filter lists

Filter lists matches the contents of the AS_PATH based on regular expressions.

AS_SET is unordered, but the regex matching is performed based on the order shown in the output of show ip bgp summary. Matching an AS_SET can be done by matching the curly braces {}, which does not require escaping.

REGEX tidbits:
() can be used to group parts of the regex expression
[] defines a set of characters
^ and $ can be used to match the start and end within a grouping
| functions as a logical OR
. is any character except newline
* matches zero or more of the preceding character/grouping
+ matches one or more of the preceding character/grouping
? matches zero or one of the precering character/grouping

Pasting a list of AS_PATHs from the sh bgp sum command into Regexr is good for practicing as it gives a visual feedback of what is matched.

Regular expressions can be tested through sh ip bgp summary | inc {pattern} or show ip bgp regexp. When using show ip bgp summary | inc {} it is possible to use the advertised-routes and received-routes parameter, but it doesn’t allow for matching start and end of string. The show ip bgp regexp command allows for matching with start and end of string, but doesn’t allow for the use of advertised-routes and received-routes parameters.

Regex matching of outbound routes happens before the local AS is prepended to the AS_PATH. Locally originated routes can hence be matched with ‘^$’

ip as-path access-list {n}

Route-maps

Route-maps allow for flexible matching and setting path attributes. Multiple match statements using different matching methods(prefix and as_path) results in a logical AND. Multiple match statements using the same matching method results in a logical OR.

The continue command in a route-map allows for execution to continue after a match is found. This can reduce the number and complexity of route-maps when used correctly. A route-map seq number can be specified to have execution continue at a specific entry.

Example route-map entry:

route-map {name} permit 10
 match ip address {ACL}
 set as-path prepend {string}
 continue [seq]

Overview and order of filtering


Image credit: Cisco, I have not found exact original source yet.

Attribute manipulation

AD Trickery

Not a BGP Path attribute, covered here as it is related to modifying something to affect routing decisions. The distance command can be used to set AD globally, for specific neighbors or specific routes from specific neighbors.

! Global
distance bgp {external} {internal} {local}

! Configured neighbor IP's matched will be affected
distance {AD} {prefix} {wildcard} 		

! Routes matched by the ACL from the matched neighbors affected
distance {AD} {prefix} {wildcard} {ACL}		

The BGP backdoor parameter to the network command sets the AD for eBGP learned routes to the AD value of locally injected routes(200 by default). This allows for IGP learned routes to take presedence over eBGP learned routes. A common usecase would be to get a direct site-to-site link to take presedence despite having an eBGP peering to an L3 VPN SP cloud.

Weight

The weight PA is cisco proprietary and only locally significant. Allowed values are 0-65535 with a default value of 0. Locally injected routes are assigned a default value of 32768.

Weight can be set in in:

  • Neighbor commands
  • Filter lists
  • Route-maps

Local preference

Local pref has a flooding scope of the local AS. Allowed values are 0 - waaay high, with a default value of 100.

Local preference can be set globally or in route-maps. Configuring a default is useful whenever a single router is to be used for traffic outbound from the AS.

router bgp 65000
 bgp default local-preference {}
route-map locpref
 set local-preference {}

AS_PATH Length

Covered under 1.5.d

Origin

i > e > ?
Can be set in route-maps, but isn’t a very effective way to influence routing.

MED

Multiple Endpoint Discriminator/Metric influences routing decisions for a neighboring AS. Allowed values are 0 - very high with a default of 0.

Best path selection based on MED has several notable flaws and gotchas that are covered under 1.5.b.

Communities

The COMMUNITIES PA can be used to manipulate routing policy upon reception. When MED is too far into the best-path selection a community is a good way to affect routing policies further “up” in the process by setting weight or local preference.

Standard communities consists of a 4B value. Extended communities are 8B and allows for sending more information and a larger number of unique communities. Extended communities are commonly used for route-targets in MPLS L3 VPN.

Four COMMUNITY values are reserved for specific purposes:

  • Internet, advertise to all neighbors
  • No-Advertise, do not advertise
  • No-export, do not advertise outside AS
  • Local-AS, do not advertise outside local sub-AS

Routers must be configured to send communities with a neighbor command.
Community values are set in route maps on the sending router and matched with ip community-lists on the receiving end. Specific communities can also be removed on inbound route-maps.

Community lists come in two flavours, standard and extended. Standard community lists match specific community numbers while extended community lists match based on regex. The community PA is unordered, which is important when matching in extended lists. Both standard and extended lists can be configured in “named” mode.

Communities can be displayed either in decimal or in “new-format”, this both affects output in show commands and regex matching in community-lists. Decimal is the default, “new-format” can be enabled with ip bgp-community new-format

ip bgp community new-format
bgp 65000
 neighbor x.x.x.x remote-as 65001
 neighbor x.x.x.x send-community {standard | extended | both}
     
! Standard community list
ip community-list {1-99 | standard {name}} permit | deny {Community number} 
    
! Extended community list
ip community-list {100-199 | extended {name}} permit | deny {regex} 
! - OR - 
ip extcommunity-list {{number} | expanded {name}
     [seq] {deny | permit [regex]}
    
route-map communities
 match community-list {}
 set community {number} [additive] ! Adding community numbers
 set community delete {number} 	! Remove specific community numbers
 set community none				! Clear all

Conditional advertisement

Routes can be advertised based on the existence of other routes in the BGP RID. The lookup can be done based on whether routes exists(exist-map) or doesn’t exist(non-exist-map) in the BGP RIB. Exist-map is evaluated as true if there is a permit/match, while the non-exist-map is evaluate as ture if there are no matches.

The referenced route-map for the [non-]exist-map can have match statements for standard ACLs, prefix-lists and as-path lists. When prefix-lists are in use only exact matches are supported(no ge/lt).

neighbor {name/IP} advertise-map {conditional route-map} [non-]exist-map {route-map}

Conditional route injections

Another way to conditionally advertise prefixes is through an inject-map. This allows advertising a component route based on the existence of an aggregate route. An aggregate route of the conditionally injected component route must exist in the bgp table(default routes allowed).

It differs from ‘regular’ conditional advertisements in that it affects the prefixes in the local bgp table and is applied under the bgp process. There is also no option for a non-exit-map

! Prefixes to be injected
prefix-list {name} permit {injectable-prefix} ...

! Inject-map must be a route-map
route-map {name} permit
 set ip address prefix-list {prefix-list}

! Regular logic applied to the exist-map
route-map {name} permit
 match ...

router bgp {asn}
 bgp inject-map {route-map} exist-map {exist-map}

1.5.d AS path Manipulations

AS_PATH Prepending

The AS_PATH PA can have ASNs prepended in route-maps and can be applied both inbound and outbound. When redistributing routes into BGP the inbound route-tag can be prepended to the AS_PATH, though i am not sure why…

set as-path prepend {string}

Neighbor options

local-as

Allows specifying an ASN that is different than the true local ASN. Can only be used with true eBGP neighbors. The configured local-as ASN is used both for peering and in the AS_PATH PA.

The local-as option is useful for migrations and situations where peering on the neibhors end can’t easily be reconfigured.

When local-as is configured, the configured local-as ASN is prepended upon reception of update messages, not upon advertising. When routes are advertised towards a neighbor with a local-as statement both the configured local-as and the real as is prepended.

The no-prepend parameter makes the local router not prepend the configured local-as ASN upon reception. This hides the fact that local-as is in use to other peers. Hence this only works in the inbound direction.

The no-prepend replace-as parameter completely hides the real ASN from neighbors, only showing the local-as configured ASN in Open messages and AS_PATH. This will cause routing loops if the network isn’t used in a sensible way. Use with caution!

dual-as allows neighbors to establish peerings to either the real or configured local-as ASN.

neighbor {} local-as {asn} [no-prepend[replace-as][dual-as]]

allowas-in

Allows the local ASN in the AS_PATH of incoming routes from specific neihbors. When allowas-in is configured the local ASN is allowed 3 times in the AS_PATH by default. This is useful when an AS becomes partitioned, it ensures split-horizon rules doesn’t affect route learning from another partition through a third-party router.

neighbor {} allowas-in {count}

remove-private-as

Removes all private ASNs from the AS_PATH in outbound advertisements. Only allowed for true eBGP neighbors. Private ASNs will not be removed from the AS_PATH if there are non-private ASNs in the AS_PATH or if the receiving routers ASN is in the AS_PATH.

neighbor {} remove-private-as

Regex

Covered under 1.5.c - Filter lists, only mentioned here for easier lookups against the blueprint.

1.5.e Convergence and scalability

I highly recommend reading through this write-up to brush up on BGP convergence.

Convergence optimizations

Transport

When a peering session is brought up, the devices enter “BGP Read-only mode” until all Updates have been sent or the bgp update-delay has passed. Meaning no best-path calculation will happen with the new routes and updates including new routes will be not sent until all information is known by both routers. A router determines that all Update messages has been received when a keepalive og EoR(End of Rib) has been received.

Optimizing transport by using a high MTU will increase the number of prefixes sent per message, reducing the time spent in read-only mode. This is not likely to make much of a difference in this high-bandwidth day and age, but I still find it fun.

IGP Interaction

BGP verifies next-hops in the IP RIB every 60 seconds by default. The BGP Scanner consumes CPU timer can be configured with bgp scan-time Next hop address-tracking allows a BGP router to respond to IP RIB changes between BGP Scanner runs. Time between runs can be configured with bgp nextop trigger delay {}

IGP summarization can hide information from the IP RIB that can result in nexthop trigger or fast session deactivation not being used. The same goes for border networks, if they aren’t in the IGP BGP will have to rely on itself for convergence.

Fast external fall-over

All routes flushed once common connected subnet between eBGP neighbors is lost. Enabled by default since the dawn of time (IOS 10).

Fast external fall-over can be disabled with no bgp fast-external-fallover.

Fast session-deactivation

Neighbor peering immediately torn down when neighbor IP is removed from the IP RIB.

Works for iBGP and eBGP. Must be used instead of fast external fall-over for eBGP when the peering addresses aren’t in the same connected subnet. Requires a fast-converging IGP to avoid session being disconnected.

Configured with neighbor {} fall-over [bfd].
Appropriate BFD configuration must be in place if BFD is used for multi-hop peerings.

Multipath & Additional paths

Path-hiding results in slow convergence when non-full-mesh peerings are in use. Can be resolved with additional paths and maximum-paths

Route reflectors

Allows for iBGP to iBGP route advertisement without requiring a full peering mesh. Making configuration far simpler and reduces load on devices. Ideally used with dynamic neighbors assigned to a peer-group.

Rules for advertisement:

  • iBGP client to any is advertised
  • iBGP non-client to iBGP non-client is not advertised.
  • eBGP to to any is advertised

Loop prevention in RR clusters:

  • CLUSTER_LIST PA, set by RRs - routes will not be considered if the local cluster ID is listed.
  • ORIGINATOR_ID PA, set by the local AS originator for the route - route not considered if local router is the originator.
  • Best route requirement, a route is not likely to be considered best when thrown for a loop.

The next-hop-self parameter will not alter the NEXT_HOP of routes advertised by an RR. An outbound route-map must be used instead.

A single RR cluster can have multiple RRs, configured as non-clients to eachother.

Multiple RR clusters can be used when physical redundancy calls for it. When multiple clusters are in use - all RRs must create a full mesh. With enough clusters the RR full mesh peerings between RRs can hinder scale, at which point a hierarchical approach can be taken with “clusters of clusters”.

A single cluster can use multiple cluster-IDs(MCID). One global cluster ID and a cluster ID associated with one or more neighbors. I struggle to see a real usecase for this, but it does allow for finer control of BGP route flooding form a single RR. The Cluster ID PA will be updated with both the global cluster-ID and the neighbor specific cluster ID.

Config example:

! Clients only need a single regular peering towards the RR
    
! Minimum viable RR config
bgp cluster-id {id}
neighbor {IP} route-reflector-client
    
! RR config with dynamic peering and peer-group
bgp cluster-id {}
neighbor {pg-name} peer-group
bgp listen address {prefix} {wildcard} peer-group {pg-name}

Cluster ID must be configured BEFORE any clients are added, if not the command will be rejected.

Route aggregation

Configured with aggregate-address.
Injects an aggregate-route to the local BGP table if a component route exists in the local BGP table(NOT RIB)

The summary-only and surpress-map parameters can be used to affect which routes will be injected. surpress-map is odd in that routes matched by the associated route-map will be surpressed.

Default routes

A Default route can be added to BGP in a few ways:

  • network, works as long as there is a default route in the local RIB.
  • redistribute, like network but from a specific protocol.
  • default-information originate, should be avoided as it injects and advertises default route regardless of existence in the RIB.
  • neighbor {} default-originate, does not add default route to the local BGP table, but advertises a default route to specified neighbor.

PA considerations for route aggregation

AS_PATH PA contains:

  • AS_SEQ, the “regular” ordered list of ASN’s for an NLRI.
  • AS_SET, an unordered list of ASN’s for an NLRI.
  • AS_CONFED_SEQ, AS_SEQ for BGP confederations. (not in CCIE - EI blueprint)
  • AS_CONFED_SET, AS_SEQ for BGP confederations. (not in CCIE - EI blueprint)

Whenever the AS_SEQ of the AS_PATH PA differs for routes being aggregated the AS_SEQ for the aggregate address will be set to NULL, which can result in routing loops. In this case the as-set parameter allows for all unique ASNs in the AS_SEQ of aggregated routes are consolidated into the AS_SET. The AS_SET counts as “one hop” during BGP best-path selection. Hence AS_PATH prepending should be used to avoid suboptimal routing.

The ORIGIN PA will be set to either ? or i according to the following rules:

  • If as-set is in use, the ORIGIN will be ? if any component routes have an ORIGIN PA of ?
  • Redistributed routes will have an ORIGIN of ?
  • If default-information originate is set globally, the route will get ?
  • If default-information originate is set for a neighbor, the neighbor will see this as i

NEXT-HOP PA will be set to local update-source IP unless changed with an outbound route-map.

1.5.f Other BGP Features

Additional-paths

By default BGP only advertises the single best route for a prefix. In some scenarios this causes “path hiding” leading to suboptimal routing, route reflector clusters is a common example.

Additional-paths allows for advertising multiple routes per prefix, as long as they are accepted as “best”. This is achieved through the PATH_ID PA which is set uniquely per network per peering, preventing a route advertisement implicitly withdrawing another.

Configuration consists of 3 parts:

  1. Whether to send, recieve or send and recieve multiple routes. Either globally or per neighbor
  2. Selecting candidate routes for advertising, configured globally
  3. Advertising all or a subset of candiate routes, configured per neighbor

The options when configuring additional-path candidate or advertising selection there are three options available: best 2/3, group-best and all. Group-best selects the best routes from each neighbor AS(group). All selects all paths reaching tie-breakers with different next-hop addresses.

router bgp 65000
 ! 1.
 bgp additional-paths [send [receive]] [receive]
 ! - OR - 
 neighbor {} additional-paths [send [receive]] [receive]
 
 ! 2.
 bgp additional-paths select [best {2|3} | group-best | all ]
 
 ! 3. 
 neighbor {} additional-paths advertise [best {2|3} | group-best | all ]

Additional path route tags can be matched in route-maps and is used for filtering routes to be advertised. A tag is equal to the selection policy used for selecting the route. Filtering is done through a regular neighbor route-map out.

route-map {name} {permit | deny}
 match additional-paths advertise-set [best N] [best-range {start} {end} ] [group-best ] [all]
router bgp 65000
 neighbor {} route-map out {name}

Soft reconfiguration & Route-refresh

Three ways of session resetting is supported on IOS-XE 16.12:

  • Hard reset, tearing down the peering
  • Soft reset, storing updates to apply policy without tearing down peering
  • Dynamic inbound soft reset, dynamically refreshing routes.

Soft reconfiguration

Soft reconfiguration stores all inbound update information pre-filtering. This allows re-applying policy without the need of tearing down the peering session. This is disabled by default due to the potentially huge increase in memory consumption. When soft-reconfiguration is a adj-RIB-in and a adj-RIB-out table is created per neighbor peering, containing all update information.

When soft-reconfiguration is enabled all received updates can be viewed with show ip bgp neighbors {addr} received-routes. This includes rejected routes.

Configuration example:

neighbor {} soft-reconfiguration [inbound]
clear ip bgp * soft

Route-refresh

Route-refresh enables BGP peers to dynamically request routing-information from neighbors. Hence a full route excahnge can be initiated without the need for a hard reset. Route-refresh is enabled by default and can’t be enabled/disabled. Route-refresh capability is exchanged during peering establishment and is required by both routers to function.

bgp soft-reconfig-backup enables automatic soft reconfiguration for peers that don’t support route-refresh.

Troubleshooting

To be populated…

Study resources

The BGP Section of the INE CCIE Enterprise infrastructure learning track is a good starting-point. Though I wouldn’t rely on it as my only study source.

Books used, ranked by most value for time spent:

The CCIE Enterprise Infrastructure Foundation book by Narbik Kocharians hasn’t been released at the time of writing this, but i suspect it will also be a very good resource for the EI.

I have also used the IOS XE 16.2.x configuration guide extensively.

Various links I’ve found useful:


Got feedback or a question?
Feel free to contact me at hello@torbjorn.dev