CCIE - EI: 1.6 Multicast 📝

2022-04-29 · Topic: CCIE-EI

This is a summary of the notes I’ve written for CCIE-EI - CCIE - EI: 1.6 Multicast. In other words, this only contains what I felt the need to write down and is not meant as a complete study resource. Please see the study resources I’ve used or related blogs for more coherent writeups.

1.6 - multicast

The documentation for multicast in IOS-XE 16.12 is not very good IMO. Configuration guides like MLD is nowhere to be found in detail

Address ranges:

  1. Permanent groups 224.0.0.0/23
    • Local groups 224.0.0.0/24, not to be routed
    • Routed groups 224.0.1.0/24, will be routed
  2. SSM groups 232.0.0.0/8
  3. GLOP addresses 233.0.0.0/8, globally unique experimental groups
    • GLOP address = 233.{first 8b of ASN}.{last 8b of ASN}.X
  4. Private groups 239.0.0.0/8 The remaining address space under 224.0.0.0/4 are “transient groups”, shared global address space for globally routed multicast groups.

IPv6 addresses:
ff00::/8 is reserved for IPv6 multicast Format: FF{flags}{scope}::{64b prefix}{32b group ID}

Flags:

  • 0 Reserved (Most significant bit)
  • 1 Rendevouz - RP embedded
  • 2 Prefix - Prefix information included
  • 3 Transient - Well-known or not (Least significant bit)

Important groups for multicast:

  • 224.0.0.1, All multicast hosts
  • 224.0.0.2, All multicast routers
  • 224.0.0.13, All PIM routers
  • 224.0.0.22, All IGMPv3 routers
  • 224.0.1.39, All RP-Announce(autoRP)
  • 224.0.1.40, All RP-Discover(autoRP)

L3 to L2 address conversion:
01005E(OUI) + 0(1b) + {last 23b of group address}

Due to 5b “dissapearing” in the conversion 25 different IP addresses will result in the same MAC address. This should not be a problem but should also be avoided.

1.6.a Layer 2 multicast

No multicast address will ever be the source address of a frame, all frames will hence be flooded without IGMP snooping(enabled by default).

1.6.a i IGMPv2, IGMPv3

Transport:
IP protocol number 2, TTL of 1.

Default timers:

Query interval:
IGMPv1 - 60s
IGMPv2/3 - 125s
Configured with ip igmp query interval {}

Query response interval(MRT): 10s, 1s for group specific
Configured with ip igmp query-max-response-time {}

Other Querier Present Interval: 255s, 2 x Query interval + 0.5 x Query Response Interval
Configured with ip igmp query-timeout

Last member query interval: 1s(MRT for group specific queries)
Last member query count: 2(not a timer, but closely related.)
Configured with last-member-query-count {} and last-member-query-interval

Group Membership Interval: 260s
The time a router will wait for a membership report before concluding that there are none.

Version 1 router present: 400s

IGMPv2 message types:

  • Membership query, assesses whether subscribed hosts are on a VLAN.
    • General query: sent to 224.0.0.1 with group address of 0.0.0.0 every query interval.
    • Group specific query: sent to the specific group address upon receiving a leave message.
  • v1 Membership report, backwards compatible membership report
  • v2 Membership report, informs a router that hosts want to subscribe to a group
    • Unsolicited, sent upon initial group join
    • Solicited, sent upon query from a router
  • Leave group, sent to 224.0.0.2 to leave a group.

Report suppression:
Hosts set a random timer up to the MRT specified in the query and waits for the timer to expire before answering, the host will not send a report if a membership report has been seen before the timer runs out.

The MRT is only included in query messages and is not used for any other purpose.

Leave mechanism
Host sends leave message to 224.0.0.2(224.0.0.22 in IGMPv3) with the group address specified, router responds with a group-specific query to assess whether to stop forwarding. The router will forward traffic for last Member query interval(=MRT=1s for group specific queries) * last member query count(2 by default)

Interoperabiltiy
IGMPv2 hosts detect IGMPv1 routers based on the inclusion of MRT in queries. When a IGMPv1 router is detected the host starts the Version 1 router present timer which resets on every received v1 query. For the duration of this timer the host will switch into IGMPv1 mode.

IGMPv2 routers detect IGMPv1 hosts based on which IGMP membership reports are received. When IGMPv1 hosts are detected the router ceases to respond to leave messages and starts the IGMPv1 host present countown(equal to the group membership interval, 260s in IGMPv2).

IGMPv3

  • Allows for SSM
  • Filtering on source + group, reduces chance of DOS attacks
  • No report suppression
  • MRT up to 53 minutes
  • Join and leave messages are sent to 224.0.0.22, not all multicast hosts.
  • Backwards compatability with both v1 and v2

Basic configuration

IGMPv2 is enabled by default once PIM is configured on the router.

ip igmp version {}

! Local interface statically joins group
ip igmp join-group {}

! Local interface statically forwards group(not a member itself)
ip igmp static-group {}

1.6.a ii IGMP Snooping, PIM Snooping

IGMP Snooping

IGMP snooping works based on creating CAM entries/adding ports to CAM entries for the MAC/GDA observed on membership reports and forwarding GDAs to detected routers.

Routers are detected based on:

  • OSPF messages
  • IGMP queries
  • PIM hellos
  • HSRP hellos
  • DVMRP

IGMP snooping intrecepts IGMP leave messages and will only forward them to the router if the host sending the message is the last on the VLAN. Similarly all membership reports are intrecepted by IGMP snooping and will not be flooded(breaking report suppression), the switch will still send a single membership report if any are received from host-ports.

IGMPv3 is only supported through BISS(Basic IGMPv3 Snooping Support). Support for SSM membership reports are limited to include mode.

Ports can only be designated as not-router ports through configuring ACLs dropping packets for the protocols used to detect routers.

! Enable IGMP snooping 
ip igmp snooping 
ip igmp snooping vlan {vlan(s)} ! requires global config first

! Statically configure router/querier port
ip igmp snooping mrouter vlan {} int {}

! Static port membership
ip igmp snooping vlan {} static {group addr} interface {}

! Set last member query interval for "regular" leave mechanism
ip igmp snooping last-member-query-interval {1000ms default}
ip igmp snooping last-member-query-count {2 default}

! Immediate-leave, stop forwarding immediately on received IGMPv2 leave message
ip igmp snooping vlan {} immediate-leave

IGMP snooping and STP interaction
When a TCN is recieved, the local switch will flush the local CAM table and revert to flooding multicast traffic. This goes on until 2 general queries has been received by default.

The root bridge sends a IGMP global leave message(group 0.0.0.0) when a TCN is received. This makes non-root bridges send general queries immediately to improve the convergence time. Non-root bridges can also be configured to send IGMP global leave messages to improve convergence further.

! Disable flooding upon received TCN
no ip igmp snooping tcn flood

! Number of general queries to flood traffic for after a STP TCN is received
ip igmp snooping tcn flood query count {2 default}

! Send global leave upon TCN regardless of being a root-bridge 
ip igmp snooping tcn query solicit

! Disable report-suppression
no ip igmp snooping report-suppression

IGMP snooping querier

A switch can be configured to handle IGMP querying for v1 and v2 in the absense of a multicast router. This requires a valid IP address for the SVI interface(s). The IGMP snooping querier will always assume the non-querier state if a multicast router is detected.

IGMPv3, IGMP snooping querier and IGMP filtering doesn’t mix.

! Enable globally
ip igmp snooping querier

ip igmp snooping querier address {}
ip igmp snooping querier version {2 default}
ip igmp snooping querier query-interval {60/125 default}

! Queries in response to received TCN 
ip igmp snooping querier tcn query count {2 default} interval {} 

PIM snooping

The docs for PIM snooping is not great.

Gotchas:

  • S,G routes are processed as *,G mrotues.
  • Does not work with PIMv1
  • PIM snooping without IGMP snooping doesn’t work with local receivers in the VLAN.
  • IPv4 mroutes only
ip pim snooping [vlan {}]

! Disable DR flooding
no ip pim snooping dr-flood

1.6.a iii IGMP Querier

Querier election happens based on source IP of query packets, where the lowest IP address is elected IGMP querier. The router that loses the election stops sending queries and starts monitoring the interval between general queries. The IGMP querier is considered dead after the duration of the “Other Querier Present Interval”(2 x query interval + 0.5 x Query response interval).

When a IGMP router receives a general query an election happens based on the source IP of the packet where the lowest IP wins. If the local router

1.6.a iv IGMP Filter

On routers IGMP filtering can achieved on router interfaces or on switches in tandem with IGMP snooping

Filtering on router:

ip access-list extended {name}
acl> deny igmp {source} {group address} [igmp-type]
int> ip igmp access-group {name}

With IGMP snooping:

! Restrict which groups are allowed
! Configured as allow or deny mode
ip igmp profile {n}
int> ip igmp filter {n}

! Throttling/Max groups on a port
ip igmp max-groups {}
ip igmp max-groups action {deny | replace}

! Minimum version
ip igmp snooping minimum-version {}

1.6.a v MLD

IGMP for IPv6.

Message types:

  • Query
    • General
    • Group specific
  • Report, similar to membership report
  • Done, similar to leave message
    • Only used in MLDv1

MLDv1 is similar to IGMPv1 and MLDv2 is similar to IGMPv3. MLDv1 is the default. I am gambling that I won’t face any detailed questions about this on the exam…

1.6.b Reverse path forwarding check

Only forward multicast traffic if the outgoing interface of the IGP route for the source IP matches the interface where traffic is received. This provides loop avoidance and ensures that the shortest path is used.

PIM routers send join messages out of the RPF interface. This affects the Register process as-well.

A good indication that there is an RPF failure is an Incoming interface of Null. Can quickly be verified with show ip rpf {source}.

Having interfaces with IGPs enabled but not PIM often results in RPF failures

Static mroutes

The RPF check be overriden using static mroutes. Static mroutes are considered has an AD of 0 if nothing else is specified.

ip mroute {prefix} {mask} {interface} [distance]

1.6.c PIM

Protocol Independent Multicast, named as such due to using unicast routing protocols to do the RPF check.

Terminology

  • SPT - Source-based distribution tree, any tree build toward the multicast source
  • RPT - Root-path tree, the tree built towards the RP in PIM-SM
    • Can also be called the shared SPT
    • Represented as *,G, since the source isn’t used when building the tree
  • Source DR - The directly connected router forwarding traffic from a source

Transport PIMv1: IP protocol 2 -> 224.0.0.2

Messages

  • Hello - used to establish adjacencies
    • Holdtime set outbound in the hello message
  • Join/Prune - Used to control which traffic is forwarded by upstream routers
    • A join is a message with a group in the “join” field, while a prune uses the “prune” field
  • Graft, Graft ack - Used to un-prune a S,G pair.
  • State-refresh - Prevents upstream routers changing prune state for a S,G pair.
  • Assert - Negotiates which router should forward into a multiaccess network
    • AD and Metric for the IGP route is sent

Timers
Hello - 30s
ip pim hello-interval {}

Prune timer - 3m
ip pim join-prune-interval {}

State refresh - 60s
ip pim state-refresh origination-interval {}

Register-suppression - 60s

v1 & Compatability PIMv2 is used by default, but routers will switch to v1 whenever it is detected on an interface. PIMv1 can be configured statically with ip pim version 1 on an interface.
PIMv1 does not use hello messages to establish adjacencies. It also uses IP protocol 2 -> 224.0.0.2 for transport.

PIMv1 does not support steady-state/state refresh messages.

Multiaccess network mechanisms

Prune override: Join message as a response when another router tries to prune on the same network.

Assert process: Used to elect which router should forward multicast traffic downstream into a common network. Assert messages are sent by all routers on a multiaccess network when they observe multicast traffic for the same S,G combination as itself is forwarding into the network.

The winner is selected based on:

  1. Lowest AD
  2. Lowest metric
  3. Highest LAN IP address

The IGMPv2 querier and forwarder might be two different routers. In which case the IGMP querier is responsible for maintaining group memberships/subscriptions while the forwarder forwards the traffic.

Designated router: Used to elect which router should forward joins and registers upstream to the RP. Election works exactly like OSPF DR electon and can be affected by setting the DR priority on interfaces with ip pim dr-priority {n}

The “real” interface IP is used when FHRP is in use unless PIM redundancy is configured. For VRRP this is done with ip pim redundancy VRRP1 vrrp dr-priority 90

IPv6 multicast

PIM-SM is enabled for all interfaces when ipv6 multicast-routing is enabled. All PIM adjacencies are formed using link-local addresses.

ipv6 multicast-routing 

int {}
 no ipv6 pim
 ipv6 pim dr-priority {}

Anycast RP, Static RP, BSR and embedded RP are the supported methods for RP discovery.

1.6.c i Sparse Mode

PIM Sparse mode forwards multicast traffic only when requested by downstream routers.

Basic configuration

ip multicast-routing
int> ip pim sparse-mode

Verification

show ip mroute
! the local router is the root of the tree incoming interface is Null

The rondevouz point

Having a router act as a rondevous point is necessary to ensure all groups can be reached by all routers without flooding all links. All multicast sources are registered to the RP and only forwarded downstream if routers request the traffic.

Source registration

  1. Host starts sending traffic to a multicast address
  2. The local router encapsulates the first multicast packet in a register message and sends it by unicast to the RP
  3. The RP either
    • Responds with a register-stop message for Register suppression
    • Starts forwarding the encapsulated packets and continues this list.
  4. The RP begins building a SPT by sending PIM joins towards the source IP.
  5. The RP sends a register-stop onvr the SPT is built and the RP starts receiving the mulicast traffic.

Register suppression

  1. Router attempts to register, but the RP has no joined routers and responds with a register-stop message
  2. Local router starts the register-suppresion timer, 1m by default.
  3. Local router sends empty register message with the Null-register flag five seconds before the timer runs out.
  4. The RP reponds with a register-stop if there still are no sources, or lets the timer run out so that “normal” registration can occur.

Building the RPT/Shared tree

  1. A router receives an IGMP membership report requesting a specific S,G pair
  2. The local router/PIM DR sends a PIM join towards the RP
  3. Upstream routers send PIM joins until the RP is reached
  4. The local router will start the SPT switchover if all goes well(unless configuration disallows it).

The RPF check is done against the RP address while the RPT is in use.

Shortest-Path Tree Switchover

Routers will switch to using a SPT instead of the RTP once a packet has been received through the RPT. This is done to increase efficiency and reduce load on the RP. The local router will send a PIM join towards the source IP as soon as it learns it’s IP from the RP and send a prune message up the RTP once the STP has been established.

The mroute table on the RP shows no inbound interface for the *,G and no outbound interface on the S,G

ip pim spt-treshold {kbps}

NBMA Networks

When using PIM over NBMA networks you will run into split-horizon issues with spoke-to-spoke traffic. This can be avoided with ip pim nbma-mode on the hubs interface.

Spokes should be disallowed from becoming the DR for the segment with ip pim dr-priority 0. This is not strictly necessary though.

1.6.c ii Static RP, BSR, AutoRP

Anycast RP with MSDP or BSR is preferred due to the ability to have redundant RPs.

Static RP

An RP can be assigned statically using ip pim rp-address {}. The RP knows to act as an RP due to seeing a local interface IP configured. The override parameter makes the statically configured RP take precedence over dynamically learned group-to-rp mappings.

This can be combined with anycast routing and MSDP for greater scalability and resiliency.

BSR

Very similar to AutoRP.

Candidate RP = Candidate RP Bootstrap Router = Mapping-agent

Doesn’t require dense-sparse or autorp listener due to BSR messages being encapsulated in PIM messages and forwarded out all non-rpf interfaces.

The BSR does not perform best RP selection, but rather sends all group-to-rp mappings to 224.0.0.13 and lets routers choose the best RP themselves.

Multiple BSRs can be configured for failover. The BSR with the highest priority will become the active BSR, or the BSR with the highest IP in case of a priority tie.

Should be filtered with bsr-border on the network edge.

!! RP candidate
ip pim rp-candidate {int} [group-list {}] [priority {}] [interval {}]
 
!! BSR candidate
ip pim bsr-candidate {int} {priority}

AutoRP

Cisco-proprietary

Requires sparse-dense ip pim sparse-dense-mode(dense logic for .39 og .40) or ip pim autorp listener(preferred) on all routers.

Candidate RP - Candidate RP Mapping agent - Advertises group-to-rp mappings to clients

The mapping agent picks which rp-group-mappings will be advertised. Decided by priority, or highest IP address as a tie-breaker.

RP-Annnounce 224.0.1.39 - “I am an RP and have these groups” RP-Discovery 224.0.1.40 - Group-to-RP mappings, for the best RPs only.

Auto-RP messages must pass the RPF check.

!! RP config
ip pim send-rp-announce {int} [group-list {ACL}] [scope {TTL}] [interval {sec}]

!! Mapping-agent config
ip pim send-rp-discovery {int} [scope {TTL}] [interval {}]

! Filter which Candidate-RP's can be accepted
ip pim rp-announce-filter rp-list {ACL} group-list {ACL}

Embedded RP

For IPv6 PIM the RP address can be embedded in the multicast group IP.

Embedded RP group address
Scope is indicated by fourth digit of the HEX IPv6 address.

  • 1 Interace-local
  • 2 Link-local
  • 4 Admin-local
  • 5 Site-local
  • 8 Org-local
  • E Global

Format: FF7{scope}:{Int ID}{Prefix-length}:{/64 RP prefix}:{Group ID}:{N}

1.6.c iii Group to RP Mapping

1.6.c iv Bidirectional PIM

Defined in RFC 5015 - The polar opposite of SSM, there are only *,G routes in bidir PIM. Scales well in “many-to-many” multicast scenarios.

All traffic is forwarded towards the RP without source registration. Hence the “Bidirectional” part of the name. The PIM Designated-forwarder mechanism is used for loop prevention and the RPF check is disabled. Loops are hence likely to occur if a router in the topoligy isn’t configured for bidir PIM.

! Enable bidir globally
ip pim bidir-enable  

! Configure an RP
ip pim rp-address {} bidir 
! Auto-RP and BSR also supported with the bidir keyword

Phantom RP

Anycast RP + MSDP isn’t possible as there is no S,G pairs or registration process in bidir pim. Hence the need for phantom RP.

Phantom RP achieved through advertising subnets attached to loopback interfaces into the IGP with differing prefix-lengths and advertising/configuring an IP address in said subnet as the RP address. It is important that the RP address isn’t the address of the loopback interface. Multicast convergence with a phantom RP is equal to IGP convergence plus DF election.

BSR cannot be used with phantom RP due to the requirement of using a local interface IP address.

1.6.c v Source-Specific Multicast

Does not use an RP and no *,G are created. This removes the RP as a bottleneck and a potential SPOF. PIM SSM is best for one-to-many multicast routing.

ip pim ssm {default | [range]}
int {}
 ip igmp version 3

1.6.c vi Multicast boundary, RP announcement filter

Administrative scoping

ip multicast boundary

Access lists

Standard access lists allows matching group addresses

To be written...

Extended ACLs allows filtering based on source + group.

To be written... Eventually...

Direction

in - Inbound filtering applies to which S/G pairs will be allowed from “upstream” interfaces.

out - Outbound filters which traffic can be forwarded out non-RPF interfaces

TTL scoping

Setting a maximum TTL for outbound multicast traffic on an interface with

RP announcement filter

Inbound Auto-RP announcement filtering
ip pim rp-announce-filter [rp-list {ACL} | group-list {ACL}] is used to filter inbound RP announcements.

Outbound Auto-RP announcement filtering
The ip multicast boundary ... filter-autorp command drops auto-rp messages denied by the ACL. This option can’t be used with the in/out option.

BSR message filtering BSR messages can be filtered on an interface with ip pim bsr-border.

1.6.c vii PIMv6 Anycast RP

IPv6 doesn’t rely on MSDP for multicast source advertisement and used PIM registers instead.

When a source registers to an IPv6 anycast RP the RP will in turn register the source with all other anycast RP routers. All IPv6 anycast RP routers must hence have all other anycast RP peers configured.

ipv6 pim rp-address {local IP}
ipv6 pim anycast-rp {local IP} {peer}

1.6.c viii IPv4 Anycast RP using MSDP

Makes the convergence of multicast the same as convergence of the IGP. RP selection a result of the IGP routing metric.

MSDP Allows advertisement of (S,G) pairs between RPs with MSDP Source Active messages sent upon a received PIM Register message.

MSDP mesh groups should be used when many MSDP peers are used. This disables the SA-flooding behaviour and reduces load.

ip msdp peer {} connect-source {non-anycast-IP}
ip msdp originator-id {} ! Recommended

! Mesh groups
ip msdp mesh-group {name} {peer-address}

! Filtering locally originated sources
ip msdp redistribute [list {}] [asn {ASN-acl}] [route-map {}]

! Filtering SA Messages
ip msdp sa-filter {in|out} {peer} [list {}] [route-map {}] [rp-list {} | rp-route-map {}]

1.6.c ix Multicast multipath

Multicast ECMP can be enabled with ip multicast multipath. ECMP is done based on source IP, hence not possible with dense-mode or bidir PIM. This makes the query/hello and assert mechanisms weird, but should be a “set and forget”.

Troubleshooting

show ip pim rp mapping
show ip rpf {source-ip}
show ip mroute 
mtrace
debug ip mfib pak

Practice list

Routing

  • PIM-ASM
  • PIM-SSM
  • PIM-Bidir

RP Discovery

  • IPv4 Auto-RP
  • IPv4 Bidir Phantom-RP
  • IPv4/6 Anycast RP
  • IPv4/6 BSR
  • IPv6 Embedded RP

Filtering

  • Multicast forwarding
  • RP Discovery
    • RP candidate filtering
    • Group-rp mappings
    • Scope
  • IGMP
  • MSDP
    • SA filtering
    • “Redistribution”

Snacks

  • Multicast
    • ECMP with static mroutes
  • Static mroutes

Study resources

The Multicast Section of the INE CCIE Enterprise infrastructure learning track is a good starting-point. Though I wouldn’t rely on it as my only study source. Notably it does not cover IPv6 multicast and barely touches L2 multicast.

Books used, ranked by most value for time spent:

The CCIE Enterprise Infrastructure Foundation book by Narbik Kocharians hasn’t been released at the time of writing this, but i suspect it will also be a very good resource for the EI.

I have also used the IOS XE 16.2.x configuration guide extensively.

Various sites I’ve found useful:


Got feedback or a question?
Feel free to contact me at hello@torbjorn.dev