You are on page 1of 1065

Course Presentation Material

Pgina 1 de 2

Last updated 8/27/2001

Finally, the IP Multicast Training materials have been updated with the latest information! Many of you have previously downloaded these materials for the purpose of self-training on IP Multicast. These new course materials have been updated and are even better than before. The training material includes lots of new topics not covered previously in the old course materials. Here's just a sample of some of the changes/additions to the material: Layer 2 Campus Design - Module 2 now contains material on the design issues relating to IP Multicast over campus networks including topics such as CGMP, IGMP Snooping, IGMPv3, mutlicast over ATM-LANE, and general Layer 2 "gotchas" that need to be avoided. Rendezvous Points - Module 6 is devoted to this topic and will help you answer that age old question, "Where do I put my RP?" This module covers the use of various RP techniques such Static RP's, Auto-RP and BSR as well as how to tune and debug these mechanisms. Advanced IP Multicast Features - Module 7 is a module that is chocked full of advanced multicast topics including, IP Multicast Helper, dealing with Rate-Limits, Admin. Scoping and many others. A real "must" read for people that are serious about extracting the absolute maximum performance out of their multicast network Multiprotocol BGP (MBGP) - Module 10 covers the use of MBGP for Inter-domain IP Multicast. Even if you are not already a BGP guru, you will find this module very helpful as it contains a short overview on BGP that will give the beginner to Inter-domain IP Multicasting the necessary background on BGP to get started. Multicast Source Discovery Protocol (MSDP) - Module 11 highlights MSDP and how it is currently being used in the Internet to interconnect independent PIM Sparse mode domains so that true Inter-domain IP Multicast can be accomplished. PIM Protocol Extension - Module 12 is an entirely new module that describes two of the latest extensions to the PIM protocol; Source-Specific Multicast (SSM) and Bidirectional PIM (Bidir). These two new extensions allow PIM to scale better in several different ways. These IP Multicast training modules have previously been used for internal Cisco training only. Due to the tremendous demand for this information, we have made this content available for "self-training" purposes by our customers. All of this material is copyrighted and may not be repackaged or resold for any commercial purposes without written permission from Cisco Systems. All modules are constantly being updated to improve their content and/or correct any mistakes in the material. You may wish to check this page from time to time to download updated versions of the material. The following training modules are all in Adobe PDF format. To view them, use Adobe Acrobat Reader. If file://D:\Lucho\xx\index.html 01/03/2003

Course Presentation Material

Pgina 2 de 2

you don't have Acrobat Reader on your workstation you may download a free version from Adobe. Module Module1 Module2 Module3 Module 4 Module 5 Module 6 Module 7 Module 8 Module 9 Module 10 Module 11 Module 12 Title "Fundamentals of IP Multicasting" (1276 KB) "IP Multicasting at Layer 2" (1249 KB) "PIM Dense Mode" (936 KB) "Basic Multicast Debugging" (529 KB) "PIM Sparse Mode" (1599 KB) "Rendezvous Points" (971KB) "Advanced IP Multicast Features" (1319 KB) "DVMRP" (513 KB) "Interconnecting PIM & DVMRP Multicast Networks" (839 KB) "Multi-protocol BGP (MBGP)" (1139 KB) "Multicast Source Discovery Protocol (MSDP)" (1605 KB) "PIM Protocol Extensions" (1053 KB) Updated 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001 8/27/2001

file://D:\Lucho\xx\index.html

01/03/2003

Fundamentals of IP Multicast
Module 1

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Module Objectives

Recognize when to use IP Multicast Identify the fundamental concepts involved in IP Multicasting Characterize the differences in various IP Multicast routing protocols

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

2 2

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Agenda

Geekometer

Why Multicast Multicast Applications Multicast Service Model Multicast Distribution Trees Multicast Forwarding Multicast Protocol Basics Multicast Protocol Review
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

3 3

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Why Multicast?

When sending same data to multiple receivers Better bandwidth utilization Less host/router processing Receivers addresses unknown

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

4 4

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Unicast vs Multicast

Unicast
Host Router

Multicast
Host Router
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

5 5

Unicast transmission sends multiple copies of data, one copy for each receiver
Ex: host transmits 3 copies of data and network forwards each to 3 separate receivers Ex: host can only send to one receiver at a time

Multicast transmission sends a single copy of data to multiple receivers


Ex: host transmits 1 copy of data and network replicates at last possible hop for each receiver, each packet exists only one time on any given net work Ex: host can send to multiple receivers simultaneously

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Multicast Advantages
Enhanced Efficiency : Controls network traffic and reduces server and CPU
loads

Optimized Performance: Eliminates traffic redundancy Distributed Applications: Makes multipoint applications possible
Example: Audio Streaming All clients listening to the same 8 Kbps audio

Multicast Unicast 0.8 0.6 Traffic 0.4 Mbps 0.2 0 1

20

40

60

80

100

# Clients
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

6 6

Multicast transmission affords many advantages over unicast transmission in a one-to-many or many-to-many environment
Enhanced Efficiency: available network bandwidth is utilized more efficiently since multiple streams of data are replaced with a single transmission Optimized Performance: less copies of data require forwarding and processing Distributed Applications: multipoint applications will not be possible as demand and usage grows because unicast transmission will not scale Ex: traffic level and clients increase at a 1:1 rate with unicast transmission Ex: traffic level and clients do not increase at a greatly reduced rate with multicast transmission

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Multicast Disadvantages
Multicast is UDP Based!!!
Best Effort Delivery: Drops are to be expected. Multicast applications
should not expect reliable delivery of data and should be designed accordingly. Reliable Multicast is still an area for much research. Expect to see more developments in this area.

No Congestion Avoidance : Lack of TCP windowing and slow-start


mechanisms can result in network congestion. If possible, Multicast applications should attempt to detect and avoid congestion conditions.

Duplicates: Some multicast protocol mechanisms (e.g. Asserts, Registers


and Shortest-Path Tree Transitions) result in the occasional generation of duplicate packets. Multicast applications should be designed to expect occasional duplicate packets.

Out Out- of of-Sequence Packets: Various network events can result in packets
arriving out of sequence. Multicast applications should be designed to handle packets that arrive in some other sequence than they were sent by the source.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

7 7

Multicast Disadvantages
Most Multicast Applications are UDP based. This results in some undesirable sideeffects when compared to similar unicast, TCP applications. Best Effort Delivery results in occasional packet drops. Many multicast applications that operate in real-time (e.g. Video, Audio) can be impacted by these losses. Also, requesting retransmission of the lost data at the application layer in these sort of real-time applications is not feasible. Heavy drops on Voice applications result in jerky, missed speech patterns that can make the content unintelligable when the drop rate gets high enough. Moderate to Heavy drops in Video is sometimes better tolerated by the human eye and appear as unusual artifacts on the picture. However, some compression algorithms can be severely impacted by even low drop rates; causing the picture to become jerky or freeze for several seconds while the decompression algorithm recovers. No Congestion Control can result in overall Network Degradation as the popularity of UDP based Multicast applications grow. Duplicate packets can occasionally be generated as multicast network topologies change. Applications should expect occasional duplicate packets to arrive and should be designed accordingly.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

IP Multicast Applications
Live TV and Radio Broadcast to the Desktop ing Multi e Learn cast Distanc F Data and F i l e T r a n s f er ile Re plica tion Corporate Broadcasts

ing nc e r nfe Co nd Wh o e ema iteb d D i V oar On d/C eo Vid olla bor atio RealReal -Time Data Delivery DeliveryFinancial n
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Training

8/10/2001 3:45 PM

8 8

Many new multipoint applications are emerging as demand for them grows
Ex: Real-time applications include live broadcasts, financial data delivery, whiteboard collaboration, and video conferencing Ex: Non-real-time applications include file transfer, data and file replication, and video-on-demand Note also that the latest version of Novell Netware uses Ipmc for file and print service announcements.see: http//developer.novell.com/research/appnotes/1999/march/02/index.htm

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

Example Multicast Applications Mbone Multicast Applications


sdrsession directory
Lists advertised sessions Launches multicast application(s)

vataudio conferencing
PCM, DVI, GSM, and LPC4 compression

vicvideo conferencing
H.261 video compression

wbwhite board
Shared drawing tool Can import PostScript images Uses Reliable Multicast
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

9 9

Several MBONE multicast applications exist


Ex: Session Directory is a tool that allows participants to view advertised multicast sessions and launch appropriate multicast applications to join an existing session Ex: Audio Conferencing allows multiple participants to share audio interactively Ex: Video Conferencing allows multiple participants to share video and audio interactively Ex: White Boarding allows multiple participants to collaborate interactively in a text and graphical environment

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

sdrSession Directory

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

10 10

SDR - Session Directory (revised)


The SDR tool allows Multimedia multicast sessions to be created by other users in the network. These multimedia sessions (video, audio, etc.) are announced by the SDR application via well-known multicast groups. The window on the left shows an example of the SDR application in action. Each line is a multimedia session that has been created by some user in the network and is being announced (via multicast) by the creators SDR application. By clicking on one of these sessions, the window on the right is brought up. This window displays various information about the multimedia session including: General session information Session schedule Media type (in this case audio and video) Media format Multicast group and port numbers Using the window on the right, one can have SDR launch the appropriate multicast application(s) to join the session.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

10

vatAudio Conferencing

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

11 11

Vat - Audio Conferencing Tool


This is an example of the vat audio conferencing tool. The window on the left is the main window for the session. It contains a speaker gain slider widget and an output VU bar-graph meter along with a microphone gain slider widget and VU meter. When one wishes to address the conference, one usually presses the right mouse button on the workstation. The window on the right is a menu that can be brought up by pressing the Menu button on the main window. This menu allows various parameters about the session to be adjusted including encoding format. Notice that there are several members of this session listed in the main window even though only the second person is talking. (Indicated by the blackened square next to the name.) This points out that all members of the session are multicast sources even though they may never speak and only listed to the session. This is because vat uses the RTP/RTCP model to transport Real-Time audio data. In this model, all members of the session multicast member information and reception statistics to the entire group in an RTCP back-channel . Most all multimedia multicast applications use the RTP/RTCP model including: vat (and its cousin application rat) vic wb - (Whiteboard) IP/TV

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

11

vicVideo Conferencing

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

12 12

vic - Video Conferencing Tool


This is an example of the vic video conferencing tool. The window on the right is the main window for the video conferencing session. Notice that multiple video streams are being received, each with its own thumbnail image. The window on the left is a larger version of the thumbnail image from the main window.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

12

wbWhite Board

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

13 13

wb - Whiteboard
Just as its name implies, this is a form of electronic Whiteboard that can be shared by members of the multicast group.

White Board uses a form of Reliable Multicast


Reliable Multicasting is necessary to insure no loss of critical graphic information occurs. Most multimedia multicast applications simply use UDP, Best Effort datagram delivery mechanisms because of the time critical nature of the media. However, wb needs a reliable method to distribute the graphic images drawn on the electronic Whiteboard.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

13

Downloading MBone Applications


Multimedia conferencing application archive
Contains sdr, vic, vat, rat, wb, nte, and other applications URL:
http://www-mice.cs.ucl.ac.uk/multimedia/software/

Multiple platform support


SunOS, Solaris, HP, Linux, Windows 95/98/2000, Windows NT, etc.

Source code
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

14 14

Several multimedia applications for the MBONE are freely available


- Download the desired application for the appropriate platform - Source code and binaries are available

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

14

IP Multicast Service Model

RFC 1112 (Host Ext. for Multicast Support) Each multicast group identified by a class-D IP address Members of the group could be present anywhere in the Internet Members join and leave the group and indicate this to the routers Senders and receivers are distinct: i.e., a sender need not be a member Routers listen to all multicast addresses and use multicast routing protocols to manage groups

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

15 15

RFC 1112 is the Internet Group Management Protocol (IGMP)


Allows hosts to join a group that receives multicast packets Allows users to dynamically register (join/leave multicast groups) based on applications they execute Uses IP datagrams to transmit data

Addressing
Class D IP addresses (224-239) are dynamically allocated Multicast IP addresses represent receiver groups, not individual receivers

Group Membership
Receivers can be densely or sparsely distributed throughout the Internet Receivers can dynamically join/leave a multicast session at any time using IGMP to manage group membership within the routers Senders are not necessarily included in the multicast group they are sending to Many applications have the characteristic of receivers also becoming senders eg RTCP streams from IP/TV clients and Tibco RV

Multicast Routing
Group distribution requires packet distribution trees to efficiently forward data to multiple receivers Multicast routing protocols effectively direct multicast traffic along network paths Multicast Extension to Open Shortest Path First (MOSPF - 1584) Core Based Tree (CBT)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

15

IP Multicast Service Model


IP group addresses
Class D addresshigh-order 3 bits are set (224.0.0.0) Range from 224.0.0.0 through 239.255.255.255

Well known addresses designated by IANA


Reserved use: 224.0.0.0 through 224.0.0.255
224.0.0.1all multicast systems on subnet 224.0.0.2all routers on subnet
See ftp://ftp. isi.edu/in-notes/ iana/assignments/multicast-addresses

Transient addresses, assigned and reclaimed dynamically


Global scope: 224.0.1.0-238.255.255.255 Limited Scope: 239.0.0.0-239.255.255.255 Site-local scope: 239.253.0.0/16 Organization-local scope: 239.192.0.0/14

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

16 16

IP Addresses use the Class D address space


Class D addresses are denoted by the high 4 bits set to 1110.

Local Scope Addresses


Addresses 224.0.0.0 through 224.0.0.255 Reserved by IANA for network protocol use Eamples: 224.0.0.1 224.0.0.2 224.0.0.3 224.0.0.5 224.0.0.6 All Hosts All Multicast Routers All DVMRP Routers All OSPF Routers All OSPF DR

Multicasts in this range are never forwarded off the local network regardless of TTL Multicasts in this range are usually sent link local with TLL = 1.

Global Scope Addresses


Addresses 224.0.1.0 through 238.255.255.255 Allocated dynamically throughout the Internet

Administratively Scoped Addresses


Addresses 239.0.0.0 through 239.255.255.255 Reserved for use inside of private Domains.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

16

IP Multicast Addressing
Dynamic Group Address Assignment
Historically accomplished using SDR application
Sessions/groups announced over well-known multicast groups Address collisions detected and resolved at session creation time Has problems scaling

Future dynamic techniques under consideration


Multicast Address Set-Claim (MASC)
Hierarchical, dynamic address allocation scheme Extremely complex garbage-collection problem. Long ways off
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

17 17

Dynamic Group Address Assignment


SDR This was typically accomplished using the SDR application which would detect collisions in IP multicast group address assignment at the time new sessions were being created and pick an unused address. While it was sufficient for use on the old MBone when the total number of multicast sessions in the Internet was quite low, SDR has severe scaling problems that preclude it from continuing to be used as the number of sessions increase. Multicast Address Set-Claim (MASC) MASC is new proposal for a dynamic multicast address allocation that is being developed by the malloc Working Group of the IETF. This new proposal will provide for dynamic allocation of the global IP Multicast address space in a hierarchical manner. In this proposal, domains lease IP multicast group address space from their parent domain. These leases are good for only a set period. It is possible that the parent domain may grant a completely different range at lease renewal time due to the need to reclaim address space for use elsewhere in the Internet. As one can imagine, this is a very non-trivial mechanism and is a long ways from actual implementation.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

17

IP Multicast Addressing

Static Group Address Assignment (new)


Temporary method to meet immediate needs Group range: 233.0.0.0 - 233.255.255.255
Your AS number is inserted in middle two octets Remaining low -order octet used for group assignment

Defined in IETF draft


draft-ietf-mboned-glop-addressing -xx.txt

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

18 18

Static Group Address Assignment


Until MASC has been fully specified and deployed, many content providers in the Internet require something to get going in terms of address allocation. This is being addressed with a temporary method of static multicast address allocation. This special allocation method is defined in: draft-ietf-mboned-glop-addressing-xx.txt The basic concept behind this methodology is as follows: Use the 233/8 address space for static address allocation The middle two octets of the group address would contain your AS number The final octet is available for group assignment.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

18

Multicast Protocol Basics

Multicast Distribution Trees Multicast Forwarding Types of Multicast Protocols


Dense Mode Protocols Sparse Mode Protocols

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

19 19

Multicast Distribution Trees


Defines the path down which traffic flows from source to receiver(s).

Multicast Forwarding
Unlike unicast forwarding which uses the destination address to make its forwarding decision, multicast forwarding uses the source address to make its forwarding decision.

Type of Multicast Protocols


Dense Mode Protocols Spares Mode Protocols

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

19

Multicast Distribution Trees

Shortest Path or Source Distribution Tree


Source 1 Notation: (S, G) S = Source G = Group Source 2
A B D F

Receiver 1
Module1. ppt

Receiver 2
8/10/2001 3:45 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

20 20

Shortest Path Trees aka Source Trees


A Shortest path or source distribution tree is a minimal spanning tree with the lowest cost from the source to all leaves of the tree. We forward packets on the Shortest Path Tree according to both the Source Address that the packets originated from and the Group address G that the packets are addressed to. For this reason we refer to the forwarding state on the SPT by the notation (S,G) (pronounced S comma G). where: S is the IP address of the source. G is the multicast group address Example 1: The shortest path between Source 1 and Receiver 1 is via Routers A and C, and shortest path to Receiver 2 is one additional hop via Router E.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

20

Multicast Distribution Trees

Shortest Path or Source Distribution Tree


Source 1 Notation: (S, G) S = Source G = Group Source 2
A B D F

Receiver 1
Module1. ppt

Receiver 2
8/10/2001 3:45 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

21 21

Shortest Path Trees aka Source Trees (cont.)


Every SPT is routed at the source. This means that for every source sending to a group, there is a corresponding SPT. Example 2: The shortest path between Source 2 and Receiver 1 is via Routers D, F and C, and shortest path to Receiver 2 is one additional hop via Router E.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

21

Multicast Distribution Trees

Shared Distribution Tree


Notation: (*, G) * = All Sources G = Group

D (RP)

(RP)

PIM Rendezvous Point Shared Tree

Receiver 1
Module1. ppt

Receiver 2
8/10/2001 3:45 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

22 22

Shared Distribution Trees


Shared distribution tree whose root is a shared point in the network down which multicast data flows to reach the receivers in the network. In PIM -SM, this shared point is called the Rendezvous Point (RP). Multicast traffic is forwarded down the Shared Tree according to just the Group address G that the packets are addressed to, regardless of source address. For this reason we refer to the forwarding state on the shared tree by the notation (*,G) (pronounced star comma G) where: * means any source G is the group address

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

22

Multicast Distribution Trees

Shared Distribution Tree


Source 1 Notation: (*, G) * = All Sources G = Group Source 2
A B D (RP) F

(RP)

PIM Rendezvous Point Shared Tree Source Tree

Receiver 1
Module1. ppt

Receiver 2
8/10/2001 3:45 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

23 23

Shared Distribution Trees (cont.)


Before traffic can be sent down the Shared Tree it must somehow be sent to the Root of the Tree. In classic PIM-SM, this is accomplished by the RP joining the Shortest Path Tree back to each source so that the traffic can flow to the RP and from there down the shared tree. In order to trigger the RP to take this action, it must somehow be notified when a source goes active in the network. In PIM -SM, this is accomplished by first-hop routers (i.e. the router directly connected to an active source) sending a special Register message to the RP to inform it of the active source. In the example above, the RP has been informed of Sources 1 and 2 being active and has subsequently joined the SPT to these sources.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

23

Multicast Distribution Trees Characteristics of Distribution Trees



Source or Shortest Path trees
Uses more memory O(S x G) but you get optimal paths from source to all receivers; minimizes delay

Shared trees
Uses less memory O(G) but you may get sub-optimal paths from source to all receivers; may introduce extra delay

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

24 24

Source or Shortest Path Tree Characteristics


Provides optimal path (shortest distance and minimized delay) from source to all receivers, but requires more memory to maintain

Shared Tree Characteristics


Provides sub-optimal path (may not be shortest distance and may introduce extra delay) from source to all receivers, but requires less memory to maintain

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

24

Multicast Distribution Trees How are Distribution Trees Built?


PIM
Uses existing Unicast Routing Table plus Join/Prune/Graft mechanism to build tree.

DVMRP
Uses DVMRP Routing Table plus special Poison-Reverse mechanism to build tree.

MOSPF
Uses extension to OSPFs link state mechanism to build tree.

CBT
Uses existing Unicast Routing Table plus Join/Prune/Graft mechanism to build tree.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

25 25

Distribution trees are built in a variety of ways, depending upon the multicast routing protocol employed
PIM utilizes the underlying unicast routing table (any unicast routing protocol) plus: Join: routers add their interfaces and/or send PIM -JOIN messages upstream to establish themselves as branches of the tree when they have interested receivers attached Prune: routers prune their interfaces and/or send PIM-PRUNE messages upstream to remove themselves from the distribution tree when they no longer have interested receivers attached Graft: routers send PIM-GRAFT messages upstream when they have a pruned interface and have already sent PIM-PRUNEs upstream, but receive an IGMP host report for the group that was pruned; routers must reestablish themselves as branches of the distribution tree because of new interested receivers attached DVMRP utilizes a special RIP -like multicast routing table plus: Poison-Reverse: a special metric of Infinity (32) plus the originally received metric, used to signal that the router should be placed on the distribution tree for the source network. Prunes & Grafts: routers send Prunes and Grafts up the distribution similar to PIM-DM. MOSPF utilizies the underlying OSPF unicast routing protocol's link state advertisements to build (S,G) trees Each router maintains an up-to-date image of the topology of the entire network CBT utilizes the underlying unicast routing table and the Join/Prune/Graft mechanisms (much like PIM -SM)
Copyright 1999-2001, Cisco Systems, Inc.
Module1.ppt

25

Multicast Forwarding

Multicast Routing is backwards from Unicast Routing


Unicast Routing is concerned about where the packet is going. Multicast Routing is concerned about where the packet came from.

Multicast Routing uses Reverse Path Forwarding


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

26 26

Multicast Forwarding
Routers must know packet origin, rather than destination (opposite of unicast) ... origination IP address denotes known source ... destination IP address denotes unknown group of receivers Multicast routing utilizes Reverse Path Forwarding (RPF) ... Broadcast: floods packets out all interfaces except incoming from source; initially assuming every host on network is part of multicast group ... Prune: eliminates tree branches without multicast group members; cuts off transmission to LANs without interested receivers ... Selective Forwarding: requires its own integrated unicast routing protocol

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

26

Reverse Path Forwarding (RPF)


What is RPF?
A router forwards a multicast datagram only if received on the up stream interface to the source (I.e. it follows the distribution tree).

The RPF Check


The routing table used for multicasting is checked against the source address in the multicast datagram. If the datagram arrived on the interface specified in the routing table for the source address; then the RPF check succeeds. Otherwise, the RPF Check fails.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

27 27

Reverse Path Forwarding


Routers forward multicast datagrams received from incoming interface on distribution tree leading to source Routers check the source IP address against their multicast routing tables (RPF check); ensure that the multicast datagram was received on the specified incoming interface Note that changes in the unicast topology will not necessarily immediately reflect a change in RPFthis depends on how frequently the RPF check is performed on an Ipmc stream - every 5 seconds is current Cisco default.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

27

Reverse Path Forwarding (RPF)

If the RPF check succeeds, the datagram is forwarded If the RPF check fails, the datagram is typically silently discarded When a datagram is forwarded, it is sent out each interface in the outgoing interface list Packet is never forwarded back out the RPF interface!

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

28 28

Multicast Forwarding
Successful RPF Checks allow the datagram to be forwarded ... Datagram is forwarded out all outgoing interfaces, but not out the RPF interface the datagram was received on Unsuccessful RPF Checks cause the datagram to be silently dropped

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

28

Reverse Path Forwarding (RPF)

Example: RPF Checking

Source 151.10.3.21

Mcast Dist. Tree Mcast Packets

RPF Checks Fail Packets arrived on wrong interface!

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

29 29

Multicast Forwarding: RPF Checking


Source floods network with multicast data Each router has a designated incoming interface (RPF interface) on which multicast data can be received from a given source Each router receives multicast data on one or more interfaces, but performs RPF check to prevent duplicate forwarding

Example: Router receives multicast data on two interfaces


1) performs RPF Check on multicast data received on interface E0; RPF Check succeeds because data was received on specified incoming interface from source 151.10.3.21; data forwarded through all outgoing interfaces on the multicast distribution tree 2) performs RPF Check on multicast data received on interface E1; RPF Check fails because data was not received on specified incoming interface from source 151.10.3.21; data silently dropped

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

29

Reverse Path Forwarding (RPF)

A closer look: RPF Check Fails


Multicast Packet from Source 151.10.3.21

RPF Check Fails!


Unicast Route Table Network Interface 151.10.0.0/16 S1 198.14.32.0/24 S0 204.1.16.0/24 E0

S1 E0

X
S0 S2

Packet Arrived on Wrong Interface! Discard Packet!

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

30 30

Multicast Forwarding: RPF Check Fails


Ex: Router can only accept multicast data from Source 151.10.3.21 on interface S1 ... multicast data is silently dropped because it arrived on an interface not specified in the RPF check (S0)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

30

Reverse Path Forwarding (RPF)

A closer look: RPF Check Succeeds


Multicast Packet from Source 151.10.3.21
S0 S1 S2 E0

RPF Check Succeeds!


Unicast Route Table Network Interface 151.10.0.0/16 S1 198.14.32.0/24 S0 204.1.16.0/24 E0

Packet Arrived on Correct Interface! Forward out all outgoing interfaces. (i. e. down the distribution tree)

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

31 31

Multicast Forwarding: RPF Check Succeeds


Ex: Router can only accept multicast data from Source 151.10.3.21 on interface S1 ... multicast data is forwarded out all outgoing on the distribution tree because it arrive on the incoming interface specified in the RPF check (S1)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

31

TTL Thresholds
What is a TTL Threshold?
A TTL Threshold may be set on a multicast router interface to limit the forwarding of multicast traffic to outgoing packets with TTLs greater than the Threshold.

The TTL Threshold Check


1) All incoming IP packets first have their TTL decremented byone. If <= Zero, they are dropped. 2) If a multicast packet is to be forwarded out an interface with a non-zero TTL Threshold; then its TTL is checked against the TTL Threshold. If the packets TTL is < the specified threshold, it is not forwarded out the interface.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

32 32

TTL-Thresholds
Non-Zero, Multicast, TTL-Thresholds may be set on any multicast capable interface. IP multicast packets whose TTLs (after being decremented by one by normal router packet processing) are less than the TTL -Threshold on an outgoing interface, will be not be forwarded out that interface. Zero Multicast TTL implies NO threshold has been set.

TTL-Threshold Application
Frequently used to set up multicast boundaries to prevent unwanted multicast traffic from entering/exiting the network.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

32

TTL Thresholds

A closer look: TTL-Thresholds


Multicast Packet w/TTL = 24
oilist: S1: (TTL-Threshold = 16) E0: (TTL-Threshold = 0) S2: (TTL-Threshold = 64)

TTL-Threshold = 16 S1 E0

S2 TTL-Threshold = 64

TTL-Threshold = 0

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

TTL-Threshold Example
In the above example, the interfaces have been configured with the following TTL Thresholds: S1: TTL -Threshold = 16 E0: TTL -Threshold = 0 (none) S2: TTL -Threshold = 64 An incoming Multicast packet is received on interface S0 with a TTL of 24. The TTL is decremented to 23 by the normal router IP packet processing. The outgoing interface list for this Group contains interfaces S1, E0 & S2. The TTL -Threshold check is performed on each outgoing interface as follows: S1: TTL (23) > TTL -Threshold (16). FORWARD E0: TTL (23) > TTL -Threshold (0). FORWARD

S2: TTL (23) < TTL -Threshold (64). DROP

Copyright 1999-2001, Cisco Systems, Inc.

S0

Packet not forwarded!

8/10/2001 3:45 PM

33 33

Module1.ppt

33

TTL Threshold Boundaries

Company ABC

Eng

Mkt

TTL-Threshold = 16

TTL-Threshold = 128

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

34 34

TTL-Threshold Boundaries
TTL-Thresholds may be used as boundaries around portions of a network to prevent the entry/exit of unwanted multicast traffic. This requires multicast applications to transmit their multicast traffic with an initial TTL value set so as to not cross the TTL -Threshold boundaries. In the example above, the Engineering or Marketing departments can prevent department related multicast traffic from leaving their network by using a TTL of 15 for their multicast sessions. Similarlly, Company ABC can prevent private multicast traffic from leaving their network by using a TTL of 127 for their multicast sessions.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

34

Administrative Boundaries

Administrative Boundary = 239.0.0.0/8

239.x.x.x multicasts
Serial0

239.x.x.x multicasts
Serial1

Configured using the ip multicast boundary <acl> interface command

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

35 35

Administrative Boundaries
Administratively-scoped multicast address ranges may also be used as boundaries around portions of a network to prevent the entry/exit of unwanted multicast traffic. This requires multicast applications to transmit their multicast traffic with a group address that falls within the Administrative address range so that it will not cross the Administrative boundaries. In the example above, the entire Administratively-Scoped address range, (239.0.0.0/8) is being blocked from entering or leaving the router via interface Serial0. This is often done at the border of a network where it connects to the Internet so that potentially sensitive company Administratively-Scoped multicast traffic can leave the network. (Nor can it enter the network from the outside.) Administrative multicast boundaries can be configured in Cisco IOS by the use of the ? ? ? ? ? ? ? ? ? ? ? ? ???????? interface command.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

35

Administrative Boundaries

Company ABC

LA Campus

NYC Campus

239.128.0.0/16

239.129.0.0/16

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

36 36

Administrative Boundaries
Administratively-scoped multicast address ranges generally used in more than one location. In the example above, the Administratively-Scoped address range, (239.128.0.0/16) is being used by both the LA campus and the NYC campus. Multicast traffic originated in these address ranges will remain within each respective campus and not onto the WAN that exists between the two campuses. This is often sort of configuration is often used so that each campus can source high-rate multicasts on the local campus and not worry about it being accidentally leaked into the WAN and causing congestion on the slower WAN links. In addition to the 239.128.0.0/16 range, the entire company network has a Administrative boundary for the 239.129.0.0/16 multicast range. This is so that multicasts in these ranges do not leak into the Internet. Note: The Admin. -Scoped address range (239..0.0/8) is similar to the 10.0.0.0 unicast address range in that it is reserved and is not assigned for use in the Internet.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

36

Types of Multicast Protocols


Dense-mode
Uses Push Model Traffic Flooded throughout network Pruned back where it is unwanted Flood & Prune behavior (typically every 3 minutes)

Sparse-mode
Uses Pull Model Traffic sent only to where it is requested Explicit Join behavior
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

37 37

Dense-mode multicast protocols


Initially flood/broadcast multicast data to entire network, then prune back paths that don't have interested receivers

Sparse-mode multicast protocols


Assumes no receivers are interested unless they explicitly ask for it

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

37

Multicast Protocol Review Currently, there are 4 multicast routing protocols:


? DVMRPv3 (Internet-draft)
DVMRPv1 (RFC1075) is obsolete and was never used.

? MOSPF (RFC 1584) Proposed Standard ? PIM-DM (Internet-draft) ? CBT (Internet-draft) ? PIM-SM (RFC 2362) Proposed Standard
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

38 38

IETF status of Multicast Protocols


DVMRPv1 is obsolete and was never used. DVMRPv2 is an old Internet-Draft and is the current implementation used through-out the Mbone. DVMRPv3 is the current Internet-Draft although it has not been completely implemented by most vendors. MOSPF is currently at Proposed Standard status. However, most members of the IETF IDMR working group doubt that MOSPF will scale to any degree and are therefore uncomfortable with declaring MOSPF as a standard for IP Multicasting. (Even the author of MOSPF, J. Moy, has been quoted in an RFC that, more work needs to be done to determine the scalability of MOSPF.) PIM-DM is in Internet Draft form and work continues to move into an RFC. CBT is also in Internet Draft form and while it has been through three different and incompatible revisions, it is not enjoying significant usage nor is it a primary focus of the IETF IDMR working group. PIM-SM moved to Proposed Standard in early 2000. Much of the effort in the IETF towards a working multicast protocol is focused on PIM -SM.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

38

Dense-Mode Protocols

DVMRP - Distance Vector Multicast Routing Protocol MOSPF - Multicast OSPF PIM DM - Protocol Independent Multicasting (Dense Mode)

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

39 39

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

39

DVMRP Overview
Dense Mode Protocol
Distance vector-based
Similar to RIP Infinity = 32 hops Subnet masks in route advertisements

DVMRP Routes used:


For RPF Check To build Truncated Broadcast Trees (TBTs)
Uses special Poison-Reverse mechanism

Uses Flood and Prune operation


Traffic initially flooded down TBTs TBT branches are pruned where traffic is unwanted. Prunes periodically time-out causing reflooding.
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

40 40

Distance Vector Multicast Routing Protocol


Builds a distribution tree per source network based on best metric (hop-count) back towards the source network. Infinity = 32 hops A Poison Reverse metric is used by DVMRP routers to signal their upstream neighbor that they are downstream and expect to receive traffic from a source network via the upstream router. Poison Reverse is denoted by adding Infinity (32) to the received metric and then sending it back to the router from which it was originally received. When a Poison Reverse is received for a source network, the interface over which it was received is placed on the outgoing interface list for the source network. Multicast traffic is flooded out all interfaces on the outgoing interface list (I.e. down the distribution tree for the source network). Downstream neighbors send Prunes up the distribution tree for multicast traffic for which they have no group members.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

40

DVMRP Source Trees


Source Network S1 Truncated Broadcast Trees Are Built using Best DVMRP Metrics Back to Source Network. Lowest IP Address Used in Case of a Tie. (Note: IP Address of D < C < B < A)

A
mrouted 1 33 1 mrouted 33

B
mrouted 1 2

X
mrouted 3 mrouted 34 35

mrouted 2 2

E
3

35

Y
mrouted

n m

Route for source network of metric n Poison reverse (metric + infinity) sent to upstream parent router. Router depends on parent to receive traffic for this source. Resulting Truncated Broadcast Tree for Source Network
41 41

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

DVMRP Source Trees


DVMRP builds its Source Trees utilising the concept of Truncated Broadcast Trees. The basic definition of a Truncated Broadcast Tree (TBT) is as follows: A Truncated Broadcast Tree (TBT) for source subnet S1, represent a shortest path spanning tree rooted at subnet S1 to all other routers in the network. In DVMRP, the abstract notion of the TBTs for all sub-networks are built by the exchange of periodic DVMRP routing updates between all DVMRP routers in the network. Just like its unicast cousin, RIPv2, DVMRP updates contain network prefixes/masks along with route metrics (in hop-counts) that describe the cost of reaching a particular subnets in the network. Unlike RIPv2, a downstream DVMRP router makes use of a special PoisonReverse advertisement to signal an upstream router that this link is on the TBT for source subnet S1. This Poison-Reverse (PR) is created by adding 32 to the advertised metric and sending back to the upstream router.

Example DVMRP TBT for network S1:


In the above example, DVMRP updates are being exchanged for source network S1. Routers A and B both advertise a metric of 1 (hop) to reach network S1 to routers C and D. In the case of router D, the advertisement from B is the best (only) route to source network S1 which causes router D to send back a PR advertisement (metric = 33) to B. This tells router B that router D is on the TBT for source network S1. In the case or router C, it received an advertisement form both A and B with the same metric. It breaks the tie using the lowest IP address and therefore sends a PR advertisement to router B. B now knows it has two branches of the TBT, one to router C and one to router D. These DVMRP updates flow throughout the entire network causing each router to send PR advertisements to its upstream DVMRP neighbor on the TBT for source network S1.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

41

DVMRP Source Trees

Forwarding onto Multi-access Networks


Network S1 A
mrouted 1 1

Both B & C have routes to network S1. To avoid duplicates, only one router can be Designated Forwarder for network S1. Router with best metric is elected as the Designated Forwarder. Lowest IP address used as tie-breaker. Router C wins in this example.

mrouted

mrouted 2 2

(Note: IP Address of C < B ) n


Module1. ppt

Route advertisement for network S1 of metric n


1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

42 42

Forwarding onto Multi-access Networks


When two or more routers share a common Multi-access network, only one can be the Designated Forwarder which is responsible for forwarding a source networks traffic onto the Multi-access network; otherwise duplicate packets will be generated. The Designated Forwarder is selected based on the best route metric back to the source network (with the Lowest IP Address used as a tie-breaker). In the example above, both Router B and C share a common Multi-access network and each have routes to network S1. Since both have the same metric to network S1, the lowest IP address is used to break the tie (in this case, Router C wins).

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

42

DVMRP Source Trees


Source Network S1 Resulting Truncated Broadcast Tree for Source Network S1

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

S1 Truncated Broadcast Tree


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

43 43

Example DVMRP TBT for network S1 (cont.)


Once the DVMRP network has converged and all PR advertisements have been sent up the TBT toward source network S1, the S1 TBT has been built. The drawing above shows the S1 TBT that resulted in the DVMRP route update exchanges from the previous page. Notice that this is a minimum spanning tree that is rooted at source network S1 and spans all routers in the network. If a multicast source were to now go active in network S1, the DVMRP routers in the network will initially flood this sources traffic down the S1 TBT.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

43

DVMRP Source Trees


Each Source Network has its Own Truncated Broadcast Tree

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Note: IP Address of D < C < B < A

S2 Truncated Broadcast Tree


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Source Network S2
8/10/2001 3:45 PM

44 44

Every source network has its own TBT


In the drawing above, the TBT for network S2 is shown. This TBT would also be created by the exchange of DVMRP route updates and by PR advertisements sent by all routers in the network toward network S2. It is important to remember that these TBTs simply exist in the form of PR advertisements in the DVMRP routing tables of the routers in the network and as such, there is one TBT for every source network in the DVMRP net work.

Advantages of TBTs
The advantage of TBTs is that the initial flooding of multicast traffic throughout the DVMRP network is limited to flowing down the branches of the TBT. This insures that there are no duplicate packets sent as a result of parallel paths in the network.

Disadvantages of TBTs
The disadvantage of using TBTs is that it requires separate DVMRP routing information to be exchanged throughout the entire network. (Unlike other multicast protocols such as PIM that make use of the existing unicast routing table and do not have to exchange additional multicast routing data. Additionally, because DVMRP is based on a RIP model, it has all of the problems associated with a Distance-Vector protocol including, count-to-infinity, holddown, periodic updates. One has to ask oneself, Would I recommend someone build a unicast network based on RIP today? The answer is of course not, protocols like OSPF, IS-IS, and EIGRP have long since superseded RIP in robustness and scalability. The same is true of DVMRP.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

44

DVMRP Flood & Prune


Source S Initial Flooding of (S, G) Multicast Packets Down Truncated Broadcast Tree

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

45 45

DVMRP Example
In this example we see source S has begun to transmit multicast traffic to group G. Initially, the traffic (shown by the solid arrows) is flooded to all routers in the network down the Truncated Broadcast Tree (indicated by the dashed arrows) and is reaching Receiver 1.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

45

DVMRP Flood & Prune


Source S Routers C is a Leaf Node so it sends an (S, G) Prune Message Router B Prunes interface.

A
mrouted Prune

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

46 46

DVMRP Example (cont.)


At this point, we see that router C is a leaf node on the TBT and has no need for the traffic. Therefore, it sends a DVMRP (S, G) Prune message up the TBT to router B to shutoff the unwanted flow of traffic. Router B receives this (S, G) Prune message and shuts off the flow of (S, G) traffic to router C.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

46

DVMRP Flood & Prune


Source S Routers X, and Y are also Leaf Nodes so they send Prune (S, G) Messages Router E prunes interface. Prune

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Prune

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

47 47

DVMRP Example (cont.)


Both routers X and Y are also leaf nodes that have no need for the (S, G) traffic (i.e. they have no directly connected receivers) and therefore send (S, G) Prunes up the TBT to router E. Once router E has received (S, G) Prunes messages from all DVMRP neighbours on the subnet it prunes the Ethernet interface connecting to router X and Y.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

47

DVMRP Flood & Prune


Source S Router E is now a Leaf Node; it sends an (S, G) Prune message. Router D prunes interface.

A
mrouted

B
mrouted

X
mrouted Prune

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

48 48

DVMRP Example (cont.)


At this point, all of router Es downstream interfaces on the TB T have been pruned and it no longer has any need for the (S, G) traffic. As a result, it too sends an (S,G) Prune up the TBT to router D. When router D receives this (S, G) Prune, it prunes the interface and shuts off the flow of (S, G) traffic to router E.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

48

DVMRP Flood & Prune


Source S

A
mrouted

B
mrouted

Final Pruned State

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

49 49

DVMRP Example (cont.)


In the drawing above, we see the final pruned state of the TBT which leaves traffic flowing to the receiver. However, because DVMRP is a flood and prune protocol, these pruned branches of the TBT will time out (typically after 2 minutes) and (S, G) traffic will once again flood down all branches of the TBT. This will again trigger the sending (S, G) Prune messages up the TBT to prune of unwanted traffic.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

49

DVMRP Evaluation

Widely used on the MBONE (being phased out) Significant scaling problems
Slow ConvergenceRIP-like behavior Significant amount of multicast routing state information stored in routers(S,G) everywhere No support for shared trees Maximum number of hops < 32

Not appropriate for large scale production networks


Due to flood and prune behavior Due to its poor scalability

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

50 50

Appropriate for large number of densley distributed receivers located in close proximity to source Widely used, oldest multicast routing protocol Significant scaling problems
Protocol limits maximum number of hops to 32 and requires a great deal of multicast routing state information to be retained

Not appropriate for...


Few interested receivers (assumes everyone wants data initially) Groups sparsely represented over WAN (floods frequently)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

50

MOSPF (RFC 1584)


Extension to OSPF unicast routing protocol
OSPF: Routers use link state advertisements to understand all available links in the network (route messages along least-cost paths) MOSPF: Includes multicast information in OSPF link state advertisements to construct multicast distribution trees (each router maintains an up-to-date image of the topology of the entire network)

Group membership LSAs are flooded throughout the OSPF routing domain so MOSPF routers can compute outgoing interface lists Uses Dijkstra algorithm to compute shortest-path tree
Separate calculation is required for each (S Net, G) pair
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

51 51

Multicast Extension to OSPF (RFC 1584)


Extension to OSPF unicast routing protocol; requires OSPF as underlying unicast routing protocol.

Group Membership LSAs


MOSPF uses a new type of OSPF LSA called Group-Membership LSA to advertise the existence of Group members on networks. Group-Membership LSAs are periodically flooded throughout an area in the same fashion as other OSPF LSAs.

Dijkstra Algorithm
Uses Dijkstra algorithm to compute shortest-path tree for every sourcenetwork/group pair!!!

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

51

MOSPF Membership LSAs


Area 0

MABR1 Area 1

MABR2 Area 2

Membership LSAs

MB

Membership LSAs

MA
Module1. ppt

MB
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

MA
8/10/2001 3:45 PM

52 52

Membership LSA Flooding Example


In this example, Area 1 has members of both Group A and B while Area 2 has members of Group A only. Routers with directly connect members originate Membership LSAs announcing the existence of these members on their networks. These LSAs are flooded throughout the area. Notice that these Group Membership LSAs do not travel between Area 1 and Area 2. (This will be addressed later.)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

52

MOSPF Intra-Area Traffic


Area 0

Not receiving (S2 , A) traffic Area 1

MABR1

MABR2 Area 2

MB

MA
Module1. ppt

M B (S1 , B)
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

M A (S2 , A)
8/10/2001 3:45 PM

53 53

Intra-Area Multicast
Once all routers within the area have learned where all members are in the network topology, it is possible to construct Source-network trees for multicast traffic forwarding.

Example
In the above example, Source S1 in Area 1 begins sending multicast traffic to Group B. As this data reaches the the routers in the area, each runs a Dijkstra calculation and computes a Shortest Path Tree rooted at the network for S1 and that spans all the members of Group B. The results of these calculations are used to forward the (S1, B) traffic as seen in Area 1 above. In Area 2, Source S2 begins sending multicast traffic to Group A. Again, the routers in the area use the Group-Membership information in their MOSPF database to run a Dijkstra calculation for the source network where S2 resides and create a Shortest Path Tree for this traffic flow. The results are then used to forward (S2, A) traffic as shown. Notice that the routers in Area 2 are not aware of the member of Group A in Area 1 because Membership LSAs are not flooded between these two areas. This InterArea traffic flow is handled by another mechanism that is described in the next few pages.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

53

MOSPF Inter-Area Traffic


Wildcard Receivers pull traffic from all sources in the area.
Wildcard Receiver Flag

Area 0

Wildcard Receiver Flag

(*, *)

(*, *)

MABR1 Area 1

MABR2 Area 2

MB

MA
Module1. ppt

M B (S1 , B)
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

M A (S2 , A)
8/10/2001 3:45 PM

54 54

Wildcard Receivers
In order to get multicast traffic to flow between Areas, the concept of Wildcard Receivers is used by MOSPF Area Border Routers (MABR). Wildcard Receivers set the Wildcard Receiver flag is in the Router LSAs that they inject into the Area. This flag is equivalent to a wildcard Group Membership LSA that effectively says, I have a directly connected member for every group.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

54

MOSPF Inter-Area Traffic


Area 0

MABR1 Area 1

MABR2 Area 2

MB

MA
Module1. ppt

M B (S1 , B)
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

M A (S2 , A)
8/10/2001 3:45 PM

55 55

Multicast Area Border Routers (MABR)


Multicast Area Border routers (i.e. routers that connect an area to the backbone area, Area 0) , always set the Wildcard Receiver flag in their Router LSAs that they are injecting into a non-backbone area. This causes the MABR to be always be added as a branch of the Shortest Path Tree of any active source in the non-backbone area. In the above example, this has resulted in MABR1 being added to the SPT for (S1,B) traffic and MABR2 being added to the SPT for (S2, A) traffic. This pulls the source traffic in the area to the border router so that it can be sent into the backbone area.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

55

MOSPF Inter-Area Traffic


Area 0 MABR routers inject Summary Membership LSAs into Area 0. (GA , GB)
Summarized Membership LSA

(GA )
Summarized Membership LSA

MABR1 Area 1

MABR2 Area 2

Membership LSAs

MB

Membership LSAs

MA
Module1. ppt

M B (S1 , B)
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

M A (S2 , A)
8/10/2001 3:45 PM

56 56

Summary Membership LSAs


In addition to Group Membership LSAs MOSPF also defines a new Summary Membership LSA that is used to summarise an areas group membership information. Summary Membership LSAs are injected into the backbone area, Area 0 so that routers in the backbone area are made aware of the existence of members in other areas.

Inter-Area Traffic Example


In the above example, the existence of members of groups A and B in Area 1 is being injected into the backbone area by MABR1 via Summary Membership LSA. In addition, MABR2 is injecting a Summary Membership LSA into the backbone area that indicates that Area 2 has members of group A. Routers in the backbone area now use the information in these Summary Membership LSAs in their Dijkstra calculations to know which MABRs to include in the backbone SPT for which sources. (See next drawing.)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

56

MOSPF Inter-Area Traffic


Area 0

MABR1 Area 1

MABR2 Area 2

MB

MA
Module1. ppt

M B (S1 , B)
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

M A (S2 , A)
8/10/2001 3:45 PM

57 57

Inter-Area Traffic Example


The combination of the Wildcard Receiver mechanism and the injection of Summary Membership LSAs into the backbone area permits the SPT for (S2,A) traffic to be extended across the backbone area. (S2, A) traffic is now flowing from Area 2 and into the backbone area (Area 0) via MABR2. The routers in the backbone are forwarding this traffic to MABR1 who is sending the traffic into Area 1. Routers inside of Area 1 run the Dijkstra calculation on (S2, A) traffic and construct an (S2, A) SPT inside of Area 1 to route the traffic to members of group A as shown above.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

57

MOSPF Inter-Area Traffic


Area 0 Unnecessary traffic still flowing to the MABR Routers!!
Wildcard Receiver Flag Wildcard Receiver Flag

(*, *)

(*, *)

MABR1 Area 1

MABR2 Area 2

(S1 , B)
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

(S2 , A)
8/10/2001 3:45 PM

58 58

Unnecessary Traffic Flows


In the case where there are no members for a multicast group, traffic is still pulled to the MABRs as a result of the Wildcard Receiver mechanisms. This can result in bandwidth being consumed inside of the area unnecessarily.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

58

MOSPF Inter-Domain Traffic


Area 0
Summarized Membership LSA

MASBR

External AS
Summarized Membership LSA

(GA , GB)

(GA )

MABR1 Area 1

MABR2 Area 2

Membership LSAs

MB

Membership LSAs

MA
Module1. ppt

MB
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

MA
8/10/2001 3:45 PM

59 59

Inter-Domain Traffic
Inter-domain multicast traffic flow basically follows the same mechanisms that were used for Inter-Area traffic flows. Summary Membership LSAs inform the routers in the backbone of which MABRs has members of which groups.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

59

MOSPF Inter-Domain Traffic


Area 0 MASBR External AS (S1 , A) (S2 , B)

MABR1 Area 1

MABR2 Area 2

MB

MA
Module1. ppt

MB
1998 2001, Cisco Systems, Inc. All rights reserved.

MA

MA
8/10/2001 3:45 PM

60 60

Inter-domain Traffic (cont.)


When traffic arrives from outside the domain via the Multicast AS Border Router (MASBR), this traffic is forwarded across the backbone to the MABRs as necessary based on the Summary Membership LSAs that they have injected into the area. This causes the multicast traffic for group A and B arriving from outside the AS to be forwarded as shown above.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

60

MOSPF Inter-Domain Traffic


Area 0 Unnecessary traffic may flow all the way to the MASBR Router!!
Wildcard Receiver Flag Wildcard Receiver Flag

MASBR

External AS

(*, *)

(*, *)

MABR1 Area 1

MABR2 Area 2

(S1 , B)
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

(S2 , A)
8/10/2001 3:45 PM

61 61

Inter-Domain Traffic (cont.)


MASBRs also use the Wildcard Receiver mechanism to automatically pull all source traffic in the area to them so that they can forward this traffic as needed to the outside world. In the example above, the Wildcard Receiver mechanism is causing the (S1,B) and (S2,A) traffic to be pulled into the backbone area and from there to the MASBR so that it can be forwarded to the outside world.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

61

MOSPF Evaluation
Does not flood multicast traffic everywhere to create state, Uses LSAs and the link-state database Protocol dependentworks only in OSPF-based networks Significant scaling problems
Dijkstra algorithm run for EVERY multicast (S Net, G) pair! Dijkstra algorithm rerun when:
Group Membership changes Line-flaps

Does not support shared-trees

Not appropriate for


General purpose multicast networks where the number of senders may be quite large.
IP/TV - (Every IP/TV client is a multicast source)
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

62 62

Appropriate for use within single routing domain Requires OSPF as underlying unicast routing protocol Significant scaling problems
Frequent flooding of link-state/membership information hinders performance Router CPU demands grow rapidly to keep track of current network topology (source-group pairs) Dijkstra algorithm must be run for every single multicast source Volatility of multicast groups can be lethal

Not appropriate for...


Networks with unstable links (too much Dijkstra algorithm computing required for each source) Many simultaneous active source-network/group pairs (routers must maintain too much information relating to the entire network topoloty) Ubiquitous Multicast Applications permit any user in the network to create an new source-group pair. There is no way for Network Administrator to control the number of sourcenetwork/group pairs in the network!!! Therefore, the Network Administrator has little control to prevent MOSPF from melting down his/her network as multicast applications become popular with the Users!!!

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

62

PIM-DM
Protocol Independent
Supports all underlying unicast routing protocols including: static, RIP, IGRP, EIGRP, IS-IS, BGP, and OSPF

Uses reverse path forwarding


Floods network and prunes back based on multicast group membership Assert mechanism used to prune off redundant flows

Appropriate for...
Smaller implementations and pilot networks
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

63 63

Protocol Independent Multicast (PIM) Dense-mode (Internet-draft)


Uses Reverse Path Forwarding (RPF) to flood the network with multicast data, then prune back paths based on uninterested receivers Interoperates with DVMRP

Appropriate for
Small implementations and pilot networks

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

63

PIM-DM Flood & Prune

Initial Flooding

Source

Multicast Packets

(S, G) State created in every router in the network!

Receiver
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

64 64

PIM-DM Initial Flooding


PIM-DM is similar to DVMRP in that it initially floods multicast traffic to all parts of the network. However unlike DVMRP, which pre-builds a Truncated Broadcast Tree that is used for initial flooding, PIM-DM initially floods traffic out ALL non RPF interfaces where there is: Another PIM-DM neighbor or A directly connected member of the group The reason that PIM-DM does not use Truncated Broadcast Trees to pre-build a spanning tree for each source network is that this would require running a separate routing protocol as does DVMRP. (At the very least, some sort of Poison-Reverse messages would have to be sent to build the TBT.) Instead, PIM-DM uses other mechanisms to prune back the traffic flows and build Source Trees.

Initial Flooding Example


In this example, multicast traffic being sent by the source is flooded throughout the entire network. As each router receives the multicast traffic via its RPF interface (the interface in the direction of the source), it forwards the multicast traffic to all of its PIM-DM neighbors. Note that this results in some traffic arriving via a non-RPF interface such as the case of the two routers in the center of the drawing. (Packets arriving via the nonRPF interface are discarded.) These non-RPF flows are normal for the initial flooding of data and will be corrected by the normal PIM-DM pruning mechanism.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

64

PIM-DM Flood & Prune

Pruning unwanted traffic

Source

Multicast Packets Prune Messages Receiver


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

65 65

Pruning unwanted traffic


In the example above, PIM Prunes (denoted by the dashed arrows) are sent to stop the flow of unwanted traffic. Prunes are sent on the RPF interface when the router has no downstream members that need the multicast traffic. Prunes are also sent on non-RPF interfaces to shutoff the flow of multicast traffic that is arriving via the wrong interface (i.e. traffic arriving via an interface that is not in the shortest path to the source.) An example of this can be seen at the second router from the receiver near the center of the drawing. Multicast traffic is arriving via a non-RPF interface from the router above (in the center of the network) which results in a Prune message.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

65

PIM-DM Flood & Prune

Results after Pruning

Source

Multicast Packets

(S, G) State still exists in every router in the network!

Flood & Prune process repeats every 3 minutes!!!


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver
8/10/2001 3:45 PM

66 66

Results after Pruning


In the final drawing in our example shown above, multicast traffic has been pruned off of all links except where it is necessary. This results in a Shortest Path Tree (SPT) being built from the Source to the Receiver. Even though the flow of multicast traffic is no longer reaching most of the routers in the network, (S, G) state still remains in ALL routers in the network. This (S, G) state will remain until the source stops transmitting. In PIM -DM, Prunes expire after three minutes. This causes the multicast traffic to be re-flooded to all routers just as was done in the Initial Flooding drawing. This periodic (every 3 minutes) Flood and Prune behavior is normal and must be taken into account when the network is designed to use PIM -DM.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

66

PIM-DM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check)

S0 Assert <distance, metric> 2 E0 1

S0 E0 2 Assert <distance, metric>

1 Routers receive packet on an interface in their oilist !!

Only one router should continue sending to avoid


duplicate packets.
2

Routers send PIM Assert messages


Compare distance and metric values Router with best route to source wins If metric & distance equal, highest IP adr wins Losing router stops sending (prunes interface)

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

67 67

PIM Assert Mechanism


The PIM Assert mechanism is used to shutoff duplicate flows onto the same multiaccess network. Routers detect this condition when they receive an (S, G) packet via a multiaccess interface that it is in the (S, G) OIL. This causes the routers to send Assert Messages. Assert messages containing the Admin. Distance and metric to the source combined into a single assert value. (The Admin. Distance is the high-order part of this assert value.) Routers compare these values to determine who has the best path (lowest value) to the source. (If both values are the same, the highest IP address is used as the tie breaker.) The Losing routers (the ones with the higher value) Prunes its interface while the winning router continues to forward multicast traffic onto the LAN segment.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

67

Potential PIM-DM Route Loop Normal Steady-State Traffic Flow

Source
Interface previously Pruned by Assert Process

RPF Interface

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

68 68

Potential PIM-DM Route Loops


The non-deterministic behavior of PIM -DM along with its flood-and-prune mechanism can sometimes result in serious network outages including blackholes and multicast route loops. The network in the above example is a simplified version of a frequently used network design whereby multiple routers are used to provide redundancy in the network. Under normal steady -state conditions, traffic flows from the source via the RPF interfaces as shown. Note that the routers have performed the Assert process and one interface on one router is in the pruned state.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

68

Potential PIM-DM Route Loop Interface Fails X


Source
This Router converges first

RPF Interface

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

69 69

Potential PIM-DM Route Loops


Now lets assume that the forwarding interface of the first-hop router fails as shown above. Lets also assume that the unicast routing of router on the left converges first and PIM computes the new RPF interface as shown.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

69

Potential PIM-DM Route Loop New Traffic Flow X


Source
But wait . . . This Router still hasnt converged yet

Multicast Route Loop ! !


RPF Interface

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

70 70

Potential PIM-DM Route Loops


Unfortunately, the middle router has not yet converged and is still forwarding multicast traffic using the old RPF interface. At this point, a multicast route loop exists in the network due to the transient condition of the two routers having opposite RPF interfaces. During the time that this route loop exists, virtually all of the bandwidth on the network segments can be consumed. This situation will continue until the router in the middle of the picture finally converges and the new correct RPF interface is calculated. Unfortunately, if the router needs some bandwidth to complete this convergence (as in the case when EIGRP goes active), then this condition will never be resolved!

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

70

PIM-DM Assert Problem

Initial Flow
Duplicate Traffic

Receiver

Receiver

Multicast Packets

Source
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

71 71

PIM-DM Assert Problem


While the PIM Assert mechanism is effective in pruning off duplicate traffic, it is not without its weaknesses. Consider the above example where duplicate traffic is flowing onto a LAN segment.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

71

PIM-DM Assert Problem

Sending Asserts
Loser

Receiver

Receiver

Winner

Multicast Packets Assert Messages Source


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

72 72

PIM-DM Assert Problem


The normal PIM Assert mechanism takes place and the two routers exchange routing metrics to determine which one has the best route to the source. In this case, the bottom router has the best metric and is the Assert Winner.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

72

PIM-DM Assert Problem

Assert Loser Prunes Interface


Loser

Receiver

Receiver

Winner

Multicast Packets

Source
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

73 73

PIM-DM Assert Problem


The normal PIM Assert mechanism takes place and the Assert Winner continues forwarding while the Assert Loser prunes its interface and starts its prune timer.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

73

PIM-DM Assert Problem

Assert Winner Fails


Traffic flow is cutoff until Prune times out on Assert Loser. Loser

Receiver

Receiver

X
Winner Multicast Packets Source
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

74 74

PIM-DM Assert Problem


Lets now assume that the Assert Winner fails immediately after winning the Assert process. Unfortunately, the Assert Loser has no way of knowing that the Assert Winner has failed and will wait 3 minutes before timing out its pruned interface. This results in a 3 minute (worst-case) loss of traffic.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

74

PIM-DM Evaluation
Most effective for small pilot networks Advantages:
Easy to configuretwo commands Simple flood and prune mechanism

Potential issues...
Inefficient flood and prune behavior Complex Assert mechanism Mixed control and data planes
Results in (S, G) state in every router in the network Can result in non-deterministic topological behaviors

No support for shared trees


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

75 75

Evaluation: PIM Dense-mode


Most effective for small pilot networks. Advantages Minimal number of commands required for configuration (two) Simple mechanism for reaching all possible receivers and eliminating distribution to uninterested receivers Simple behavior is easier to understand and therefore easier to debug Interoperates with DVMRP Potential issues Necessity to flood frequently because prunes expire after 3 minutes.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

75

Sparse-Mode Protocols

PIM SM- Protocol Independant Multicasting (Sparse Mode) CBT - Core Based Trees

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

76 76

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

76

PIM-SM (RFC 2362)


Supports both source and shared trees
Assumes no hosts want multicast traffic unless they specifically ask for it

Uses a Rendezvous Point (RP)


Senders and Receivers rendezvous at this point to learn of each others existence.
Senders are registered with RP by their first-hop router. Receivers are joined to the Shared Tree (rooted at the RP) by their local Designated Router (DR).

Appropriate for
Wide scale deployment for both densely and sparsely populated groups in the enterprise Optimal choice for all production networks regardless of size and membership density.
Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

77 77

Protocol Independent Multicast (PIM) Sparse-mode (RFC 2362)


Utilizes a rendezvous point (RP) to coordinate forwarding from source to receivers Regardless of location/number of receivers, senders register with RP and send a single copy of multicast data through it to registered receivers Regardless of location/number of sources, group members register to receive data and always receive it through the RP Appropriate for Wide scale deployment for both densely and sparsely populated groups in the Enterprise Optimal choice for all production networks regardless of size and membership density.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

77

PIM-SM Shared Tree Joins

RP

( * , G) Joins Shared Tree Receiver

( * , G) State created only along the Shared Tree.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

78 78

PIM-SM Shared Tree Joins


In this example, there is an active receiver (attached to leaf router at the bottom of the drawing) has joined multicast group G. The leaf router knows the IP address of the Rendezvous Point (RP ) for group G and when it sends a (*,G) Join for this group towards the RP. This (*, G) Join travels hop-by-hop to the RP building a branch of the Shared Tree that extends from the RP to the last-hop router directly connected to the receiver. At this point, group G traffic can flow down the Shared Tree to the receiver.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

78

PIM-SM Sender Registration

Source

RP

(S, G) Register (S, G) Joins Shared Tree Source Tree

(unicast)

(S, G) State created only along the Source Tree.

Receiver

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

79 79

PIM-SM Sender Registration


As soon as an active source for group G sends a packet the leaf router that is attached to this source is responsible for Registering this source with the RP and requesting the RP to build a tree back to that router. The source router encapsulates the multicast data from the source in a special PIM SM message called the Register message and unicasts that data to the RP. When the RP receives the Register message it does two things It de-encapsulates the multicast data packet inside of the Register message and forwards it down the Shared Tree. The RP also sends an (S,G) Join back towards the source network S to create a branch of an (S, G) Shortest-Path Tree. This results in (S, G) state being created in all the router along the SPT, including the RP.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

79

PIM-SM Sender Registration

Source

RP

(S, G) Register (S, G) Joins Shared Tree Source Tree (S, G) Register-Stop

(unicast)

RP sends Register-Stop back to first-hop router.

(unicast)

Receiver

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

80 80

PIM-SM Sender Registration (cont.)


As soon as the SPT is built from the Source router to the RP, multicast traffic begins to flow natively from source S to the RP. Once the RP begins receiving data natively (i.e. down the SPT) from source S it sends a Register Stop to the sources first hop router to inform it that it can stop sending the unicast Register messages.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

80

PIM-SM Sender Registration

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

Source traffic flows natively along SPT to RP. From RP, traffic flows down the Shared Tree to Receivers.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

81 81

PIM-SM Sender Registration (cont.)


At this point, multicast traffic from the source is flowing down the SPT to the RP and from there, down the Shared Tree to the receiver.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

81

PIM-SM SPT Switchover

Source

RP

Last-hop router joins the SPT. (S, G) Joins Shared Tree Source Tree (S, G)RP-bit Prunes Additional (S, G) State is created along new part of the Source Tree. Receiver Additional (S, G) State is created along along the Shared Tree to prune off (S, G) traffic.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

82 82

PIM-SM Shortest-Path Tree Switchover


PIM-SM has the capability for last-hop routers (i.e. routers with directly connected members) to switch to the Shortest-Path Tree and bypass the RP if the traffic rate is above a set threshold called the SPT-Threshold. The default value of the SPT-Threshold in Cisco routers is zero. This means that the default behaviour for PIM-SM leaf routers attached to active receivers is to immediately join the SPT to the source as soon as the first packet arrives via the (*,G) shared tree. In the above example, the last-hop router (at the bottom of the drawing) sends an (S, G) Join message toward the source to join the SPT and bypass the RP. This (S, G) Join messages travels hop-by-hop to the first-hop router (i.e. the router connected directly to the source) thereby creating another branch of the SPT. This also creates (S, G) state in all the routers along this branch of the SPT. Finally, special (S, G)RP-bit Prune messages are sent up the Shared Tree to prune off this (S,G) traffic from the Shared Tree. If this were not done, (S, G) traffic would continue flowing down the Shared Tree resulting in duplicate (S, G) packets arriving at the receiver.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

82

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

(S, G) Traffic flow is now pruned off of the Shared Tree and is flowing to the Receiver via the SPT.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

83 83

PIM-SM Shortest-Path Tree Switchover


At this point, (S, G) traffic is now flowing directly from the first -hop router to the last-hop router and from there to the receiver. Note: The RP will normally send (S, G) Prunes back toward the source to shutoff the flow of now unnecessary (S, G) traffic to the RP IFF it has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree. (This step has been omitted from the example above.) As a result of this SPT-Switchover mechanism, PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

83

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree (S, G) Prune Receiver

(S, G) traffic flow is no longer needed by the RP so it Prunes the flow of (S, G) traffic.

Module1.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

84 84

PIM-SM Shortest-Path Tree Switchover


At this point, the RP no longer needs the flow of (S, G) traffic since all branches of the Shared Tree (in this case there is only one) have pruned off the flow of (S, G) traffic. As a result, the RP will send (S, G) Prunes back toward the source to shutoff the flow of the now unnecessary (S, G) traffic to the RP Note: This will occur IFF the RP has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

84

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

(S, G) Traffic flow is now only flowing to the Receiver via a single branch of the Source Tree.

Module1.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

85 85

PIM-SM Shortest-Path Tree Switchover


As a result of the SPT-Switchover, (S, G) traffic is now only flowing from the first-hop router to the last-hop router and from there to the receiver. Notice that traffic is no longer flowing to the RP. As a result of this SPT-Switchover mechanism, it is clear that PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

85

PIM-SM FFF

PIM-SM Frequently Forgotten Fact


The default behavior of PIM-SM is that routers with directly connected members will join the Shortest Path Tree as soon as they detect a new multicast source.

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

86 86

Frequently Forgotten Fact


Unless configured otherwise, the default behaviour of Cisco routers running PIM SM is for last-hop routers to immediately switch to the SPT for any new source.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

86

PIM-SM Evaluation

Effective for sparse or dense distribution of multicast receivers Advantages:


Traffic only sent down joined branches Can switch to optimal source-trees for high traffic sources dynamically Unicast routing protocol-independent Basis for inter-domain multicast routing
When used with MBGP and MSDP

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

87 87

Evaluation: PIM Sparse-mode


Can be used for sparse of dense distribution of multicast receiv ers (no necessity to flood) Advantages Traffic sent only to registered receivers that have explicity joined the multicast group RP can be switched to optimal shortest-path-tree when high-traffic sources are forwarding to a sparsely distributed receiver group Interoperates with DVMRP Potential issues Requires RP during initial setup of distribution tree (can switch to shortest-pathtree once RP is established and determined suboptimal)

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

87

CBT Overview
Constructs single, shared delivery tree (not source -based) for multicast group members
Traffic is sent and received over same tree, regardless of source(s) Reduced amount of multicast state information stored in routers

Uses core router to construct shared tree


Routers send join message to core and form branch of tree, suppressing downstream join messages Downstream routers connect to shared tree through on-tree routers Source unicasts data to core, then multicasts using group ID Aggregates traffic onto smaller subset of links

No Commercial implementation available


Module1. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

88 88

Core Based Trees (Internet-draft)


Utilizes shared delivery tree constructed around core router (much like PIM's RP) Unlike PIM, the Shared tree is bi-directional. If the first-hop router for a source is already on the tree, it forwards the multcast packets out all branches of the tree. If the first-hop router for a source is not on the Shared tree, a single copy of multicast data is sent through the core router to receivers. Regardless of location/number of sources, group members always receive multicast data through through the Shared tree. Key benefits Reduced amount of multicast state information stored in routers (always send and receive over same distribution tree) Traffic is aggregated onto smaller subset of links

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

88

CBT Evaluation

Academic work-in-progress Runs primarily on MS Power Point

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

89 89

Evaluation: CBT
Appropriate for inter- and intra-domain multicast routing (no necessity to flood) Current Deployment New protocol that is not widely deployed in production environments (no commercial implementation available) Improves scalability of some existing multicast algorithms to support sparse distribution of multicast receivers Interoperates with DVMRP Potential issue Has no capability to switch to SPT Can suffer from latency problems since traffic must flow through the Core router. Core routers can become bottlenecks if not selected with great care, especially when senders and receivers are located very far from each other

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

89

Protocol Summary

CONCLUSION
Virtually all production networks should be configured to run PIM in Sparse mode!

Module1. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 3:45 PM

90 90

Protocol Summary
Given the pros and cons of all the multicast routing protocols available, virtually all production networks should be configured to run PIM SM.

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

90

Module1.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

91

Copyright 1999-2001, Cisco Systems, Inc.

Module1.ppt

91

IP Multicasting at Layer 2
Module 2

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Module Objectives
Understand Layer 2 Multicast Addressing Identify the purpose of IGMP Recognize the difference between v1, v2 & v3 of the IGMP protocol Identify issues in IGMP v1-v2 Interoperation Identify potential solutions to L2 Multicast Frame Switching problems
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

2 2

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Module Agenda

MAC Layer Multicast Addresses IGMPv1 IGMPv2 IGMP v1-v2 Interoperability IGMPv3 L2 Multicast Frame Switching

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

3 3

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Layer 3 Multicast Addressing


IP group addresses 224.0.0.0239.255.255.255 Class D addresses = high order bits of 1110 Special reserved group addresses: 224.0.0.0224.0.0.255:
224.0.0.1 224.0.0.2 224.0.0.4
Module2. ppt

All systems on this subnet All routers on this subnet DVMRP routers
8/9/2001 4:20 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

4 4

IANA Reserved Addresses


IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: 224.0.0.2 - PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) 224.0.0.5 - OSPF ALL ROUTERS 224.0.0.6 - OSPF DESIGNATED ROUTERS 224.0.0.9 - RIP2 Routers 224.0.0.13 - PIMv2 224.0.1.39 - CISCO-RP-ANNOUNCE 224.0.1.40 - CISCO-RP-DISCOVERY (Auto-RP) (Auto-RP) (RFC1583) (RFC1583)

ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses is the authoritative source for reserved multicast addresses.

Additional Information
"Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: draft-ietf-mboned-admin-ip-space-03.txt

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Layer 2 Multicast Addressing IP Multicast MAC Address Mapping


(FDDI and Ethernet)
32 Bits 1110 28 Bits

239.255.0.1
5 Bits Lost

01-00-5e-7f-00-01
25 Bits 48 Bits 23 Bits

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

5 5

Ethernet & FDDI Multicast Addresses


The low order bit (0x01) in the first octet indicates that this packet is a Layer 2 multicast packet. Furthermore, the 0x01005e prefix has been reserved for use in mapping L3 IP multicast addresses into L2 MAC addresses. When mapping L3 to L2 addresses, the low order 23 bits of the L3 IP multicast address are mapped into the low order 23 bits of the IEEE MAC address. Notice that this results in 5 bits of information being lost.

A bit of History
It turns out that this loss of 5 bits worth of information was not originally intended. When Dr. Steve Deering was doing his seminal research on IP Multicast, he approached his advisor with the need for 16 OUIs to map all 28 bits worth of Layer 3 IP Multicast address into unique Layer 2 MAC addresses. Note: An OUI (Organizationally Unique Identifier) is the high 24 bits of a MAC address that is assigned to an organization by the IEEE. A single OUI therefore provides 24 bits worth of unique MAC addresses to the organization. Unfortunately, at that time the IEEE charged $1000 for each OUI assigned which meant that Dr. Deering was requesting that his advisor spend $16,000 so he could continue his research. Due to budget constraints, the advisor agreed to purchase a single OUI for Dr. Deering. However, the advisor also chose to reserve half of the MAC addresses in this OUI for other graduate research projects and granted Dr. Deering the other half. This resulted in Dr. Deering having only 23 bits worth of MAC address space with which to map 28 bits of IP Multicast addresses. (Its too bad that it wasnt known back then how popular IP Multicast would become. If they had, Dr. Deering might have been able to pass the hat around to interested parties and collected enough money to purchase all 16 OUIs. :-) )
Copyright ? ?1998-2001 Cisco Systems, Inc.
Module2.ppt

Layer 2 Multicast Addressing IP Multicast MAC Address Mapping


(FDDI & Ethernet)

Be Aware of the 32:1 Address Overlap


32 - IP Multicast Addresses 224.1.1.1 224.129.1.1 225.1.1.1 225.129.1.1 . . . 238.1.1.1 238.129.1.1 239.1.1.1 239.129.1.1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

1 - Multicast MAC Address (FDDI and Ethernet)

0x0100.5E01.0101

8/9/2001 4:20 PM

6 6

L2/L3 Multicast Address Overlap


Since there are 28 bits of unique address space for an IP multicast address (32 minus the first 4 bits containing the 1110 Class D prefix) and there are only 23 bits plugged into the IEEE MAC address - there are 5 bits of overlap or 28-23 = 5. 2**5 = 32 therefore there is a 32:1 overlap of L3 addresses to L2 addresses so beware several L3 addresses can map to the same L2 multicast address! For example, all of the following IP multicast addresses map to the same L2 multicast of 01-00-5e-0a-00-01: 224.10.0.1, 225.10.0.1, 226.10.0.1, 227.10.0.1 228.10.0.1, 229.10.0.1, 230.10.0.1, 231.10.0.1 232.10.0.1, 233.10.0.1, 234.10.0.1, 235.10.0.1 236.10.0.1, 237.10.0.1, 238.10.0.1, 239.10.0.1 224.138.0.1, 225.138.0.1, 226.138.0.1, 227.138.0.1 228.138.0.1, 229.138.0.1, 230.138.0.1, 231.138.0.1 232.138.0.1, 233.138.0.1, 234.138.0.1, 235.138.0.1 236.138.0.1, 237.138.0.1, 238.138.0.1, 239.138.0.1

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Layer 2 Multicast Addressing IP Multicast MAC Address Mapping


(Token Ring)
A Layer 3 IPmc Address Maps to a single Token Ring Functional Address or the all ones Broadcast address:

224.x.x.x c0-00-00-04-00-00
(Shown in Token Ring, non-canonical format)

224.x.x.x ff-ff-ff-ff-ff-ff

Results in high levels of unwanted interrupts for nonnon-interested Hosts


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

7 7

Token Ring MAC Addresses


Because the bit order of bytes transmitted on Token-Ring are reversed, it is typical to see Token Ring MAC addresses written in their non-canonical form. For example, when transposed to canonical (Ethernet) form, the 0xc000.0004.0000 MAC address in the above slide would be 0x0300.0020.0000.

Token Ring Functional Addresses


Token Ring Functional Addresses use a format of 0xc000.0004.xxxx where the last 2 octets typically has at most, a single bit set. Many of the Functional Addresses are reserved for well-known Token-Ring MAC layer functions such as Ring Error Monitor and others. A bit in the 3rd Octet is used to signal that this is a Functional Address. In fact, the 0x5e (canonical form) in the 3rd Octet of a normal Ethernet multicast address has a bit pattern that would confuse Token Ring end stations into thinking that the address was a Functional Address. Therefore, IP multicast address to L2 multicast address mapping cannot occur in Token Ring as it does in Ethernet.

Impact on Token-Ring End Stations


Mapping all multicast addresses into a single L2 address forces the the main CPU in end systems to perform filtering of wanted vs. unwanted multicast packets instead of being handled in hardware by the Token Ring NIC card. This creates significant performance issues on Token-Ring end systems when multicasting traffic is present on the ring. This is a very good reason, among many others, for users considering the Ethernet versus Token Ring debate to strongly consider Ethernet if MultiMedia Applications and IPmc is being deployed or planned.
Copyright ? ?1998-2001 Cisco Systems, Inc.
Module2.ppt

Layer 2 Multicast Addressing IP Multicast MAC Address Mapping


(Token Ring)

Be Aware of the 268,435,200:1 Address Overlap


ALL 268,435,200 - IP Multicast Addresses 224.0.1.0 224.0.1.1 224.0.1.2 224.0.1.3 . . . 239.239.255.252 239.255.255.253 239.255.255.254 239.255.255.255
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

1 - Multicast MAC Address (Token Ring)

0xFFFF.FFFF.FFFF

RUN AWAY ! ! !
8/9/2001 4:20 PM

8 8

L2/L3 Multicast Address Overlap


Unfortunately, all 28 significant bits of an IP multicast address (32 minus the first 4 bits) map into a single Token Ring MAC address. This has the disasterous result of a 2**28 = 268,435,200 ambiguity! Because al L3 addresses map into the same L2 multicast address, constraint of multicast traffic at L2 is impossible on Token Ring networks. A migration from Token-Ring to Ethernet should be considered by network administrators contemplating any extensive use of IP multicast.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

Layer 2 Multicast Addressing Token-Ring MAC Addresses


Token Ring Interfaces may be configured to use either the Functional Address or the all ones Broadcast Address
interface token-ring 0 ip pim sparse ip multicast useuse -functional

Use Functional Address 0xc000.0004.000

interface token-ring 0 ip pim sparse

Use Broadcast Address 0xffff.ffff.ffff (Default)

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

9 9

Default Configuration
The default Token Ring interface configuration is to use the broadcast address.

Recommended Configuration
If Functional Address support is available on IP multicast Token Ring end systems, it is recommended the Functional Address be used since this will not affect non-IP multicast users like the broadcast address will.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

IGMP
How hosts tell routers about group membership Routers solicit group membership from directly connected hosts RFC 1112 specifies first version of IGMP RFC 2236 specifies current version of IGMP IGMP v3 enhancements Supported on UNIX systems, PCs, and MACs

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

10 10

IGMP
The primary purpose of IGMP is to permit hosts to commincate their desire to receive multicast traffic to the IP Multicast router(s) on the local network. This, in turn, permits the IP Multicast router(s) to Join the specified multicast group and to begin forwarding the multicast traffic onto the network segment. The initial specification for IGMP (v1) was documented in RFC 1112, Host Extensions for IP Multicasting. Since that time, many problems and limitations with IGMPv1 have been discovered. This has lead to the development of the IGMPv2 specification which was ratified in November, 1997 as RFC 2236. Even before IGMPv2 had been ratified, work on the next generation of the IGMP protocol, IGMPv3, had already begun. However, the IGMPv3 specification is still in the working stage and has not been implemented by any v endors.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

10

IGMPv1

RFC 1112 Host extensions for IP Multicasting


Membership Queries
Querier sends IGMP query messages to 224.0.0.1 with ttl = 1 One router on LAN is designated/elected to send queries Query interval 60120 seconds

Membership Reports
IGMP report sent by one host suppresses sending by others Restrict to one report per group per LAN Unsolicited reports sent by host, when it first joins the group
Module2. ppt 8/9/2001 4:20 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

11 11

IGMP Membership Queries


IGMPv1 Membership Queries are sent by the router to the All-Hosts (224.0.0.1) multicast address to solicit what multicast groups have active receivers on the local network.

IGMP Membership Reports


IGMPv1 Membership Reports are sent by hosts wishing to receive traffic for a specific multicast group. Membership Reports are sent (with a TTL of 1) to the multicast address of the group for which the hosts wishes to receive traffic. Hosts either send reports asynchronously (when the wish to first join a group) or in response to Membership Queries. In the latter case, the response is used to maintain the group in an active state so that traffic for the group continues to be forwarded to the network segment.

Report Suppression
Report suppression is used among group members so that all members do not have to respond to a query. This saves CPU and bandwidth on all systems. The rule in multicast membership is that as long as one member is present, the group must be forwarded onto that segment. Therefore, only one member present is required to keep interest in a given group so report suppression is efficient.

TTL
Since Membership Query and Report packets only have local significance, the TTL of these packets are always set to 1. This is also so they will not be accidentally forwarded off of the local subnet and cause confusion on other subnets.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

11

IGMPv1Packet Format
4 Ver 7 Type Unused 15 23 Checksum 31

Group Address

Ver: Code Version = 1 Type: 1 = Host Membership Query 2 = Host Membership Report Group Address: Multicast Group Address

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

12 12

Version
is the IGMP version and should be 0x1 in IGMPv1. This field has been merged with the Type field in IGMPv2 and eliminated.

Type
is the IGMP message type. 0x1 = Host Membership Query 0x2 = Host Membership Report This field has been expanded into an 8 bit field in IGMPv2 .

Group
is the Multicast Group address being specified for reports.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

12

IGMPv1Joining a Group
H1 H2 224.1.1.1 H3

Report

IGMPv1

Joining member sends report to 224.1.1.1 immediately upon joining


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

13 13

Asynchronous Joins
Members joining a group do not have to waited for a query to join; they send in an unsolicited report indicating their interest. This reduces join latency for the end system joining if no other members are present.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

13

IGMPv1General Queries
H1 H2 H3

General Query to 224.0.0.1 IGMPv1 Multicast Router

Periodically sends General Queries to 224.0.0.1 to determine memberships


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

14 14

General Queries
General Queries go to the All-Hosts (224.0.0.1) multicast address. One member from each group on the segment will respond with a report. General Queries are sent out periodically based on the setting of the ip igmp query-interval command. (The default setting is 60 seconds.)

IGMP Querier
There is no formal IGMP Query Router election process within IGMPv1 itself. Instead, the election process is left up to the multicast routing protocol and different protocols used different mechanisms. This often results in multiple queriers on a single multi-access network.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

14

IGMPv1Maintaining a Group
224.1.1.1 H1 224.1.1.1 H2 224.1.1.1 H3

X
Suppressed #3 Report #2

X
Suppressed #3 Query to 224.0.0.1 IGMPv1 #1

#1 #2 #3
Module2. ppt

Router sends periodic queries One member per group per subnet reports Other members suppress reports
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

15 15

Query-Response Process
The router multicasts periodic IGMPv1 Membership Queries to the All-Hosts (224.0.0.1) group address. Only one member per group responds with a report to a query. This is to save bandwidth on the subnet network and processing by the hosts. This is process is called Response Suppression. (See below.)

Response Suppression Mechanism


The Report Suppression mechanism is accomplished as follows: When a host receives the Query, it starts a count-down timer for each multicast group of which it is a member. The count-down timers are each initialized to a random count within a given time range. (In IGMPv1 this was a fixed range of 10 seconds. Therefore the count-down timers were randomly set to some value between 0 and 10 seconds.) When a count-down timer reaches zero, the host sends a Membership Report for the group associated with the count-down timer to notify the router that the group is still active. However, if a host receives a Membership Report before its associated count-down timer reaches zero, it cancels the count-down timer associated with the multicast group, thereby suppressing its own report. In the example shown in the slide, H2s time expired first so it responded with its Membership Report. H1 and H3 cancelled their timers associated with the group; thereby suppressing their reports.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

15

IGMPv1Leaving a Group
H1 H2 H3

Query to 224.0.0.1 IGMPv1

Module2. ppt

Router sends periodic queries Hosts silently leave group Router continues sending periodic queries No Reports for group received by router Group times out
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

16 16

IGMPv1 Leaves
There was no special Leave mechanism defined in Version 1 of IGMP. Instead, IGMPv1 hosts leave a group "passively" or "quietly" at any time without any notification to the router. This is not a problem if there are multiple member present because the multicast flow still must be delivered to the segment. However, when the member leaving is the last member, there will be a period when the router continues to forward the multicast traffic onto the segment needlessly since there are no members left. It was up to the IGMP Query router to timeout the Group after several Query Intervals pass without a response for a Group. This is inefficient - especially if the number of groups and/or the traffic in these groups is high.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

16

IGMPv2

RFC 2236
Group-specific query
Router sends Group-specific queries to make sure there are no members present before stopping to forward data for the group for that subnet

Leave Group message


Host sends leave message if it leaves the group and is the last member (reduces leave latency in comparison to v1)
17 17

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

IGMPv2
As a result of some of the limitations discovered in IGMPv1, work was begun on IGMPv2 in an attempt to remove these limitations. Most of the changes between IGMPv1 and IGMPv2 are primarily to address the issues of Leave and Join latencies as well as address ambiguities in the original protocol specification. (IGMPv2 is almost to standard status.) The following sections define some of the more significant changes.

Group Specific Queries


A Group Specific query was added in v2 to allow the router to only query membership in a single group instead of all groups. This is an optimized way to quickly find out if any members are left in a group without asking all groups for a report. The difference between the Group Specific query and the General Query is that a General Query is multicast to the All-Hosts (224.0.0.1) address while a Group Specific query for Group G, is multicast to the Group G multicast address.

Leave Group message


A Leave Group message was also added in IGMPv2. This allows end systems to tell the router they are leaving the group which reduces the leave latency for the group on the segment when the member leaving is the last member of the group. The standard is loosely written on when leave group messages should and must be sent. This is an important consideration when discussing CGMP.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

17

IGMPv2 (cont.)

Querier election mechanism


On multi-access networks, an IGMP Querier router is elected based on lowest IP address. Only the Querier router sends Queries.

Query-Interval Response Time


General Queries specify Max. Response Time which inform hosts of the maximum time within which a host must respond to General Query. (Improves burstiness of the responses.)

Backward compatible with IGMPv1


Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

18 18

Querier Election
IGMP itself now has a Querier election mechanism unlike v1. The lowest unicast IP address of the IGMP-speaking routers will be elected as the Querier. All IGMP speaker come up thinking they will be the querier but must immediately relinquish that role if a lower IP address query is heard on the same segment.

Query-Interval Response Time


The Query-Interval Response time has also been added to control the burstiness of reports. This value is indicated in queries to convey to the membership how much time they have to respond to a query with a report.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

18

IGMPv2Packet Format
7 Type Max. Resp. Time Group Address 15 Checksum 31

Type: 0x11 = Membership Query 0x12 = Version 1 Membership Report 0x16 = Version 2 Membership Report 0x17 = Leave Group Max. Resp. Time max. time before sending a responding report in 1/10 secs. (Default = 10 secs) Group Address: Multicast Group Address (0.0.0.0 for General Queries)
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

19 19

Type
In IGMPv2, the old 4 bit Version field was merged with the old 4 bit Type field to create a new 8 bit Type field. By assigning IGMPv2 type codes 0x11 and 0x12 as the Membership Query (V1 & V2) and the V1 Membership Report respectively, backwards compatibility of IGMP v1 and v2 packet formats was maintained.

Max. Response Time


This new field allows the querying router to specify exactly what the Query Interval Response Time is for this Query. The value (in 1/10 seconds) is used by the IGMPv2 hosts as the upper bound when randomly choosing the value of their response timers. This helps to control the burstiness of the responses during the Query-Response interval.

Group Address
This field is identical to the IGMPv1 version of this field with the exception that it is set to 0.0.0.0 for General Queries.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

19

IGMPv2Joining a Group
1.1.1.10 H1 1.1.1.11 224.1.1.1 H2 1.1.1.12 H3

Report 1.1.1.1 rtr-a

Joining member sends report to 224.1.1.1 immediately upon joining (same as IGMPv1)
20 20

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

Asynchronous Joins
Members joining a group do not have to waited for a query to join; they send in an unsolicited report indicating their interest. This reduces join latency for the end system joining if no other members are present.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

20

IGMPv2Joining a Group
1.1.1.10 H1 1.1.1.11 H2 1.1.1.12 H3

1.1.1.1 rtr-a IGMP State in rtr-a

rtr-a>show ip igmp group IGMP Connected Group Membership Group Address Interface Uptime 224.1.1.1 Ethernet0 6d17h

Expires 00:02:31

Last Reporter 1.1.1.11

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

21 21

IGMP State in rtr-a


Group 224.1.1.1 is active on Ethernet 0 and Has been active on this interface for 6 days and 17 hours. It expires (and will be deleted) in 2 minutes and 31 seconds if an IGMP Host Membership report for this group is not heard in that time. The last Host to report membership was 1.1.1.11 (H2).

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

21

IGMPv2Querier Election
1.1.1.10 H1 1.1.1.11 H2 1.1.1.12 H3

Query IGMP Non-Querier

1.1.1.2 IGMPv2 rtr-b

1.1.1.1

Query IGMP Querier

rtr-a

Intially all routers send out a Query Router w/lowest IP address elected querier Other routers become Non-Queriers
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

22 22

Querier Election
In IGMPv1 there was no formal IGMP querying router election process in within IGMPv1 itself - it was left up to the multicast routing protocol and different protocols used different mechanisms. This would often result in multiple queriers on a single multiaccess network. With the definition of IGMPv2 a formal querying router election process was specified within the IGMPv2 protocol itself. In IGMPv2 each router on a multiaccess network will initially assume it is the querier and begin sending queries. Each router will see the queries from the other IGMPv2 routers and will examine the IP address of these queries. All IGMPv2 routers will then defer to the router with the lowest IP address. In other words, the IGMPv2 router with the lowest IP address will become the querying router. Finally, if the currently elected Query Router fails to issue a query within a specified time limit, a timer in the other IGMPv2 routers will time-out and cause them to re-initiate the Query Election process.

Group Specific Queries


IGMPv2 also added the concept of Group Specific Queries. This is accomplished by sending the IGMPv2 Membership Query to the Groups multicast address as opposed to sending to the All Hosts (224.0.0.1) multicast address as is done for IGMPv2 General Queries.

Query Interval
Membership queries are sent every 60 seconds (default).

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

22

IGMPv2Querier Election
Determining which router is the IGMP Querier
rtr-a>show rtr-a>show ip ip igmp igmp interface interface e0 e0 Ethernet0 Ethernet0 is is up, up, line line protocol protocol is is up up Internet address is 1.1.1.1, Internet address is 1.1.1.1, subnet subnet mask mask is is 255.255.255.0 255.255.255.0 IGMP is enabled on interface IGMP is enabled on interface Current Current IGMP IGMP version version is is 22 CGMP CGMP is is disabled disabled on on interface interface IGMP IGMP query query interval interval is is 60 60 seconds seconds IGMP IGMP querier querier timeout timeout is is 120 120 seconds seconds IGMP max query response time IGMP max query response time is is 10 10 seconds seconds Inbound IGMP access group is Inbound IGMP access group is not not set set Multicast Multicast routing routing is is enabled enabled on on interface interface Multicast Multicast TTL TTL threshold threshold is is 00 Multicast Multicast designated designated router router (DR) (DR) is is 1.1.1.1 1.1.1.1 (this (this system) system) IGMP IGMP querying querying router router is is 1.1.1.1 1.1.1.1 (this (this system) system) Multicast groups joined: 224.0.1.40 224.2.127.254 Multicast groups joined: 224.0.1.40 224.2.127.254

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

23 23

Verifying the IGMPv2 Querier


Use the show ip igmp interface command to determine which router is the IGMPv2 Querier on the multiaccess network. Note that the Designated Router is a different function and is listed separately in the display above.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

23

IGMPv2Maintaining a Group
1.1.1.10 224.1.1.1 1.1.1.11 224.1.1.1 H2 224.1.1.1 1.1.1.12 H1 H3

X
Suppressed 1.1.1.1

Suppressed

Report

Query

IGMPv2

Router sends periodic queries One member per group per subnet reports Other members suppress reports
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

24 24

Query-Response Process
The router multicasts periodic IGMPv1 Membership Queries to the All-Hosts (224.0.0.1) group address. Only one member per group responds with a report to a query. This is to save bandwidth on the subnet network and processing by the hosts. This is process is called Response Suppression. (See section below.)

Response Suppression Mechanism


The Report Suppression mechanism is accomplished as follows: When a host receives the Query, it starts a count-down timer for each multicast group of which it is a member. The count-down timers are each initialized to a random count within a given time range. (In IGMPv1 this was a fixed range of 10 seconds. Therefore the count-down timers were randomly set to some value between 0 and 10 seconds.) When a count-down timer reaches zero, the host sends a Membership Report for the group associated with the count-down timer to notify the router that the group is still active. However, if a host receives a Membership Report before its associated count-down timer reaches zero, it cancels the count-down timer associated with the multicast group, thereby suppressing its own report. In the example shown in the slide, H2s time expired first so it responded with its Membership Report. H1 and H3 cancelled their timers associated with the group; thereby suppressing their reports.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

24

IGMPv2Leaving a Group
1.1.1.10 H1 1.1.1.11 H2 1.1.1.12 H3

1.1.1.1 rtr-a

IGMP State in rtr-a before Leave


rtr-a>sh ip igmp group IGMP Connected Group Membership Group Address Interface Uptime 224.1.1.1 Ethernet0 6d17h Expires 00:02:31 Last Reporter 1.1.1.11

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

25 25

IGMPv2 Leaves
In the above example, notice that the router is aware that there one or more members of group 224.1.1.1 active on Ethernet0 and that Host 2 responded with a Group Membership Report for this group during the last General Query interval. (Indicated by the IP address of Host 2 in the Last Reporter field.)

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

25

IGMPv2Leaving a Group
1.1.1.10 H1 224.1.1.1 Leave to #1 224.0.0.2 1.1.1.11 H2 224.1.1.1 Report to #3 224.1.1.1 1.1.1.1 rtr-a Group Specific Query to 224.1.1.1 #2 1.1.1.12 H3


Module2. ppt

H2 leaves group; sends Leave message Router sends Group specific query A remaining member host sends report Group remains active
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

26 26

IGMPv2 Leaves
In IGMPv1, hosts would leave passively - i.e.. they do not explicitly say they are leaving - they just stop reporting. However, IGMPv2 has explicit Leave Group messages. When the IGMPv2 Query router receives a Leave Message, it responds by sending a Group Specific Query for the associated group to see if there are still other hosts wishing to receive traffic for the group. This process helps to reduce overall Leave Latency. When CGMP is in use, the IGMPv2 Leave Message mechanism also helps the router to better manage the CGMP state in the switch. This also improves the leave latency for the specific host at layer 2. (Note: Due to the wording of the current IGMPv2 draft specification, hosts may chose to NOT send Leave messages if they are not the last host to leave the group. This can adversely affect CGMP performance.)

Example :
H2 and H3 are members of group 224.1.1.1 #1 - H2 leaves #2 - Router sends group specific query to see if any other group members are present. #3 - H3 hasnt left yet so it responds with a Report message. Router keeps sending multicast for 224.1.1.1 since there is >= 1 member present

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

26

IGMPv2Leaving a Group
1.1.1.10 H1 1.1.1.11 H2 1.1.1.12 H3

1.1.1.1 rtr-a

IGMP State in rtr-a after H2 Leaves


rtr-a> rtr-a>sh sh ip ip igmp igmp group group IGMP IGMP Connected Connected Group Group Membership Membership Group Interface Uptime Group Address Address Interface Uptime 224.1.1.1 Ethernet0 6d17h 224.1.1.1 Ethernet0 6d17h Expires Last Expires Last Reporter Reporter 00:01:47 00:01:47 1.1.1.12 1.1.1.12

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

27 27

IGMPv2 Leaves
At this point, the group is still active. However, the router shows that Host 3 is the last host to send an IGMP Group Membership Report.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

27

IGMPv2Leaving a Group
1.1.1.10 H1 1.1.1.11 H2 224.1.1.1 Leave to #1 224.0.0.2 1.1.1.1 rtr-a Group Specific Query to 224.1.1.1 #2 1.1.1.12 H3

Last host leaves group; sends Leave message Router sends Group specific query No report is received Group times out
1998 2001, Cisco Systems, Inc. All rights reserved.

Module2. ppt

8/9/2001 4:20 PM

28 28

IGMPv2 Leaves Example (continued):


H3 is the only remaining member of group 224.1.1.1 #1 - H3 leaves #2 - Router sends group specific query to see if any other group members are present. H3 was the last remaining member of the group so no IGMP Membership Report for group 224.1.1.1 is received and the group times out. (This typically takes from 1-3 seconds from the time that the Leave message is sent until the Group Specific Query times out and traffic stops flowing.)

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

28

IGMPv2Leaving a Group
1.1.1.10 H1 1.1.1.11 H2 1.1.1.12 H3

1.1.1.1 rtr-a

IGMP State in rtr-a after H3 Leaves


rtr-a> rtr-a>show show ip ip igmp igmp group group IGMP IGMP Connected Connected Group Group Membership Membership Group Interface Uptime Group Address Address Interface Uptime Expires Expires Last Last Reporter Reporter

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

29 29

IGMPv2 Leaves
At this point, all hosts have left the 224.1.1.1 group on Ethernet0. This is indicated by rtr-a above.in the output of the show ip igmp group command.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

29

IGMPv2Response Tuning
Query Query

Query Resp. Interval Query Interval

Query Response Interval

Host Membership Reports (assuming 18 active Groups)

Report suppression mechanism tends to spread Reports out over the entire Query Response Interval
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

30 30

IGMPv2 Query-Response Tuning


Because random report timers are set on all hosts and report suppression is in effect - the reports are randomly distributed over the query response time interval instead of coming all at once. The query response interval is specified by the querying router as a guide for the end systems to set an upper bound on the random timer they will set for a report.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

30

IGMPv2Response Tuning
Query Query Query

Query Resp. Interval Query Interval

Query Resp. Interval Query Interval

Query Resp. Interval

Query

Query

Query

Query Resp. Interval Query Interval

Query Resp. Interval Query Interval

Query Resp. Interval

Increasing the Query Response Interval will spread out Reports; decreasing Burstiness.
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

31 31

IGMPv2 Query Response Tuning (cont.)


The advantage of increasing the Query Interval and Query Response Interval is less overhead and bandwidth on the segment and less work for the routers and end systems to maintain the groups. The disadvantage for setting these intervals longer is the detection of router failures in redundant multicast router environments. This is a common tradeoff in most routing protocols. Short "keepalive" intervals mean more overhead and work but allow for faster convergence in failure scenarios.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

31

IGMPv2Response Tuning

interface Ethernet 0 ip pim sparse ip igmp queryquery -max max-response response-time 20

Tuning the Query Response Interval (Default = 10 secs)

interface Ethernet 0 ip pim sparse ip igmp queryquery -interval 120

Tuning the Query Interval (Default = 60 secs)

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

32 32

Query Response Tuning (cont.)


Use default settings when possible. Tune with care!

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

32

IGMPv2Response Tuning
Verifying IGMPv2 Response Tuning Values
jabber>show jabber>show ip ip igmp igmp interface interface e0 e0 Ethernet0 Ethernet0 is is up, up, line line protocol protocol is is up up Internet Internet address address is is 10.1.3.1, 10.1.3.1, subnet subnet mask mask is is 255.255.255.0 255.255.255.0 IGMP is enabled on interface IGMP is enabled on interface Current Current IGMP IGMP version version is is 22 CGMP CGMP is is disabled disabled on on interface interface IGMP IGMP query query interval interval is is 120 120 seconds seconds IGMP IGMP querier querier timeout timeout is is 240 240 seconds seconds IGMP max query response time IGMP max query response time is is 20 20 seconds seconds Inbound IGMP access group is Inbound IGMP access group is not not set set Multicast Multicast routing routing is is enabled enabled on on interface interface Multicast Multicast TTL TTL threshold threshold is is 00 Multicast Multicast designated designated router router (DR) (DR) is is 10.1.3.1 10.1.3.1 (this (this system) system) IGMP IGMP querying querying router router is is 10.1.3.1 10.1.3.1 (this (this system) system) Multicast groups joined: 224.0.1.40 224.2.127.254 Multicast groups joined: 224.0.1.40 224.2.127.254

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

33 33

Checking IGMP Response Tuning


Use the show ip igmp interface command to verify that the values are correct when you perform IGMP Response Tuning. In the above example, the Query Interval has been set to 120 seconds while the Max. Query Response Time is set to 20 seconds.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

33

IGMP v1-v2 Interoperability


IGMPv1 H1 IGMPv2 IGMPv1 Report #2 1.1.1.1 IGMPv1 Host H2: MUST always send IGMPv1 Reports MAY suppress IGMPv2 Leaves IGMPv1 #1 Query H2 IGMPv1 H3

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

34 34

IGMP v1-v2 Interoperability


IGMPv1 routers will not recognize IGMPv2 Membership Reports. Therefore, when IGMPv2 hosts are present on the same network as an IGMPv1 router (which is serving as the query router), the IGMPv2 capable hosts MUST send IGMPv1 Membership Reports so the IGMPv1 router will recognize them. In addition, if the router is running IGMPv1, it makes no sense for hosts to send Leave Messages. However, it will not hurt if they do.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

34

IGMP v1-v2 Interoperability


IGMPv2 H1 IGMPv1 224.1.1.1 Report 1.1.1.1 IGMPv2 Router A: MUST set a timer noting IGMPv1 member present for Group 224.1.1.1 MUST Ignore any v2 Leaves for Group 224.1.1.1 (until timer expires)
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

H2

IGMPv2

H3

Router A

8/9/2001 4:20 PM

35 35

IGMP v1-v2 Interoperability (cont.)


If the query router is running IGMPv2, it must be able to recognize when v1 hosts are present since v1 hosts do not have advanced v2 query response interval awareness. Furthermore, in this situation an IGMPv2 must ignore any IGMPv2 Leave Messages since the v1 hosts present will not be able to recognize nor respond to IGMPv2 Group Specific queries. If the router were to process the Leave Message, send out an IGMPv2 Group Specific query and the only remaining host in the group was an IGMPv1 host, the group would be pruned when it should not have been.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

35

IGMP v1-v2 Interoperability


H1 H2 H3

1.1.1.1 IGMPv2 Router A Router A:

1.1.1.2 IGMPv1 Router B

Must be manually configured to use IGMPv1 on this interface.


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

36 36

IGMP v1-v2 Interoperability (cont.)


All routers on a network segment must run the same version of IGMP!!!! By default, IOS will run IGMPv2. If there are other IGMPv1 routers on the network segment, the Cisco router MUST be manually configured to run IGMPv1. The IOS configuration command used to manually configure the IGMP version on an interface is: ip igmp version 1|2 Note that in IOS versions prior to 11.1, the router would automatically attempt to ascertain the proper version of IGMP to run on an interface. Unfortunately, there are many corner cases which make this problematic and prone to error. Therefore, as of IOS version 11.1, it is necessary to perform this task manually with the above command.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

36

IGMP v1-v2 Interoperability


Determining which IGMP version is running on an interface
rtr-a> show rtr-a> show ip ip igmp igmp interface interface e0 e0 Ethernet0 Ethernet0 is is up, up, line line protocol protocol is is up up Internet Internet address address is is 1.1.1.1, 1.1.1.1, subnet subnet mask mask is is 255.255.255.0 255.255.255.0 IGMP is enabled on interface IGMP is enabled on interface Current IGMP version is 2 Current IGMP version is 2 CGMP CGMP is is disabled disabled on on interface interface IGMP IGMP query query interval interval is is 60 60 seconds seconds IGMP IGMP querier querier timeout timeout is is 120 120 seconds seconds IGMP max query response time IGMP max query response time is is 10 10 seconds seconds Inbound IGMP access group is Inbound IGMP access group is not not set set Multicast Multicast routing routing is is enabled enabled on on interface interface Multicast Multicast TTL TTL threshold threshold is is 00 Multicast Multicast designated designated router router (DR) (DR) is is 1.1.1.1 1.1.1.1 (this (this system) system) IGMP IGMP querying querying router router is is 1.1.1.1 1.1.1.1 (this (this system) system) Multicast groups joined: 224.0.1.40 224.2.127.254 Multicast groups joined: 224.0.1.40 224.2.127.254

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

37 37

Verifying the IGMP Version on an Interface


Use the show ip igmp interface command to determine which version of IGMP is currently active on an interface. This is indicated by the line in the above example that says Current IGMP version is 2

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

37

IGMPv3
draft-ietf-idmr-igmp-v3-??.txt
In EFT

Adds Include/Exclude Source Lists


Enables hosts to listen only to a specified subset of the hosts sending to the group Requires new IPMulticastListen API
New IGMPv3 stack required in the O/S.

Apps must be rewritten to use IGMPv3 Include/Exclude features


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

38 38

IGMPv3
The IDMR is completing work on IGMPv3. The key change in IGMPv3 is the addition of Group records each containing a list of sources to Include or Exclude. This permits a host to signal which set of hosts that they wish to receive group traffic. IGMPv3 requires that the IPMulticastListen API be changed to accommodate the Include/Exclude filter list. This means that the IGMP stack in the OS will have to be updated to support IGMPv3. In order to take advantage of the benefits of IGMPv3, applications must be (re)written to support the new API.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

38

IGMPv3
New Membership Report address
224.0.0.22 (IGMPv3 Routers)
All IGMPv3 Hosts send reports to this address
Instead of the target group address as in IGMPv1/v2

All IGMPv3 Routers listen to this address Hosts do not listen or respond to this address

No Report Suppression
All Hosts on wire respond to Queries Response Interval may be tuned over broad range
Useful when large numbers of hosts reside on subnet
Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

39 39

IGMPv3
IGMPv3 is assigned its own All IGMPv3 Routers link-local multicast group address, 224.0.0.22. IGMPv3 hosts no longer send their reports to the target multicast group address. Instead, they send their IGMPv3 Membership Reports to the All IGMPv3 Routers multicast address. Routers listen to the 224.0.0.22 address in order to receive and maintain IGMP membership state for every member on the subnet! This is a radical change over the behavior in IGMPv1/v2 where the routers only maintained group state on a subnet basis. Hosts do not listen to 224.0.0.22 and therefore do not hear other hosts IGMPv3 membership reports. IGMPv3 drops the Report Suppression mechanism that was used in IGMPv1/v2. All IGMPv3 hosts on the wire respond to Queries by sending and IGMPv3 membership reports containing their total IGMP state for all groups in the report. In order to prevent huge bursts of IGMPv3 Reports, the Response Interval may now be tuned over a much greater range than before. This permits the network engineer to adjust the burstiness of IGMPv3 Reports when there is a large number of hosts on the subnet.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

39

IGMPv3 Query Packet Format


Type = 0x11 IGMP Query Max. Resp. Time Max. time to send a response if < 128, Time in 1/10 secs if > 128, FP value (12.8 - 3174.4 secs ) Group Address: Multicast Group Address (0.0.0.0 for General Queries) S Flag Suppresses processing by routers QRV (Querier Robustness Value) Affects timers and # of retries QQIC (Queriers Query Interval) Same format as Max. Resp. Time Number of Sources (N) (Non-zero for Group-and-Source Query) Source Address Address of Source
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

7 Type = 0x11

15 Max. Resp. Code Group Address

31 Checksum

S QRV

QQIC

Number of Sources (N)

Source Address [1] Source Address [2] . . . Source Address [N]

8/9/2001 4:20 PM

40 40

Type
The same IGMPv2 type code 0x11 is used as the IGMPv3 Membership Query Type code.

Max. Response Time (1/10 seconds)


This field has been reformatted to permit longer times to be expressed. If the value is < 128, the the time is absolute (.1- 12.8 seconds). If the value is > 128, it is interpreted as a floating-point number as follows:
+-+-+-+-+-+-+-+-+ |1| exp | mant | value = (mant|0x10)<<(exp+3) +-+-+-+-+-+-+-+-+

Group Address
This field is identical to the IGMPv2 version of this field. It is set to 0.0.0.0 for General Queries.

S Flag
Indicates that the routers that receive this message should not process it.

QRV (Querier Robustness Value)


This value causes all hosts to adjust their Robustness Values which in turn affect various timers and retry counts. Increasing this value provides more protocol robustness at the expense of latency.

QQIC (Querier Query Interval)


This field indicates the Query Interval in use by the Querying router. Its format is the same as the Max. Response Time field.

Number of Sources
The number of Source Addresses in the Group-and-Source-Specific Query.
Copyright ? ?1998-2001 Cisco Systems, Inc.
Module2.ppt

40

IGMPv3 Report Packet Format


7 Type = 0x22 15 Reserved Checksum # of Group Records (M) 31 7 15 # of Sources (N) 31 Record Type Aux Data Len

Reserved

Multicast Group Address Group Record [1] Source Address [1] Source Address [2] . . . Source Address [N] Auxilliary Data Group Record [M]

Group Record [2] . . .

# of Group Records (M) Number of Group Records in Report Group Records 1 - M Group address plus list of zero or more sources to Include/Exclude (See Group Record format)
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Record Type Include, Exclude, Chg-to-Include, Chg-to-Exclude, Add, Remove # of Sources (N) Number of Sources in Record Source Address 1- N Address of Source
8/9/2001 4:20 PM

41 41

# of Group Records
Indicates the number of Group records that are contained in the Membership Report. IGMPv3 Membership Reports can contain IGMP state on a number of Groups and Sources within the group. The source information specifies which Sources to Include or Exclude.

Aux. Data Length (Group Records)


Indicates the size of the Auxilliary Data area.

Multicast Address (Group Records)


The multicast group address of the joined Group.

# of Sources (Group Records)


Indicates the number of Sources in the list.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

41

IGMPv3 Example
Source = 1.1.1.1 Group = 224.1.1.1 R1 R2 Source = 2.2.2.2 Group = 224.1.1.1

H1 wants to receive only S = 1.1.1.1 and no other. With IGMP, specific sources can be joined. S = 1.1.1.1 in this case R3 IGMPv3: Join 224.1.1.1 Include: 1.1.1.1 H1 - Member of 224.1.1.1

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

42 42

IGMPv3 Example
In this example, host H1 wishes to join group 224.1.1.1 but only wishes to receive traffic from Source 1.1.1.1. The IGMPv3 host can signal the designated router, R3, that it is only interested in multicast traffic from Source 1.1.1.1 for Group 224.1.1.1. Router R3 could then potentially prune the unwanted source, 2.2.2.2,.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

42

IGMPv3Joining a Group
1.1.1.10 H1
v3 Report (224.0.0.22)

1.1.1.11 H2
Group: 224.1.1.1 Exclude: <empty>

1.1.1.12 H3

1.1.1.1 rtr-a

Joining member sends IGMPv3 Report to 224.0.0.22 immediately upon joining


43 43

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

Asynchronous Joins
Members joining a group do not have to waited for a query to join; they send in an unsolicited IGMPv3 Membership Report indicating their interest. This reduces join latency for the end system joining if no other members are present. In the example above, Host 2 is joining multicast group 224.1.1.1 and is willing to receive any and all sources in this group.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

43

IGMPv3Joining specific Source(s)


1.1.1.10 H1
v3 Report (224.0.0.22)

1.1.1.11 H2
Group: 224.1.1.1 Include: 10.0.0.1

1.1.1.12 H3

1.1.1.1 rtr-a

IGMPv3 Report contains desired source(s) in the Include list. Only Included source(s) are joined.
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

44 44

Joining only specific Source(s)


Hosts may signal the router that it wishes to receive only a specific set of sources sending to the group. This is done by using an Include list in the Group record of the Report. When an Include list is in use, only the specific sources listed in the Include list are joined. In the example above, Host 2 is joining multicast group 224.1.1.1 and only wants to receive source 10.0.0.1 sending to the group.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

44

IGMPv3Excluding specific Source(s)


1.1.1.10 H1
v3 Report (224.0.0.22)

1.1.1.11 H2
Group: 224.1.1.1 Exclude: 7.7.7.7

1.1.1.12 H3

1.1.1.1 rtr-a

IGMPv3 Report contains undesired source(s) in the Exclude list. All sources except Excluded source(s) are joined.
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

45 45

Joining only specific Source(s)


Hosts may signal the router that it wishes to receive all sources sending to the group except a specific set of undesired sources. This is done by using an Exclude list in the Group record of the Report. When an Exclude list is in use, all sources in the group are joined except the sources listed in the Exclude list. In the example above, Host 2 is joining multicast group 224.1.1.1 and wish to receive multicast traffic from any source in the group except source 7.7.7.7.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

45

IGMPv3Maintaining State
1.1.1.10 H1
v3 Report (224.0.0.22) v3 Report (224.0.0.22)

1.1.1.11 H2

1.1.1.12 H3
v3 Report (224.0.0.22)

1.1.1.1

Query

Router sends periodic queries All IGMPv3 members respond


Reports contain multiple Group state records
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

46 46

Query-Response Process
The router multicasts periodic Membership Queries to the All-Hosts (224.0.0.1) group address. All hosts on the wire respond by sending back an IGMPv3 Membership Report that contains their complete IGMP Group state for the interface.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

46

L2 Multicast Frame Switching

Problem: Layer 2 Flooding of Multicast Frames


Typical L2 switches treat multicast traffic as unknown or broadcast and must flood the frame to every port Static entries can sometimes be set to specify which ports should receive which group(s) of multicast traffic Dynamic configuration of these entries would cut down on user administration
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

PIM

Multicast M

8/9/2001 4:20 PM

47 47

L2 Multicast Switching
For most L2 Switches, Multicast traffic is normally treated like an unknown MAC address or Broadcast frame which causes the frame to be flooded out every port within a VLAN at rates of over 1 Mbps. This is fine for unknowns and broadcasts but as we have seen earlier, IP Multicast hosts may join and be interested in only specific multicast groups. Again, on most L2 Switches, all this traffic is forwarded out all ports resulting in wasted bandwidth on both the segments and on the end stations. One way around this on Catalyst Switches is using the Command Line Interface to program the switch manually to associate a multicast MAC address with say ports 5,6,7 so only ports 5,6,and 7 receive the multicast traffic destined for the multicast group. This works fine but again we know IP Multicast hosts dynamically join and leave groups using IGMP to signal to the Multicast Router. This static way of entering the multicast information is not very scaleable. Dynamic configuration of the Switches forwarding tables would be a better idea, and cut down on user administration.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

47

L2 Multicast Frame Switching

Solution 1: IGMP Snooping


Switches become IGMP aware IGMP packets intercepted by the NMP or by special hardware ASICs Switch must examine contents of IGMP messages to determine which ports want what traffic
IGMP membership reports IGMP leave messages PIM

Impact on switch:
Must process ALL Layer 2 multicast packets Admin. load increases with multicast traffic load Requires special hardware to maintain throughput

IGMP

IGMP
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

48 48

Solution 1: IGMP Snooping


As its name implies, switch become IGMP aware and listen in on the IGMP conversations between hosts and routers. This requires the processor in the switch to identify and intercept a copy of all IGMP packets flowing between router and hosts and vice versa. This includes: IGMP Membership Reports IGMP Leaves If care is not taken as to how IGMP Snooping is implemented, a switch may have to intercept ALL layer 2 multicast packets in order to identify IGMP packets. This can have a significant impact on the switchs performance. Proper designs require special hardware to avoid this problem. This can directly affect the overall cost of the switch.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

48

Typical L2 Switch Architecture


Router A 1 LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0000.6503.1d0e Port 5

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3 Host 4 (0000.6503.1d0e)


8/9/2001 4:20 PM

49 49

Typical Layer 2 Switch


Most Layer 2 switches consist of the following components: Switching Engine - Used to actually perform switching of packets from the input port to the output port(s) under the control of the Contents Addressable Memory (CAM) Table. If there is no entry in the CAM Table that matches the destination MAC address, the Switching Engine will flood the packet to all ports in an attempt to insure that the packet reaches the destination. CAM Table - The information in this table is used to control the operation of the Switching Engine. Each entry in this table contains a Layer 2 destination MAC address and output port(s) where packets addressed to this destination should be switched. CPU - The switchs main CPU populates the CAM Table with destination MAC addresses so that packets can be switched efficiently by the Switching Engine. The CPU learns the ports associated with a particular MAC address by watching arriving traffic sent by hosts. It then populates the CAM Table with this learned information. (Switches can typically also be instructed to populate the CAM Table with specific MAC address to port mapping information via configuration commands.) In the example shown above, the switch has learned the port (port 5) associated with Host 4s MAC address (0000.6503.1d0e). This information has been stored by the CPU in the CAM Table. Because of this CAM Table entry, packets arriving with Host 4s MAC address as the destination are being switched by the Switching Engine to port 5 as can be seen in the drawing above. In the next few pages, we will see how this simply Layer 2 architecture might be used to implement IGMP Snooping and its potential impact on the switch.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

49

Typical L2 Switch 1st Join


Router A 1 (IGMP Snooping Enabled) 0 IGMP Report 224.1.2.3

LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address Ports

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

50 50

IGMP Snooping in L2 Switches


In the above example, the CPU has been programmed to perform IGMP Snooping. This requires the CPU to listen to all IGMP traffic and then add an appropriate Layer 2 multicast MAC address to the CAM Table in order to constrain the IP Multicast traffic to only those ports that require the traffic. Initially, when the first host (Host 1) joins group 224.1.2.3, there is no entry in the CAM table associated with the Layer 2 MAC address equivalent to this group address. Therefore, the initial IGMP Group Membership Report sent by Host 1 is flooded to all ports including the switchs CPU and the Router. Overhearing this, the CPU populates the CAM table with an entry of 0x0100.5e01.0203 which is the L2 MAC address equivalent of IP multicast address 224.1.2.3. Additionally, this entry is populated with the port associated with Host 1 (port 2) as well as the Router and the CPU ports (ports 0 and 1). The CPU port must be included in order for the Switching Engine to continue to forward any further IGMP messages addressed to this group to the CPU for processing.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

50

Typical L2 Switch 1st Join


Router A 1 (IGMP Snooping Enabled) 0

LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 0,1,2

Entry Added
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

51 51

IGMP Snooping in L2 Switches


In the above example, the CPU has been programmed to perform IGMP Snooping. This requires the CPU to listen to all IGMP traffic and then add an appropriate Layer 2 multicast MAC address to the CAM Table in order to constrain the IP Multicast traffic to only those ports that require the traffic. Initially, when the first host (Host 1) joins group 224.1.2.3, there is no entry in the CAM table associated with the Layer 2 MAC address equivalent to this group address. Therefore, the initial IGMP Group Membership Report sent by Host 1 is flooded to all ports including the switchs CPU and the Router. Overhearing this, the CPU populates the CAM table with an entry of 0x0100.5e01.0203 which is the L2 MAC address equivalent of IP multicast address 224.1.2.3. Additionally, this entry is populated with the port associated with Host 1 (port 2) as well as the Router and the CPU ports (ports 0 and 1). The CPU port must be included in order for the Switching Engine to continue to forward any further IGMP messages addressed to this group to the CPU for processing.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

51

Typical L2 Switch 2nd Join


Router A 1 (IGMP Snooping Enabled) 0 IGMP Report 224.1.2.3

LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 0,1,2

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

52 52

IGMP Snooping in L2 Switches


Now lets assume that a second host (Host 4) also joins the group by sending an IGMP Report to group 224.1.2.3. Because of the CAM Table entry for 0x0100.5e01.02.03, this IGMP Report is constrained to only Host 1, the router and the CPU. When the CPU receives the IGMP Report, it simply adds the port (port 5) on which Host 4 is connected to the CAM Table entry. This results in ports 0, 1, 2 and 5 being associated with the multicast MAC address 0x0100.5e01.02.03.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

52

Typical L2 Switch 2nd Join


Router A 1 (IGMP Snooping Enabled) 0

LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 0,1,2,5

Port Added
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

53 53

IGMP Snooping in L2 Switches


Now lets assume that a second host (Host 4) also joins the group by sending an IGMP Report to group 224.1.2.3. Because of the CAM Table entry for 0x0100.5e01.02.03, this IGMP Report is constrained to only Host 1, the router and the CPU. When the CPU receives the IGMP Report, it simply adds the port (port 5) on which Host 4 is connected to the CAM Table entry. This results in ports 0, 1, 2 and 5 being associated with the multicast MAC address 0x0100.5e01.02.03.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

53

Typical L2 Switch Meltdown!


6Mbps !!! Choke, Gasp, Wheeze!!

Router A 1 (IGMP Snooping Enabled) 0

LAN Switch

CPU

Switching Engine
6Mbps MPEG Video

CAM Table
2
MAC Address 0100.5e01.0203 Ports 0,1,2,5

Host 1 Host 2 (MPEG Server)


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 3

Host 4
8/9/2001 4:20 PM

54 54

IGMP Snooping in L2 Switches


Let us now assume that Host 1 begins transmitting a 1.5Mbps MPEG video stream to multicast group 224.1.2.3. Because the destination MAC address of this stream maps to 0x0100.5e01.02.03, the Switching Engine dutifully switches this traffic to Host 4, the Router and the CPU! In most cases, the switchs CPU does not have sufficient horsepower to keep up with this high rate flow of multicast traffic and switch performance can suffer. In some cases, the switch can actually fail under such loads.

Summary
IGMP Snooping can be (and often is) implemented in low-end, Layer-2 only switches using techniques similar to the above. While this is fine for extremely low data-rate multicast flows or carefully orchestrated vendor demonstrations of their switchs IGMP Snooping feature, it is generally inadequate for real-world use.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

54

L3 Aware Switch
Router A 1 (IGMP Snooping Enabled) 0

LAN Switch

CPU

Switching Engine (w/L3 ASICs)

CAM Table
2
MAC Address 0100.5exx.xxxx L3 IGMP Ports 0

IGMP Processing Entry


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

55 55

IGMP Snooping in L3-aware Switches


In order to properly implement IGMP Snooping on a switch without suffering performance degradation, it is necessary to make the switch Layer 3 aware. This is typically accomplished adding Layer 3 ASICs to the Switching Engine in addition to extending the CAM Table so that entries may contain additional Layer 3 information that can be used to make switching decisions. (In case it is not obvious, this means the switch will cost more money.) In the above example, we have just such a Layer-3 aware switch that has been programmed to perform IGMP Snooping using some of the added Layer 3 capabilities in the switchs architecture. In order to accomplish this, the CPU populates the CAM Table with a special entry to capture any and all IGMP packets. There can be many ways to do this but in the example above, the CAM Table entry contains a wildcard MAC address that will match on any IP multicast address. Furthermore, the Layer 3 part of the packet must contain an IGMP protocol packet in order for the entry to match and cause the packet to be switched to the CPU

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

55

L3 Aware Switch 1st Join


Router A 1 (IGMP Snooping Enabled) 0

LAN Switch

CPU

Switching Engine (w/L3 ASICs)


IGMP Report 224.1.2.3

CAM Table
2
MAC Address 0100.5exx.xxxx 0100.5e01.0203 L3 IGMP !IGMP Ports 0 1,2

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

56 56

IGMP Snooping in L3-aware Switches


Lets assume that the first host (Host 1) now joins group 224.1.2.3 and signals this by sending an IGMP Report. This report matches on the first entry in the CAM Table and is switched to the CPU. The CPU responds by forwarding the packet on to the Router (for normal IGMP processing) and then adds a second entry to the CAM table to switch 224.1.2.3 group traffic to Host 1 and the Router (ports 1 and 2). This second entry will match IFF: The packet is addressed to multicast MAC address 0x0100.5e01.0203 (the Layer 2 equivalent to group address 224.1.2.3) and The packet is not and IGMP packet.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

56

L3 Aware Switch 2nd Join


Router A 1 (IGMP Snooping Enabled) 0
IGMP Report 224.1.2.3

LAN Switch

CPU

Switching Engine (w/L3 ASICs)

CAM Table
2
MAC Address 0100.5exx.xxxx 0100.5e01.0203 Port Added
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

L3 IGMP !IGMP

Ports 0 1,2,5

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

57 57

IGMP Snooping in L3-aware Switches


Now lets assume that again Host 4 is the second host to join 224.1.2.3 and therefore sends an IGMP Report to 224.1.2.3. Once again, the IGMP Report matches on the first entry and is switched to the CPU. The CPU responds by forwarding a copy of the IGMP Report to the Router and by adding the port associated with Host 4 (port 5) to the port list in the second CAM Table entry.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

57

L3 Aware Switch
Ahhh, Thats more like it!

Router A

LAN Switch

1 (IGMP Snooping Enabled) 0

CPU

Switching Engine (w/L3 ASICs)


6Mbps MPEG Video

CAM Table
2
MAC Address 0100.5exx.xxxx 0100.5e01.0203 L3 IGMP !IGMP Ports 0 1,2,5

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

58 58

IGMP Snooping in L3-aware Switches


In the final step of our example, Host 1 once again starts up the 1.5Mbps MPEG video stream to group 224.1.2.3. Packets in this stream will not match on the first CAM Table entry but instead will match on the second entry. Therefore, the video stream is switched to only Host 4 and the Router and the CPU is not burdened with this unwanted data stream.

Summary
In order to construct a switch that is capable of IGMP Snooping without suffering a performance hit, the switch must use special Layer 3 ASIC or some similar technique. This increases the overall cost of the switch.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

58

L2 Multicast Frame Switching

Solution 2: CGMPCisco Group Multicast Protocol


Runs on both the switches and the router Router sends CGMP multicast packets to the switches at a well known multicast MAC address:
0100.0cdd.dddd
PIM

CGMP packet contains :


Type fieldJoin or Leave MAC address of the IGMP client Multicast address of the group

CGMP Commands

IGMP

Switch uses CGMP packet info to add or remove an entry for a particular multicast MAC address
Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

59 59

Solution 2: CGMP
CGMP is based on a client server model where the router can be considered a CGMP server and the switch taking on the client role. There are software components running on both devices, with the router translating IGMP messages into CGMP commands which are then executed on the Catalyst 5000 NMP and used to program the EARLs forwarding tables with the correct Multicast entries. Since the hosts and routers use well-known IP Multicast Addresses, the EARL can be preprogrammed to direct IGMP Control packets both to the router and the NMP. We will see the NMPs use of these IGMP control packets in a later slide. The basis of CGMP is that the IP Multicast router sees all IGMP packets and therefore can inform the switch when specific hosts join or leave Multicast groups. The switch then uses this information to program its forwarding table. When the router sees an IGMP control packet it creates a CGMP packet that contains the request type (Join or Leave), the Layer 2 Multicast MAC Address, and the actual MAC address of the client. This packet is sent to a well known address which all CGMP switches listen on. It is then interpreted and the proper entries created in the switchs CAM Table to constrain the forwarding of multicast traffic for this group.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

59

CGMP Basics

IGMP Report
Dst MAC = 0100.5e01.0203 Src MAC = 0080.c7a2.1093 Dst IP = 224.1.2.3 Src IP = 192.1.1.1 IGMP Group = 224.1.2.3

CGMP Join
1/1 1/1
USA = 0080.c7a2.1093 GDA = 0100.5e01.0203

5/1

5/1

(a)
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

(b)
8/9/2001 4:20 PM

60 60

CGMP Example
In this example - the client will asynchronously send an IGMP Membership Report when it wants to join the group. The Router converts this IGMP Membership Report into a CGMP Join containing: USA - Unicast Source Address GDA - Group Destination Address The CGMP Join is multicast to a well-known (non-IP) multicast MAC address which the switch listens on.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

60

CGMP Packet Format


3 Ver Type 7 15 Reserved GDA GDA USA USA 23 Count 31

Ver (4 bits): Type (4 bits): Count (1 byte): GDA (6 bytes): USA (6 bytes):

Only version 1 is currently recognized and supported 0 = Join, 1 = Leave Number of GDA/USA pairs in the packet Group Destination Address - IEEE MAC level canonical format Unicast Source Address - IEEE MAC-level canonical format

Reserved (2 bytes): Must be set to 0 and ignored

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

61 61

CGMP Packet Format


All CGMP packets encapsulated in SNAP frames using Ciscos ORG ID (0x00000c) with an Ethertype of 2001: Mac Header 802.2 Header: SNAP Header CGMP Header aa aa 03 00 00 0c 20 01

Most sniffers and software capture programs do not decode CGMP (have fun with the hex decodes)

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

61

CGMP 1st Join


Router A 1 Simple LAN Switch
IGMP Report 224.1.2.3

CPU

Switching Engine

CAM Table
2
MAC Address Ports

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7a2.1093
62 62

CGMP Implementation in L2 switches


Because the switch relies on the Router to assist in the process of constraining IP multicast traffic at Layer 2, it can be implemented very easily in low-end, Layer2 only switches. In the above CGMP example, the first host (Host 1) joins multicast group 224.1.2.3 by sending an IGMP Membership Report. Because there is no matching entry in the CAM Table, the IGMP Membership Report is flooded to all ports including the Router who processes the IGMP Report.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

62

CGMP 1st Join


Router A 1 Simple LAN Switch

CPU

Switching Engine
CGMP Join USA 0080.c7a2.1093 GDA 0100.5e01.0204

CAM Table
2
MAC Address Ports

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7a2.1093
63 63

CGMP Implementation in L2 switches


In addition to performing normal IGMP processing of the IGMP Membership Report, the Router also converts it into a CGMP Join message containing the MAC address of the host that sent the IGMP Report (Host 1) in the USA field and the Layer 2 MAC address equivalent of group 224.1.2.3 in the GDA field. This CGMP Join message is then multicast back to the switch. When the switch receives the CGMP Join, it uses the host address in the USA field to determine the port where the Host resides. This is done by scanning the CAM table for the hosts MAC address to obtain the associated port number. (This step is not shown in the example above.) The CPU then populates its CAM Table with an entry containing the multicast MAC address from the GDA field and the port number of the host that joined along with the port numbers of any routers connected to the switch. Note: The CPU has many ways to determine which ports have routers attached. These include listening for DVMRP Probes, PIM Hellos, and IGMP Queries.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

63

CGMP 1st Join


Router A 1 Simple LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 1,2

Entry Added
Module2. ppt

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7a2.1093
64 64

1998 2001, Cisco Systems, Inc. All rights reserved.

CGMP Implementation in L2 switches


In addition to performing normal IGMP processing of the IGMP Membership Report, the Router also converts it into a CGMP Join message containing the MAC address of the host that sent the IGMP Report (Host 1) in the USA field and the Layer 2 MAC address equivalent of group 224.1.2.3 in the GDA field. This CGMP Join message is then multicast back to the switch. When the switch receives the CGMP Join, it uses the host address in the USA field to determine the port where the Host resides. This is done by scanning the CAM table for the hosts MAC address to obtain the associated port number. (This step is not shown in the example above.) The CPU then populates its CAM Table with an entry containing the multicast MAC address from the GDA field and the port number of the host that joined along with the port numbers of any routers connected to the switch. Note: The CPU has many ways to determine which ports have routers attached. These include listening for DVMRP Probes, PIM Hellos, and IGMP Queries.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

64

CGMP 2nd Join


Router A 1 Simple LAN Switch
IGMP Report 224.1.2.3

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 1,2

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7b3.2174
65 65

CGMP Implementation in L2 switches


Next, lets assume that (once again) Host 4 is the second host to join group 224.1.2.3 and signals this by sending an IGMP Report to 224.1.2.3. Because the IGMP Report is sent to group 224.1.2.3, the MAC destination address is 0x0100.5e01.0203 which matches on the first entry in the CAM Table shown above. This results in the IGMP Report being sent to Host 1 and the Router.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

65

CGMP 2nd Join


Router A 1 Simple LAN Switch

CPU

Switching Engine
CGMP Join USA 0080.c7b3.2174 GDA 0100.5e01.0204

CAM Table
2
MAC Address 0100.5e01.0203 Ports 1,2

Host 1
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7b3.2174
66 66

CGMP Implementation in L2 switches


In addition to performing normal IGMP processing of the IGMP Membership Report, the Router again converts it to a CGMP Join message containing the MAC address of Host 4 in the USA field and the Layer 2 MAC address equivalent of group 224.1.2.3 in the GDA field. The resulting CGMP Join message is then multicast back to the switch. When the switch receives this CGMP Join, it again uses the host address in the USA field to determine the port where the Host resides. (In this case, port 5.) The CPU then adds port 5 to the port list in the existing CAM Table entry associated with the multicast MAC address from the GDA field.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

66

CGMP 2nd Join


Router A 1 Simple LAN Switch

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e01.0203 Ports 1,2,5

Port Added
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 1

Host 2

Host 3

Host 4
8/9/2001 4:20 PM

0080.c7b3.2174
67 67

CGMP Implementation in L2 switches


In addition to performing normal IGMP processing of the IGMP Membership Report, the Router again converts it to a CGMP Join message containing the MAC address of Host 4 in the USA field and the Layer 2 MAC address equivalent of group 224.1.2.3 in the GDA field. The resulting CGMP Join message is then multicast back to the switch. When the switch receives this CGMP Join, it again uses the host address in the USA field to determine the port where the Host resides. (In this case, port 5.) The CPU then adds port 5 to the port list in the existing CAM Table entry associated with the multicast MAC address from the GDA field.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

67

CGMP No Load on Switch


Router A 1 Simple LAN Switch

CPU

Switching Engine
6Mbps MPEG Video

CAM Table
2
MAC Address 0100.5e01.0203 Ports 1,2,5

Host 1 Host 2 (MPEG Server)


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 3

Host 4
8/9/2001 4:20 PM

68 68

CGMP Implementation in L2 switches


In our final drawing of the example, Host 1 again begins sourcing its 1.5Mbps MPEG video stream to group 224.1.2.3. When this stream hits the switch, it matches on the first entry in the CAM Table and is switched to Host 4 and the Router. Note that because the CPUs port is not included in this entry, the high-rate video stream is not being sent to the CPU and hence does not impact the performance of the switch.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

68

CGMP Messages
GDA
Mcst MAC Mcst MAC 00000000 00000000 Mcst MAC 00000000

USA
Client MAC Client MAC Router MAC Router MAC 00000000 00000000

Join/Leave Meaning
Join Leave Join Leave Leave Leave Add USAs port to the Group Delete USAs port from Group Assign Port = Router Port Deassign Port = Router Port Delete Group from CAM Delete ALL Groups from CAM

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

69 69

CGMP Messages
All of these messages are sent by the router (switches do not originate CGMP messages) All of these messages are contained within a given VLAN When a JOIN is sent with a non-zero GDA and a non-zero USA, this adds the switch port where USA is located to the given group list in the CAM table (normal operation after a router receives an IGMP JOIN) When a LEAVE is sent with a non-zero GDA and a clients MAC address for the USA, that clients port is deleted from the group (selectively delete a single client based on an IGMP leave) When a JOIN is sent with a GDA of all zeros using its own MAC address as the USA, this is an advertisement for the switches to detect what incoming switch ports are router ports (occurs every 60 seconds so switches can dynamically find the CGMP-speaking routers) When a LEAVE is sent with an all-zeros GDA and a USA of the routers MAC, all groups and ports are deleted that are associated with that router port (the router has withdrawn its CGMP ability) When a LEAVE is sent with a non-zero GDA and an all zeros USA, this globally deletes the group in all switches (used to globally delete the group after the last member has left via IGMP state) When a LEAVE is sent with all zeros in GDA and USA, all groups are deleted in all switches (occurs when CGMP is disabled on the router or a clear ip cgmp is executed for a given router interface/VLAN)

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

69

CGMP Router Commands


Command
ip cgmp ip cgmp proxy

Notes
Enable CGMP per (Sub) Interface Enables CGMP and DVMRP Proxy per (Sub) Interface Debugs CGMP Activity Shows if CGMP Is Enabled or Disabled Clears All CGMP Groups

debug ip cgmp show ip igmp interface [int [int] ] clear ip cgmp [int int] ]

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

70 70

CGMP Router Commands


All you really need to know is the first command for the majority of installations!

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

70

CGMP Switch Commands

Command
set cgmp enable|disable show multicast router show multicast group show cgmp statistics clear cgmp statistics

Notes
Globally Enable or Disable cgmp Displays Which Ports Are Router Ports Shows which Groups Are Active Shows CGMP Statistics Clears CGMP Statistics

Note: Cat5000 series switch commands shown.


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

71 71

CGMP Switch Commands


All you really need to know is the enable form of the first command for the majority of installations!

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

71

CGMP Switch Commands

Command
set multi router <mod/port> clear multicast router <mod/port> set cgmp leave <en|dis <en|dis> >

Notes
Designates port a router port Deletes multicast router port information Enables/disables fast leave processing

Note: Cat5000 series switch commands shown.


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

72 72

CGMP Switch Commands


The set multi router <mod/port> command may be used to manually designate a port as having a router attached. This might be necessary if the router connected to this port is running some non-standard multicast protocol that the switch does not recognize.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

72

SummaryFrame Switches
IGMP snooping
Switches with Layer 3 aware ASICs
High-throughput performance maintained Increases cost of switches

Switches without Layer 3 aware ASICs


Suffer serious performance degradation

CGMP
Requires Cisco routers and switches Can be implemented in low -cost switches
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

73 73

Summary
IGMP Snooping can actually provide some performance optimizations over CGMP. However, it requires switches that are implemented with more costly Layer 3 aware ASICs in order to avoid performance impacts. CGMP is a proprietary protocol that is only implemented on Cisco routers and switches and does not have quite as many performance optimizations that IGMP Snooping can offer. However, it is the ONLY choice if one desires to provide Layer 2 multicast traffic constraint on low-end switches such as the Cisco Catalyst 1900 or other equivalent switches.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

73

Design Issue Server Location

Multicast Traffic Being Dropped!!! Unnecessary Traffic!!!


Catalyst 5000
VLAN1 Catalyst 29xx Catalyst 29xx Catalyst 29xx VLAN2 VLAN3

Video Server
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

74 74

Layer 2 Design Issues Server Location


IGMP Snooping and CGMP do not solve all problems related to multicast traffic constrainment in Layer 2 networks. Given a typical Layer 2 switch network where a high-end central switch is trunked to closet switches, unwanted traffic can still wind up flowing over inter-switch trunks.

Example:
In the above drawing, the Video Server is located on one of the ports of the 2900 closet switches. This server is sourcing high-rate video for which there are no receivers in the LAN switching environment. However, the IP Multicast host model defined in RFC 1112 requires that this traffic flow must at all times be sent to the router. This results in traffic flowing over the inter-switch trunk that may not be necessary. Certainly, if there are no receivers beyond the router, this traffic flow is just wasting trunk bandwidth.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

74

Design Issue Server Location

Multicast Traffic Still Being Dropped!!!


Video Server

Catalyst 5000
VLAN1 VLAN2 VLAN3

Catalyst 29xx

Catalyst 29xx

Catalyst 29xx

Keep high B/W sources close to router


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

75 75

Layer 2 Design Issues Server Location


By paying attention to this possibility in the design of the net work, the impact can be reduced. In the above example, the high-rate video server has been moved as close as possible to the router. This eliminates the possibility of unnecessary traffic flowing on the inter-switch trunks. There is another way to solve this problem and that is to replace the switches with routers. It is only at Layer 3 that complete control of multicast flows is possible.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

75

Design IssueCore Switch


Video Server

Router A

7500
1.5MB MPEG Video Streams

Unnecessary Multicast Traffic !!!

Holy Multicast, Batman!! 3MB of unwanted data! (Choke, gasp, wheeze!)

T1

Catalyst 5000

2500 2500

WAN

Router D

Router B

7500

Unnecessary Multicast Traffic !!!

7500
Receiver Group 2

Router C

Receiver Group 1

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

76 76

Layer 2 Design Issues Core Switch Issues


In the case of a core network composed of several routers on an Ethernet segment, IGMP Snooping and CGMP provide absolutely no help in the constraint of multicast traffic flows. This is because routers do not send IGMP Membership Reports for desired multicast flows. (They use PIM control messages or some other routing protocol control messages instead.)

Example:
Consider the network shown in the drawing above. Three campus routers are connected via 100Mbps Ethernet to a core switch. A video server connected to Router A is sourcing two 1.5Mbps MPEG video multicast streams, one to Group 1 and another to Group 2. Router B has a directly connected member of Group 1 and therefore needs the 1.5Mbps Group 1 video stream. Router C has a directly connected member of Group 2 and therefore needs the 1.5Mbps Group 1 video stream. Because both Routers B & C are on the same Ethernet segment (albeit on different ports on the switch), they each receive both Group 1 & 2 video streams even though they only need one. Even worse, Router D has been connected to this core backbone Ethernet segment for the purpose of supplying remote sites with unicast connectivity and low rate multicast. (i.e. there is no intention of sending MPEG video to the remote sites.) Unfortunately, the little 2500 will also receive both of the high-rate video streams for a total of 3Mbps of unwanted traffic! While the 2500 is capable of fast-dropping the unwanted traffic in the fastswitching path, it still has a significant impact on the performance of the router.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

76

Design Issue Core Switch


Video Server

Router A

7500
1.5MB MPEG Video Streams
2500 2500

T1

WAN

Router-D

Catalyst 5000

Move WAN Router to Another VLAN Segment Inside of Catalyst 5000

Router B

7500

Unnecessary Multicast Traffic !!!

7500
Receiver Group 2

Router C

Receiver Group 1

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

77 77

Layer 2 Design Issues Core Switch Issues


While todays technology can not solve this problem (it would basically require the switch to run PIM which means it must become a router and not a switch), this problem can be address by proper network design.

Solution
By connecting Router D to a separate LAN segment off of Router A (this could be accomplished using another port on Router A and a separate VLAN in the Cat5000), Router D is able to prune off any unwanted traffic. Exercise 1: Why is this now possible? (See answer below.) Unfortunately, we still have unwanted traffic flowing to Routers B & C. One might argue that the same solution used for Router D could be used which is true. However, this would require additional Ethernet ports on Router A. The other solution would be to use multiple VLANs on the single Ethernet port on Router A. Unfortunately, this would significantly reduce the overall bandwidth available and is a sub-optimal solution.

Answer to Exercise 1: Because there is no other router on the LA N segment, Router A is able to Prune off the traffic flow without the Prune being overridden by another router on the LAN segment.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

77

Design Issue Core Switch


Problem
Routers send PIM Join/Prunes at Layer 3
IGMP Join/Leaves not sent by routers Other routers on VLAN can override Prune

Switches operate at Layer 2


Use IGMP Snooping to constrain multicast Must assume routers want all multicast traffic

Need new Layer 2 Join/Prune mechanism


Permits routers to send Join/Prunes to switch

Solution: (RGMP)
Router Group Management Protocol
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

78 78

Design Issue Core Switch


Routers do not send IGMP Membership reports or IGMP Leaves to communicate their desire to receive multicast traffic. Instead, they communicate at Layer 3 using PIM Join/Prune messages. However, when sending a PIM (*,G) Prune message to indicate they do not want a particular multicast group, another router on the VLAN can override the Prune. Switches only operate at Layer 2 (otherwise they would be routers). They listen to IGMP messages to constrain the flow of multicast traffic to hosts that wish to receive a particular multicast group. Because routers do not send IGMP membership reports, the switches must assume that the routers want all multicast traffic. (This is an assumption of the basic multicast model defined in RFC 1112.) In order to constrain multicast traffic between routers on a core LAN segment, routers and switches need some form of Layer 2 Join/Prune communication that permits the routers to inform the switch of which groups it has interest. Cisco has developed the Router Group Management Protocol (RGMP) to accomplish this communication.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

78

Router Group Management Protocol


Runs on Core Routers and Switches
Routers
Identify themselves via RGMP Hello/Bye msg. Send special Layer 2 (*,G) Join/Prune messages.

Switches
Do not forward multicast traffic to router ports until specifically requested.

Limitations:
Only works with PIM-SM/PIM-SSM No Hosts permitted on VLAN
Routers cannot detect sources since multicast flooding to routers is off by default.
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

79 79

Router Group Management Protocol


RGMP is enabled on core routers and switches. This enables the routers to inform the switches of which multicast group traffic it needs to receive. Routers send special Layer 2, RGMP (*,G) Join/Prune messages to the switches. Switches, by default, do not forward any multicast traffic to the routers. Instead, they wait for RGMP (*,G) Join messages from the routers to tell them when to begin forwarding multicast group traffic. Limitations: RGMP requires the use of PIM-SM to operate properly. RGMP is intended for use only on a VLAN used as a router backbone. No multicast hosts should be used on a VLAN configured for RGMP. This is because any multicast traffic sourced by a host may not be heard by the routers since RGMP blocks multicast traffic to the routers by default. This could prevent the DR on this VLAN from knowing that there is an active source which would in turn, prevent the source from being registered to the RP.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

79

RGMP
Initially, multicast forwarding to routers is disabled
RGMP Switch

CPU

Switching Engine

CAM Table
1
MAC Address 0100.5exx.xxxx Ports

2 B C

3 D

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

80 80

RGMP Example:
Consider the LAN segment running through the above switch that has four routers connected in a core router backbone. Initially, RGMP blocks all multicast traffic from reaching the routers by the wildcard CAM table entry.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

80

RGMP
1st Router receives a PIM (*,G) Join from downstream
RGMP Switch

CPU

Switching Engine
RGMP Join
0100.5e01.0101

CAM Table
1
MAC Address 0100.5exx.xxxx 0100.5e01.0101 0100.5exx.xxxx Ports 2

2 B C

3 D

Entry Added

PIM (*, 224.1.1.1) Join


8/9/2001 4:20 PM

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

81 81

RGMP Example:
When router B receives a downstream PIM (*,G) Join for group 224.1.1.1, it sends an RGMP Join for the corresponding MAC address, 0x0100.5e01.0101. When the switch receives the RGMP Join message, it instantiates an entry in its CAM table for 0x0100.5e01.0101 with the port number of the sending router.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

81

RGMP
2nd Router receives a PIM (*,G) Join from downstream
RGMP Switch

CPU

Switching Engine
RGMP Join
0100.5e01.0101

CAM Table
1
MAC Address 0100.5e01.0101 0100.5exx.xxxx Port Added Ports 2,4

2 B C

3 D

PIM (*, 224.1.1.1) Join


1998 2001, Cisco Systems, Inc. All rights reserved.

Module2. ppt

8/9/2001 4:20 PM

82 82

RGMP Example:
When router D receives a downstream PIM (*,G) Join for group 224.1.1.1, it too sends an RGMP Join for the corresponding MAC address, 0x0100.5e01.0101. When the switch receives the RGMP Join message from router D, it updates the entry in its CAM table for 0x0100.5e01.0101 by adding the port number associated with router D.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

82

RGMP
Multicast is constrained to routers B and D
RGMP Switch

CPU

Switching Engine

CAM Table
1
MAC Address 0100.5e01.0101 0100.5exx.xxxx Ports 2,4

2 B C

3 D

Source
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

83 83

RGMP Example:
When a source behind router B begins transmitting to group 224.1.1.1, router B forwards this traffic to the core router backbone (the switch). The CAM table entry in the switch now effectively constrains the multicast traffic to routers B and D. Routers A and C, which have not joined the multicast group, do not receive the traffic.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

83

Design Issue 224.0.0.x Flooding


Router A 1 LAN Switch
OSPF Hello (224.0.0.5)

CPU

Switching Engine

CAM Table
2
MAC Address Ports

Host 1

Router B

Router C

Router D
8/9/2001 4:20 PM

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

84 84

Layer 2 Design Issue 224.0.0.x Flooding


By default, all Cisco switches flood multicast traffic addressed to 224.0.0.x. to all ports on the switch. More specifically, any traffic addressed to 0x0100.5e00.00xx is flooded. This means that this includes not only 224.0.0.x address but 225.0.0.x, 226.0.0.x, etc., etc. This is done in order to avoid problems with protocols such as OSPF, EIGRP, DVMRP, PIM and many others that make use of link-local multicast to addresses in the 224.0.0.x range. If this was not done, problems could occur that cause the flow of traffic in these ranges to be inadvertently constrained thereby breaking these protocols.

Example:
Consider the OSPF LAN segment running through the above switch that has four OSPF routers plus one host. Because there is no entry in the CAM Table for the MAC address equivalent of 224.0.0.5 (OSPF Router Hello), the OSPF Hello messages are being flooded to all OSPF routers on the segment and OSPF adjacency is being maintained.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

84

Design Issue 224.0.0.x Flooding


Router A 1 LAN Switch
IGMP Report 224.0.0.5

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e00.0005 Ports 2

Entry Added
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Host 1

Router B

Router C

Router D
8/9/2001 4:20 PM

85 85

Layer 2 Design Issue 224.0.0.x Flooding Example (cont.)


Now lets assume that for some (perverse) reason, Host 1 decides to Join group 224.0.0.5 and therefore sends an IGMP Membership Report for this group. (This might be caused by a multicast application that is launched with an incorrect group address in the command line or simply by a hacker wishing to mess with the network.) Assuming the switch is doing IGMP Snooping or CGMP and does not automatically flood traffic in this range it might respond to the IGMP Membership Report by instantiating an entry in the CAM Table to constrain the flow of 224.0.0.5 multicast traffic to just Host 1.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

85

Design Issue 224.0.0.x Flooding


Router A 1 LAN Switch
OSPF Hello (224.0.0.5)

CPU

Switching Engine

CAM Table
2
MAC Address 0100.5e00.0005 Ports 2

Host 1

Router B

Router C

Router D
8/9/2001 4:20 PM

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

86 86

Layer 2 Design Issue 224.0.0.x Flooding Example (cont.)


As a result of blindly instantiating this CAM Table entry, further OSPF Hello traffic is constrained to Host 1. This results in OSPF adjacency being lost between all OSPF routers on this segment. Care should be taken when purchasing non-Cisco switches as some vendors will behave as shown in this example which can cause problems.

Note:
This is a hotly debated issue in the IETF that stems from the fact that the IGMPv2 spec states that with the exception of 224.0.0.2, all devices MUST join the multicast group in order to receive traffic from the group. Unfortunately, this implies that all router vendors must rewrite their existing routing protocols (such as OSPF) so that the router sends IGMP Membership Reports for such groups as All OSPF Routers, 224.0.0.5, All OSPF DRs, 224.0.0.6, All EIGRP Routers, 224.0.0.10, etc., etc. This is clearly an absurd idea as even if all vendors did rewrite their implementations to be compliant with the new IGMP spec, it would mean that the customer would have to wholesale upgrade all routers in the network in a flag day. This might even require changing out router hardware in order to be able to run the latest code. For this reason, Cisco has chosen to address this problem by having all switches flood any IP multicast traffic with a destination MAC address falling in the range of 0x0100.5e00.00xx. This guarantees that protocols that use linklocal multicast will continue to function properly.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

86

Design Issue Address Overlap

Try to Avoid Addresses that Must Be Flooded


32 - IP Multicast Addresses 224.0.0.x 224.128.0.x 225.0.0.x 225.128.0.x . . . 238.0.0.x 238.128.0.x 239.0.0.x 239.128.0.x
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

1 - Multicast MAC Address

0x0100.5E00.00xx

8/9/2001 4:20 PM

87 87

Layer 2 Design Issue 224.0.0.x Flooding Example (cont.)


The implication of this requirement to flood all traffic addressed to a destination MAC address in the 0x0100.5e00.00xx range means that there is a large range of Layer 3 IP multicast addresses that fall into this range as shown in the drawing above. This problem is particularly true for low-end switches that do not have the capability to differentiate between link -local traffic flows addressed to 224.0.0.xx and traffic flows addressed to valid global multicast addresses such as 225.0.0.xx.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

87

Design Issue Address Overlap

Dont forget about 32:1 overlap when selecting group addresses


32 - IP Multicast Addresses 224.1.1.1 224.129.1.1 225.1.1.1 225.129.1.1 . . . 238.1.1.1 238.129.1.1 239.1.1.1 239.129.1.1

1 - Multicast MAC Address

0x0100.5E01.0101

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

88 88

Layer 2 Design Issue Address Overlap


Remember to fact in the overlap of Layer 3 addresses into Layer 2 addresses when selecting multicast addresses. Failure to do so can result in hosts receiving unwanted multicast traffic that the switch is unable to differentiate.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

88

SummaryDesign Issues
Pay attention to campus topology
Be aware of unwanted flooding over trunks

Use IGMP snooping and/or CGMP


Neither can solve all L2 flooding issues To solve all problems requires multicast routing
Or a more robust CGMP-like protocol (hint)

224.0.0.x flooding
Watch out for switches that dont flood 224.0.0.x traffic

Address overlap
Select group addresses to avoid L2 overlap Avoid x.0.0.x group addresses when possible
Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

89 89

Design Issues Summary


Topology Watch your campus topology when designing you network for multicast and beware of the possibility of unwanted traffic over inter-switch trunks. Use IGMP Snooping and/or CGMP This will help constrain multicast traffic to hosts that have requested it. Keep in mind that not all situations are covered by IGMP Snooping or CGMP and that traffic is not always constrained under certain conditions. 224.0.0.x Flooding Watch out for vendor switches that do not flood multicast traffic in these ranges. Misbehaved or misconfigured hosts can cause this critical traffic to be shutoff in switches that do not flood this traffic. Address Overlap Try to select multicast addresses so that different applications dont map their multicast streams into the same L2 MAC address due to the 32:1 overlap of IP group addresses at Layer 2. Avoid *.0.0.* and *.128.0.* multicast addresses when possible as these ranges are flooded by Cisco switches.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

89

Multicast over ATM LANE Core

BUS

Unwanted Data!!!
Source Member
8/9/2001 4:20 PM

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

90 90

Multicast over an ATM Lane Core Network


The nature of ATM LANE hides the underlying ATM topology from the routers which can result in the inefficient use of bandwidth in the core.

Multicast flow over ATM LANE


All multicast traffic flows through the Broadcast/Unknown Server (BUS) in the multicast ELAN. This is shown in the drawing above. Each LAN Emulation Client (LEC) in the network (in this case the routers) each have a p2p VC to the BUS. Any broadcast/multicast or unknown traffic is sent by the router to the BUS for distribution to all other routers in the ELAN. The BUS has a p2mp VC that connects all the BUS device to all other routers in the network. In the above example, multicast traffic is flowing from the source, through the router to the BUS via the p2p VC. From there, the BUS sends the traffic to all routers in the core ELAN vi the p2mp VC. Note that in the above example, this traffic is being sent to routers that have no need for the traffic and is therefore wasting bandwidth in the ATM core.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

90

Multicast over ATM LANE Core


Make Sure Your BUS Device Can Handle the Maximum Expected Load!

BUS

1.5Mbit MPEG Video


Source

Make Sure Your ATM Switches Can Replicate Cells at this Rate!

Member

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

91 91

Multicast flow over ATM LANE


The higher the rate of traffic being sourced, the greater the amount of bandwidth being wasted in the ATM core. Care must be take to insure that the BUS device selected can handle the total flow of multicast traffic. Care must also be take to insure that the ATM switches in the network are capable of replicating cells at the rates necessary so that traffic is not lost along the p2mp VC.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

91

Multicast over ATM LANE Core


Why do I have to do all the work?! ELAN 1 ELAN 2

BUS

Avoid using a single device as BUS for multiple ELANs

ELAN 3

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

92 92

Multicast over an ATM Lane Core Network


A frequently made mistake is to assign the duties of the BUS device of several ELANs to a single physical device. This can often result in an overloaded BUS device.

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

92

Multicast over ATM LANE Core


Design issues
BUS horsepower is critical
Use separate BUS device per ELAN to reduce load Overloaded BUS = cell/packet loss and jitter/delay
Can cause problems on multimedia conferences

ATM switch cell replication rate is critical


Switches that replicate cells in hardware are best

Add lots of bandwidth to ATM fabric


Traffic will frequently be sent where its unwanted ATM core bandwidth will be wasted

P2MP VCs may be a better solution


Module2. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/9/2001 4:20 PM

93 93

Multicast over an ATM Design Issues


Bus Horsepower Make sure that the BUS device selected has sufficient horsepower to forward the expected multicast traffic flows. Use separate physical devices for BUS devices on different ELANs ATM Switch Horsepower Make sure that the ATM switches are capable of replicating cells at the expect multicast traffic rates. Bandwidth Account for the inefficient use of ATM bandwidth when multicasting over an ATM core network. Remember that traffic is often sent where it is unwanted, thereby wasting bandwidth. Consider alternatives to using ATM LANE

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

93

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

94

Copyright ? ?1998-2001 Cisco Systems, Inc.

Module2.ppt

94

PIM Dense Mode


Module 3

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

Module Objectives

Identify and explain the basic mechanisms of PIM Dense Mode. Identify the packet types used by PIM Dense Mode. Configure and verify normal PIM DM operation.

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

2 2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

Module Agenda

Geekometer

PIM Dense Mode Overview PIM Packet Formats PIM Dense Mode Concepts PIM Dense Mode Review Configuring PIM Dense-Mode

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

3 3

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Dense Mode Overview

Uses Push Model


Traffic is initially flooded to all PIM neighbors Branches that dont want data are pruned

Multicast forwarding state is created by the arrival of data If the source goes inactive, the tree is torn down

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

4 4

The PIM Dense Mode Push Model


This model assumes that there are members at all points in the network, hence the concept of a dense distribution of receivers. Routers initially flood Multicast traffic out all interfaces where there is: Another PIM-DM neighbor or A directly connected member or An interface that has been manually configured to join the group. Branches that do not have members send Prune messages toward the source to prune off the unwanted/unnecessary traffic. These pruned branches timeout after 3 minutes and traffic is re-flooded down the branch. Due to this periodic re-flooding, dense mode is more applicable when bandwidth is plentiful as bandwidth is wasted due to re-flooding.

In Dense mode, multicast state is created by data arrival


In PIM Dense mode, the control plane and the data plane are one in the same. This implies the following: (S, G) state, and hence the Source Tree, is created on the fly by the arrival of (S, G) multicast traffic. (S, G) state, and hence the Source Tree, is deleted when the source goes inactive and no multicast traffic is received by the router for 3 minutes. Because the control plane and data plane are mixed in PIM Dense mode, its maintenance of the Source Tree is considerably less deterministic than Sparse mode. This can sometimes result in instabilities and temporary loss of data during some network topology changes. Dense mode only has source trees - no shared trees are used
Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Dense Mode Overview

Grafts are used to join existing source tree Asserts are used to determine forwarder for multi-access LAN Prunes are sent on non-RPF P2P links
Asserts are sent on non-RPF multi-access links

Rate-limited prunes are sent on all P2P links

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

5 5

PIM Dense Mode Grafting


PIM Graft messages are used to reduce the join latency when a previously pruned branch of the Source Tree must be grafted back. This can be the case when a member joins the group after the router has sent a Prune message to prune off unwanted traffic. If Grafts were not used, the member would have to wait up to three minutes for the periodic re-flooding to occur to begin receiving the multicast traffic. By using Grafts, the Prune can be reversed almost immediately.

PIM Dense Mode Asserts


When two routers both forward the same (S, G) multicast traffic onto a common multi-access LAN, duplicate traffic is generated. When this occurs, Assert messages are generated by both routers to determine which router should continue forwarding on the LAN and which router(s) should stop (prune).

PIM Dense Mode Pruning


When data arrives on non-RPF interface (i.e. an interface that is not used to reach the source) and the interface is point-to-point (P2P), a Prune is immediately sent to the upstream neighbor in an attempt to shut off the flow of traffic. Note that when data arrives on a non-RPF interface that is not a P2P (i.e. multi-access) interface, an Assert is triggered instead of a Prune. Rate-limited Prunes are sent on all P2P interfaces. This means that if data continues to arrive on a non-RPF, P2P interface, rate-limited Prunes are sent. Rate-limited Prunes are also sent on the RPF interface of P2P links when it is necessary to Prune the flow of traffic.
Copyright ? ?1998-2001, Cisco Systems, Inc.
Module3.ppt

PIM Dense Mode Overview

Initial Flooding

Source

Multicast Packets

(S, G) State created in every router in the network!

Receiver
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

6 6

Initial Flooding
In this example, multicast traffic being sent by the source is flooded throughout the entire network. As each router receives the multicast traffic via its RPF interface (the interface in the direction of the source), it forwards the multicast traffic to all of its PIM-DM neighbors. Note that this results in some traffic arriving via a non-RPF interface such as the case of the two routers in the center of the drawing. (Packets arriving via the non-RPF interface are discarded.) These non-RPF flows are normal for the initial flooding of data and will be corrected by the normal PIM-DM pruning mechanism.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Dense Mode Overview

Pruning unwanted traffic

Source

Multicast Packets Prune Messages Receiver


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

7 7

Pruning unwanted traffic


In the example above, PIM Prunes (denoted by the dashed arrows) are sent to stop the flow of unwanted traffic. Prunes are sent on the RPF interface when the router has no downstream members that need the multicast traffic. Prunes are also sent on non-RPF interfaces to shutoff the flow of multicast traffic that is arriving via the wrong interface (i.e. traffic arriving via an interface that is not in the shortest path to the source.) An example of this can be seen at the second router from the receiver near the center of the drawing. Multicast traffic is arriving via a non-RPF interface from the router above (in the center of the network) which results in a Prune message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Dense Mode Overview

Results after Pruning

Source

Multicast Packets

(S, G) State still exists in every router in the network!

Flood & Prune process repeats every 3 minutes!!!


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver
8/10/2001 11:41 AM

8 8

Results after Pruning


In the final drawing in our example shown above, multicast traffic has been pruned off of all links except where it is necessary. This results in a Shortest Path Tree (SPT) being built from the Source to the Receiver. Even though the flow of multicast traffic is no longer reaching most of the routers in the network, (S, G) state still remains in ALL routers in the network. This (S, G) state will remain until the source stops transmitting. In PIM -DM, Prunes expire after three minutes. This causes the multicast traffic to be re-flooded to all routers just as was done in the Initial Flooding drawing. This periodic (every 3 minutes) Flood and Prune behavior is normal and must be taken into account when the network is designed to use PIM-DM.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Packet Formats

PIM Packet Headers PIM Hello Messages PIM Join/Prunes PIM Grafts/Graft Acks PIM Asserts

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

9 9

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

PIM Packet Formats (SM Only)

PIM Registers PIM Register-Stop

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

10 10

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

10

PIMv1 Packet Header


3 Type Ver. 7 Code Reserved Type: 0x14 = PIM Message 15 Checksum 31

PIMv1 used IGMP Packets

Code: 0 = Router-Query 1 = Register (SM only) 2 = Register-Stop (SM only) 3 = Join/Prune 4 = RP-Reachibility (SM only) 5 = Assert 6 = Graft 7 = Graft-ACK Ver: PIM Version = 1

PIMv1 messages are multicast to the ALL-Routers (224.0.0.2) group with a TTL of 1.
Note: (PIMv1 packet formats are not shown. Only PIMv2 packets wi ll be given.)
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

11 11

PIM (v1) Packet Headers


An IGMP type code of 0x14 indicates the frame is carrying a PIMv1 message the code field then determines the type of PIM messages. PIMv1 messages are multicast to the ALL-Routers (224.0.0.2) multicast group address with a TTL of 1. This means that these control messages are Link-Local in scope.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

11

PIMv2 Packet Header


PIMv2 is assigned protocol number 103
3 Ver. 7 Type Reserved 15 Checksum 31

Ver: PIM Version = 2 Type: 0 = Hello 1 = Register 2 = Register-Stop 3 = Join/Prune 4 = Bootstrap 5 = Assert 6 = Graft 7 = Graft-Ack 8 = C-RP-Announcement

(SM only) (SM only) (SM BSR only) (DM only) (DM only) (SM BSR only)

PIMv2 messages are multicast to the ALL-PIM-Routers (224.0.0.13) group with a TTL of 1.
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

12 12

PIM (v2) Packet Headers


PIMv2 packets are encoded in their own protocol packets using PIM assigned protocol number of 103. The Type field then determines the type of PIMv2 message. PIMv2 messages are multicast to the ALL-PIM-Routers (224.0.0.13) multicast group address with a TTL of 1. This means that these control messages are Link-Local in scope.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

12

PIM Hello Messages


3 Ver. 7 Type Reserved 15 Checksum Option Length Option Value ... Option Type OptionValue OptionLength 31

Option Type

Option Types: 1 = Holdtime (Period of time in seconds before this PIM neighbor times out.) 19 = DR Priority 20 = Generation ID
13 13

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

PIMv2 Hello Messages


PIMv2 Hello messages are used to form and maintain neighbor adjacencies They are sent periodically to indicate to the other PIM routers on the network that this PIM router is still present. The PIMv2 Hello message format defines numerous Option TLVs which include: Holdtime: This specifies the time in seconds that this neighbor is reachable. A value of 0xffff indicates the neighbor never times out. A value of 0x0000 means the neighbor is immediately timed out. DR Priority: This value can be used in the election of the DR for the subnet. Generation ID: This is a random 32-bit value that is sent whenever the neighbor actives PIM on the interface. It can be used to determine when the neighbor has been reactivated after a failure.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

13

PIM Join/Prune Packets


3 Ver. 7 Type Reserved 15 Checksum 31 Upstream Neighbor Address: IP address of RPF of upstream neighbor Holdtime: Period of time in seconds before this join/prune times out. Num. Grps # of Groups in Group list Group List: List (by group) of sources to Join and/or Prune

Upstream Neighbor Address (Encoded-Unicast) Reserved Num. Groups Holdtime

Group List

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

14 14

PIM Join/Prune Packets


The JOIN/PRUNE is a single packet format that contains a list of Joins and a list of Prunes. Either list may be empty (although not both).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

14

PIM Group Lists


0 Number of Join Sources 15 Group-1 (Encoded-Group) Number of Prune Sources Group- x Group IP Address Number of Join Sources # of Joins for Group- x Number of Prune Sources # of Prunes for Group- x Join/Prune Source - x Encoded Source address to be Joined/Pruned 31

Join Source-1 (Encoded-Source) Join Source-n (Encoded-Source) Prune Source-1 (Encoded-Source) Prune Source-n (Encoded-Source) Group-2 (Encoded-Group) Number of Join Sources Number of Prune Sources

Group Lists are used in Join/Prune and Graft/Graft-Ack messages.


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

15 15

Group Lists
Group Lists are used in Join/Prune messages as well as Graft and Graft -Ack messages. A Group List is a list of Group entries each beginning with a Group IP Address and Group mask to identify the Multicast Group. Each Group List entry contains a list of zero or more sources to Join followed by a list of zero or more sources to Prune. Group IP Address Number of Join Sources Number of Prune Sources Join List Prune List The addresses used in Join and Prune lists use a special encoded format that allows for other protocols besides IPv4. (See next slides.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

15

Encoded Unicast Addresses


3 7 10 15 Addr Family Encoding Type Unicast Address Addr Family: IANA Address Family Identifier (1=IPv4) Encoding Type: Type of encoding within Address Family Unicast Address : Unicast Address of the target device. 31 Unicast Address ...

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

16 16

Encoded Unicast Addresses


The Unicast Addresses contained in the Join and Prune Lists of a Group List entry are encoded in a special format as shown in the slide above. Address Family Indicates the IANA Address Family Identifier. For IPv4, this value is 1. Encoding Type Indicates the encoding type within the Address Family. Unicast Address IP unicast address of the target device.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

16

Encoded Source Addresses


3 7 10 15 Addr. Family Encoding Type Rsvd Source Address Addr Family: IANA Address Family Identifier (1=IPv4) Encoding Type: Type of encoding within Address Family S = Sparse Mode bit : Indicates sparse mode group. Rcvrs must send periodic joins. W = Wildcard bit : Indicates join/prune applies to (*, G) entry. R = RP bit : Indicates this join/prune should be sent up the Shared Tree towards the RP. Mask Length: Number of bits in the prefix of the Group Address. Source Address : Address of Multicast Source.
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

23 SWR

31 Mask Len

8/10/2001 11:41 AM

17 17

Encoded Source Addresses


The Source Addresses contained in the Join and Prune Lists of a Group List entry are encoded in a special format as shown in the slide above. Address Family Indicates the IANA Address Family Identifier. For IPv4, this value is 1. Encoding Type Indicates the encoding type within the Address Family. S = Sparse Mode bit Used by routers on the Shortest-Path Tree (SPT) to indicate the group is a sparse mode group which tells the receiver of this join that it must send periodic Joins toward the source. W = Wildcard bit Indicates that the Join/Prune applies to the (*, G) entry. If this bit is cleared, it indicates that this applies to an (S, G) entry. Joins and Prunes sent to the RP should have this bit set. R = RP bit Indicates that this information should be sent up the Shared Tree towards the RP. If this bit is clear, the information should be sent up the ShortestPath Tree toward the source. M. Len Mask length in bits. Source Address IP address of the Source.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

17

Encoded Group Addresses


3 7 10 15 23 Addr. Family Encoding Type Reserved Group Address Addr Family: IANA Address Family Identifier (1=IPv4) Encoding Type: Type of encoding within Address Family Mask Length: Number of bits in the prefix of the Group Address. Group Address : Multicast Group Address. 31 Mask Len

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

18 18

Encoded Group Addresses


Group Addresses contained in the Join and Prune Lists of a Group List entry are encoded in a special format as shown in the slide above. Address Family Indicates the IANA Address Family Identifier. For IPv4, this value is 1. Encoding Type Indicates the encoding type within the Address Family. M. Len Mask length in bits. Group Address IP multicast group address.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

18

PIM Graft/Graft-Ack Packets


3 Ver. 7 Type Reserved 15 Checksum 31 Upstream Neighbor Address: IP address of RPF of upstream neighbor Holdtime: Period of time in seconds before this join/prune times out. Group List Num. Grps # of Groups in Group list Group List: List (by group) of sources to Graft or Graft-Ack

Upstream Neighbor Address (Encoded-Unicast) Reserved Num. Groups Holdtime

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

19 19

PIM Graft/Graft-Ack Packets


Graft/Graft-Ack are used in dense mode for grafting onto the tree These are the only PIM messages that are sent reliably (I.e. get an acknowledgement)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

19

PIM Assert Packets


3 Ver. 7 Type Reserved 15 Checksum 31 Group Address (Encoded-Group) Source Address (Encoded-Source) R Metric Preference Metric Group Address: Identifies Group of the Assert Source Address: Identifies Source of the Assert R: (Sparse Mode) 1 = Assert down RP Tree; 0 = Assert Down SPT Metric Preference: Admin. Distance of unicast routing protocol Metric: Unicast routing protocol metric
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

20 20

PIM Assert Packets


Assert messages determine who will be the active forwarder when there is redundancy in the network toward the source If the same routing protocol is used between the redundant neighbors, the metric is compared and the best metric wins In the case of an equal cost metric with the same routing protocol - the highest IP address neighbor will break the tie In the case where dissimilar unicast routing protocols are used, a metric preference is used to weight the preferred order of the routing information of each unicast routing protocol (like administrative distance)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

20

PIM Register Packets


Sparse Mode Only
3 Ver. BN 7 Type Reserved Reserved 15 Checksum 31

Multicast Data Packet

B = Border Bit: Indicates DR is a border router performing a proxy-register N = Null Register Bit: Indicates DR is sending a Null-Register before expiring its register-suppression timer. Multicast Data Packet: The original packet sent by the source. For periodic sending of registers, this part is null.
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

21 21

PIM Register Packets


Used in SM by the DR to encapsulate multicast packets and send them to the RP so they may be forwarded down the shared tree. Register messages with encapsulated multicast packets continue to be sent to the RP by the DR until a Register-Stop message is received from the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

21

PIM Register-Stop Packets


Sparse Mode Only
3 Ver. 7 Type Reserved 15 Checksum 31

Group Address (Encoded-Group) Source Address (Encoded-Source)

Group Address: The group address from the register message. Source Address: IP host address of source from multicast data packet in register.

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

22 22

PIM Register-Stop Packets


Used in SM by the RP to inform the DR to stop sending Register messages. This message is sent after the RP has joined the source tree to the DR and is receiving the multicast traffic natively via the SPT.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

22

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Forwarding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

23 23

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

23

PIM Neighbor Discovery


171.68.37.2 PIM-DM Router 2 Highest IP Address elected as DR (Designated Router) PIM Query

PIM Query PIM-DM Router 1 171.68.37.1 PIM Queries are Multicast to the All-Routers (224.0.0.2) (with a TTL of 1) multicast group address periodically. (Default = 30 seconds ) If the DR times-out, a new DR is elected. In PIM DM, interface is added to outgoing interface list for all groups when first neighbor is heard.

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

24 24

PIM Neighbor Discovery


PIM Queries are sent periodically to discover the existance of other PIM routers on the network. For Multi-Access networks (e.g. Ethernet), the PIM Query message is multicast to the All-Routers (224.0.0.2) multicast group address.

Designated Router (DR)


For Multi-Access networks, a Designated Router (DR) is elected. In PIM Sparse mode networks, the DR is responsible for sending Joins to the RP for hosts on the Multi-Access network. For Dense mode, the DR has no meaning. The exception to this is when IGMPv1 is in use. In this case, the DR also functions as the IGMP Querier for the Multi-Access network.

Designated Router (DR) Election


To elect the DR, each PIM node on a Multi-Access network examines the received PIM Query messages from its neighbors and compares the IP Address of its interface with the IP Address of its PIM Neighbors. The PIM Neighbor with the highest IP Address is elected the DR. If no PIM Querys have been received from the elected DR after some period (configurable), the DR Election mechanism is run again to elect a new DR.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

24

PIM Neighbor Discovery


wan-gw8> show wan-gw8> show ip ip pim pim neighbor neighbor PIM PIM Neighbor Neighbor Table Table Neighbor Neighbor Address Address Interface Interface 171.68.0.70 FastEthernet0 171.68.0.70 FastEthernet0 171.68.0.91 FastEthernet0 171.68.0.91 FastEthernet0 171.68.0.82 FastEthernet0 171.68.0.82 FastEthernet0 171.68.0.86 FastEthernet0 171.68.0.86 FastEthernet0 171.68.0.80 FastEthernet0 171.68.0.80 FastEthernet0 171.68.28.70 Serial2.31 171.68.28.70 Serial2.31 171.68.28.50 Serial2.33 171.68.28.50 Serial2.33 171.68.27.74 Serial2.36 171.68.27.74 Serial2.36 171.68.28.170 Serial0.70 171.68.28.170 Serial0.70 171.68.27.2 Serial1.51 171.68.27.2 Serial1.51 171.68.28.110 Serial3.56 171.68.28.110 Serial3.56 171.68.28.58 Serial3.102 171.68.28.58 Serial3.102

Uptime Uptime 2w1d 2w1d 2w6d 2w6d 7w0d 7w0d 7w0d 7w0d 7w0d 7w0d 22:47:11 22:47:11 22:47:22 22:47:22 22:47:07 22:47:07 1d04h 1d04h 1w4d 1w4d 1d04h 1d04h 12:53:25 12:53:25

Expires Expires 00:01:24 00:01:24 00:01:01 00:01:01 00:01:14 00:01:14 00:01:13 00:01:13 00:01:02 00:01:02 00:01:16 00:01:16 00:01:08 00:01:08 00:01:21 00:01:21 00:01:06 00:01:06 00:01:25 00:01:25 00:01:20 00:01:20 00:01:03 00:01:03

Mode Mode Dense Dense Dense Dense (DR) (DR) Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense Dense

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

25 25

show ip pim neighbor command output


Neighbor Address - the IP address of the PIM Neighbor Interface - the interface where the PIM Query of this neighbor was received. Uptime - the period of time that this PIM Neighbor has been active. Expires - the period of time after which this PIM Neighbor will no longer be considered as active. (Reset by the receipt of a another PIM Query.) Mode - PIM mode (Sparse, Dense, Sparse/Dense) that the PIM Neighbor is using. (DR) - Indicates that this PIM Neighbor is the Designated Router for the network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

25

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

26 26

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

26

PIM State

Describes the state of the multicast distribution trees as understood by the router at this point in the network. Represented by entries in the multicast routing (mroute) table Used to make multicast traffic forwarding decisions Composed of (*, G) and (S, G) entries Each entry contains RPF information
Incoming (i.e. RPF) interface RPF Neighbor (upstream)

Each entry contains an Outgoing Interface List (OIL)


OIL may be NULL

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

27 27

PIM State
In general, Multicast State basically describes the multicast distribution tree as it is understood by the router at this point in the network. However to be completely correct, Multicast State describes the multicast traffic forwarding state that is used by the router to forward multicast traffic.

Multicast Routing (mroute) Table


Multicast state is stored in the multicast routing (mroute) table and which can be displayed using the show ip mroute command. Entries in the mroute table are composed of (*, G) and (S, G) entries each of which contain: RPF Information consisting of an Incoming (or RPF) interface and the IP address of the RPF (i.e. upstream) neighbor router in the direction of the source. (In the case of PIM-SM, this information in a (*, G) entry points toward the RP. PIM-SM will be discussed in a later module.) Outgoing Interface List (OIL) which contains a list of interfaces that the multicast traffic is to be forwarded. (Multicast traffic must arrive on the Incoming interface before it will be forwarded out this interfaces. If multicast traffic does not arrive on the Incoming interface, it is simply discarded.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

27

PIM-DM State Example


sj-mbone> show ip mroute IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT M - MSDP created entry, X - Proxy Join Timer Running A - Advertised via MSDP Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (*, 224.1.1.1), 00:00:10/00:00:00, RP 0.0.0.0, flags: D Incoming interface: Null, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Dense, 00:00:10/00:00:00 Serial1, Forward/Dense, 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 (128.9.160.43/32, 224.1.1.1), 00:00:10/00:02:49, flags: T Incoming interface: Serial0, RPF nbr 198.92.1.129 Outgoing interface list: Serial1, Forward/Dense, 00:00:10/00:00:00 Serial3, Prune/Dense, 00:00:05/00:02:55

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

28 28

PIM-DM State Example


(*, G) Entry - The (*, 224.1.1.1) entry shown in sample output of the show ip mroute command is the (*, G) entry. These entries are not directly used for multicast traffic forwarding in PIM-DM. However in Cisco IOS, all (S, G) entries will always have a parent (*, G) entry and in the case of PIM -DM the OIL of these entries reflect interfaces that: Have PIM -DM neighbors or Have directly connected members or Have been manually configured to join the group. (S, G) Entry - The (128.9.160.43/32, 224.1.1.1) entry is an example of an (S, G) entry in the mroute table. This entry is used to forward any multicast traffic sent by source 128.9.160.43 to group 224.1.1.1. Notice the following: The Expires countdown timer in the first line of the (S, G) entry which shows when the entry will expire and be deleted. This entry is reset to 3 minutes whenever an (S, G) multicast packet is forwarded. The Incoming interface information is used to RPF check arriving (S, G) multicast traffic. If a packet does not arrive via this interface, the packet is discarded. The Outgoing Interface list which reflects the interfaces where (S,G) packets are to be forwarded. Note that Serial3 has been pruned and traffic is not being forwarded out this interface. Also note that the prune status of this interface will expire in 00:02:55 at which time the interface will return to Forward status. (This is how the flood and prune mechanism is accomplished.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

28

PIM-DM (*,G) State Rules


(*,G) created automatically
When 1st (S,G) for group is created
(S,G)s must always have a parent (*,G)

When a directly connected member joins the group

(*,G) reflect PIM neighbor adjacency


IIF = NULL OIL = all interfaces
with PIM-DM neighbors or with directly connected hosts or manually configured
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

29 29

PIM-DM (*,G) State Rules


All (S, G) entries must always have a parent (*, G) entry. Therfore, (*, G) entries are automatically created whenever an (S, G) entry for the group must be created. The (*, G) entry is created first and then the (S, G) entry. The reason for this will become clear shortly. PIM-DM (*, G) entries are also created as a result of a directly connected member joining the group. This can result in a (*, G) entry without a corresponding (S, G) entry if there are no sources currently sending traffic to group G. PIM -DM (*, G) entries reflect PIM neighbor/member adjacency In PIM -DM, the (*, G) entry is not used for actual traffic forwarding. Therefore, the Incoming interface information is meaningless and therefore always Null. The OIL of a PIM-DM (*, G) entry reflects PIM-DM neighbor and/or member adjacency. Therefore, any interface with a PIM-DM neighbor or a directly connected member of the group will be reflected in the OIL of the (*, G) entry. (Note: It is also possible to force the router into thinking that there is a directly connected member of the group on the interface using the ip igmp static-group <group> command.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

29

PIM-DM (S,G) State Rules


(S,G) created by multicast data arrival
Parent (*,G) created (if doesnt exist) IIF = RPF Interface in direction of source OIL = Copy of OIL from (*,G) minus IIF

Interfaces in OIL initially Forward


Go to Pruned state when Prune rcvd Forward intfc timers never expire Pruned intfc timers expire in 3 minutes
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

30 30

PIM-DM (S, G) Rules


In PIM -DM, (S, G) state is created on the fly as the result of the arrival of multicast data. When a (S, G) packet arrives at a router and a corresponding (S, G) entry does not exist, one is created as follows: If a corresponding (*, G) entry does not exist, it is created first and its Outgoing Interface list populated using the rules previously described. The RPF Information is computed for the source S. This information is stored in the (S, G) entry as the Incoming interface and the RPF neighbor (i.e. the PIM-DM neighbor in the direction of the source). The OIL of the (S, G) entry is populated with a copy of the OIL from the parent (*, G) entry less the Incoming interface. (The Incoming interface must not appear in the OIL otherwise a multicast route loop could occur.) The interfaces in the (S, G) OIL are initially placed in Forward/Dense status so that arriving (S, G) traffic (that arrives on the RPF interface) is forwarded out these interfaces. These interfaces remain in this status until a Prune message is received via the interface. At that point, the status of the interface will switch to Pruned/Dense which stops the forwarding of traffic out this interface. When an interface changes status to Pruned/Dense, the interface prune timer is started which causes the interface to switch back to Forward/Dense status after 3 minutes have lapsed.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

30

PIM-DM State Flags


D C L P T = Dense Mode = Directly Connected Host = Local (Router is member) = Pruned (All intfcs in OIL = Prune) = Forwarding via SPT

Indicates at least one packet was forwarded

J = Join SPT
Always on in (*,G) entry in PIM-DM Basically meaningless in PIM-DM

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

31 31

PIM-DM State Flags


D Flag ((*, G) entries only) Indicates the group is operating in Dense mode. (Appears only on (*, G) entries.) C Flag Indicates that there is a member of the group directly connected to the router. L Flag Indicates the router itself is a member of this group and is receiving the traffic. (This would be the case for the Auto-RP Discovery group 224.0.1.40 which all Cisco routers join automatically.) P Flag Set whenever all interfaces in the outgoing interface list of an entry are Pruned (or the list is Null). This general means that the router will send Prune messages to the RPF neighbor to try to shutoff this traffic.) T Flag ((S, G) entries only) Indicates that at least one packet was received via the SPT J Flag Always on in (*,G) entries in PIM -DM. Used by the internal code. Basically meaningless in PIM-DM.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

31

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

32 32

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

32

PIM DM Flooding
S0 S3 S1

rtr-a

S0

rtr-b
E1

PIM Dense mode interfaces are placed on the (*,G) oilist for a Multicast Group IF:
PIM neighbor heard on interface Host on this interface has joined the group Interface has been manually configured to join group.

(S,G) entries inherit a copy of the (*,G) oilist .


Minus the Incoming Interface
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

33 33

PIM DM Forwarding
If a PIM neighbor is present - dense mode assumes EVERYONE wants to receive the group so it gets flooded to that link by definition. This is accomplished as follows: The (*, G) OIL is populated with interfaces that have PIM -DM neighbors or a directly connected member of the group. (This can also be simulated via manual configuration of the interface.) When the (S, G) entry associated with the traffic flow is created, its OIL is populated with a copy of the interfaces in the (*, G)OIL less the Incoming interface. This results in arriving (S, G) traffic being initially flooded to all PIM-DM neighbors and/or directly connected members of the group. The next few slides/pages will demonstrate this process in a step by step fashion.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

33

PIM DM Flooding
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

Arriving data causes rtr-a to create state

S0

rtr-b
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:59, 00:00:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial1, Forward/Dense, Serial1, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:49, 00:00:10/00:02:49, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Serial3, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

34 34

Arriving data causes rtr-a to create state A parent (*, G) entry must first be created before the (S, G) entry can be created.
The (*, 224.2.127.254) entry is created and the outgoing interface list is populated with interfaces that: Have a PIM -DM neighbor or Have a directly connected member or Has manually be configured to join the group (Note: In this example, PIM-DM neighbors are assumed to be connected to rtr-a via S0 and S1.)

The (S, G) entry is then created.


The RPF information for source 128.9.160.43 is computed which results in the Incoming interface being Serial0 and an RPF neighbor of 198.92.1.129. The (S, G) Outgoing interface list (oil) is populated with a copy of the (*,G) oil minus the Incoming interface Serial0. This results in Serial1 and Serial3 being in the (S, G) oil. The status of these interfaces are initially Forward/Dense which results in the data being flooded out these interfaces. Note that the Expiration timers on the interfaces in the oilist are both 00:00:00. This means that traffic will continue to be forwarded out this interface until a prune is received. The Expiration countdown timer of the (S, G) entry indicates 00:02:49. This timer will be reset to 00:03:00 whenever an (S, G) packet is forwarded. If the counter reaches zero, the (S, G) entry will be deleted. If it is the last (S,G) entry, the (*, G) entry will also be deleted.
Copyright ? ?1998-2001, Cisco Systems, Inc.
Module3.ppt

34

PIM DM Flooding
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

Packets are flooded out all interfaces in (S, G) oilist .

S0

rtr-b
E1
(*, (*, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:59, 00:00:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial1, Forward/Dense, Serial1, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:49, 00:00:10/00:02:49, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Serial3, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

35 35

Initial Flooding
Now that the (*, G) and (S, G) entries have been created, the router begins to forward all (S, G) multicast traffic based on the (S, G) entry. Traffic arriving at rtr-b will be RPF checked against the Incoming interface, Serial0. Any (S, G) packets that do not arrive via this interface will fail the RPF check and be discarded. Traffic arriving via this interface will RPF check successfully and be forwarded based on the (S, G) OIL. At this point in the example, that status of both Serial1 and Serial3 in the OIL are both Forward/Dense. This causes arriving (S, G) traffic (that RPF checks) to initially be flooded out these two interfaces.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

35

PIM DM Flooding
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

Arriving data causes rtr-b to create state

S0

rtr-b
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:59, 00:00:12/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:48, 00:00:12/00:02:48, flags: flags: PT PT Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.2.31 198.92.2.31 Outgoing Outgoing interface interface list: list: Null Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

36 36

Arriving data causes rtr-b to create state A parent (*, G) entry must first be created before the (S, G) entry can be created.
The (*, 224.2.127.254) entry is created and the outgoing interface list is populated with interfaces that: Have a PIM -DM neighbor or Have a directly connected member or Has manually be configured to join the group This results in only Serial0 being places in the (*,G) oil.

The (S, G) entry is then created.


The RPF information for source 128.9.160.43 is computed which results in the Incoming interface being Serial0 and an RPF neighbor of 198.92.2.31. The (S, G) Outgoing interface list (oil) is populated with a copy of the (*,G) oil minus the Incoming interface Serial0. The removal of the Incoming interface results in the (S, G) oil being Null. The P flag is set on the (S, G) entry which indicates that rtr-b will send a Prune messages to rtr-a. (See the section on Pruning.) The Expires countdown timer of the (S, G) entry indicates 00:02:48. This timer would normally be reset to 00:03:00 whenever an (S, G) packet is forwarded. However, since the (S, G) entry has a Null oil, the counter will reach zero in 2 minutes and 48 seconds, at which time the (S, G) entry will be deleted. If it is the last (S,G) entry, the (*, G) entry will be deleted. (The arrival of another (S, G) packet from rtr-a will recreate the state as described above.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

36

Potential PIM-DM Route Loop


Normal Steady-State Traffic Flow

Source
Interface previously Pruned by Assert Process

RPF Interface

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

37 37

Potential PIM-DM Route Loops


The non-deterministic behavior of PIM -DM along with its flood-and-prune mechanism can sometimes result in serious network outages including blackholes and multicast route loops. The network in the above example is a simplified version of a frequently used network design whereby multiple routers are used to provide redundancy in the network. Under normal steady -state conditions, traffic flows from the source via the RPF interfaces as shown. Note that the routers have performed the Assert process and one interface on one router is in the pruned state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

37

Potential PIM-DM Route Loop


Interface Fails X
Source
This Router converges first

RPF Interface

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

38 38

Potential PIM-DM Route Loops


Now lets assume that the forwarding interface of the first-hop router fails as shown above. Lets also assume that the unicast routing of router on the left converges first and PIM computes the new RPF interface as shown.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

38

Potential PIM-DM Route Loop


New Traffic Flow X
Source
But wait . . . This Router still hasnt converged yet

Multicast Route Loop ! !


RPF Interface

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

39 39

Potential PIM-DM Route Loops


Unfortunately, the middle router has not yet converged and is still forwarding multicast traffic using the old RPF interface. At this point, a multicast route loop exists in the network due to the transient condition of the two routers having opposite RPF interfaces. During the time that this route loop exists, virtually all of the bandwidth on the network segments can be consumed. This situation will continue until the router in the middle of the picture finally converges and the new correct RPF interface is calculated. Unfortunately, if the router needs some bandwidth to complete this convergence (as in the case when EIGRP goes active), then this condition will never be resolved!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

39

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

40

PIM DM Pruning
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

Initial Flooding State in rtr-a

S0

rtr-b
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:59, 00:00:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial1, Forward/Dense, Serial1, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 Serial3, Forward/Dense, 00:00:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:10/00:02:49, 00:00:10/00:02:49, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00 Serial3, Serial3, Forward/Dense, Forward/Dense, 00:00:10/00:00:00 00:00:10/00:00:00
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

41 41

Initial Flooding State in rtr-a


Let us again review the state in the router that resulted in the initial flooding of (S, G) traffic. Pay particular attention to the following: Both an (*, G) and (S, G) entry exist. In PIM DM, (*, G) entries are created automatically as soon as the first packet (from any source) arrives for group G or when a locally connected host has joined the group via IGMP. Both Serial1 and Serial3 are in the Outgoing interface list (oilist) for the (S, G) entry. This is because there is a PIM Neighbor on these interfaces in this example. The Expires times on the interfaces in the oilist are both 00:00:00. This is because in PIM DM, only pruned interfaces timeout since we are using a flood and prune model.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

41

PIM DM Pruning
S0 Multicast Packets (128.9.160.43, 224.2.127.254) 2 Prune S0 S3 S1

rtr-a

E1

1 2 3

rtr-a initially floods (S, G) traffic out all interfaces in oilist. rtr-b is a leaf node w/o receivers. Sends Prune for (S,G). rtr-a Prunes interface for (S,G).
1998 2001, Cisco Systems, Inc. All rights reserved.

rtr-b

Module3. ppt

8/10/2001 11:41 AM

42 42

Step 1
As a result of the initial flooding state (shown in the previous slide), rtr-a is flooding (S, G) traffic out interfaces Serial3 and Serial 1.

Step 2
rtr-b is a leaf node without any downstream PIM-DM neighbors or directly connected members of the group. This is reflected in rtr-bs (S, G) entry (shown in the previous slide) by the Null OIL and the corresponding P flag being set. As a result of the above, rtr-b sends an (S, G) Prune message to rtr-a to shutoff the flow of unwanted traffic.

Step 3
rtr-a responds by pruning interface Serial3. (This is reflected in the (S, G) state shown in the next slide.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

42

PIM DM Pruning
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

State in rtr-a after Pruning

S0

rtr-b
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:59, 00:00:12/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 Serial1, Serial1, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 Serial3, Serial3, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:48, 00:00:12/00:02:48, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing interface list: Outgoing interface list: Serial1, Serial1, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 Serial3, Serial3, Prune/Dense, Prune/Dense, 00:00:12/00:02:56 00:00:12/00:02:56
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

43 43

State in rtr-a after Pruning


Pay particular attention to the following: Serial3 in the Outgoing interface list (oilist) for the (S, G) entry is now in the Pruned state. The Expires time on interface Serial3 now shows 00:02:56 which indicates that the Prune state will expire in 2 minutes and 56 seconds. At that time, the interface will return to the Forward state and (S, G) traffic will once again be flooded to rtr-b. When this happens, rtr-b will have to send another Prune to rtr-a to shutoff the unwanted (S, G) traffic. This periodic flood and prune behavior is normal for PIM Dense mode.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

43

PIM DM Pruning
S0 Multicast Packets (128.9.160.43, 224.2.127.254) S3 S1

rtr-a

State in rtr-b before/after Pruning

S0

rtr-b
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:59, 00:00:12/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:00:12/00:00:00 00:00:12/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:00:12/00:02:48, 00:00:12/00:02:48, flags: flags: PT PT Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.2.31 198.92.2.31 Outgoing Outgoing interface interface list: list: Null Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

44 44

State in rtr-b after Pruning


Pay particular attention to the following: The Outgoing interface list (oilist) for the (S, G) entry is still Null and the P flag is still set. This indicates that rtr-b will send (S, G) Prunes out the Incoming interface to rtr-a which is the RPF neighbor in the direction of the source.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

44

PIM Prune Delay on Multiaccess Networks


rtr-a
S0 (S,G) Packets E0 1 E0 4 S1
Ill wait 3 secs to see 2 if someone else wants (S,G) before I Prune Interface E0.

Prune

E0

Join 3

rtr-b
E1

rtr-c
E1

Receiver

1 2 3 4

rtr-b is a leaf node w/o receivers. Sends Prune for (S,G). rtr-a schedules a Prune for (S,G) to occur in 3 seconds. rtr-c hears Prune from rtr-b. Overrides with a Join. rtr-a hears Join and cancels Prune for (S,G).
1998 2001, Cisco Systems, Inc. All rights reserved.

Module3. ppt

8/10/2001 11:41 AM

45 45

PIM Prune Delay on Multi-access Networks


rtr-a schedules a prune when asked but doesnt do it right away because it received the (S, G) Prune on a multi-access interface. This gives any other router on the LAN the chance to override the (S, G) Prune if they still need the (S, G) traffic. In the above example, this process occurs as follows: rtr-b is a leaf node with no downstream neighbors or directly connected members so it sends an (S, G) prune. rtr-a receives this (S, G) Prune and schedules the prune of interface Ethernet0 to occur in 3 seconds. rtr-c overhears to (S, G) Prune sent to rtr-a. (It overheard this because all PIM control messages are multicast on the local wire.) Because rtr-c has a directly connected member, it overrides the (S,G) Prune by sending an (S, G) Join to rtr-a. When rtr-a hears this (S, G) Join, it cancels the Prune scheduled for interface Ethernet0. If there was local igmp state for this group on rtr-a (i.e. there was a directly connected member on the LAN) and neither rtr-b and rtr-c had downstream members (and therefore did not override the (S, G) Prune), the (S, G) Prune would be ignored by rtr-a.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

45

PIM Prune Delay on Multiaccess Networks


Watch out for the ripple affect of this Delay!!!
Source

X X
Prune 3 sec delay 3 sec delay Prune

X X
Prune 3 sec delay 3 sec delay Prune

Source begins sending traffic which is flooded everywhere. Leaf router has no receivers; sends prune which ripples up the tree. Total time to prune back to source = 12 seconds! Process repeats 3 minutes later when prunes timeout!
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

46 46

Accumulative Affect of Prune Delays


The 3 second prune delay on multi-access networks can be accumulative and should be taken into account during PIM -DM network design. In the above example, a source is transmitting to a multicast group for which there are no members. The normal prune process is initiated by the router on the far right. However, due to the normal 3 second Prune Delay on multi-access links, the upstream router does not prune its interface for three seconds. When this does happen, the upstream router itself then triggers a Prune to its upstream router. Because this router is also connected via a multi-access network, this prune will also be delayed by three seconds. This process continues adnauseum until it reaches the first hop router directly connected to the source. In this example, a total of 12 seconds was required to completely shutoff the flow of unwanted traffic. Unfortunately, this process is repeated three minutes later when the prunes timeout and re-flooding occurs..

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

46

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

47 47

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

47

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

rtr-b
E1

rtr-c
E1

Beginning State
rtr-b and rtr-c have previously Pruned (S,G) traffic. rtr-a is still forwarding traffic downstream via S1.

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

48 48

PIM DM Grafting Example


Initially, rtr-b and rtr-c have previously Pruned (S, G) traffic. rtr-a is still forwarding traffic downstream via S1 which, for this example, well assume hasnt been pruned.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

48

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

Beginning State in rtr-a

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:59, 00:04:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 Serial1, Serial1, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing Outgoing interface interface list: list: Ethernet0, Ethernet0, Prune/Dense, Prune/Dense, 00:01:29/00:01:30 00:01:29/00:01:30 Serial1, Serial1, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

49 49

Beginning State in rtr-a


Pay particular attention to the following: Ethernet0 in the Outgoing interface list (oilist) for the (S, G) entry is in the Pruned state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

49

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:59, 00:04:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: PT PT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 198.92.2.1 198.92.2.1 Outgoing interface list: Null Outgoing interface list: Null

Beginning State in rtr-b


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

50 50

Beginning State in rtr-b


Pay particular attention to the following: The Incoming interface for the (S, G) entry is Ethernet0. The Outgoing interface list for the (S, G) entry is Null. The P flag is set in the (S, G) entry which indicates the entry is in the Pruned state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

50

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:59, 00:04:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: PT PT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 198.92.2.1 198.92.2.1 Outgoing interface list: Null Outgoing interface list: Null

Beginning State in rtr-c


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

51 51

Beginning State in rtr-c


Pay particular attention to the following: The Incoming interface for the (S, G) entry is Ethernet0. The Outgoing interface list for the (S, G) entry is Null. The P flag is set in the (S, G) entry which indicates the entry is in the Pruned state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

51

PIM DM Grafting
S0 (S,G) Packets 3 PIM Graft-ACK E0 2 PIM Graft E1
IGMP Join Rcvr A

S1 E0

rtr-a
4 E0

rtr-b

rtr-c
E1 1

1 2 3 4

Rcvr A wishes to receive group G traffic. Sends IGMP Join for G. rtr-b sends PIM Graft for Group (S,G). rtr-a acknowledges with a PIM Graft-Ack. rtr-a begins forwarding traffic for (S,G).
1998 2001, Cisco Systems, Inc. All rights reserved.

Module3. ppt

8/10/2001 11:41 AM

52 52

PIM DM Grafting Example (cont.)


1) Rcvr A wishes to receive group G traffic. Therefore, it sends an IGMP Host Membership Report to rtr-b. rtr-b receives the IGMP Host Membership Report and creates IGMP Group state on the interface toward Rcvr A. 2) rtr-b already has (*, G) and (S, G) state. However, the interface towards Rcvr A is in the Prune state. Therefore rtr-b sends a PIM Graft to its upstream neighbor rtr-a, in the direction of the source. 3) rtr-a receives the Graft and acknowledges with a Graft -Ack. 4) rtr-a begins forwarding traffic for (S, G).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

52

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

State in rtr-a after Grafting

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:59, 00:04:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 Serial1, Serial1, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 198.92.1.129 198.92.1.129 Outgoing interface list: Outgoing interface list: Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:00:25/00:00:00 00:00:25/00:00:00 Serial1, Serial1, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

53 53

State in rtr-a after Grafting


Pay particular attention to the following: Both Ethernet0 and Serial1 in the Outgoing interface list (oilist) for the (S, G) entry are now in the Forward state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

53

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:00:00, 00:04:10/00:00:00, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 Ethernet1, Ethernet1, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: C CT T Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 198.92.2.1 198.92.2.1 Outgoing interface list: Outgoing interface list: Ethernet1, Ethernet1, Forward/Dense, Forward/Dense, 00:00:26/00:00:00 00:00:26/00:00:00

State in rtr-b after Grafting


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

54 54

State in rtr-b after Grafting


Pay particular attention to the following: Ethernet1 is now in the (S, G) Outgoing interface list and is in the Forward state. The P flag has been cleared in the (S, G) entry The T flag is set indicating that traffic is successfully flowing down the Shortest-Path Tree.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

54

PIM DM Grafting
S0 (S,G) Packets S1 E0

rtr-a

E0

E0

rtr-b
E1

rtr-c
E1

(*, (*, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:59, 00:04:10/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: D D Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Outgoing interface list: Ethernet0, Ethernet0, Forward/Dense, Forward/Dense, 00:04:10/00:00:00 00:04:10/00:00:00 (128.9.160.43/32, (128.9.160.43/32, 224.2.127.254), 224.2.127.254), 00:04:10/00:02:39, 00:04:10/00:02:39, flags: flags: PT PT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 198.92.2.1 198.92.2.1 Outgoing interface list: Null Outgoing interface list: Null

State in rtr-c after Grafting


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

55 55

State in rtr-c after Grafting


Notice there has been no change in the state: The Incoming interface for the (S, G) entry is Ethernet0. The Outgoing interface list for the (S, G) entry is Null. The P flag is set in the (S, G) entry which indicates the entry is in the Pruned state. The obvious question at this point is, Will rtr-c begin sending (S, G) Prunes in an attempt to shutoff this unwanted traffic? The answer is no because ratelimited prunes are only sent on P2P interfaces. This implies the following: When (S, G) traffic is received on the RPF interface which is a non-P2P link (in this case an Ethernet) and the OIL is Null, an (S, G) Prune message is sent only once at the time the (S, G) entry transitions to a Null OIL. This implies that when the (S, G) entry expires and is deleted, the next arriving (S, G) packet will recreate the (S, G) entry and another single (S, G) Prune will triggered by rtr-c which will be overriden by an (S, G) Join by rtrb. As long as the source remains active, this periodic Prune and Join override will occur every three minutes.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

55

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

56 56

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

56

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check)

S0 A E0 .1 192.168.1.0/24 1

S0 B E0 .2

C 1

Receiver

Routers A & B receive packet on an interface in their oilist !!


Only one router should continue sending to avoid duplicate packets.

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

57 57

PIM Assert Mechanism


The PIM Assert mechanism is used to shutoff duplicate flows onto the same multi-access network. Routers detect this condition when they receive an (S, G) packet via a multiaccess interface that it is in the (S, G) OIL. This is explained in the example presented in the next few slides.

Step 1
Routers A & B receive both receive the same (S, G) traffic via their proper RPF interfaces (Serial0) and forward the packet onto the common Ethernet segment. Routers A & B therefore will receive an (S, G) packet via a multi-access interface that is in the Outgoing Interface list of their (S, G) entry. This triggers the Assert mechanism.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

57

PIM Assert Mechanism

Loser S0 Assert <distance, metric> 192.168.1.0/24 2 A E0 .1 Pruned S0

Winner B E0 .2

2 Assert <distance, metric>

X
C

Receiver

Routers send PIM Assert messages


Compare distance and metric values Router with best route to source wins If metric & distance equal, highest IP adr wins Losing router stops sending (prunes interface)

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

58 58

Step 2
Routers A & B both send PIM Assert messages that contain their Administrative Distance and route Metric back to the source. Note: The Administrative Distance and route Metric is treated as a single combined numeric value where the Administrative Distance is the high-order part of the numeric value. Therefore, even though different routing protocols use different metrics, the lower Administrative Distance will take precedence. Each router compares the received Administrative Distance/Metric value with its own and the router with the best (lowest) value wins the Assert. In case of a tie, the highest IP address is used as the tie breaker. The losing router will Prune its interface just as if it had received a Prune on this interface. Note: This prune will timeout in 3 minutes and cause the router to begin forwarding on this interface again. This triggers another Assert process. By the same token, if the winning router were to crash, the loser would take over the job of forwarding onto this LAN segment after its prune timed out.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

58

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check) Loser S0 A E0 .1 192.168.1.0/24 S0 B E0 .2 Join 4 Prune 3 Winner

If there are no directly connected members on the interface, the winning router sends a Prune and waits 3 seconds for a Join override. This will shutoff traffic if it is not needed somewhere downstream. Router C does need traffic. Sends join to override.
1998 2001, Cisco Systems, Inc. All rights reserved.

X
C

Receiver

4
Module3. ppt

8/10/2001 11:41 AM

59 59

Step 3
In the case where there are no directly connected members on the LAN segment (as is the case in our example), the winning router will send an (S,G) Prune message and schedule its interface to be pruned after the normal 3 second prune delay. This mechanism allows traffic to be shutoff if there are no members of the group further downstream of the LAN segment. (Which is not the case in the figure above.

Step 4
In this example, downstream router C does need the traffic (it has a directly connected receiver) so it responds to the (S, G) Prune sent by the winning router by sending an overriding (S, G) Join. This cancels the scheduled Prune in Router B and thereby continues the flow of (S, G) traffic onto the transit LAN.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

59

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check) Loser S0 A E0 .1 192.168.1.0/24 5 S0 B E0 .2 Pruned 6 Winner

5 6

If router C doesnt need the traffic, no Join override is sent. Router B prunes its interface after 3 second prune delay.
This stops the flow of traffic onto the transit LAN.
1998 2001, Cisco Systems, Inc. All rights reserved.

X
C

X
60 60

Module3. ppt

8/10/2001 11:41 AM

Step 5
On the other hand, if the none of the downstream router(s) need the traffic (as is the case in the example shown above), no (S, G) Join is sent to override the (S, G) Prune sent by the winning router, Router B.

Step 6
After the normal 3 second Prune delay expires and Router B has not received an (S, G) Join to override the prune, it goes ahead and Prunes its interface. This shuts off the flow of traffic onto the transit LAN segment. Note: The prune of Router Bs Ethernet interface will timeout after 3 minutes just as if it had received an (S, G) prune on this interface. This means that traffic will start to flow via this interface after 3 minutes which will trigger Router A to start the Assert process all over again.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

60

PIM Assert Mechanism


Downstream routers must listen for the assert winner to know which router to send prunes and grafts
RPF Neighbor A
.1 E0 S0 S0

B
E0 .2
198.92.2.0/24

Routing Protocol Boundary

State in Router C before Assert

(128.9.160.43/32, 224.2.127.254), 00:04:10/00:02:39, flags:PT Incoming interface: Ethernet0, RPF nbr 198.92.2.2 Outgoing interface list: Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

61 61

Downstream routers on the Assert LAN


It is important for downstream routers that are on the Assert LAN to note who wins the Assert process. This is because it must address any PIM (S,G) control messages (Joins, Prunes, Grafts) to the RPF neighbor (i.e. upstream neighbor) in the direction of the source. In this example, assume that Router A has a better metric to the source. However, because there is a routing protocol boundary between Router C and the other two routers, Router Cs unicast routing table does not know that Router A has better metric to the source. As a result, the unicast routing table in Router C indicates that the best route back to the source is via Router B. This is reflected by the RPF nbr field of the (S, G) entry.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

61

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check)

RPF Neighbor Assert Winner


Assert .1

A
E0

S0

S0

B
E0 .2

Assert Loser
Assert

Routing Protocol Boundary

198.92.2.0/24

C
(128.9.160.43/32, 224.2.127.254), 00:04:10/00:02:39, flags:PT Incoming interface: Ethernet0, RPF nbr 198.92.2.2 Outgoing interface list: Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

62 62

Downstream routers on the Assert LAN (cont.)


When traffic begins to flow, it triggers Routers A & B to send Assert messages. Because Router A has a better (lower) metric to the source than Router B and therefore Router A wins the assert.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

62

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check)

RPF Neighbor Assert Winner


.1

A
E0

S0

S0

B
E0 .2

Assert Loser
Pruned

X
198.92.2.0/24

Routing Protocol Boundary

C
(128.9.160.43/32, 224.2.127.254), 00:04:10/00:02:39, flags:PT Incoming interface: Ethernet0, RPF nbr 198.92.2.2 Outgoing interface list: Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

63 63

Downstream routers on the Assert LAN (cont.)


Because Router B is the Assert Loser, it Prunes its interface. Traffic now flows through Router A, the Assert Winner.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

63

PIM Assert Mechanism


Incoming Multicast Packet
(Successful RPF Check)

RPF Neighbor Assert Winner


.1

A
E0

S0

S0

B
E0 .2

Assert Loser

Routing Protocol Boundary

198.92.2.0/24

State in Router C after Assert

Assert Winner

(128.9.160.43/32, 224.2.127.254), 00:04:10/00:02:39, flags:PT Incoming interface: Ethernet0, RPF nbr 198.92.2.1 Outgoing interface list: Null

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

64 64

Downstream routers on the Assert LAN (cont.)


Because Router C has overheard the Assert process (because PIM control messages are multicast onto the local link), it was able to determine who has won the Assert process. Router C now updates its RPF nbr information to reflect that Router A is now the correct upstream neighbor in the direction of the source. This will result in any (S, G) PIM control traffic (Joins, Prunes, Grafts) being sent with the IP address of Router A in the Upstream Neighbor Address field of the PIM control message. If Router C didnt update its RPF nbr information and continued to send PIM control traffic (Joins, Prunes, Grafts) to Router B (the old RPF nbr), it would not be able to properly control the flow of multicast traffic since the control messages would be going to the wrong upstream neighbor.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

64

PIM-DM Assert Problem

Initial Flow
Duplicate Traffic

Receiver

Receiver

Multicast Packets

Source
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

65 65

PIM-DM Assert Problem


While the PIM Assert mechanism is effective in pruning off duplicate traffic, it is not without its weaknesses. Consider the above example where duplicate traffic is flowing onto a LAN segment.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

65

PIM-DM Assert Problem

Sending Asserts
Loser

Receiver

Receiver

Winner

Multicast Packets Assert Messages Source


Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

66 66

PIM-DM Assert Problem


The normal PIM Assert mechanism takes place and the two routers exchange routing metrics to determine which one has the best route to the source. In this case, the bottom router has the best metric and is the Assert Winner.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

66

PIM-DM Assert Problem

Assert Loser Prunes Interface


Loser

Receiver

Receiver

Winner

Multicast Packets

Source
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

67 67

PIM-DM Assert Problem


The normal PIM Assert mechanism takes place and the Assert Winner continues forwarding while the Assert Loser prunes its interface and starts its prune timer.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

67

PIM-DM Assert Problem

Assert Winner Fails


Traffic flow is cutoff until Prune times out on Assert Loser. Loser

Receiver

Receiver

X
Winner Multicast Packets Source
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

68 68

PIM-DM Assert Problem


Lets now assume that the Assert Winner fails immediately after winning the Assert process. Unfortunately, the Assert Loser has no way of knowing that the Assert Winner has failed and will wait 3 minutes before timing out its pruned interface. This results in a 3 minute (worst-case) loss of traffic.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

68

PIM DM Concepts

PIM Neighbor Discovery PIM DM State PIM DM Flooding PIM DM Pruning PIM DM Grafting PIM Assert Mechanism PIM DM State Maintenance
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

69 69

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

69

PIM DM State Maintenance


State is maintained by the flood and prune behavior of Dense mode.
Received Multicast packets reset (S,G) entry expiration timers. When (S, G) entry expiration timers count down to zero, the entry is deleted.

Interface prune state times out every 3 minutes causing periodic reflooding and pruning.
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

70 70

PIM DM State Maintenance


In PIM DM, all (S, G) entries have an expiration countdown timer which is reset to 3 minutes by the receipt of an (S, G) packet received via the Shortest-Path Tree (SPT). If no further packets are received from the source, this expiration timer goes to zero and the (S, G) entry is deleted. When a Prune message is received in PIM Dense mode, the interface on which the Prune was received is marked as Prune/Dense and a prune countdown timer is set to 3 minutes. When this timer expires, the interface is set back to Forward/Dense and traffic is again flooded out the interface. The downstream router will again send another (S, G) Prune to stop the unwanted traffic; therfore the Flood and Prune behaviour occurs every 3 minutes.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

70

PIM Dense Mode Review


Source Link Data Control A B G C D F H E Receiver 1 I Receiver 2

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

71 71

PIM Dense Mode Review


The following slides will review all the major concepts previously present in a sample network situation.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

71

PIM Dense Mode Review


Source Initial Flood of Data and Creation of State A B G C D F H E Receiver 1 I Receiver 2

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

72 72

PIM Dense Mode Review (cont.)


Source starts sending and the (S,G) gets initailly flooded everywhere

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

72

PIM Dense Mode Review


Source Prune to Non-RPF Neighbor

A Prune C

B G D F H E I Receiver 2

Receiver 1

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

73 73

PIM Dense Mode Review (cont.)


The link between B & C is not Cs RPF interface for this group so a prune is immediately sent to B and this link is removed off of the tree at B

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

73

PIM Dense Mode Review


Source C and D Assert to Determine Forwarder for the LAN, C Wins A B G C Asserts H E Receiver 1 I Receiver 2 D F

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

74 74

PIM Dense Mode Review (cont.)


C & D are redundant forwarders for their common Ethernet - they assert - C wins (assume a better metric to the source or that C has a higher IP address if the metrics are equal)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

74

PIM Dense Mode Review


Source I Gets Pruned

B G

D Prune E I

F H

Receiver 1

Receiver 2

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

75 75

PIM Dense Mode Review (cont.)


I prunes off - it has no need to receive the group

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

75

PIM Dense Mode Review


Source Es Prune is Ignored

B G

C Prune E Receiver 1

F H I Receiver 2

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

76 76

PIM Dense Mode Review (cont.)


E prunes but it is ignored since C knows there is a locally attached host via IGMP state

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

76

PIM Dense Mode Review


Source

Gs Prune is Overridden A B Prune G C D F Join Override H E Receiver 1 I Receiver 2

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

77 77

PIM Dense Mode Review (cont.)


G prunes since it doesnt need the group H overrides the prune since it does need it - F continues to forward on this link G will continue to receive the group input on the common Ethernet but since its oif is NULL, the packets are fast switched to the bit bucket

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

77

PIM Dense Mode Review


Source New Receiver, I Sends Graft

B G

D Graft E I

F H

Receiver 1 Receiver 3
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver 2

8/10/2001 11:41 AM

78 78

PIM Dense Mode Review (cont.)


Assume a new receiver (#3) comes on behnd rtr I I grafts onto E E already had state for this group sicne it was still being received on the C,D,E Ethernet so it starts sending the group to I

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

78

PIM Dense Mode Review


Source I is Grafted back onto tree

B G

F H

E Receiver 1

I Receiver 2 Receiver 3

Module3. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

79 79

PIM Dense Mode Review (cont.)


E already had state for this group sicne it was still being received on the C,D,E Ethernet so it starts sending the group to I Final state given all events in the network

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

79

Configuring PIM Dense Mode


ip multicastmulticast- routing S0 E1 E0 interface Ethernet 0 ip address 1.1.1.1 255.255.0.0 ip pim dense dense- mode interface Ethernet 1 ip address 2.2.2.2 255.255.0.0 ip pim dense dense- mode interface Serial 0 ip address 192.1.1.1 255.255.255.252 ip pim dense dense- mode

Simple to configure
One global command One command per interface
Module3. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 11:41 AM

80 80

Configuring PIM Dense Mode


Configuring PIM Dense Mode multicasting is very simple. The following commands are the only configuration commands necessary: Add the ip multicast-routing global command to the configuration. Add the ip pim dense-mode interface command to each interface in the router configuration to enable ip multicasting using PIM Dense mode. (Warning: Use caution if you do not add the above command to all interfaces in the router. Problems can occur if some interfaces in the router are not running multicast. This is because the RPF check mechanism uses the Unicast route table to compute the RPF interface. If the RPF interface maps to an interface that is not running multicasting RPF Failures can occur.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

80

Module2. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

81

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module3.ppt

81

Basic Multicast Debugging


Module 4

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

Module Objectives

Introduction to IOS Command Line Interface (CLI) tools Understand usage and key information fields for IOS CLI tools in troubleshooting and monitoring the router and network Develop a Strategy for debugging multicast networks

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

Module Agenda

Router Command Review Debugging Strategies

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

Router Command Review

Show commands Debug commands Other useful commands

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip igmp groups

R4#show R4#show ip ip igmp igmp group group IGMP IGMP Connected Connected Group Group Membership Membership Group Address Interface Group Address Interface 224.1.1.1 Ethernet1 224.1.1.1 Ethernet1 224.0.1.40 Ethernet0 224.0.1.40 Ethernet0

Uptime Uptime 3d16h 3d16h 4d15h 4d15h

Expires eporter Expires Last Last RR eporter 00:01:59 .7.2 00:01:59 172.16 172.16 .7.2 never 172.16 .6.2 never 172.16 .6.2

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Uptime - shows how long there has been membership for the listed group on that interface Expires - shows when membership interest will end - IGMP reports from client members of this group are what keep this timer from expiring - you should see this value reset and not timeout as long as there are members present. When this timer expires - the multicast routing protocol is notified to stop delivery of that group onto this interface Only the last IGMP reporter is listed - this is due to report suppression

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip igmp interface


R4#show R4#show ip ip igmp igmp interface interface Ethernet1 Ethernet1 is is up, up, line line protocol protocol is is up up Internet address Internet address is is 172.16.7.1, 172.16.7.1, subnet subnet mask mask is is 255.255.255.0 255.255.255.0 IGMP is enabled on interface IGMP is enabled on interface Current IGMP version is 2 Current IGMP version is 2 CGMP CGMP is is disabled disabled on on interface interface IGMP IGMP query query interval interval is is 60 60 seconds seconds IGMP querier timeout IGMP querier timeout is is 120 120 seconds seconds IGMP IGMP max max query query response response time time is is 10 10 seconds seconds Inbound Inbound IGMP IGMP access access group group is is not not set set Multicast Multicast routing routing is is enabled enabled on on interface interface Multicast Multicast TTL TTL threshold threshold is is 00 Multicast Multicast designated designated router router (DR) (DR) is is 172.16.7.1 172.16.7.1 (this (this system) system) IGMP IGMP querying querying router router is is 172.16.7.1 172.16.7.1 (this (this system) system) No multicast groups joined No multicast groups joined

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

This is the command to verify IGMP and CGMP are enabled or disabled on the interface IGMP version can be verified with this command - this is important if you have a mixed environment of multicast routing protocols running or other routers that support different versions of IGMP - some IGMP configuration may be required IGMP timers can be verified here for tuning purposes The multicast designated router (DR) and IGMP querier for this link can also be determined with this command

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip pim neighbor

R6#show R6#show ip ip pim pim neighbor neighbor PIM PIM Neighbor Neighbor Table Table Neighbor Address Neighbor Address Interface Interface 172.16.10.2 Serial0 172.16.10.2 Serial0 172.16.11.2 Serial1 172.16.11.2 Serial1 172.16.9.1 Ethernet0 172.16.9.1 Ethernet0

Uptime Uptime 4d15h 4d15h 4d15h 4d15h 4d15h 4d15h

Expires Expires 00:01:19 00:01:19 00:01:00 00:01:00 00:01:00 00:01:00

Mode Mode Dense Dense Dense Dense Dense Dense

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Uptime - indicates how long the neighbor adjacency has existed Expires - indicates when the adjacency will timeout and be removed PIM hellos maintain this adjacency Mode - indicates what mode the interface is running in

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip pim interface

R6#show R6#show ip ip pim pim interface interface Address Interface Address Interface 172.16.10.1 172.16.10.1 172.16.11.1 172.16.11.1 172.16.9.2 172.16.9.2 Serial0 Serial0 Serial1 Serial1 Ethernet0 Ethernet0

Mode Mode Dense Dense Dense Dense Dense Dense

Nbr DR Nbr Query Query DR Count Count Intvl Intvl 11 30 0. 0.0.0 30 0. 0.0.0 11 30 0. 0.0.0 30 0. 0.0.0 11 30 17 2.16.9.2 30 17 2.16.9.2

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Nbr Count = number of neighbors on this link DR = 0.0.0.0 in this example because p2p links do not have DRs

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip rpf
R4#show R4#show ip ip rpf rpf 172.16.8.1 172.16.8.1 RPF RPF information information for for R1 R1 (172.16.8.1) (172.16.8.1) RPF RPF interface: interface: Ethernet0 Ethernet0 RPF RPF neighbor: neighbor: R3 R3 (172.16.6.1) (172.16.6.1) RPF RPF route/mask: route/mask: 172.16.8.0/255.255.255.0 172.16.8.0/255.255.255.0 RPF type: unicast RPF type: unicast R4#sh R4#sh ip ip rpf rpf 172.16.12.2 172.16.12.2 RPF RPF information information for for Source1 Source1 (172.16.12.2) (172.16.12.2) RPF interface: RPF interface: Ethernet0 Ethernet0 RPF neighbor: R6 (172.16.11.1) RPF neighbor: R6 (172.16.11.1) RPF RPF route/mask: route/mask: 172.16.12.0/255.255.255.0 172.16.12.0/255.255.255.0 RPF RPF type: type: unicast unicast

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

Top example is obtaining RPF information for the RP (on R1) The RPF interface is the interface used to reach the target address (The RP itself in this example) Also shown is the RPF neighbor on the RPF interface and the route and mask used to reach the target address The second example is the RPF information for the source of the multicast group

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

show ip route
R4#show R4#show ip ip route route Gateway Gateway of of last last resort resort is is not not set set DD DD DD DD CC DD DD DD 172.16.0.0/24 ,, 77 subnets 172.16.0.0/24 is is subnetted subnetted subnets 172.16.2.0 172.16.2.0 [90/2354611] [90/2354611] via via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet0 Ethernet0 172.16.3.0 [90/2354611] via 172.16.3.0 [90/2354611] via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet0 Ethernet0 172.16.4.0 172.16.4.0 [90/2221056] [90/2221056] via via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet0 Ethernet0 172.16.5.0 172.16.5.0 [90/2221056] [90/2221056] via via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet0 Ethernet0 172.16.6.0 172.16.6.0 [90/2281542] [90/2281542] via via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet0 Ethernet0 172.16.10.0 00 172.16.10.0 [90/2281542] [90/2281542] via via 172.16.6.1, 172.16.6.1, 4d15h, 4d15h, Ethernet Ethernet 172.16.8.0 [90/2221056] via 172.16.6.1, 4d15h, Ethernet0 172.16.8.0 [90/2221056] via 172.16.6.1, 4d15h, Ethernet0 192.169.1.0/24 192.169.1.0/24 is is subnetted, subnetted, 11 subnets subnets 192.169.1.0 00 192.169.1.0 [90/2349056] [90/2349056] via via 172.16.6.1, 172.16.6.1, 3d15h, 3d15h, Ethernet Ethernet

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

10 10

This slide for reference only for following slides - this table taken from R4 Recall that multicast forwarding decisions are made based on the unicast routing table - make sure you understand the UNICAST topology and stability before looking at MULTICAST issues

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

10

show ip mroute summary

R6#show R6#show ip ip mroute mroute summary summary IP IP Multicast Multicast Routing Routing Table Table Flags: Flags: DD -- Dense, Dense, SS -- Sparse, Sparse, CC -- Connected, Connected, LL -- Local, Local, PP -- Pruned Pruned RR -- RP-bit RP-bit set, set, FF -- Register Register flag, flag, TT -- SPT-bit SPT-bit set, set, JJ -- Join Join SPT SPT Timers: Timers: Uptime/Expires Uptime/Expires Interface Interface state: state: Interface, Interface, Next-Hop, Next-Hop, State/Mode State/Mode (*, (*, 224.1.1.1), 224.1.1.1), 00:01:47/00:02:55, 00:01:47/00:02:55, RP RP 0.0.0.0, 0.0.0.0, flags: flags: DD (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), 00:01:47/00:02:54, 00:01:47/00:02:54, flags: flags: CT CT (*, (*, 224.0.1.40), 224.0.1.40), 3d16h/00:00:00, 3d16h/00:00:00, RP RP 0.0.0.0, 0.0.0.0, flags: flags: DCL DCL

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

11 11

A summarized version of the multicast routing table

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

11

show ip mroute
barrnet -gw>show barrnet -gw>show ip ip mroute mroute IP IP Multicast Multicast Routing Routing Table Table Flags: D Dense, S Flags: D - Dense, S -- Sparse, Sparse, CC -- Connected, Connected, LL -- Local, Local, PP -- Pruned Pruned RR -- RP-bit set, RP-bit set, FF -- Register Register flag, flag, TT -- SPT-bit SPT-bit set, set, JJ -- Join Join SPT SPT Timers: Uptime/Expires Timers: Uptime/Expires Interface state: Interface, Next-Hop, State/Mode Interface state: Interface, Next-Hop, State/Mode (*, (*, 224.2.130.100), 224.2.130.100), 00:18:53/00:02:59, 00:18:53/00:02:59, RP RP 0.0.0.0, 0.0.0.0, flags: flags: DD Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Fddi1/0, Fddi1/0, Forward/Dense, Forward/Dense, 00:09:20/00:02:38 00:09:20/00:02:38 Hssi3/0, Hssi3/0, Forward/Dense, Forward/Dense, 00:18:53/00:00:00 00:18:53/00:00:00 (208.197.169.209/32, (208.197.169.209/32, 224.2.130.100), 224.2.130.100), 00:18:53/00:02:27, 00:18:53/00:02:27, flags: flags: TT Incoming Incoming interface: interface: Hssi3/0, Hssi3/0, RPF RPF nbr nbr 131.119.26.9 131.119.26.9 Outgoing Outgoing interface interface list: list: Fddi1/0, Fddi1/0, Forward/Dense, Forward/Dense, 00:16:16/00:02:38 00:16:16/00:02:38 (*, 239.100.111.224), (*, 239.100.111.224), 05:35:08/00:02:58, 05:35:08/00:02:58, RP RP 171.69.10.13, 171.69.10.13, flags: flags: DP DP Incoming interface: Null, RPF Incoming interface: Null, RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing interface list: Null Outgoing interface list: Null

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

12 12

Partial output taken from a production router in Ciscos network for more interesting output This is a generic multicast routing table Note the:
(*,G) and (S,G) entries incoming interface outgoing interface list (OIF) RP (if any) Flags times - how long the entry has been in the table and when it will expire

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

12

show ip mroute active


barrnet -gw>show barrnet -gw>show ip ip mroute mroute active active Active Active IP IP Multicast Multicast Sources Sources -- sending sending >= >= 44 kbps kbps Group: Group: 224.2.154.118, 224.2.154.118, Radio Radio Bandit Bandit Source: pilsnet .sunet.se) Source: 192.36.125.68 192.36.125.68 (falcon. (falcon. pilsnet .sunet.se) Rate: 11 pps/30 kbps(1sec), Rate: 11 pps/30 kbps(1sec), 30 30 kbps(last kbps(last 33 33 secs), secs), 23 23 kbps(life kbps(life avg) avg) Group: Group: 224.2.246.13, 224.2.246.13, UO UO Presents Presents KWAX KWAX Classical Classical Radio Radio Source: Source: 128.223.83.204 128.223.83.204 (d83-204.uoregon.edu) (d83-204.uoregon.edu) Rate: 24 pps/69 kbps(1sec), 72 kbps(last 2 secs), )) Rate: 24 pps/69 kbps(1sec), 72 kbps(last 2 secs), 70 70 kbps(life kbps(life avg avg Group: Group: 224.2.180.115, 224.2.180.115, ANL ANL TelePresence TelePresence Microscopy Microscopy Site Site Source: .anl.gov )) Source: 146.139.72.5 146.139.72.5 (aem005.amc (aem005.amc .anl.gov Rate: /5 ), )) Rate: 11 pps pps /5 kbps(1sec), kbps(1sec), 99 kbps(last kbps(last 52 52 secs secs ), 12 12 kbps(life kbps(life avg avg ... ...

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

13 13

Shows all active groups with an aggregate bandwidth greater than the specified kbps (4kbps is the default) Listed in each entry is:
group address session name source address and domain name averaged pps and kbps rates for this flow

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

13

show ip mroute count


sj-mbone> sj-mbone> show show ip ip mroute mroute count count IP IP Multicast Multicast Statistics Statistics 1460 routes using 471528 bytes 1460 routes using 471528 bytes of of memory memory 404 404 groups, groups, 2.61 2.61 average average sources sources per per group group Forwarding Counts: Pkt Count/Pkts per Avg Forwarding Counts: Pkt Count/Pkts per second/ second/ Avg Pkt Pkt Size/Kilobits Size/Kilobits per per second second Other -limit Other counts: counts: Total/RPF Total/RPF failed/Other failed/Other drops(OIF-null, drops(OIF-null, rate rate -limit etc) etc) Group: Group: 224.2.234.11, 224.2.234.11, Source Source count: count: 1, 1, Group Group pkt pkt count: count: 3244 3244 RP-tree: RP-tree: Forwarding: Forwarding: 3244/0/1198/0, 3244/0/1198/0, Other: Other: 3244/0/0 3244/0/0 Source: Source: 171.69.235.123/32, 171.69.235.123/32, Forwarding: Forwarding: 0/0/0/0, 0/0/0/0, Other: Other: 0/0/0 0/0/0 Group: Group: 224.2.247.22, 224.2.247.22, Source Source count: count: 3, 3, Group Group pkt pkt count: count: 369 369 RP-tree: RP-tree: Forwarding: Forwarding: 366/0/92/0, 366/0/92/0, Other: Other: 366/0/0 366/0/0 Source: 171.69.10.13/32, Forwarding: 0/0/0/0, Other: 0/0/0 Source: 171.69.10.13/32, Forwarding: 0/0/0/0, Other: 0/0/0 Source: Source: 171.69.200.191/32, 171.69.200.191/32, Forwarding: Forwarding: 0/0/0/0, 0/0/0/0, Other: Other: 19/0/19 19/0/19 Source: 3/113 Source: 171.69.248.71/32, 171.69.248.71/32, Forwarding: Forwarding: 3/0/112/0, 3/0/112/0, Other: Other: 239/12 239/12 3/113 .. .. ..

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

14 14

Useful for seeing statistics on each routing entry

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

14

show ip mcache

R6#show R6#show ip ip mcache mcache IP IP Multicast Multicast Fast-Switching Fast-Switching Cache Cache (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), Ethernet1, Ethernet1, Last Last used: used: 00:02:33 00:02:33 Serial0 MAC Header: Serial0 MAC Header: 0F000800 0F000800 Serial1 MAC Header: 0F000800 Serial1 MAC Header: 0F000800

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

15 15

Displays IPmc fast switching cache - useful for debugging fast switching bugs

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

15

sh ip pim rp mapping
sjck-rp1>show sjck-rp1>show ip ip pim pim rp rp mapping mapping PIM -to-RP PIM Group Group -to-RP Mappings Mappings This system is an RP (Auto -RP) This system is an RP (Auto -RP) This This system system is is an an RP-mapping RP-mapping agent agent (Loopback1) (Loopback1) Group(s) Group(s) 224.0.0.0/4 224.0.0.0/4 RP sj-mbone-loopback0. cisco .com), RP 171.69.10.13 171.69.10.13 (( sj-mbone-loopback0. cisco .com), v2v1 v2v1 Info sj-mbone-loopback0. cisco .com), Info source: source: 171.69.10.13 171.69.10.13 (( sj-mbone-loopback0. cisco .com), via via Auto-RP Auto-RP Uptime: 4w4d, expires: 00:02:55 Uptime: 4w4d, expires: 00:02:55 Group(s) 239.192.111.0/24 Group(s) 239.192.111.0/24 RP cisco.com), RP 192.168.165.15 192.168.165.15 (sjc25b-00rp-gw1-loop1. (sjc25b-00rp-gw1-loop1. cisco.com), v2v1 v2v1 Info cisco.com), Info source: source: 192.168.165.15 192.168.165.15 (sjc25b-00rp-gw1-loop1. (sjc25b-00rp-gw1-loop1. cisco.com), via via Auto-RP Auto-RP Uptime: 1d18h, expires: 00:02:35 Uptime: 1d18h, expires: 00:02:35

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

16 16

This command lists the contents of the Group-to-RP Mapping Cache. In the example above, there are two group ranges covered by two different RPs, both of which have been learned via Auto-RP. (RPs can be learned either dynamically or by static configuration.) Note that there can be multiple RPs in the network each supporting a different multicast address range

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

16

show ip sdr
dallas-gw>show dallas-gw>show ip ip sdr sdr SDR SDR Cache Cache -- 450 450 entries entries *cisco *cisco 100k 100k Field Field *cisco *cisco 100K 100K Field Field Sales Sales Office Office *cisco 500k San *cisco 500k San Jose Jose && RTP RTP *cisco 500k SJ and RTP *cisco 500k SJ and RTP

. . . . . .

By default, sdr cache entries are not deleted - use the command ip sdr cache-timeout <minutes> to remove cache entries after a period of time.

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

17 17

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

17

show ip sdr detail


dallas-gw>show dallas-gw>show ip ip sdr sdr detail detail SDR SDR Cache Cache -- 450 450 entries entries Session Name: *cisco 100K Session Name: *cisco 100K Field Field Description: Description: 100K 100K Video Video Continuous Continuous Test Test Channel Channel Group: 0.0.0.0, ttl: 0, Contiguous Group: 0.0.0.0, ttl: 0, Contiguous allocation: allocation: 11 Uptime: Uptime: 3w0d, 3w0d, Last Last Heard: Heard: 00:09:44 00:09:44 Announcement Announcement source: source: 171.68.224.10 171.68.224.10 Created Created by: by: -- 27981 27981 25 25 IN IN IP4 IP4 171.68.224.8 171.68.224.8 Phone Phone number: number: TRC TRC <(408) <(408) 526-8888> 526-8888> Email: Email: URL: URL: http://171.68.223.153/CustAdv/InfoSys/TRC/guides/webcast.html http://171.68.223.153/CustAdv/InfoSys/TRC/guides/webcast.html Media: Media: video video 54002 54002 RTP/AVP RTP/AVP 31 31 32 32 96 96 Media Media group: group: 224.2.247.65, 224.2.247.65, ttl: ttl: 15 15 Attribute: quality:8 Attribute: quality:8 Attribute: framerate:20 Attribute: framerate:20 Attribute: :96 Attribute: rtpmap rtpmap :96 WBIH/90000 WBIH/90000 Media: 33 Media: audio audio 23704 23704 RTP/AVP RTP/AVP 33 00 14 14 55 96 96 97 97 98 98 99 99 100 100 101 101 102 102 10 10 Media Media group: group: 224.2.220.101, 224.2.220.101, ttl: ttl: 15 15 Attribute: :96 Attribute: rtpmap rtpmap :96 L8/22050/2 L8/22050/2 Attribute: :97 Attribute: rtpmap rtpmap :97 L8/22050 L8/22050 Attribute: :98 Attribute: rtpmap rtpmap :98 L8/11025/2 L8/11025/2 ... ...
Module4. ppt 8/10/2001 1:55 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

18 18

show ip sd [group | "session-name" | detail] Displays the contents of the session directory cache Example shown is an advertisement of a Cisco- internal IP/TV broadcast

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

18

Router Command Review

Show commands Debug commands Other useful commands

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

19 19

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

19

debug ip igmp

R4# R4# debug debug ip ip igmp igmp IGMP: IGMP: Send Send v2 v2 Query Query on on Ethernet1 Ethernet1 to to 224.0.0.1 224.0.0.1 IGMP: .1 IGMP: Received Received v2 v2 Report Report from from 172.16.7.2 172.16.7.2 (Ethernet1) (Ethernet1) for for 224.1.1 224.1.1 .1 IGMP: Received v2 Query from 172.16.6.1 IGMP: Received v2 Query from 172.16.6.1 (Ethernet0) (Ethernet0) IGMP: Set report delay time to 2.2 seconds for 224.0.1.40 on Eth ernet0 IGMP: Set report delay time to 2.2 seconds for 224.0.1.40 on Eth ernet0 IGMP: IGMP: Send Send v2 v2 Report Report for for 224.0.1.40 224.0.1.40 on on Ethernet0 Ethernet0 IGMP: .40 IGMP: Received Received v2 v2 Report Report from from 172.16.6.1 172.16.6.1 (Ethernet0) (Ethernet0) for for 224.0.1 224.0.1 .40 IGMP: Received v2 Report from 172.16.6.1 .40 IGMP: Received v2 Report from 172.16.6.1 (Ethernet0) (Ethernet0) for for 224.0.1 224.0.1 .40

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

20 20

This is a useful debug to make sure you are sending queries and to determine the query interval It is also useful for figuring out what IGMP version the clients are using - when the report back when queried

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

20

debug ip mpacket

R6# R6# debug debug ip ip mpacket mpacket 224.1.1.1 224.1.1.1 detail detail IP: IP: MAC MAC sa=00e0.b063.cf4b sa=00e0.b063.cf4b (Ethernet1), (Ethernet1), IP IP last-hop=172.16.12.2 last-hop=172.16.12.2 IP: IP tos=0x0, len =100, id=0x175, IP: IP tos=0x0, len =100, id=0x175, ttl=254, ttl=254, prot=1 prot=1 IP: s=172.16.12.2 (Ethernet1) d=224.1.1.1 len IP: s=172.16.12.2 (Ethernet1) d=224.1.1.1 len 114, 114, mroute mroute olist olist null null

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

21 21

Decode of a multicast packet USE CAUTION - when turning on packet level debugging especially when the router is servicing high multicast loads!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

21

debug ip mroute

R6# R6# debug debug ip ip mrouting mrouting 224.1.1.1 224.1.1.1 MRT: MRT: Create Create (*, (*, 224.1.1.1), 224.1.1.1), RPF RPF Null, Null, PC PC 0x6032D254 0x6032D254 MRT: MRT: Create Create (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), RPF RPF Ethernet1/0.0.0.0, Ethernet1/0.0.0.0, PC PC x6032D378 x6032D378

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

22 22

Useful for watching multicast routing table maintenance

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

22

debug ip pim

R4# R4# debug debug ip ip pim pim 224.1.1.1 224.1.1.1 PIM: -Query PIM: Send Send Router Router -Query on on Ethernet0 Ethernet0 PIM: -Query PIM: Send Send Router Router -Query on on Ethernet1 Ethernet1 PIM: Received Router-Query PIM: Received Router-Query on on Ethernet0 Ethernet0 from from 172.16.6.1 172.16.6.1

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

23 23

Periodic Router-Query messages used to keep track of PIM neighbors. This creates and maintains neighbor adjacencies. There is no other PIM router on E1/1 but R3 is seen on E0/0

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

23

debug ip pim (cont)


R4# R4# PIM: PIM: Building Building Join/Prune Join/Prune message message for for 224.1.1.1 224.1.1.1 PIM: PIM: For For RP, RP, Join-list: Join-list: 172.16.8.1/32, 172.16.8.1/32, RP-bit, RP-bit, WC-bit WC-bit PIM: Send periodic Join/Prune to RP via PIM: Send periodic Join/Prune to RP via 172.16.6.1 172.16.6.1 (Ethernet0) (Ethernet0) PIM: Received RP -Reachable on Ethernet0 from 172.16.8.1 PIM: Received RP -Reachable on Ethernet0 from 172.16.8.1 for group 224.1.1.1 for group 224.1.1.1 PIM: PIM: Update Update RP RP expiration expiration timer timer (270 (270 sec) sec) for for 224.1.1.1 224.1.1.1

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

24 24

Here, the router is configured with the RP's address and hence sends out a periodic JOIN towards the RP. The RP in turn sends back an RP-Reachable message in return. The WC bits indicates (*,G) state setup.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

24

debug ip pim (cont)

R1# R1# PIM: PIM: Received Received Join/Prune Join/Prune on on Serial0 Serial0 from from 172.16.3.2 172.16.3.2 PIM: -list: -bit PIM: Join Join -list: (*, (*, 224.1.1.1) 224.1.1.1) RP RP 172.16.8.1, 172.16.8.1, RP-bit RP-bit set, set, SS -bit set set PIM: PIM: Add Add Serial0/172.16.3.2 Serial0/172.16.3.2 to to (*, (*, 224.1.1.1), 224.1.1.1), Forward Forward state state PIM: PIM: Received Received Join/Prune Join/Prune on on Serial0 Serial0 from from 172.16.3.2 172.16.3.2 PIM: PIM: Building Building Join/Prune Join/Prune message message for for 224.1.1.1 224.1.1.1 PIM: PIM: Send Send RP-reachability RP-reachability for for 224.1.1.1 224.1.1.1 on on Serial0 Serial0

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

25 25

On R1, which the RP for the Group 224.1.1.1 The RP receives periodic JOIN's for the (*,G) which is the pre-existing state in PIM Sparse mode. The RP updates its OIF for the (*,G) and sends back an RP-Reachability message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

25

debug ip pim (cont)


R6# R6# PIM: PIM: Check Check RP RP 172.16.8.1 172.16.8.1 into into the the (*, (*, 224.1.1.1) 224.1.1.1) entry entry PIM: 11 PIM: Send Send Register Register to to 172.16.8.1 172.16.8.1 for for 172.16.12.2, 172.16.12.2, group group 224.1.1. 224.1.1. PIM: -Reachable PIM: Received Received RP RP -Reachable on on Serial1 Serial1 from from 172.16.8.1 172.16.8.1 PIM: -Reachable PIM: Received Received RP RP -Reachable on on Serial2 Serial2 from from 172.16.8.1 172.16.8.1

PIM: PIM: Received Received Join/Prune Join/Prune on on Ethernet0 Ethernet0 from from 172.16.9.1 172.16.9.1 PIM: -list: PIM: Join Join -list: (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), S-bit S-bit set set PIM: Add Ethernet0/172.16.9.1 to (172.16.12.2/32, rward PIM: Add Ethernet0/172.16.9.1 to (172.16.12.2/32, 224.1.1.1), 224.1.1.1), Fo Fo rward state state PIM: Building Join/Prune message for 224.1.1.1 PIM: Building Join/Prune message for 224.1.1.1 PIM: No sources in join or prune list PIM: No sources in join or prune list PIM: PIM: Received Received Join/Prune Join/Prune on on Serial1 Serial1 from from 172.16.11.2 172.16.11.2 PIM: -list: PIM: Join Join -list: (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), S-bit S-bit set set PIM: ard PIM: Add Add Serial1/172.6.11.2 Serial1/172.6.11.2 to to (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), Forw Forw ard state state PIM: PIM: Send Send Null Null Register Register to to 172.16.8.1 172.16.8.1 PIM: PIM: Received Received Register-Stop Register-Stop on on Ethernet0 Ethernet0 from from 172.16.8.1 172.16.8.1

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

26 26

Taken from R4 (router connected to the source) - this will show the initiation of the shared tree in PIM sparse mode Part 1 - When the Source initiates transmission to Group 224.1.1.1 R4 uses its (*,G) entry and sends the data to the RP encapsulated i n Register packets for the Source 172.16.12.2. Part 2 - It then creates a (S,G) entry of the form (172.16.12.2/24,224.1.1.1) JOIN's from its PIM Neighbors come in causing the interfaces on which the JOIN's are received to be added to the OIF -list in the Mroute table. Part 3 - R4 now starts sending periodic Null Register messages to the RP and receives Register-Stop messages. This is for maintenance of the tree.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

26

debug ip pim (cont)


R1# R1# PIM: PIM: Received Received Register Register on on Ethernet1 Ethernet1 from from 172.16.9.2 172.16.9.2 PIM: PIM: Forward Forward decapsulated decapsulated data data packet packet for for 224.1.1.1 224.1.1.1 on on Serial0 Serial0 --------PIM: Send Join on Ethernet1 to 172.16.8.2 for (172.16.12.2/32, 24.1.1.1) PIM: Send Join on Ethernet1 to 172.16.8.2 for (172.16.12.2/32, 22 24.1.1.1) PIM: PIM: Received Received Join/Prune Join/Prune on on Serial0 Serial0 from from 172.16.3.2 172.16.3.2 PIM: Join -list: (172.16.12.2/32, 224.1.1.1), S-bit set PIM: Join -list: (172.16.12.2/32, 224.1.1.1), S-bit set PIM: ard PIM: Add Add Serial0/172.16.3.2 Serial0/172.16.3.2 to to (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1), 224.1.1.1), Forw Forw ard state state PIM: PIM: Send Send RP-reachability RP-reachability for for 224.1.1.1 224.1.1.1 on on Serial0 Serial0 --------PIM: PIM: Received Received Join/Prune Join/Prune on on Serial0 Serial0 from from 172.16.3.2 172.16.3.2 PIM: -list: -bit PIM: Join Join -list: (*, (*, 224.1.1.1) 224.1.1.1) RP RP 172.16.8.1, 172.16.8.1, RP-bit RP-bit set, set, SS -bit set set PIM: PIM: Add Add Serial0/172.16.3.2 Serial0/172.16.3.2 to to (*, (*, 224.1.1.1), 224.1.1.1), Forward Forward state state --------PIM: PIM: Building Building Join/Prune Join/Prune message message for for 224.1.1.1 224.1.1.1 PIM: PIM: For For 172.16.8.2, 172.16.8.2, Join-list: Join-list: 172.16.12.2/32 172.16.12.2/32 PIM: Send periodic Join/Prune to 172.16.8.2 PIM: Send periodic Join/Prune to 172.16.8.2 (Serial0) (Serial0) --------PIM: Received Register on Ethernet1 from 172.16.9.2 PIM: Received Register on Ethernet1 from 172.16.9.2 PIM: PIM: Send Send Register-Stop Register-Stop to to 172.16.9.2 172.16.9.2 for for 0.0.0.0, 0.0.0.0, group group 0.0.0.0 0.0.0.0 --------PIM: PIM: Received Received Join/Prune Join/Prune on on Serial0 Serial0 from from 172.16.8.2 172.16.8.2 PIM: -bit PIM: Prune-list: Prune-list: (172.16.12.2/32, (172.16.12.2/32, 224.1.1.1) 224.1.1.1) RP RP -bit set set
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

27 27

On R1 (the RP) The RP receives the Register messages from Router R4, it decapsulates the data from the Source and forwards it down the tree towards the Receiver using the pre-existing (*,224.1.1.1) state. Sends a JOIN towards the Source for (S,G)-> (172.16.12.2,224.1.1.1) This builds the (S,G) mtree from the RP to the Source. (the stop the encapsulated data flow to a native IPmc flow) Meanwhile the (*,G) is periodically renewed by the routers on the Receiver side of the mtree. The RP continues to send out periodic JOIN's for (S,G) to maintain state. The RP continues to receive the Null Register messages sent out by R6. The RP then receives a PRUNE from R5 for (S,G) with the RP bit set. The RP bit indicates that the tree is switching from a Shared tree to the Shortest Path tree (SPT). The S bit also signifies the switch.
Copyright ? ?1998-2001, Cisco Systems, Inc.
Module4.ppt

27

Router Command Review

Show commands Debug commands Other useful commands

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

28 28

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

28

mtrace and mstat commands

Based on Unix mtrace command Split into two separate commands Both use the same mechanism
draft-ietf-idmr-traceroute-ipm-xx.txt

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

29 29

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

29

mtrace/mstatHow it works

Mtrace Packet Flow


Adds mtrace data Adds mtrace data Adds mtrace data Adds mtrace data Adds mtrace data

src
First-hop Router
Multicast Dist. Tree Mtrace Packet
mt rac e res po ns e

dest
st ue req

Last-hop Router

e rac mt

Note: Mtracepackets use special IGMP packets with IGMP Type codes of 0x1E and 0x1F.
Module4. ppt

Unix Workstation or Cisco Router


8/10/2001 1:55 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

30 30

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

30

mtrace/mstatHow it works
Uses a special IGMP packet type
IGMP type 0x1F = IGMP type 0x1E = Queries/Requests Response

Requestor sends Query/Request packet


Sent to last-hop router of destination Can be initiated by 3rd Party

Last-hop router rcvs Query packet


Converts packet to traceroute Request Unicasts to upstream router toward source
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

31 31

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

31

mtrace/mstatHow it works
Each hop adds data to packet

Module4. ppt

Query arrival time Incoming Interface Outgoing Interface Prev. Hop Router address Input packet count Output packet count Total packets for this Source/Group Routing Protocol TTL Threshold Forwarding/Error Code
8/10/2001 1:55 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

32 32

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

32

mtrace/mstatHow it works

1st Hop router receives Request


Adds own response data Converts packet to Response type Sends response back to Requestor

Request receives Response packet


Packet contains hop-by-hop trace info

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

33 33

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

33

mtrace
Shows:
Multicast path from source to receiver.
Similar to unicast trace command Trace path between any two points in network TTL Thresholds & Delay shown at each node

Troubleshooting Usage:
Find where multicast traffic flow stops.
Focus on router where flow stops

Verify path multicast traffic is following.


Identify sub-optimal paths.
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

34 34

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

34

mtrace
dallas-gw>mtrace dallas-gw>mtrace bloom-iptv-svr bloom-iptv-svr bwilliam-ss5 bwilliam-ss5 224.2.156.43 224.2.156.43 Type escape sequence to abort. Type escape sequence to abort. Mtrace Mtrace from from 172.17.67.43 172.17.67.43 to to 171.68.37.121 171.68.37.121 via via group group 224.2.156.43 224.2.156.43 From From source source (?) (?) to to destination destination (bwilliam-ss5.cisco.com) (bwilliam-ss5.cisco.com) Querying Querying full full reverse reverse path... path... 0 0 bwilliam-ss5 bwilliam-ss5 (171.68.37.121) (171.68.37.121) -1 -1 dallas-gw dallas-gw (171.68.37.1) (171.68.37.1) PIM PIM thresh^ thresh^ 0 0 3 3 ms ms -2 -2 wan-gw4 wan-gw4 (171.68.86.193) (171.68.86.193) PIM PIM thresh^ thresh^ 0 0 32 32 ms ms -3 -3 bloomington-mn-gw bloomington-mn-gw (171.68.27.2) (171.68.27.2) PIM PIM thresh^ thresh^ 0 0 717 717 ms ms -4 -4 bloom-mnlab bloom-mnlab (171.68.39.28) (171.68.39.28) PIM PIM thresh^ thresh^ 0 0 730 730 ms ms -5 -5 bloom-iptv-svr bloom-iptv-svr (172.17.67.43) (172.17.67.43) dallas-gw> dallas-gw>

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

35 35

Shows all active groups with an aggregate bandwidth greater than the specified kbps (4kbps is the default) Listed in each entry is:
group address session name source address and domain name averaged pps and kbps rates for this flow

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

35

mstat
Shows:
Multicast path in pseudo graphic format.
Trace path between any two points in network Drops/Duplicates shown at each node TTLs & Delay shown at each node

Troubleshooting Usage:
Locate congestion point in the flow.
Focus on router with high drop/duplicate count Duplicates indicated as negative drops

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

36 36

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

36

mstat
dallas-gw>mstat dallas-gw>mstat 172.17.67.43 172.17.67.43 bwilliam-ss5 bwilliam-ss5 224.2.156.43 224.2.156.43 Source Response Packet Source Response Dest Dest Packet Statistics Statistics For For 172.17.67.43 171.68.86.194 All 172.17.67.43 171.68.86.194 All Multicast Multicast Traffic Traffic | __/ Lost/Sent | __/ rtt rtt 547 547 ms ms Lost/Sent = = Pct Pct Rate Rate v / hop --------------------v / hop 547 547 ms ms --------------------172.17.67.33 172.17.67.33 171.68.39.28 bloom 171.68.39.28 bloom-mnlab -mnlab | ^ ttl 0 | ^ ttl 0 v | hop -11/168 v | hop -409 -409 ms ms -11/168 = = --% --% 16 16 pps pps 171.68.39.1 171.68.39.1 171.68.27.2 bloomington-mn-gw 171.68.27.2 bloomington-mn-gw | ^ ttl 1 | ^ ttl 1 v | hop -9/170 17 v | hop 379 379 ms ms -9/170 = = --% --% 17 pps pps 171.68.27.1 171.68.27.1 171.68.86.193 wan 171.68.86.193 wan-gw4 -gw4 | ^ ttl 2 | ^ ttl 2 v | hop ms -3/195 19 v | hop 28 28 ms -3/195 = = --% --% 19 pps pps 171.68.86.194 171.68.86.194 171.68.37.1 dallas-gw 171.68.37.1 dallas-gw | \__ ttl 3 | \__ ttl 3 v \ ms 196 19 v \ hop hop 0 0 ms 196 19 pps pps 171.68.37.121 171.68.86.194 171.68.37.121 171.68.86.194 Receiver Query Receiver Query Source Source
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Only Only For For Traffic Traffic From From 172.17.67.43 172.17.67.43 To To 224.2.156.43 224.2.156.43 ---------------------------------------

0/67 0/67 = = 0% 0%

6 6 pps pps

-3/67 -3/67 = = --% --%

6 6 pps pps

0/70 0/70 = = 0% 0%

7 7 pps pps

70 70

7 7 pps pps

8/10/2001 1:55 PM

37 37

Shows all active groups with an aggregate bandwidth greater than the specified kbps (4kbps is the default) Listed in each entry is:
group address session name source address and domain name averaged pps and kbps rates for this flow

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

37

mstat
dallas-gw>mstat dallas-gw>mstat 172.17.67.43 172.17.67.43 bwilliam-ss5 bwilliam-ss5 224.2.156.43 224.2.156.43 Source Response Packet Source Response Dest Dest Packet Statistics Statistics For For 172.17.67.43 171.68.86.194 All 172.17.67.43 171.68.86.194 All Multicast Multicast Traffic Traffic | __/ Lost/Sent | __/ rtt rtt 399 399 ms ms Lost/Sent = = Pct Pct Rate Rate v / hop --------------------v / hop 399 399 ms ms --------------------172.17.67.33 172.17.67.33 171.68.39.28 bloom 171.68.39.28 bloom-mnlab -mnlab | ^ ttl 0 | ^ ttl 0 v | hop 77/694 v | hop 119 119 ms ms 77/694 = = 11% 11% 69 69 pps pps 171.68.39.1 171.68.39.1 171.68.27.2 bloomington-mn-gw 171.68.27.2 bloomington-mn-gw | ^ ttl 1 | ^ ttl 1 v | hop 395/609 v | hop -150 -150 ms ms 395/609 = = 65% 65% 60 60 pps pps 171.68.27.1 171.68.27.1 171.68.86.193 wan 171.68.86.193 wan-gw4 -gw4 | ^ ttl 2 | ^ ttl 2 v | hop ms -8/39 3 v | hop 30 30 ms -8/39 = = --% --% 3 pps pps 171.68.86.194 171.68.86.194 171.68.37.1 dallas-gw 171.68.37.1 dallas-gw | \__ ttl 3 | \__ ttl 3 v \ ms 39 3 v \ hop hop 0 0 ms 39 3 pps pps 171.68.37.121 171.68.86.194 171.68.37.121 171.68.86.194 Receiver Query Receiver Query Source Source
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Only Only For For Traffic Traffic From From 172.17.67.43 172.17.67.43 To To 224.2.156.43 224.2.156.43 ---------------------------------------

0/65 0/65 = = 0% 0%

6 6 pps pps

44/65 44/65 = = 68% 68%

6 6 pps pps

-1/21 -1/21 = = --% --%

2 2 pps pps

22 22

2 2 pps pps

8/10/2001 1:55 PM

38 38

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

38

mrinfo

berwyn-gw>mrinfo -gw berwyn-gw>mrinfo berwyn berwyn -gw 171.68.56.1 -gw.cisco.com) 171.68.56.1 (berwyn (berwyn -gw.cisco.com) [version [version cisco cisco 11.2] 11.2] [flags: [flags: PMSA]: PMSA]: 171.68.56.97 querier/leaf] 171.68.56.97 -> -> 0.0.0.0 0.0.0.0 [1/0/pim/ [1/0/pim/ querier/leaf] 171.68.56.1 171.68.56.1 -> -> 0.0.0.0 0.0.0.0 [1/0/pim/querier/leaf] [1/0/pim/querier/leaf] 171.68.28.142 171.68.28.142 -> -> 171.68.28.141 171.68.28.141 (wan-gw6.cisco.com) (wan-gw6.cisco.com) [1/0/pim] [1/0/pim]

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

39 39

Used to query a peering router about multicast information Example shown is from the Cisco internal network on a remote office router - when no arguments are given - the router queries itself

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

39

ping

ISP-251#ping ISP-251#ping 224.1.1.1 224.1.1.1 Type Type escape escape sequence sequence to to abort. abort. Sending -byte Sending 1, 1, 100 100 -byte ICMP ICMP Echos Echos to to 224.1.1.1, 224.1.1.1, timeout timeout is is 22 seconds: seconds: Reply Reply to to request request 00 from from 172.16.12.2, 172.16.12.2, 16 16 ms ms Reply Reply to to request request 00 from from 172.16.7.2, 172.16.7.2, 20 20 ms ms

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

40 40

Ping is the easiest way to generate multicast traffic in the lab and test the multicast tree Pings all members of the group - all members respond

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

40

Caching IP Multicast Packet Headers

You can view {source, group} traffic pairs IP ident and ttl Inter-packet delay Commands
ip multicast cache-headers show ip mpacket <source> <group> [detail]

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

41 41

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

41

Caching IP Multicast Packet Headers (Cont.)


dino-cisco-fr#show dino-cisco-fr#show ip ip mpacket mpacket cisco cisco-beta -beta IP IP Multicast Multicast Header Header Cache Cache - entry entry count: count: 29, 29, next next index: index: 30 30 Key: Key: id/ttl id/ttl timestamp timestamp (name) (name) source source group group D782/117 D782/117 7302/113 7302/113 6CB2/114 6CB2/114 D786/117 D786/117 E2E9/123 E2E9/123 1CA7/127 1CA7/127 1CAA/127 1CAA/127 1CAC/127 1CAC/127 1CAF/127 1CAF/127 1CB0/127 1CB0/127 1CB2/127 1CB2/127 2BBB/114 2BBB/114 3D1D/123 3D1D/123 2BC0/114 2BC0/114 7303/113 7303/113 7304/113 7304/113 2C7E/123 2C7E/123 206416.908 206416.908 206417.172 206417.172 206417.412 206417.412 206417.868 206417.868 206418.488 206418.488 206418.544 206418.544 206418.584 206418.584 206418.624 206418.624 206418.664 206418.664 206418.704 206418.704 206418.744 206418.744 206418.840 206418.840 206419.380 206419.380 206419.672 206419.672 206419.888 206419.888 206420.140 206420.140 206420.360 206420.360 (all-purpose-gunk.near.net) (all-purpose-gunk.near.net) 199.94.220.184 199.94.220.184 224.2.231.173 224.2.231.173 (speedy. (speedy.rrz rrz.uni .uni-koeln.de) -koeln.de) 134.95.19.23 134.95.19.23 224.2.231.173 224.2.231.173 ( wayback .uoregon.edu ) 128.223.156.117 224.2.231.173 ( wayback .uoregon.edu ) 128.223.156.117 224.2.231.173 (all-purpose-gunk.near.net) (all-purpose-gunk.near.net) 199.94.220.184 199.94.220.184 224.2.231.173 224.2.231.173 ( (dino-ss20.cisco.com) dino-ss20.cisco.com) 171.69.58.81 171.69.58.81 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (dino-ss2.cisco.com) dino-ss2.cisco.com) 171.69.129.220 171.69.129.220 224.2.231.173 224.2.231.173 ( (crevenia.parc.xerox crevenia.parc.xerox.com) .com) 13.2.116.11 13.2.116.11 224.2.231.173 224.2.231.173 ( (dalvarez-ss20.cisco.com) dalvarez-ss20.cisco.com) 171.69.60.189 171.69.60.189 224.2.231.173 224.2.231.173 ( (crevenia.parc.xerox crevenia.parc.xerox.com) .com) 13.2.116.11 13.2.116.11 224.2.231.173 224.2.231.173 (speedy. (speedy.rrz rrz.uni .uni-koeln.de) -koeln.de) 134.95.19.23 134.95.19.23 224.2.231.173 224.2.231.173 (speedy. (speedy.rrz rrz.uni .uni-koeln.de) -koeln.de) 134.95.19.23 134.95.19.23 224.2.231.173 224.2.231.173 ( (lwei-ss20.cisco.com) lwei-ss20.cisco.com) 171.69.58.88 171.69.58.88 224.2.231.173 224.2.231.173

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

42 42

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

42

Debugging Strategies

What does the network look like when everything is working? What is the expected behavior? What specifically is not working? Was it ever working correctly? What has been changed?

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

43 43

Debugging Strategies
These are standard questions to consider when debugging anything, including multicast

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

43

Debugging Strategies Troubleshooting Table


Source Signaling NA Network ? Receivers ?

Packet Flow

Is each piece working correctly?


Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

44 44

Debugging Strategies
Signaling is the process of setting up (and tearing down) the multicast session Packet flow is the actual sending, replication, and reception of the multicast packets based on the forwarding tables created by the signalling processes Each section of the table needs to be working for the application to work A similar table could be developed for unicast IP or other technologies, but the tools used to troubleshoot each case are different

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

44

Check source packet flow

Check interface counters on host Check upstream router for traffic flow
show ip mroute count show ip mroute active

debug ip mpacket on nearest upstream router *use with caution*


detail argument, or ACL for granularity

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

45 45

Checking source packet flow


How can we tell if the source is actually sourcing packets? First, check the interface counters on the source host to see if *it* thinks it is sending packets - if it doesnt, then it probably isnt. Check for misconfiguration or bugs in the host stack and application. Next, check the first upstream router or switch to see if it sees multicast packets from the source, using show commands Only if necessary, run debug ip mpacket on the route. This could have a serious performance impact on other traffic, so use with caution. The detail parameter for this command can be used to include packet headers in the debug output, and access lists can be used in conjunction with this command to check for traffic from specific sources.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

45

Check Network signaling

Most complex piece Depends of protocol, mode, etc. Check initial flow creation Check for pruning and timer expiration during session

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

46 46

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

46

Network signaling (continued)

show/debug ip mroute commands show/debug ip pim commands show/debug ip dvmrp commands show ip rpf
watch oilist for null entries

hop-by-hop process - use mtrace

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

47 47

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

47

PIM Sparse mode troubleshooting

show ip pim rp [<group>]


indicates RP for the group

show ip pim rp mapping


indicates RP for the group

debug ip pim auto-rp

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

48 48

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

48

DVMRP troubleshooting
show ip dvmrp route
can include address or interface arguments

debug ip dvmrp
Optional arguments are: detail - to capture headers ACL - to specify specific routes in | out - transmitted or recd only pruning - watch pruning and grafting only
Module4. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

49 49

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

49

Check Network packet flow

mstat command ping command show ip mroute count show ip mroute active debug ip mpacket
Be Careful with this one!

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

50 50

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

50

Check Receiver signaling

show ip igmp interface show ip igmp groups debug ip igmp / cgmp IGMPv1 vs. IGMPv2

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

51 51

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

51

Check Receiver packet flow

Check receiver interface stats Is the stack installed and configured properly? Is the application installed and configured properly? Watch for duplicates
performance implication

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 1:55 PM

52 52

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

52

Module4. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

53

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module4.ppt

53

PIM Sparse Mode


Module 5

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

Module Objectives

Identify and explain the basic mechanisms of PIM Sparse Mode. Configure and verify normal PIM SM operation.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

2 2

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

Module Agenda

Geekometer

PIM SM Overview PIM SM Protocol Mechanics PIM SM Review Configuring PIM SM

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

3 3

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM Sparse Mode Overview


Explicit join model
Receivers join to the Rendezvous Point (RP) Senders register with the RP Data flows down the shared tree and goes only to places that need the data from the sources Last hop routers can join source tree if the data rate warrants by sending joins to the source

RPF check depends on tree type


For shared trees, uses RP address For source trees, uses Source address
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

4 4

Explicit Join Model


Unlike PIM Dense mode, PIM Sparse mode uses the explicit join model whereby Receivers send PIM Join messages to a designated Rendezvous Point (RP). (The RP is the root of a shared distribution tree down which all multicast traffic flows.) In order to get multicast traffic to the RP for distribution down the shared tree, Senders send Register messages to the RP. Register messages cause the RP to send a Join towards the source so that multicast traffic can flow to the RP and hence down the shared tree. Last hop routers may be configured with an SPT-Threshold which, once exceeded, will cause the last hop router to join the Shortest Path Tree (SPT). This will result in the multicast traffic from the source to flow down the SPT from the source to the last hop router.

RPF Check depends on tree type


If traffic is flowing down the shared tree, the RPF check mechanism will use the IP address of the RP to perform the RPF check. If traffic is flowing down the SPT, the RPF check mechanism will use the IP address of the Source to perform the RPF check.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM Sparse Mode Overview

Only one RP is chosen for a particular group RP statically configured or dynamically learned (Auto-RP or PIM v2 BSR) Data forwarded based on the source state (S, G) if it exists, otherwise use the shared state (*, G) RFC 2362 - PIM Sparse Mode Protocol Spec

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

5 5

Only one RP for a group may be active at a time


While it is usually the case that a single RP serves all groups, it is possible to configure different RPs for a range(s) group(s). This is accomplished via access-lists. This permits the RPs to be placed in different locations and can improve the traffic flow for the group if it is placed close to the Source(s).

RP Configuration
RPs may be configured statically on each router (although they must all agree or your network will be broken!) in your network. However, a better solution is to use the Auto-RP or PIMv2 mechanisms to configure RPs.

Data Forwarding
Multicast traffic forwarding In a PIM Sparse mode network is first attempted using any matching (S,G) entries in the Multicast Routing table. If no matching (S,G) state exists, then the traffic is forwarded using the matching (*,G) entry in the Multicast Routing table.

PIM Sparse Mode Spec


PIM Sparse mode is now defined in RFC 2362.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM-SM Shared Tree Joins

RP

( * , G) Joins Shared Tree Receiver

( * , G) State created only along the Shared Tree.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

6 6

PIM-SM Shared Tree Joins


In this example, there is an active receiver (attached to leaf router at the bottom of the drawing) has joined multicast group G. The leaf router knows the IP address of the Rendezvous Point (RP ) for group G and when it sends a (*,G) Join for this group towards the RP. This (*, G) Join travels hop-by-hop to the RP building a branch of the Shared Tree that extends from the RP to the last-hop router directly connected to the receiver. At this point, group G traffic can flow down the Shared Tree to the receiver.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM-SM Sender Registration

Source

RP

(S, G) Register (S, G) Joins Shared Tree Source Tree

(unicast)

(S, G) State created only along the Source Tree.

Receiver

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

7 7

PIM-SM Sender Registration


As soon as an active source for group G sends a packet the leaf router that is attached to this source is responsible for Registering this source with the RP and requesting the RP to build a tree back to that router. The source router encapsulates the multicast data from the source in a special PIM SM message called the Register message and unicasts that data to the RP. When the RP receives the Register message it does two things It de-encapsulates the multicast data packet inside of the Register message and forwards it down the Shared Tree. The RP also sends an (S,G) Join back towards the source network S to create a branch of an (S, G) Shortest-Path Tree. This results in (S, G) state being created in all the router along the SPT, including the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM-SM Sender Registration

Source

RP

(S, G) Register (S, G) Joins Shared Tree Source Tree (S, G) Register-Stop

(unicast)

RP sends Register-Stop back to first-hop router.

(unicast)

Receiver

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

8 8

PIM-SM Sender Registration (cont.)


As soon as the SPT is built from the Source router to the RP, multicast traffic begins to flow natively from source S to the RP. Once the RP begins receiving data natively (i.e. down the SPT) from source S it sends a Register Stop to the sources first hop router to inform it that it can stop sending the unicast Register messages.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM-SM Sender Registration

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

Source traffic flows natively along SPT to RP. From RP, traffic flows down the Shared Tree to Receivers.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

9 9

PIM-SM Sender Registration (cont.)


At this point, multicast traffic from the source is flowing down the SPT to the RP and from there, down the Shared Tree to the receiver.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

PIM-SM SPT Switchover

Source

RP

Last-hop router joins the SPT. (S, G) Joins Shared Tree Source Tree (S, G)RP-bit Prunes Additional (S, G) State is created along new part of the Source Tree. Receiver Additional (S, G) State is created along along the Shared Tree to prune off (S, G) traffic.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

10 10

PIM-SM Shortest-Path Tree Switchover


PIM-SM has the capability for last-hop routers (i.e. routers with directly connected members) to switch to the Shortest-Path Tree and bypass the RP if the traffic rate is above a set threshold called the SPT-Threshold. The default value of the SPT-Threshold in Cisco routers is zero. This means that the default behaviour for PIM-SM leaf routers attached to active receivers is to immediately join the SPT to the source as soon as the first packet arrives via the (*,G) shared tree. In the above example, the last-hop router (at the bottom of the drawing) sends an (S, G) Join message toward the source to join the SPT and bypass the RP. This (S, G) Join messages travels hop-by-hop to the first-hop router (i.e. the router connected directly to the source) thereby creating another branch of the SPT. This also creates (S, G) state in all the routers along this branch of the SPT. Finally, special (S, G)RP-bit Prune messages are sent up the Shared Tree to prune off this (S,G) traffic from the Shared Tree. If this were not done, (S, G) traffic would continue flowing down the Shared Tree resulting in duplicate (S, G) packets arriving at the receiver.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

10

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

(S, G) Traffic flow is now pruned off of the Shared Tree and is flowing to the Receiver via the SPT.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

11 11

PIM-SM Shortest-Path Tree Switchover


At this point, (S, G) traffic is now flowing directly from the first -hop router to the last-hop router and from there to the receiver. Note: The RP will normally send (S, G) Prunes back toward the source to shutoff the flow of now unnecessary (S, G) traffic to the RP IFF it has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree. (This step has been omitted from the example above.) As a result of this SPT-Switchover mechanism, PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

11

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree (S, G) Prune Receiver

(S, G) traffic flow is no longer needed by the RP so it Prunes the flow of (S, G) traffic.

Module5.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

12 12

PIM-SM Shortest-Path Tree Switchover


At this point, the RP no longer needs the flow of (S, G) traffic since all branches of the Shared Tree (in this case there is only one) have pruned off the flow of (S, G) traffic. As a result, the RP will send (S, G) Prunes back toward the source to shutoff the flow of the now unnecessary (S, G) traffic to the RP Note: This will occur IFF the RP has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

12

PIM-SM SPT Switchover

Source

RP

Traffic Flow Shared Tree Source Tree Receiver

(S, G) Traffic flow is now only flowing to the Receiver via a single branch of the Source Tree.

Module5.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

13 13

PIM-SM Shortest-Path Tree Switchover


As a result of the SPT-Switchover, (S, G) traffic is now only flowing from the first-hop router to the last-hop router and from there to the receiver. Notice that traffic is no longer flowing to the RP. As a result of this SPT-Switchover mechanism, it is clear that PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

13

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

14 14

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

14

PIM Neighbor Discovery


171.68.37.2 PIM Router 2 Highest IP Address elected as DR (Designated Router) PIM Hello

PIM Hello PIM Router 1 171.68.37.1

PIMv2 Hellos are periodically multicast to the All-PIM-Routers (224.0.0.13) group address. (Default = 30 seconds)
Note: PIMv1 multicasts PIM Query messages to the All-Routers (224.0.0.2) group address.

If the DR times-out, a new DR is elected. The DR is responsible for sending all Joins and Register messages for any receivers or senders on the network.
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

15 15

PIM Neighbor Discovery


PIM Hellos are sent periodically to discover the existence of other PIM routers on the network and to elect the Designated Router. For Multi-Access networks (e.g. Ethernet), the PIM Hello messages are multicast to the All-PIM -Routers (224.0.0.13) multicast group address.

Designated Router (DR)


For multi-access networks, a Designated Router (DR) is elected. In PIM Sparse mode networks, the DR is responsible for sending Joins to the RP for members on the multi-access network and for sending Registers to the RP for sources on the multi-access network. For Dense mode, the DR has no meaning. The exception to this is when IGMPv1 is in use. In this case, the DR also functions as the IGMP Querier for the Multi-Access network.

Designated Router (DR) Election


To elect the DR, each PIM node on a multi-access network examines the received PIM Hello messages from its neighbors and compares the IP Address of its interface with the IP Address of its PIM Neighbors. The PIM Neighbor with the highest IP Address is elected the DR. If no PIM Hellos have been received from the elected DR after some period (configurable), the DR Election mechanism is run again to elect a new DR.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

15

PIM Neighbor Discovery


wan-gw8> show wan-gw8> show ip ip pim pim neighbor neighbor PIM PIM Neighbor Neighbor Table Table Neighbor Neighbor Address Address Interface Interface 171.68.0.70 FastEthernet0 171.68.0.70 FastEthernet0 171.68.0.91 FastEthernet0 171.68.0.91 FastEthernet0 171.68.0.82 FastEthernet0 171.68.0.82 FastEthernet0 171.68.0.86 FastEthernet0 171.68.0.86 FastEthernet0 171.68.0.80 FastEthernet0 171.68.0.80 FastEthernet0 171.68.28.70 Serial2.31 171.68.28.70 Serial2.31 171.68.28.50 Serial2.33 171.68.28.50 Serial2.33 171.68.27.74 Serial2.36 171.68.27.74 Serial2.36 171.68.28.170 Serial0.70 171.68.28.170 Serial0.70 171.68.27.2 Serial1.51 171.68.27.2 Serial1.51 171.68.28.110 Serial3.56 171.68.28.110 Serial3.56 171.68.28.58 Serial3.102 171.68.28.58 Serial3.102

Uptime Uptime 2w1d 2w1d 2w6d 2w6d 7w0d 7w0d 7w0d 7w0d 7w0d 7w0d 22:47:11 22:47:11 22:47:22 22:47:22 22:47:07 22:47:07 1d04h 1d04h 1w4d 1w4d 1d04h 1d04h 12:53:25 12:53:25

Expires Expires 00:01:24 00:01:24 00:01:01 00:01:01 00:01:14 00:01:14 00:01:13 00:01:13 00:01:02 00:01:02 00:01:16 00:01:16 00:01:08 00:01:08 00:01:21 00:01:21 00:01:06 00:01:06 00:01:25 00:01:25 00:01:20 00:01:20 00:01:03 00:01:03

Mode Mode Sparse Sparse Sparse Sparse (DR) (DR) Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse Sparse

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

16 16

show ip pim neighbor command output


Neighbor Address - the IP address of the PIM Neighbor Interface - the interface where the PIM Hello of this neighbor was received. Uptime - the period of time that this PIM Neighbor has been active. Expires - the period of time after which this PIM Neighbor will no longer be considered as active. (Reset by the receipt of a another PIM Query.) Mode - PIM mode (Sparse, Dense, Sparse/Dense) that the PIM Neighbor is using. (DR) - Indicates that this PIM Neighbor is the Designated Router for the network.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

16

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

17 17

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

17

PIM State

Describes the state of the multicast distribution trees as understood by the router at this point in the network. Represented by entries in the multicast routing (mroute) table
Used to make multicast traffic forwarding decisions Composed of (*, G) and (S, G) entries Each entry contains RPF information
Incoming (i.e. RPF) interface RPF Neighbor (upstream)

Each entry contains an Outgoing Interface List (OIL)


OIL may be NULL

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

18 18

PIM State
In general, Multicast State basically describes the multicast distribution tree as it is understood by the router at this point in the network. However to be completely correct, Multicast State describes the multicast traffic forwarding state that is used by the router to forward multicast traffic.

Multicast Routing (mroute) Table


Multicast state is stored in the multicast routing (mroute) table and which can be displayed using the show ip mroute command. Entries in the mroute table are composed of (*, G) and (S, G) entries each of which contain: RPF Information consisting of an Incoming (or RPF) interface and the IP address of the RPF (i.e. upstream) neighbor router in the direction of the source. (In the case of PIM-SM, this information in a (*, G) entry points toward the RP. PIM-SM will be discussed in a later module.) Outgoing Interface List (OIL) which contains a list of interfaces that the multicast traffic is to be forwarded. (Multicast traffic must arrive on the Incoming interface before it will be forwarded out this interfaces. If multicast traffic does not arrive on the Incoming interface, it is simply discarded.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

18

PIM-SM State Example


sj-mbone> show ip mroute IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT -bit set, J - Join SPT M - MSDP created entry, X - Proxy Join Timer Running A - Advertised via MSDP Timers: Uptime/Expires Interface state: Interface, Next -Hop or VCD, State/Mode (*, 224.1.1.1), 00:13:28/00:02:59, RP 10.1.5.1, flags: SCJ Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:13:28/00:02:32 Serial0, Forward/Sparse, 00:4:52/00:02:08 (171.68.37.121/32, 224.1.1.1), Incoming interface: Serial0, Outgoing interface list: Ethernet1, Forward/Sparse, Ethernet0, forward/Sparse, 00:01:43/00:02:59, flags: CJT RPF nbr 192.10.2.1 00:01:43/00:02:11 00:01:43/00:02:11

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

19 19

PIM-SM State Example


(*, G) Entry - The (*, 224.1.1.1) entry shown in sample output of the show ip mroute command is the (*, G) entry. If there is no matching entry for a particular (S, G) entry, this entry is used to forward traffic down the Shared Tree. The Expires countdown timer in the first line of the (*, G) entry which shows when the entry will expire and be deleted. This entry will remain at roughly 3 minutes as long as there is an interface in the Outgoing Interface list. The Incoming interface information is used to RPF check arriving (*, G) multicast traffic and is computed in the direction of the RP (in this case, 10.1.5.1.). The Outgoing Interface list which reflects the interfaces where (*,G) Joins have been received or where directly connected members of group G reside. Traffic flowing down the Shared Tree are forwarded out these interfaces. The Expires countdown timers on these interfaces are reset to 3 minutes by the receipt of periodic (*, G) Joins. If the count ever reaches zero, the entry in the OIL is deleted. (S, G) Entry - The (128.9.160.43/32, 224.1.1.1) entry is an example of an (S, G) entry in the mroute table. This entry is used to forward any multicast traffic sent by source 128.9.160.43 to group 224.1.1.1. Notice the following: The Expires countdown timer in the first line of the (S, G) entry which shows when the entry will expire and be deleted. This entry is reset to 3 minutes whenever an (S, G) multicast packet is forwarded. The Incoming interface information is used to RPF check arriving (S, G) multicast traffic. If a packet does not arrive via this interface, the packet is discarded. The Outgoing Interface list which reflects the interfaces where (S,G) packets are to be forwarded.
Copyright ? ?1999-2001, Cisco Systems, Inc.
Module5.ppt

19

PIM-SM (*,G) State Rules


(*,G) creation
Receipt of a (*,G) Join or IGMP Report Automatically if (S,G) must be created

(*,G) reflects default group forwarding


IIF = RPF interface toward RP OIL = interfaces
that received a (*,G) Join or with directly connected members or manually configured

(*,G) deletion
When OIL = NULL and no child (S,G) state exists
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

20 20

PIM-SM (*,G) State Rules


A (*, G) entry is created when a (*, G) Join or an IGMP Report is received The later condition can be simulated by manually configuring the interface to join the group. (*, G) entries are also automatically created whenever an (S, G) entry for the group must be created. The (*, G) entry is created first and then the (S, G) entry. The reason for this will become clear shortly. The IIF reflects the RPF interface/neighbor in the direction of the RP. The OIL of a PIM-SM (*, G) entry reflects interfaces that: Have received a (*, G) Join or Where a directly connected member has joined the group The interface was manually configured to join the group. (Note: This may be accomplished using the ip igmp static-group <group> command.) (*, G) entries are deleted when its Expires timer counts down to zero. This will only occur when: The OIL is Null and No child (S, G) entry exists

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

20

PIM-SM (S,G) State Rules


(S,G) creation
By receipt of (S,G) Join or Prune or By Register process Parent (*,G) created (if doesnt exist)

(S,G) reflects forwarding of S to G


IIF = RPF Interface normally toward source
RPF toward RP if RP-bit set

OIL = Initially, copy of (*,G) OIL minus IIF

(S,G) deletion
By normal (S,G) entry timeout
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

21 21

PIM-SM (S, G) Rules


In PIM -SM, (S, G) state is created as a result of: The receipt of an (S, G) Join or Prune or The PIM -SM Register process which is triggered by a first-hop router receiving a packet from a directly connected source. When an (S, G) entry must be created, the following steps occur: If a corresponding (*, G) entry does not exist, it is created first. The RPF Information is computed for the source S. This information is stored in the (S, G) entry as the Incoming interface and the RPF neighbor (i.e. the PIM neighbor in the direction of the source). The exception to this rule is if the RP-bit is set in the (S, G) entry, the RPF interface is pointed up the Shared Tree. This mechanism allows duplicate (S, G) traffic to be blocked from flowing down the Shared Tree after a downstream router has switched to the Shortest Path Tree. (More on this later.) The OIL of the (S, G) entry is populated with a copy of the OIL from the parent (*, G) entry less the Incoming interface. (The Incoming interface must not appear in the OIL otherwise a multicast route loop could occur.) In PIM -SM, (S, G) entries are deleted when their Expires timer counts down to zero. The Expires timer is reset whenever an (S, G) packet is received and forwarded.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

21

PIM-SM OIL Rules


Interfaces in OIL added
By receipt of Join message
Interfaces added to (*,G) are added to all (S,G)s

Interfaces in OIL removed


By receipt of Prune message
Interfaces removed from (*,G) are removed from all (S,G)s

Interface Expire timer counts down to zero


Timer reset (to 3 min.) by receipt of periodic Join or By IGMP membership report
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

22 22

PIM-SM Outgoing Interface List Rules


Adding an interface Interfaces are added to an (S, G) OIL when a (S, G) Join message is received on an interface. Interfaces are added to the (*, G) OIL when a (*, G) Join message is received on an interface. Anytime an interface is added to the (*, G) OIL, the interface is added to the OIL of all associated (S, G) OILs. (Note: A check is always made to prevent the IIF from appearing in the OIL.) Removing an interface Interfaces are removed from the OIL of a (*, G) or (S, G) entry if the interfaces Expires timer counts down to zero. Note: The interface Expires timer is reset to 3 minutes by the receipt of periodic Join messages sent by downstream routers once per minute or by an IGMP Report sent by a directly connected member on the interface. Interfaces are removed from the OIL if an Prune message is received (and it is not overridden by another router if the interface is a multi-access network). Interfaces removed from a (*, G) OIL, are removed from the OIL of all associated (S, G) OILs.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

22

PIM-SM OIL Rules


Triggering Join/Prune Messages
(*,G) Joins are triggered when:
The (*,G) OIL transitions from Null to non-Null

(*,G) Prunes are triggered when:


The (*,G) OIL transitions from non-Null to Null

(S,G) Joins are triggered when:


The (S,G) OIL transitions from Null to non-Null

(S,G) Prunes are triggered when:


The (S,G) OIL is Null AND A packet is received on the incoming interface

(S,G)RP-bit Prunes are triggered when:


The (S,G) RPF info != the (*,G) RPF info
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

23 23

PIM-SM Outgoing Interface List Rules


Triggering Join/Prune Messages (*,G) Joins are triggered whenever the (*,G) OIL is empty (Null) and an interface is added making the OIL non-Null. (*,G) Prunes are triggered whenever the last interface is removed from the (*,G) OIL. (S,G) Joins are triggered whenever the (S,G) OIL is empty (Null) and an interface is added making the OIL non-Null. (S,G) Prunes are triggered whenever the (S,G) OIL is empty AND a packet is received on the incoming interface. Note: This is an optimization that attempts to minimize the sending of (S,G) Prunes. Instead of sending the (S,G) Prune immediately when the last interface is removed, the state is just allowed to time out. However, if (S,G) traffic is still flowing, then the arrival of the next (S,G) packet will cause the prune to be sent. (S,G)RP -bit Prunes are sent whenever the (S,G) RPF information (incoming interface and RPF-neighbor) is not the same as the (*,G) RPF information. This indicates that the SPT and the Shared-Tree diverge at this point and that (S,G) traffic should be pruned from the Shared-Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

23

PIM-SM State Flags

S C L P T

= = = = =

Sparse Mode Directly Connected Host Local (Router is member) Pruned (All intfcs in OIL = Prune) Forwarding via SPT

Indicates at least one packet was forwarded

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

24 24

PIM-SM State Flags


S Flag ((*, G) entries only) Indicates the group is operating in Sparse mode. (Appears only on (*, G) entries.) C Flag Indicates that there is a member of the group directly connected to the router. L Flag Indicates the router itself is a member of this group and is receiving the traffic. (This would be the case for the Auto-RP Discovery group 224.0.1.40 which all Cisco routers join automatically.) P Flag Set whenever all interfaces in the outgoing interface list of an entry are Pruned (or the list is Null). This general means that the router will send Prune messages to the RPF neighbor to try to shutoff this traffic.) T Flag ((S, G) entries only) Indicates that at least one packet was received via the SPT

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

24

PIM-SM State Flags (cont.)


J = Join SPT
Indicates SPT-Threshold is being exceeded Next (S,G) received will trigger join of SPT

In (*, G) entry

In (S, G) entry
Indicates SPT joined due to SPT-Threshold If rate < SPT-Threshold, switch back to Shared Tree

F = Register
In (S,G) entry
S is a directly connected source Triggers the Register Process

In (*, G) entry
Set when F set in at least one child (S,G)
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

25 25

PIM-SM State Flags


J Flag (Join SPT) When this flag is set in a (*, G) entry, it indicates that the rate of traffic flowing down the Shared Tree is above the SPT-Threshold and will cause a switch to the SPT for the next packet received down the shared tree. (More on this later.) When this flag is set in an (S, G) entry, it indicates that the (S, G) entry (and hence the SPT) was created as a result of the SPT-Threshold being exceeded. If the rate of this (S, G) traffic drops back below the SPT, the router will attempt to switch this traffic flow back to the Shared Tree. F Flag (Register) This flag is set on an (S, G) entry when source S is directly connected to the router. This indicates that this router is a first-hop router and triggers it to send Register messages to the RP to inform the RP of this active source. This flag can also be set for arriving (S, G) entries created at a border router such as a router that borders on a DVMRP or other dense mode cloud. This causes the router to perform a proxy-register operation and send (S, G) Register messages to the RP on behalf of the downstream DVMRP routers. This proxy-register operation follows the same rules as for directly connected sources. The F flag is also set on a (*, G) entry if any associated (S, G) entries have the F flag set.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

25

PIM-SM State Flags (cont.)


R = RP bit
(S, G) entries only Set by (S,G)RP-bit Prune Indicates info is applicable to Shared Tree Used to prune (S,G) traffic from Shared Tree
Initiated by Last-hop router after switch to SPT

Modifies (S,G) forwarding behavior


IIF = RPF toward RP (I.e. up the Shared Tree) OIL = Pruned accordingly
26 26

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

PIM-SM Flags
R Flag (RP-Bit) This flag is set on (S, G) entries only and indicates that the (S, G) forwarding information in the entry is applicable to (S, G) traffic flowing down the Shared Tree. The R flag is set on an (S, G) entry by the receipt of an (S, G)RP-bit Prune message. These messages are sent by downstream routers on the Shared Tree that are requesting that this specific (S, G) traffic flow be pruned off of the Shared Tree. This is done to eliminate duplicate (S, G) traffic after a downstream router has switched to the (S, G) Shortest-Path Tree. Whenever the R flag is set on an (S, G) entry, the RPF information must be changed to point toward the RP instead of pointing at source S. This is done because the (S, G) entry is now applicable to (S, G) traffic arriving down the Shared Tree. As a result, the RPF information must point up the Shared Tree in order for arriving (S, G) packets to RPF correctly. (This should be made clear later.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

26

PIM-SM State Flags (cont.)

X = Proxy Join Timer flag


(S, G) entries only Indicates Proxy Join Timer is running Used to handle turn-around router case
More on this in another Module

When Proxy Join Timer is running


(S, G) Joins are sent toward the source The sending of (S, G) Prunes are suppressed
Even if the OIL list is NULL

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

27 27

PIM-SM Flags
X Flag (Proxy Join Timer Running) This flag is set on (S, G) entries only and is used to indicate that the Proxy Join Timer is running. When this timer is running, the router will continue to send (S, G) Joins in the direction of the source even if the OIL is NULL. This is used to handle the special turn-around router situation which occurs when the SPT to the RP and the Shared Tree merge. (More on this special scenario will be presented in another module.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

27

PIM-SM State Flags (cont.)


M = MSDP Created bit
(S, G) entries only Set when (S, G) learned via an MSDP SA msg

A = MSDP Advertise bit


(S, G) entries only (S, G) may be advertised in an MSDP SA msg
Presence of certain filters can affect this.

Indicates source is in local SM domain


Received a PIM (S,G) Register or Source is directly connected or (S,G) traffic was received on a DM interface
via the RPF interface
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

28 28

PIM-SM Flags
M Flag (MSDP Created) This flag only appears on (S, G) entries and only on the router that is the active RP for group G. The flag indicates that the RP has learned of this particular source via an MSDP Source Active message. (MSDP is addressed in more detail in another module.) A Flag (Advertise Flag) This flag only appears on (S, G) entries and only on the router that is the active RP for group G. The A flag indicates that this source is in the local PIM-SM domain and that it is a candidate for being announced to RPs in other networks via MSDP Source Active messages. A source is considered to be in the local domain if an (S, G) Register message was received for this source or the source is directly connected to the RP or the (S, G) traffic was received on a Dense mode interface that has been designated as a dense mode boundary interface.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

28

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

29 29

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

29

PIM SM Joining

Leaf routers send a (*,G) Join toward RP


Joins sent hop-by-hop along path toward RP

Each router along path creates (*,G) state


IF no (*,G) state,
Create it and send a Join toward RP

ELSE
Join process complete. Reached the Shared Tree.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

30 30

Leaf (last-hop) routers join the Shared Tree (RPT)


When a last-hop router wishes to begin receiving multicast traffic for group G, it sends a PIM (*,G) Join message to its up-stream PIM Neighbor in the direction of the RP. While the Join is multicast to the All-Routers (224.0.0.2) multicast address, the up-stream PIM Neighbors IP address is indicated in the body of the PIM Join Message. This permits all PIM routers on a Multi-Access network to be aware of the Join but only the indicated up-stream PIM Neighbor will perform the Join.

Routers up the Shared Tree (RPT) create (*,G) state


When a PIM router receives a (*,G) Join for group G from one of its downstream PIM Neighbors, it will check to see if any (*, G) state exists for group G in its Multicast Routing table. If (*, G) state for group G already exists, then the interface from which the Join was received is placed on the (*,G) oilist. If no (*, G) state for group G exists, a (*, G) entry is created, the interface from which the Join was received is placed in the (*, G) oilist and a (*, G) Join is sent towards the RP. The end result of the above mechanism is to create (*, G) state all the way from the last-hop router to the RP so that group G multicast traffic will flow down the Shared Tree (RPT) to the last-hop router.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

30

PIM SM Joining
To RP (10.1.5.1) S0
10.1.4.2 Shared Tree 10.1.2.2

S1

rtr-a
E0
10.1.2.1

E0

1 IGMP Join
Rcvr A

E1

rtr-b

Rcvr A wishes to receive group G traffic. Sends IGMP Join for G.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

31 31

PIM SM Joining Example


1) Receiver A wishes to receive group G multicast traffic and therefore sends an IGMP Host Membership message (sometimes loosely referred to as an IGMP Join) which is received by rtr-b. rtr-b has no existing (*, G) state for group G and therefore creates an entry. (See next slide.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

31

PIM SM Joining
To RP (10.1.5.1) S0
10.1.4.2 Shared Tree 10.1.2.2

S1

rtr-a
E0
10.1.2.1

E0

E1
Rcvr A

rtr-b

(*, (*, 224.1.1.1), 224.1.1.1), 00:00:05/00:02:54, 00:00:05/00:02:54, RP RP 10.1.5.1, 10.1.5.1, flags: flags: SC SC Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 10.1.2.1 10.1.2.1 Outgoing Outgoing interface interface list: list: Ethernet1, Forward/Sparse, 00:00:05/00:02:54 Ethernet1, Forward/Sparse, Forward/Sparse, 00:00:05/00:02:54 00:00:05/00:02:54

rtr-b creates (*, 224.1.1.1) state

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

32 32

State in rtr-b after Joining (*, 224.1.1.1)


(*, 224.1.1.1) indicates the (*, G) entry. 00:00:05/00:02:54 indicates that the entry has existed for 5 seconds and will expire in 2 minutes and 54 seconds. RP 10.1.5.1 is the IP Address of the Rendezvous Point for Group 224.1.1.1 flags: SC indicates that this is a Sparse mode group (S) and that there is a member of this group directly connected (C) to the router. Incoming interface: Ethernet0, RPF nbr 10.1.2.1 indicates the Incoming interface (up the Shared Tree toward RP) and the RPF neighbors IP address (in the direction of the RP) is 10.1.2.1 Outgoing interface list: lists the interfaces that are in the outgoing interface list for this entry. Ethernet1, Forward/Sparse, 00:00:05/00:02:54 indicates Ethernet 1 is in the oilist; its in the Forward state; Sparse mode and that it has been in the list for 5 seconds and will expire in 2 minutes and 54 seconds if no further (*, G) Join or IGMP Report is received on this interface.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

32

PIM SM Joining
To RP (10.1.5.1) S0
10.1.4.2 Shared Tree 10.1.2.2

S1

rtr-a
E0
10.1.2.1

E0

2 PIM Join

E1
Rcvr A

rtr-b

1 2

Rcvr A wishes to receive group G traffic. Sends IGMP Join for G. rtr-b sends (*,G) Join towards RP.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

33 33

PIM SM Joining Example


2) Because the OIL of the (*, G) transitioned from Null to non-Null (when rtr-b added Ethernet 1 to the OIL of the newly created entry), a PIM (*, G) Join is sent to rtr-bs up-stream PIM neighbor (rtr-a) in the direction of the RP. When rtr-a receives the (*, G) Join it creates (*, G) state. (See next slide.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

33

PIM SM Joining
To RP (10.1.5.1) S0
10.1.4.2 Shared Tree 10.1.2.2

S1

rtr-a
E0
10.1.2.1

E0

E1
Rcvr A

rtr-b

(*, (*, 224.1.1.1), 224.1.1.1), 00:00:05/00:02:54, 00:00:05/00:02:54, RP RP 10.1.5.1, 10.1.5.1, flags: flags: S S Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 10.1.4.1 10.1.4.1 Outgoing Outgoing interface interface list: list: Ethernet0, Ethernet0, 00:00:05/00:02:54 Ethernet0, Forward/Sparse, Forward/Sparse, 00:00:05/00:02:54 00:00:05/00:02:54

rtr-a creates (*, 224.1.1.1) state.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

34 34

State in rtr-a after Joining (*, 224.1.1.1)


(*, 224.1.1.1) indicates the (*, G) entry. 00:00:05/00:02:54 indicates that the entry has existed for 5 seconds and will expire in 2 minutes and 54 seconds. RP 10.1.5.1 is the IP Address of the Rendezvous Point for Group 224.1.1.1 flags: S indicates that this is a Sparse mode group (S). Incoming interface: Serial0, RPF nbr 10.1.4.1 indicates the Incoming interface (up the Shared Tree toward RP) and the RPF neighbors IP address (in the direction of the RP) is 10.1.4.1 Outgoing interface list: lists the interfaces that are in the outgoing interface list for this entry. Ethernet0, Forward/Sparse, 00:00:05/00:02:54 indicates Ethernet 0 is in the oilist; its in the Forward state; Sparse mode and that it has been in the list for 5 seconds and will expire in 2 minutes and 54 seconds if no further (*, G) Join or IGMP Report is received on this interface.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

34

PIM SM Joining
To RP (10.1.5.1) 4 Shared Tree S0
10.1.4.2

S1

3 PIM Join
Shared Tree

rtr-a
E0
10.1.2.1

10.1.2.2

E0

E1
Rcvr A

rtr-b

1 2 3 4

Rcvr A wishes to receive group G traffic. Sends IGMP Join for G. rtr-b sends (*,G) Join towards RP. rtr-a sends (*,G) Join towards RP. Shared tree is built all the way back to the RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

35 35

PIM SM Joining Example


3) Because the OIL of the (*, G) transitioned from Null to non-Null (when rtr-a added Ethernet 0 to the OIL of the newly created entry), a PIM (*, G) Join is sent to rtr-as up-stream PIM neighbor in the direction of the RP. When the upstream router receives the (*, G) Join it too creates (*, G) state and creates a branch of the Shared Tree. 4) This process continues all the way back to the RP (or until a router is reached that is already on the Shared Tree and therefore already has a (*, G) entry.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

35

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

36 36

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

36

PIM SM Registering
Senders begin sourcing Multicast Traffic
Senders dont necessarily perform IGMP group joins.

1st-hop router unicasts Registers to RP


A Mcast packet is encapsulated in each Register msg Registers messages follow unicast path to RP

RP receives Register messages


De-encapsulates Mcast packet inside Register msg Forwards Mcast packet down Shared Tree Sends (S,G) Join toward Source to build a SPT from the Source to the RP
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

37 37

All Senders are not necessarily Receivers and vice versa.


It is not a requirement that all sources be receivers. In the case of a source-only host, it is permissible for the host to simply begin sending multicast traffic without ever joining the group via IGMP.

1st-hop router sends Registers to the RP


In PIM Sparse mode, when a 1st-hop router receives the first multicast packet from directly connected source S for group G, it creates (S, G) state and sets the F bit in the (S, G) entry to indicate that it is a directly connected Source and also sets the Registering flag to indicate that its in the process of Registering. Next, the 1st-hop router encapsulates the original multicast packet in a PIM Register message and unicasts it to the RP. (Any subsequent multicast packets received from directly connected source S for group G are also encapsulated in a Register message and unicast to the RP. This continues until an (S, G) Register-Stop message is received from the RP.)

RP receives Register messages


When the RP receives a Register message it will de-encapsulated the message. If this packet is to a Group for which the RP has (*, G) state, the RP will: Forward the original packet out all interfaces in the the (*, G) entrys oilist. If it hasnt already done so, the RP creates (S, G) state and sends an (S, G) Join back towards the Source in order to join the Shortest-path Tree (SPT) to Source S.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

37

PIM SM Registering (cont.)


1st-hop router receives (S,G) Join
SPT between Source and RP is now built. Begins forwarding traffic down SPT to RP (S,G) Traffic temporarily flowing down 2 paths to RP

RP receives traffic down native SPT


Sends a Register-Stop msg to the 1st-Hop router.

1st-hop router receives Register-Stop


Stops encapsulating traffic in Register messages (S,G) Traffic now flowing down single SPT to RP

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

38 38

1st-hop router receives (S, G) Join


When the 1st-hop router receives the (S, G) Join (sent hop-by-hop from the RP), it processes it normally by adding the interface, from which the Join was received, to the oilist of the existing (S, G) entry. (This entry was originally created when the 1st-hop router received the first multicast packet from directly connected Source S.) This completes the building of the Shortest-Path Tree (SPT) from the Source to the RP. The 1st-hop router now begins forwarding Source S multicast traffic down the newly built Shortest-Path Tree (SPT) to the RP. Note: (S, G) traffic temporarily flows to the RP via two methods; via Register messages (until a Register-Stop message is received) and the native ShortestPath Tree (SPT).

RP begins receiving traffic down the (S, G) SPT.


As soon as the RP begins receiving (S, G) traffic natively (I.e. not encapsulated in Register messages) down the SPT, the RP will set the T bit in the (S, G) entry to denote that traffic is succesfully flowing down the ShortestPath Tree (SPT). Now when any (S, G) Register messages are received by the RP, it sees that the T bit is set in the (S, G) entry will respond by sending a PIM (S, G) Register-Stop message to the 1st-hop router. This notifies the 1st-hop router that traffic is now being received natively down the SPT.

1st-hop router receives Register-Stop message


When the (S, G) Register-Stop message is received by the 1st-hop router, it clears the Registering flag in the (S, G) entry and stops encapsulating (S,G) traffic in Register messages. Traffic is now flowing only down the SPT to the RP.
Copyright ? ?1999-2001, Cisco Systems, Inc.
Module5.ppt

38

PIM SM Register Examples

Receivers Join Group First Source Registers First Receivers along the SPT

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

39 39

PIM SM Register Examples


Depending on whether there are any existing Receivers for group G on the Shared Tree (RPT), the RP hands the Register process a little different. In the following examples we will consider the Register process for the cases when: Receivers join group G first; The Source Registers first. Receivers along the SPT.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

39

PIM SM Registering
Receiver Joins Group First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

Shared Tree

(*, 224.1.1.1), 00:03:14/00:02:59, RP 171.68.28.140, flags:S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:03:14/00:02:45 Serial1, Forward/Sparse, 00:03:14/00:02:45

State in RP before any source registers


(with receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

40 40

State in RP before Registering (Rcvrs on Shared Tree)


Pay particular attention to the following in the (*, G) entry: The Incoming interface: is NULL and the RPF nbr is 0.0.0.0. This indicates that this router is the RP. The Outgoing interface list: contains Serial0 and Serial1 which are assumed to be the only two active branches of the Shared Tree (RPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

40

PIM SM Registering
Receiver Joins Group First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

Shared Tree

rtr-b rtr-b>sh >sh ip ip mroute mroute 224.1.1.1 224.1.1.1 No No such such group group

State in rtr-b before any source registers


(with receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

41 41

State in rtr-b before source registers


Note that there is no group state information for this Group yet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

41

PIM SM Registering
Receiver Joins Group First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

Shared Tree

rtr-a >sh ip mroute 224.1.1.1 No such group.

State in rtr-a before any source registers


(with receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

42 42

State in 1st-hop router (rtr-a) before source registers


Note that there is no group state information for this Group yet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

42

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets

1
Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

RP

rtr-a

rtr-b

rtr-c

Shared Tree

Source begins sending group G traffic.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

43 43

Receivers Join Group First Example


1) Source S begins sending traffic to group G. 2) 1st-hop router (rtr-a) creates (*, G) and (S, G) state; encapsulates the multicast packets in PIM Register message(s) and unicasts it(them) to the RP. 3) The RP (rtr-c) de-encapsulates the packets and sees that the packet is for group G for which it already has (*, G) state. It then forwards the packets down the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

43

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets

Register Msgs RP

Source 171.68.37.121

E0

S0

S0

S1

S3 S0 S1

rtr-a

rtr-b

rtr-c

Shared Tree (*, (*, 224.1.1.1), 224.1.1.1), 00:00:03/00:02:56, 00:00:03/00:02:56, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null PT (171.68.37.121/32, FPT (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:00:03/00:02:56, 00:00:03/00:02:56, flags: flags: F FPT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Registering Outgoing Outgoing interface interface list: list: Null Null

rtr-a creates (S, G) state for source


(After automatically creating a (*, G) entry)
1 2

Source begins sending group G traffic. rtr-a encapsulates packets in Registers; unicasts to RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

44 44

1st-hop router (rtr-a) creates (S, G) state


A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for this entry points up the Shared Tree via Serial0 with the RPF neighbor of 171.68.28.191. (Serial 0 of rtr-b.) Because in this example no members have joined the group (the sender is only sending), the OIL of the (*, G) entry is Null. The P flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Ethernet0. The RPF neighbor is 0.0.0.0 because the source is directly connected. The (S, G) OIL receives a copy of the (*, G) OIL. (Which is Null.) The F flags are set in the (S, G) entry which indicates that this is a directly connected Source. The Registering flag is set in the (S, G) entry which indicates that we are still sending Register messages to the RP for this Source. The P flag (Pruned) is set since the OIL is Null. 2) The 1st-hop router encapsulates the multicast packets in PIM Register message(s) and unicasts them to the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

44

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs 171.68.28.139
S0 S3 S0 S1

RP

Source 171.68.37.121

E0

S0

S1

rtr-a

rtr-b

rtr-c

3 (*, 224.1.1.1)
Mcast Traffic Shared Tree

(*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121, 224.1.1.1, 00:01:15/00:02:46, flags: Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11

RP processes Register; creates (S, G) state


3

rtr-c (RP) de-encapsulates packets; forwards down Shared tree.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

45 45

The RP creates (S, G) state


As a result of the Register message that was received from rtr-a, the RP creates (S, G) state as follows: The RPF information is calculated using the source address contained in the multicast packet encapsulated inside of the register message. This results in an IIF of Serial3 and an RPF neighbor of 171.68.28.139. Next, the OIL of the parent (*, G) entry is copied into the OIL of the new (S,G) entry. (An additional check is made to insure that the IIF does not appear in the OIL. If it does, it is removed to prevent a route loop.) Now the router is ready to forward the (S, G) packet that was encapsulated in the Register message using the newly created (S, G) state. (Note that traffic is always forwarded using the matching (S, G) entry if one exists. Otherwise, the (*, G) entry is used.) This is accomplished as follows: Because this packet was received inside of a Register message, the RPF check is skipped. Next, the router forwards a copy of the packet out all interfaces in the (S, G) OIL. In this case a copy is sent out Serial0 and Serial1 which corresponds to the two branches of the Shared Tree. The T flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is received natively (via the Incoming interface) and forwarded using this entry, the T flag will be set.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

45

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs (S,G) Join

4 S0
S0 S1

RP

Source 171.68.37.121

E0

S0

S0

S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree

RP sends (S,G) Join toward Source to build SPT.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

46 46

Receivers Join Group First Example (cont.)


4) Because RP has existing (*, G) state (I.e. Receivers already waiting on the Shared Tree), it sends an (S, G) Join toward source S to build a Shortest-Path Tree (SPT) from source S to the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

46

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets (S,G) Join Source 171.68.37.121
E0 S0

Register Msgs

5 S0
S0 S1 S0 S1

RP

rtr-a 171.68.28.190

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF Outgoing Outgoing interface interface list: list: Null Null

RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP nbr nbr 171.68.28.140, 171.68.28.140,

(171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.190 171.68.28.190 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

rtr-b processes Join, creates (S, G) state


(After automatically creating the (*, G) entry) 5

rtr-b sends (S,G) Join toward Source to continue building SPT.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

47 47

rtr-b processes the (S, G) Join and creates state


A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for the (*, G) entry points up the Shared Tr ee via Serial1 with the RPF neighbor of 171.68.28.140. (Serial 3 of the RP.) Because in this example no members have joined the group, the OIL of the (*, G) entry is Null. The P flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Serial0. The RPF neighbor is 171.68.28.190. (Serial 0 of rtr-a.) The (S, G) OIL initially receives a copy of the (*, G) OIL. (Which is Null.) Interface Serial1 (which is the interface that received the (S, G) Join) is added to the (S, G) OIL. The T flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is forwarded using this entry, the flag will be T set. 5) Because the OIL of the (S, G) transitioned from Null to non-Null (when rtr-b added Serial1 to the OIL of the newly created entry), a PIM (S, G) Join is sent to rtr-as to continue the process of joining the SPT.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

47

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121
E0 S0 S0 S1 S0 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree (*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: FT FT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Registering Registering Outgoing Outgoing interface interface list: list: Serial0, Forward/Sparse, 00:04:28/00:01:32

rtr-a processes the (S, G) Join; adds Serial 0 to OIL

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

48 48

rtr-a processes the (S, G) Join


Because an (S, G) entry already existed, rtr-a simply added the interface on which it received the (S, G) join to the OIL. This results in the following: Serial0 is now listed in the Outgoing interface list (OIL) since the RP joined the SPT via this interface. The P flag (Pruned) is cleared since the OIL is no longer Null.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

48

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs

6
Source 171.68.37.121
E0 S0 S0 S1 S0 S1

RP

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

7 Register -Stop

Shared Tree

6 7

RP begins receiving (S,G) traffic down SPT. RP sends Register-Stop to rtr-a.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

49 49

A branch of the (S,G) SPT has been built to the RP.


6) Now that the SPT has been built from source S to the RP, traffic begins flowing down the Shortest-Path Tree (SPT). At this point, the RP is receiving the (S, G) traffic natively down the SPT. (This causes the T flags to be set in the (S, G) entries along this path including in the RP.) 7) The RP then sends an (S, G) Register-Stop to the 1st-hop router to inform it that the encapsulated group G Register messages from source S are no longer necessary.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

49

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets

8
Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

RP

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree (*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: FT FT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Registering Registering Outgoing Outgoing interface interface list: list: Serial0, Serial0, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

rtr-a stops sending Register messages


(Final State in rtr-a)
8

(S,G) Traffic now flowing down a single path (SPT) to RP.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

50 50

rtr-a stops sending Register messages


When the 1st-hop router (rtr-a) receives the (S, G) Register-Stop message it ceases sending encapsulated Register messages for (S, G) traffic. Notice that the Registering flag on the second line of the (S, G) entry is no longer being displayed indicating that rtr-a is not sending Registers. This is the final state in rtr-a after the Registration process. 8) (S, G) traffic is now only flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

50

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S0 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF Outgoing Outgoing interface interface list: list: Null Null

RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP nbr nbr 171.68.28.140, 171.68.28.140,

(171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.190 171.68.28.190 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

Final state in rtr-b

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

51 51

Final state in rtr-b after the Registration process


Pay particular attention to the following in the (S, G) entry: The T flag is now set indicating that (S, G) traffic is flowing along this path. The (*, G) entry still has a Null OIL and the P flag is still set. This is because there are no members that have joined the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

51

PIM SM Registering
Receiver Joins Group First
(171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Shared Tree (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11

Final state in the RP


(with receivers on Shared Tree)
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

52 52

Final state in the RP after the Registration process


Pay particular attention to the following in the newly created (S, G) entry: The T flag is now set indicating that (S, G) traffic is flowing along this path.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

52

PIM SM Register Examples

Receivers Join Group First Source Registers First Receivers along the SPT

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

53 53

PIM SM Register Examples


Depending on whether there are any existing Receivers for group G on the Shared Tree (RPT), the RP hands the Register process a little different. In the following examples we will consider the Register process for the cases when: Receivers join group G first; The Source Registers first.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

53

PIM SM Registering
Source Registers First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

rtr-c rtr-c>show >show ip ip mroute mroute 224.1.1.1 224.1.1.1 Group Group 224.1.1.1 224.1.1.1 not not found. found.

State in RP before Registering


(without receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

54 54

State in RP before Registering (w/o Rcvrs on Shared Tree)


Notice that no state for group G exists since there are no Receivers on the Shared Tree yet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

54

PIM SM Registering
Source Registers First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

rtr-b rtr-b>show >show ip ip mroute mroute 224.1.1.1 224.1.1.1 Group Group 224.1.1.1 224.1.1.1 not not found. found.

State in rtr-b before any source registers


(with receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

55 55

State in rtr-b before source registers


Note that there is no group state information for this Group yet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

55

PIM SM Registering
Source Registers First

RP
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

rtr-a >show ip mroute 224.1.1.1 Group 224.1.1.1 not found.

State in rtr-a before any source registers


(with receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

56 56

State in 1st-hop router (rtr-a) before source registers


Note that there is no group state information for this Group yet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

56

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets

1
Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

RP

rtr-a

rtr-b

rtr-c

Source begins sending group G traffic.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

57 57

Source Registers First Example


1) Source S begins sending traffic to group G. 2) 1st-hop router (rtr-a) creates (*, G) and (S, G) state; encapsulates the multicast packets in PIM Register message(s) and unicasts it(them) to the RP. 3) The RP (rtr-c) de-encapsulates the (S, G) packet and creates (*, G) and (S, G) state. Since no one has joined the Shared Tree yet, the OILs of these entries will be NULL.. Because the OIL of the (S, G) entry (just created) is NULL, the packet is discarded.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

57

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets

2 Register Msgs
RP

Source 171.68.37.121

E0

S0

S0

S1

S3 S0 S1

rtr-a

rtr-b

rtr-c

(*, (*, 224.1.1.1), 224.1.1.1), 00:00:03/00:02:56, 00:00:03/00:02:56, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null PT (171.68.37.121/32, FPT (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:00:03/00:02:56, 00:00:03/00:02:56, flags: flags: F FPT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Registering Outgoing Outgoing interface interface list: list: Null Null

rtr-a creates (S, G) state for source


(After automatically creating a (*, G) entry)
1 2

Source begins sending group G traffic. rtr-a encapsulates packets in Registers; unicasts to RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

58 58

1st-hop router (rtr-a) creates (S, G) state


A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for this entry points up the Shared Tree via Serial0 with the RPF neighbor of 171.68.28.191. (Serial 0 of rtr-b.) Because in this example no members have joined the group (the sender is only sending), the OIL of the (*, G) entry is Null. The P flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Ethernet0. The RPF neighbor is 0.0.0.0 because the source is directly connected. The (S, G) OIL receives a copy of the (*, G) OIL. (Which is Null.) The F flags are set in the (S, G) entry which indicates that this is a directly connected Source. The Registering flag is set in the (S, G) entry which indicates that we are still sending Register messages to the RP for this Source. The P flag (Pruned) is set since the OIL is Null. 2) The 1st-hop router encapsulates the multicast packets in PIM Register message(s) and unicasts them to the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

58

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs 171.68.28.139
S0 S0 S1 S3 S0 S1

RP

Source 171.68.37.121

E0

3
rtr-c

rtr-a

rtr-b

(*, 224.1.1.1), 00:01:15/00:01:45, RP 171.68.28.140, flags: SP Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Null (171.68.37.121, 224.1.1.1), 00:01:15/00:01:45, flags: P Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Null

RP processes Register; creates (S, G) state


(After automatically creating the (*, G) entry) 3 rtr-c (RP) has no receivers on Shared Tree; discards packet.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

59 59

The RP creates (S, G) state


As a result of the Register message that was received from rtr-a, the RP creates (*, G) and (S, G) state. However, because no previous (*, G) state existed, it must be created before the (S,G) entry can be created. This (*, G) entry is created as shown above. Notice that the (*, G) OIL is NULL. This is because the RP has not yet received any (*, G) Joins for this group. (Remember, in this example, the source registers first.) Next, the (S, G) entry can be created and is accomplished as follows: The RPF information is calculated using the source address contained in the multicast packet encapsulated inside of the register message. This results in an IIF of Serial3 and an RPF neighbor of 171.68.28.139. Next, the OIL of the parent (*, G) entry is copied into the OIL of the new (S,G) entry. Since the OIL of the (*, G) entry is NULL, this results in a NULL (S, G) OIL. Now the router is ready to forward the (S, G) packet that was encapsulated in the Register message using the newly created (S, G) state. This is accomplished as follows: Because this packet was received inside of a Register message, the RPF check is skipped. Next, the router forwards a copy of the packet out all interfaces in the matching (S, G) OIL. However, because the (S, G) OIL is NULL (i.e. there are no branches of the Shared Tree), the packet is simply discarded.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

59

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets Register Msgs RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b Register -Stop

rtr-c

RP sends Register-Stop to rtr-a.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

60 60

Source Registers First Example


4) Since the RP has no (*, G) state and hence no receivers on the Shared Tree, it does not need the (S, G) traffic. Therefore the RP sends an (S, G) RegisterStop message to the 1st-hop router so it will stop sending Register messages.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

60

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

rtr-a stops encapsulating traffic in Register Messages; drops packets from Source.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

61 61

Source Registers First Example


5)The 1st-hop router receives the (S, G) Register-Stop message and stops sending Register messages for (S, G) traffic. Note: Eventually, the original (S, G) entry will time out (approx. 3 min.) and be deleted. The Register process will start over again when the 1st-hop router receives the next multicast packet from directly connected source S. The RP will again respond with a Register-Stop which will prevent the (S,G) traffic from flowing to the RP until it is needed.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

61

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

(*, (*, 224.1.1.1), 224.1.1.1), 00:01:28/00:01:32, 00:01:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:01:28/00:01:32, 00:01:28/00:01:32, flags: flags: FPT FPT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: Null Null

State in rtr-a after Registering


(without receivers on Shared Tree)
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

62 62

State in 1st-hop router after Registering (w/o Rcvrs on Shared Tree)


Pay particular attention to the following in the (S, G) entry: The Registering flag is now cleared. The Outgoing interface list is still Null since the RP did not join the SPT. The P flag (Pruned) is still set since the oilist is still Null. The 00:01:32 Expiration time value will count down to zero at which time the (S, G) entry will be deleted. (The Register process will begin all over again when the next multicast packet is received from source S.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

62

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

rtr-b rtr-b>show >show ip ip mroute mroute 224.1.1.1 224.1.1.1 Group Group 224.1.1.1 224.1.1.1 not not found. found.

State in rtr-b after rtr-a Registers


(without receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

63 63

State in rtr-b after Registering (w/o Rcvrs on Shared Tree)


Notice that no state exists in rtr-b at this point in time.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

63

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

(*, (*, 224.1.1.1), 224.1.1.1), 00:01:15/00:01:45, 00:01:15/00:01:45, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121, (171.68.37.121, 224.1.1.1), 224.1.1.1), 00:01:15/00:01:45, 00:01:15/00:01:45, flags: flags: P P Incoming Incoming interface: interface: Serial3, Serial3, RPF RPF nbr nbr 171.68.28.139, 171.68.28.139, Outgoing Outgoing interface interface list: list: Null Null

State in RP after rtr-a Registers


(without receivers on Shared Tree)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

64 64

State in RP after Registering (w/o Rcvrs on Shared Tree)


Pay particular attention to the following in the newly created (S, G) entry: The RPF nbr is the IP Address of rtr-b. The Incoming interface: is Serial3 which is the RPF interface towards source S via rtr-b. The Outgoing interface list: is Null since the (*,G) OIL is also Null. (Indicates there are no Receivers on the Shared Tree yet.) The P flag (Pruned) is set since the OIL is Null. The (S,G) state will remain in the RP as long as the source is still actively sending. This is accomplished by fact that the first-hop route will continue sending periodic Register messages to the RP as long as the first-hop router is receiving traffic from the source.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

64

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c

(*, G) Join

Receivers begin joining the Shared Tree

RP (rtr-c) receives (*, G) Join from a receiver on Shared Tree.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

65 65

Source Registers First Example


6)The RP now begins receiving (*, G) Joins from Last-hop routers with Receivers that wish to join the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

65

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets

7
(S, G) Join RP
S0 S0 S1 S3 S0 S1

Source 171.68.37.121

E0

rtr-a

rtr-b

rtr-c

(*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:00:14/00:02:46 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:14/00:02:46

RP processes (*,G) Join


(Adds Serial1 to Outgoing Interface Lists)
7

RP sends (S,G) Joins for all known Sources in Group.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

66 66

The RP process the (*, G) Join


In the (*, G) entry: Serial1 has been added to the (*, G) entry since a (*,G) Join was received on this interface which is the only active branch of the Shared Tree (RPT). In the (S, G) entry: Serial1 has also been added to the (S, G) OIL because the OILs of all (S,G) entries are always kept in sync with their parent (*, G). Note: When the (S, G) OILs are synchronized with the OIL of their parent (*,G) OIL, a check is made to insure that the IIF of the (S, G) does not appear in the OIL of the (S, G). This could result in a route loop. 7) The transitioning of the (*, G) OIL from Null to non-Null triggers the RP to scan its list of (S, G) entries for group G and send (S, G) Joins towards all sources. (This will cause a SPT to be built from each active source back to the RP which will eventually start the flow of (S, G) traffic to the RP.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

66

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets

8
(S, G) Join RP
S1 S3 S0 S1 S0 S0

Source 171.68.37.121

E0

rtr-a 171.68.28.190

rtr-b

rtr-c

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF Outgoing Outgoing interface interface list: list: Null Null

RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP nbr nbr 171.68.28.140, 171.68.28.140,

(171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.190 171.68.28.190 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

rtr-b processes Join, creates (S, G) state


(After automatically creating the (*, G) entry) 8

rtr-b sends (S,G) Join toward Source to continue building SPT.


1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

67 67

rtr-b processes the (S, G) Join and creates state


A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for the (*, G) entry points up the Shared Tr ee via Serial1 with the RPF neighbor of 171.68.28.140. (Serial 3 of the RP.) Because in this example no members have joined the group, the OIL of the (*, G) entry is Null. The P flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Serial0. The RPF neighbor is 171.68.28.190. (Serial 0 of rtr-a.) The (S, G) OIL initially receives a copy of the (*, G) OIL. (Which is Null.) Interface Serial1 (which is the interface that received the (S, G) Join) is added to the (S, G) OIL. The T flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is forwarded using this entry, the flag will be T set. 8) Because the OIL of the (S, G) transitioned from Null to non-Null (when rtr-b added Serial1 to the OIL of the newly created entry), a PIM (S, G) Join is sent to rtr-as to continue the process of joining the SPT.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

67

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets

9
Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

RP

rtr-a

rtr-b

rtr-c

10 (*, 224.1.1.1)
Mcast Traffic

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.191, 171.68.28.191, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: FT FT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 0.0.0.0, 0.0.0.0, Outgoing Outgoing interface interface list: list: Serial0, Forward/Sparse, 00:04:28/00:01:32

rtr-a processes the (S, G) Join; adds Serial0 to OIL


9 RP begins receiving (S,G) traffic down SPT. 10 RP forwards (S,G) traffic down Shared Tree to receivers.
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

68 68

1st-hop router (rtr-a) processes the (S, G) Join


The (S, G) Join is processed as follows: Serial0 is added to the Outgoing interface list (OIL). (This is the interface on which the (S, G) Join arrived.) The P flag (Pruned) is cleared since the OIL is no longer Null. 9) As a result of Serial0 being added to the (S, G) OIL, traffic begins to flow down the SPT from the source to the RP. 10) The RP then forwards all incoming (S, G) traffic to the Receivers down the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

68

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a 171.68.28.190

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF nbr nbr 171.68.28.140, 171.68.28.140, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.190 171.68.28.190 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

Final state in rtr-b after Receivers Join


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

69 69

State in rtr-b after Receivers Join


Pay particular attention to the following: Both (*, G) and (S, G) state was created as a result of the (S, G) Join received from the RP. The P flag set in the (*, G) entry since there are no receivers on the Shared Tree at this point in the network. The T flag is set in the (S, G) entry indicating that traffic is flowing down the Shortest-Path Tree. The RPF nbr is the IP Address of rtr-a. Serial0 is the Incoming interface of the (S, G) entry since this is the RPF interface for source S via rtr-a. Serial1 is listed in the Outgoing interface list of the (S, G) entry since the RP joined the SPT via this interface.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

69

PIM SM Registering
Source Registers First
(171.68.37.121, 224.1.1.1) Mcast Packets 171.68.28.139 RP Source 171.68.37.121
E0 S0 S0 S1 S3 S0 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

(*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:49/00:02:11

Final state in RP after Receivers Join

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

70 70

State in RP after Receivers Join


In the (*, G) entry: Serial1 has been added to the (*, G) entry since a (*,G) Join was received on this interface which is the only active branch of the Shared Tree (RPT). In the (S, G) entry: Serial1 has also been added to the (S, G) OIL because the OILs of all (S,G) entries are always kept in sync with their parent (*, G). Note: When the (S, G) OILs are synchronized with the OIL of their parent (*, G) OIL, a check is made to insure that the IIF of the (S, G) does not appear in the OIL of the (S, G). This could result in a route loop.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

70

PIM SM Register Examples

Receivers Join Group First Source Registers First Receivers along the SPT

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

71 71

PIM SM Register Examples


Depending on whether there are any existing Receivers for group G on the Shared Tree (RPT), the RP hands the Register process a little different. In the following examples we will consider the Register process for the cases when: Receivers join group G first; The Source Registers first. Receivers along the SPT.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

71

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

(*, (*, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, RP RP 171.68.28.140, 171.68.28.140, flags: flags: SP SP Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF nbr nbr 171.68.28.140, 171.68.28.140, Outgoing Outgoing interface interface list: list: Null Null (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:04:28/00:01:32, 00:04:28/00:01:32, flags: flags: T T Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF nbr nbr 171.68.28.190 171.68.28.190 Outgoing Outgoing interface interface list: list: Serial1, Serial1, Forward/Sparse, Forward/Sparse, 00:04:28/00:01:32 00:04:28/00:01:32

Current state in rtr-b


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

72 72

State in rtr-b with traffic flowing on the SPT


Pay particular attention to the following: Both (*, G) and (S, G) state was created as a result of the (S, G) Join received from the RP. The P flag set in the (*, G) entry since there are no receivers on the Shared Tree at this point in the network. The T flag is set in the (S, G) entry indicating that traffic is flowing down the Shortest-Path Tree. The RPF nbr is the IP Address of rtr-a. Serial0 is the Incoming interface of the (S, G) entry since this is the RPF interface for source S via rtr-a. Serial1 is listed in the Outgoing interface list of the (S, G) entry since the RP joined the SPT via this interface.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

72

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

(*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:03:14/00:02:46 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:49/00:02:11

Current state in the RP


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

73 73

State in the RP with traffic flowing on the SPT


Pay particular attention to the following: The (*, G) entry only has Serial1 in its outgoing interface list. In the (S, G) entry, Serial0 is the Incoming interface since this is the RPF interface for source S via rtr-b. Serial1 is listed in the Outgoing interface list of the (S, G) entry because the OIL of the (S, G) entry is always kept in sync with the (*, G) OIL.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

73

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 E0 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

1 IGMP Join
Rcvr A

Rcvr A wishes to receive group G traffic. Sends IGMP Join for G.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

74 74

Receivers along the SPT


Step 1: A host directly connected to rtr-b, Receiver A, joins multicast group 224.1.1.1 by sending an IGMP Report.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

74

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 E0 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Rcvr A (*, 224.1.1.1), 00:04:28/00:01:32, RP 171.68.28.140, flags: SC Incoming interface: Serial1, RPF nbr 171.68.28.140, Outgoing interface list: Ethernet0, Forward/Sparse, 00:00:30/00:02:30 (171.68.37.121/32, 224.1.1.1), 00:04:28/00:01:32, flags: CT Incoming interface: Serial0, RPF nbr 171.68.28.190 Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 Ethernet0, Forward/Sparse, 00:00:30/00:02:30

Added Interfaces

State in rtr-b after Rcvr A joins group


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

75 75

Receivers along the SPT


As a result of the IGMP Report sent by Receiver A for group 224.1.1.1, rtr-b updates its state for this group as follows. Ethernet0 is added to the OIL of the (*, G) entry. This is done to permit any (*, 224.1.1.1) traffic flowing down the Shared Tree to be forwarded to Receiver A. Next, the OILs of all child (S, G) entries are synchronized with the OIL change just made to the OIL of the (*, G). This results in Ethernet0 being added to the OIL of the (171.68.37.121/32, 224.1.1.1) entry. This permits traffic from this source to be picked off as it flows along the SPT through rtr-b on its way to the RP. (Note that this traffic does not flow to the RP and then back out the same interface to reach rtr-b. This is a common misperception.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

75

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 E0

rtr-a

rtr-b

S3 S1

rtr-c (*, 224.1.1.1) Mcast Traffic

(*, G) Join

Rcvr A

rtr-b triggers a (*,G) Join to join the Shared Tree

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

76 76

Receivers along the SPT


Step 2: Because the OIL of the (*, G) entry in rtr-b transitioned from NULL to non-Null (Ethernet0 is now in the (*, G) OIL), a (*, G) Join message is triggered. This message is sent up the Shared Tree so that rtr-b will be placed on a branch of the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

76

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 E0 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

Rcvr A (*, 224.1.1.1), 00:09:21/00:02:38, RP 171.68.28.140, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial1, Forward/Sparse, 00:03:14/00:02:46 Serial3, Forward/Sparse, 00:00:10/00:02:50 (171.68.37.121/32, 224.1.1.1, 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr 171.68.28.139, Outgoing interface list: Serial1, Forward/Sparse, 00:00:49/00:02:11

State in RP after rtr-b joins Shared Tree


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

77 77

Receivers along the SPT


When the RP receives the (*, G) Join sent by rtr-b, it adds Serial3 to the (*, G) OIL. Next, the RP synchronizes the OILs of all (S, G) entries by adding Serial3 to each (S, G) OIL. However in this case, Serial3 is the Incoming interface for the (171.68.37.121/32, 224.1.1.1) entry and is therefore not added to the OIL. (If it were, Serial3 would appear in both the incoming and outgoing interface list which could cause a route loop.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

77

PIM SM Registering
Receivers along the SPT
(171.68.37.121, 224.1.1.1) Mcast Packets RP Source 171.68.37.121
S0 S1 E0 S3 S1

rtr-a

rtr-b

rtr-c (*, 224.1.1.1) Mcast Traffic

3
Rcvr A

Group G traffic begins to flow to Rcvr A.


(Note: 171.68.37.121 traffic doesnt flow to RP then back down to rtr-b)
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

78 78

Receivers along the SPT


Step 3: Traffic from source 171.68.37.121 is now being picked off by rtr-b and forwarded out Ethernet0 as the traffic flows down the SPT to the RP. Again, it is important to note that this source traffic does not flow to the RP and then turn around and come back out on the same interface that it arrived. (Refer to the state in the RP shown on the previous page.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

78

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

79 79

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

79

PIM SM SPT-Switchover
SPT Thresholds may be set for any Group
Access Lists may be used to specify which Groups Default Threshold = 0kbps (I.e. immediately join SPT) Threshold = infinity means never join SPT.

Threshold triggers Join of Source Tree


Sends an (S,G) Join up SPT for next S in G packet received.

Pros
Reduces Network Latency

Cons
More (S,G) state must be stored in the routers.
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

80 80

SPT Thresholds
In PIM Sparse mode, SPT Thresholds may be configured to control when to switch to the Shortest-Path Tree (SPT). SPT Thresholds are specified in Kbps and can be used with Access List to specify to which Group(s) the Threshold applies. The default SPT-Threshold is 0Kbps. This means that any and all sources are immediately switched to the Shortest-Path Tree. If an SPT-Threshold of Infinity is specified for a group, the sources will not be switched to the Shortest-Path Tree (SPT) and will remain on the Shared Tree.

Exceeding the Threshold


When the Groups SPT-Threshold is exceed in a last-hop router, the next received packet for the group will cause an (S, G) Join to be sent toward the source of the packet. This builds a Shortest-Path Tree from the source S to the last-hop router.

PROS
By switching to the Shortest-Path Tree (SPT), the most optimal (usually) path is used to deliver the multicast traffic. Depending on the location of the source in relation to the RP, this switch to the SPT can reduce network latency substantially.

CONS
In networks with large numbers of senders (remember most multicast applications such as IP/TV Client, send RTCP multicast packets in the background and are therefore senders), an increased amount of state must be kept in the routers. In some cases, an Infinity threshold may be used to force certain groups to remain on the Shared Tree when latency is not an issue.
Copyright ? ?1999-2001, Cisco Systems, Inc.
Module5.ppt

80

PIM SM SPT-Switchover
SPT-Switchover Mechanism
Once Once each each second second

Compute Compute new new (*, (*, G) G) traffic traffic rate rate If If threshold threshold exceeded, exceeded, set set J J flag flag in in (*, (*, G) G)
For For each each (S (Sii,, G) G) packet packet received: received:

If If J J flag flag set set in in (*, (*, G) G)


Join Join SPT SPT for for (S (Sii ,, G) G) Mark Mark (S (Sii ,, G) G) entry entry with with J J flag flag Clear Clear J J flag flag in in (*,G) (*,G)

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

81 81

SPT-Threshold Myth
This is a frequently misunderstood mechanism. Many people think that the the traffic rates of the sources in the group are monitored and compared against the SPT-Threshold. THIS IS NOT THE CASE. Instead, the total aggregate rate of Group traffic flowing down the Shared Tree (RPT) is calculated once per second. If this total aggregate rate is exceed, then the next Group packet received causes that source to be switched to the Shortest-Path Tree (SPT).

SPT-Switchover Mechanism
Once each second, the aggregate (*, G) traffic rate is computed and checked against the SPT-Threshold. If the aggregate rate of all group traffic flowing down the Shared Tree (RPT) exceeds the threshold, then the J flag is set in the (*, G) entry. As each multicast packet is received on the Shared Tree, the J bit is checked in the (*, G) entry. If the J flag is set, a new (S, G) entry is created for the source of the packet. An (S, G) Join is sent towards the source in order to join the SPT. The J flag is set in the (S, G) entry to denote that this entry was created as a result of the SPT-Threshold switchover. The J flag in the (*, G) is reset. (It will be set in one second if the aggregate rate on the Shared Tree is still over the SPT-Threshold.) This mechanism can sometimes result in low rate sources being switched to the SPT erroneously. However, the RPT-switchback mechanism will correct this situation and eventually only the high rate sources will be received via SPTs while low rate sources will remain on the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

81

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.5.1, Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28

State in rtr-c before switch


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

82 82

PIM-SM SPT-Switchover Example


Receivers A & B have joined multicast group 224.1.1.1 which has resulted in traffic flowing down the Shared Tree as shown by the solid arrows. The state is rtr-c prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Serial1 and Serial2 as a result of (*, G) Joins that were sent up the Shared Tree by rtr-a and rtr-d, respectively.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

82

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Serial0, RPF nbr 10.1.4.8, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11

State in rtr-d before switch


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

83 83

PIM-SM SPT-Switchover Example


The state is rtr-d prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Ethernet0 as a result of the IGMP Reports for group 224.1.1.1 that are sent by Receiver B.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

83

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11

State in rtr-a before switch


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

84 84

PIM-SM SPT-Switchover Example


The state is rtr-a prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Ethernet0 as a result of (*, G) Joins that were sent up the Shared Tree by rtr-b.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

84

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11

State in rtr-b before switch


85 85

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

PIM-SM SPT-Switchover Example


The state is rtr-b prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Ethernet0. The OIL of the (*, G) entry contains Ethernet1 as a result of the IGMP Reports for group 224.1.1.1 that are sent by Receiver A.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

85

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1 1 Group G rate > Threshold

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SCJ Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11

1 2

Group G rate exceeds SPT Threshold at rtr-b; Set J Flag in (*, G) and wait for next (Si,G) packet.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

86 86

PIM-SM SPT-Switchover Example


Step 1: The total amount of all traffic flowing down the Shared Tree begins to exceed the SPT-Threshold configured at rtr-b. Step 2: As a result, rtr-b sets the J flag in the (*, G) entry to denote that the rate is above the SPT-Threshold for this group.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

86

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1 3

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SCJ Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11

3 4

(Si,G) packet arrives down Shared tree. Clear J Flag in the (*,G) & create (Si,G) state.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

87 87

PIM-SM SPT-Switchover Example


Step 3: The very next packet to arrive via the Shared Tree happens to be from source (Si, G). Because there is a member directly connected to this router (denoted by the C flag) and the traffic rate is above the SPT-Threshold (denoted by the J flag), rtr-b initiates a switch to the SPT for (Si, G) traffic. Step 4: The J flag in the (*, G) entry is first cleared and an new traffic rate measurement interval (1 second) is started. Next, (Si, G) state is created for source Si sending to group G.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

87

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:00:28/00:02:51, flags: C CJT JT Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list: Ethernet1, Forward/Sparse, 00:00:28/00:02:32

J Flag indicates (S, G) created by exceeding the SPT-threshold

New State in rtr-b


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

88 88

PIM-SM SPT-Switchover Example


The (171.68.37.121/32, 224.1.1.1) entry shown above is created as follows: To denote that this entry was created as a result of the SPT Switchover mechanism, the J flag is set on the (S, G) entry. (The J flag being set will cause rtr-b to monitor the rate of the (S, G) traffic and if the rate of this traffic drops below the SPT Threshold for over one minute, rtr-b will attempt to switch this traffic flow back to the Shared Tree.) The RPF information is calculated in the direction of source Si. This results in an Incoming interface of Ethernet0, and an RPF neighbor address of 10.1.2.1. . (Note: That the RPF information for the (S, G) entry is the same as the (*, G) entry. This indicates that the Shared Tree and the SPT are following the same path at this point.) The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry and then removing the IIF from this list to prevent a possible route loop. This results in an (S, G) OIL containing only Ethernet1.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

88

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1 5 (Si,G) Join

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

Send (Si,G) Join towards Si .

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

89 89

PIM-SM SPT-Switchover Example


Step 5: Once state has been created for (Si, G), an (S, G) Join is sent toward source Si to build a branch of the SPT to rtr-b. These (Si, G) Joins will continue to be sent periodically (once a minute) as long as the (Si, G) entry is not Pruned (i.e. does not have a Null OIL).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

89

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:13:28/00:02:53, flags: T Incoming interface: Serial1, RPF nbr 10.1.9.2 Outgoing interface list: Ethernet0, Forward/Sparse, 00:13:25/00:02:30

New state in rtr-a


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

90 90

PIM-SM SPT-Switchover Example


When the (Si, G) Join is received by rtr-a, the (171.68.37.121/32, 224.1.1.1) entry shown above is created as follows: The RPF information is calculated in the direction of source Si. This results in an Incoming interface of Serial1, and an RPF neighbor address of 10.1.9.1. . (Note: That the RPF information for the (S, G) entry is not the same as the (*, G) entry. This indicates that the paths of the Shared Tree and the SPT diverge at this point.) The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry and then removing the IIF from this list to prevent a possible route loop. This results in an (S, G) OIL containing only Ethernet0.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

90

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si 6 (Si ,G) Join

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

rtr-a forwards (Si,G) Join toward Si.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

91 91

PIM-SM SPT-Switchover Example


Step 6: When the (Si, G) state is created at rtr-a, an (Si, G) Join is sent toward source Si. These (Si, G) Joins will continue to be sent periodically (once a minute) as long as the (Si, G) entry is not Pruned (i.e. does not have a Null OIL).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

91

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si 7 (S ,G) Traffic i

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

6 7

rtr-a forwards (Si,G) Join toward Si. (Si, G) traffic begins flowing down SPT tree.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

92 92

PIM-SM SPT-Switchover Example


Step 7: When the (Si, G) Joins reach the first-hop router directly connected to source Si, a complete branch of the SPT has been built (shown by the dashed arrows). This permits (Si, G) traffic to flow via the SPT to rtr-b and receiver A.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

92

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1

(Si,G)RP-bit Prune

rtr-a

S1

To Source Si

S0
10.1.4.2

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

6 7 8

rtr-a forwards (Si,G) Join toward Si. (Si, G) traffic begins flowing down SPT tree. SPT & RPT diverge, triggering (Si,G)RP-bit Prunes toward RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

93 93

PIM-SM SPT-Switchover Example


Step 8: Because the paths of the Shared Tree and the SPT diverge at rtr-a, (note the difference in RPF information on the previous page), this causes rtr-a to begin sending (Si, G)RP-bit Prune messages up the Shared Tree to stop the flow of redundant (Si, G) traffic down the Shared Tree. (Note: This step is delayed until traffic begins arriving via the SPT which is denoted by the T flag being set in the (Si, G) entry in the mroute table.)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

93

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

Rcvr B (*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.5.1, Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28 (171.68.37.121/32, 224.1.1.1), 00:13:28/00:02:53, flags: R Incoming interface: Serial0, RPF nbr 10.1.5.1 Outgoing interface list: Serial2, Forward/Sparse, 00:00:32/00:02:28

State in rtr-c after receiving the (Si, G) RP-bit Prune


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

94 94

PIM-SM SPT-Switchover Example


When the (Si, G)RP -bit Prune reaches rtr-c, the (171.68.37.121/32, 224.1.1.1) entry shown above is created as follows: Because this (S, G) entry was created as a result of the receipt of an (S,G)RP -bit Prune, the R bit is set to denote that this forwarding state is applicable to traffic flowing down the Shared Tree and not the Source Tree. Because the R bit is set, the RPF information is calculated in the direction of the RP instead of source Si. (Remember, this entry is applicable to (Si,G) traffic flowing down the Shared Tree and therefore the RPF information must point up the Shared Tree.) This results in an Incoming interface of Serial0, and an RPF neighbor address of 10.1.5.1. . The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry minus the interface that the (Si, G)RP-bit Prune was received. Next, the IIF is removed from the OIL to prevent a possible route loop. These steps results in an (S, G) OIL containing only Serial2. At this point, (Si, G) traffic flowing down the Shared Tree will be forwarded using the (Si, G) entry. (Si, G) traffic arriving at rtr-a will RPF correctly because the RPF information in the (Si, G) entry is pointing up the Shared Tree (as a result of the R bit) and will then be forwarded out all interfaces in the (Si, G) OIL. In this case, only Serial2 remains in the (Si, G) OIL and therefore (Si, G) traffic will be sent to rtr-d but not rtr-a. This successfully prunes the redundant (Si, G) traffic from the branch of the Shared Tree between rtr-c and rtr-a.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

94

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 S2

rtr-c
10.1.4.1

S1 9 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

S0

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

9 Unnecessary (S , G) traffic is pruned from the Shared tree. i

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

95 95

PIM-SM SPT-Switchover Example


Step 9: At this point, the redundant (Si, G) traffic is pruned from the Shared Tree branch from rtr-c to rtr-a. (Si, G) traffic is reaching receiver A via the SPT through rtr-a and rtr-b.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

95

PIM SM SPT-Switchover
To RP (10.1.5.1) S0 10 S0 S2

rtr-c
10.1.4.1

S1 S0
10.1.4.2

rtr-a

S1

To Source Si

E010.1.2.1

rtr-d
E0
Rcvr A Rcvr B

10.1.2.2

E0 E1

rtr-b

(Si, G) Traffic Flow Shared (RPT) Tree SPT Tree

9 Unnecessary (S , G) traffic is pruned from the Shared tree. i 10

(Si, G) traffic still flows via other branches of the Shared tree.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

96 96

PIM-SM SPT-Switchover Example


Step 10: (Si, G) traffic is still reaching receiver B via a branch of the Shared Tree through rtr-c and rtr-d. This is because the (Si, G) state in rtr-c still has Serial2 in its OIL.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

96

PIM SM SPT-Switchover
Shared Tree Switchback Mechanism
Once Once each each minute minute

If If J J flag flag set set in in (S (Sii,, G) G) entry entry


Compute Compute new new (S (Sii ,, G) G) traffic traffic rate rate If If rate rate < < SPT-threshold SPT-threshold
Rejoin Rejoin (*, (*, G) G) Tree Tree for for (S (S i i ,, G) G) traffic traffic Send Send(S (S i i,, G) G)prune pruneup upSPT SPTtoward toward S S ii Delete (S , G) entry Delete (S i i , G) entry

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

97 97

Shared Tree Switchback


The Shared Tree Switchback (for lack of a better term) mechanism is used to switch sources back to the Shared Tree when their traffic rate falls below the SPT-Threshold.

Switchback Algorithm
The Switchback mechanism runs once a minute. (This helps prevent Sources from cycling between Shared Tree and Shortest-Path Tree too rapidly.) For each (Si, G) entry in the Multicast Routing Table that has the J flag set, the mechanism computes the traffic rate for source Si. If the rate has fallen below the SPT-Threshold, a switchback to the Shared Tree is initiated by the last-hop router by: Sending a Join/Prune message that contains a (*, G) Join without a (Si, G)RP -bit Prune, up the Shared Tree (RPT). (This will cause the (Si, G) Prune state along the RPT to be deleted which will permit (Si, G) traffic to begin flowing down the RPT again.) Deleteing its (Si, G) entry in the Multicast Routing Table. Send (Si, G) Prune up the Shortest-Path Tree (SPT) to stop traffic from flowing down the SPT. Note that this Switchback Algorithm is broken in older versions of IOS.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

97

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

98 98

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

98

PIM SM Pruning

IGMP group times out / last host sends Leave Interface removed from all (*,G) & (S,G) entries
IF all interfaces in oilist for (*,G) are pruned; THEN send Prune up shared tree toward RP Any (S, G) state allowed to time -out

Each router along path prunes interface


IF all interfaces in oilist for (*,G) are pruned; THEN send Prune up shared tree toward RP Any (S, G) state allowed to time -out

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

99 99

SM Pruning
Locally connected host sends an IGMP Leave (or IGMP state times out in the router) for group G. The interface is removed from the (*, G) and all (S, G) entries in the Multicast Routing Table. If the (*, G) Outgoing Interface list is now Null, then send a (*, G) Prune up the Shared Tree (RPT) towards the RP. Any remaining (S, G) entries are allowed to timeout and be deleted from the Multicast Routing Table. When the routers up the Shared Tree receive the (*, G) Prune, they remove the interface on which the Prune was received from their (*, G) Outgoing interface list. If as a result of removing the interface the (*, G) Outgoing Interface list becomes Null, then forward a (*, G) Prune up the Shared Tree (RP T) towards the RP. Any remaining (S, G) entries are allowed to timeout and be deleted from the Multicast Routing Table.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

99

PIM SM Pruning
Shared Tree Case
To RP (10.1.5.1) S0
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

S1

rtr-a
E0
10.1.2.1

10.1.2.2

E0

E1
Rcvr A

rtr-b

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: SC Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11

State in rtr-b before Pruning


100 100

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

State in rtr-b before Pruning


Pay particular attention to the following: Traffic is flowing down the Shared Tree. (Denoted by the existance of only the (*, G) entry.) The Incoming interface is Ethernet0. The Outgoing interface list contains Ethernet1. The C flag is set in the (*, G) which denotes that there is a locally connected host for this group. (Rcvr A)

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

100

PIM SM Pruning
Shared Tree Case
To RP (10.1.5.1) S0
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

S1

rtr-a
E0
10.1.2.1

10.1.2.2

E0

E1
Rcvr A

rtr-b

(*, 224.1.1.1), 00:01:43/00:02:13, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11

State in rtr-a before Pruning

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

101 101

State in rtr-a before Pruning RPT Case


Pay particular attention to the following: Traffic is flowing down the Shared Tree. (Denoted by the existance of only the (*, G) entry.) The Incoming interface is Serial0. The Outgoing interface list contains Ethernet0.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

101

PIM SM Pruning
Shared Tree Case
To RP (10.1.5.1) S0
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

S1

rtr-a
E0
10.1.2.1

10.1.2.2

E0

3 (*,G) Prune

1 IGMP Leave
Rcvr A

E1 2

rtr-b

1 2 3

rtr-b is a Leaf router. Last host Rcvr A, leaves group G. rtr-b removes E1 from (*,G) and any (Si,G) oilists. rtr-b (*,G) oilist now empty; sends (*,G) Prune toward RP.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

102 102

PIM SM Pruning Example RPT Case


1) The last-hop or Leaf router (rtr-b) receives an IGMP Group Leave message from Rcvr A for group G. After performing the normal IGMP Leave processing and finding that Rcvr A was the last host to leave, the IGMP state for group G on interface E1 is deleted. 2) This causes interface E1 to be removed from the Outgoing interface list of the (*, G) entry and any (Si, G) entries (in this case there are none) in the Multicast Routing Table. Because E1 was the only interface in the (*, G) entry, its outgoing interface list becomes Null. 3) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is sent up the Shared Tree (RPT) via E0 toward the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

102

PIM SM Pruning
Shared Tree Case
To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree

(*,G) Prune

X
E1

6 S0

S1

10.1.4.2

rtr-a

10.1.2.2

X
E0

E0 4

10.1.2.1

rtr-b

4 5 6

rtr-a receives Prune; removes E0 from (*,G) oilist.


(After the 3 second Multi-access Network Prune delay.)

rtr-a (*,G) oilist now empty; send (*,G) Prune toward RP. Pruning continues back toward RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

103 103

PIM SM Pruning Example RPT Case (cont.)


4) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 5) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 6) This pruning continues back toward the RP or until a router is reached whose (*, G) Outgoing interface list doesnt go to Null as a result of the Prune.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

103

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1
Rcvr A

rtr-b

(*, (*, 224.1.1.1), 224.1.1.1), 00:01:43/00:02:59, 00:01:43/00:02:59, RP RP 10.1.5.1, 10.1.5.1, flags: flags: SC SC Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 10.1.2.1, 10.1.2.1, Outgoing Outgoing interface interface list: list: Ethernet1, Ethernet1, Forward/Sparse, Forward/Sparse, 00:01:43/00:02:11 00:01:43/00:02:11 (171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:01:05/00:01:55, 00:01:05/00:01:55, flags: flags: CJT CJT Incoming Incoming interface: interface: Ethernet0, Ethernet0, RPF RPF nbr nbr 10.1.2.1 10.1.2.1 Outgoing Outgoing interface interface list: list: Ethernet1, Ethernet1, Forward/Sparse, Forward/Sparse, 00:01:05/00:02:55 00:01:05/00:02:55
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

State in rtr-b before Pruning

8/10/2001 2:37 PM

104 104

State in rtr-b before Pruning SPT Case


Pay particular attention to the following: Both a (*, G) and (S, G) entries exist. The J flag is set in the (S, G) entry. This indicates that the (S, G) state was created as a result of the SPT-Threshold being exceeded. The T flag is set in the (S, G) entry. This indicates that (S, G) traffic is being successfully received via the Shortest-Path Tree (SPT). The Incoming interface is the same for the (*, G) and the (S, G) entry. This indicates that Shared Tree and the Shortest-Path tree are the same at this point.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

104

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1
Rcvr A

rtr-b

(*, 224.1.1.1), 00:01:43/00:02:59, RP 10.1.5.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 (171.68.37.121/32, 224.1.1.1), 00:01:05/00:01:55, flags: T Incoming interface: Serial1, RPF nbr 10.1.9.2 Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:05/00:02:55

State in rtr-a before Pruning


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

105 105

State in rtr-b before Pruning SPT Case


Pay particular attention to the following: Both a (*, G) and (S, G) entries exist. The T flag is set in the (S, G) entry. This indicates that (S, G) traffic is being successfully received via the Shortest-Path Tree (SPT). The Incoming interface is different for the (*, G) and the (S, G) entry. This indicates that Shared Tree and the Shortest-Path tree diverge at this point.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

105

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1 3 (*,G) Prune

10.1.2.2

E0

1 IGMP Leave
Rcvr A

E1 2

rtr-b

1 2 3

rtr-b is a Leaf router. Last host Rcvr A, leaves group G. rtr-b removes E1 from (*,G) and any (Si,G) oilists. rtr-b (*,G) oilist now empty; sends (*,G) Prune toward RP.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

106 106

PIM SM Pruning Example SPT Case


1) The last-hop or Leaf router (rtr-b) receives an IGMP Group Leave message from Rcvr A for group G. After performing the normal IGMP Leave processing and finding that Rcvr A was the last host to leave, the IGMP state for group G on interface E1 is deleted. 2) This causes interface E1 to be removed from the Outgoing interface list of the (*, G) entry and any (Si, G) entries in the Multicast Routing Table. Because E1 was the only interface in the (*, G) and the (Si, G) entries, their outgoing interface lists become Null. 3) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is sent up the Shared Tree (RPT) via E0 toward the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

106

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1

Periodic (S, G) Join

rtr-b

Rcvr A

1 2 3 4

rtr-b is a Leaf router. Last host Rcvr A, leaves group G. rtr-b removes E1 from (*,G) and any (Si,G) oilists. rtr-b (*,G) oilist now empty; sends (*,G) Prune toward RP. rtr-b stops sending periodic (S, G) joins.
1998 2001, Cisco Systems, Inc. All rights reserved.

Module5. ppt

8/10/2001 2:37 PM

107 107

PIM SM Pruning Example SPT Case (cont.)


4) Because the (Si, G) Outgoing interface list is now Null, rtr-b stops sending Periodic (Si, G) Join messages up the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

107

PIM SM Pruning
Source (SPT) Case
S1
(*,G) Prune (Si, G) Traffic Flow Shared Tree SPT Tree

To Source Si

To RP (10.1.5.1) 6
10.1.4.2

S0

rtr-a

10.1.2.2

E0

E0 10.1.2.1 5

E1

rtr-b

5 6

rtr-a receives Prune; removes E0 from (*,G) oilist.


(After the 3 second Multiaccess Network Prune delay.)

rtr-a (*,G) oilist now empty; sends (*,G) Prune toward RP.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

108 108

PIM SM Pruning Example SPT Case (cont.)


5) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 7) Because rtr-a is no longer receiving (Si, G) Join messages from rtr-b, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source Si. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

108

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1

rtr-b

(*, 224.1.1.1), 00:02:32/00:02:59, RP 10.1.5.1, flags: S P Incoming interface: Ethernet0, RPF nbr 10.1.2.1, Outgoing interface list: (171.68.37.121/32, 224.1.1.1), 00:01:56/00:00:53, flags: PT Incoming interface: Ethernet0, RPF nbr 10.1.2.1 Outgoing interface list:

State in rtr-b after Pruning


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

109 109

PIM SM Pruning Example SPT Case (cont.)


5) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 7) Because rtr-a is no longer receiving (Si, G) Join messages from rtr-b, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source Si. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

109

PIM SM Pruning
Source (SPT) Case
S1 S0 To Source Si

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1

rtr-b
RP RP 10.1.5.1, 10.1.5.1, flags: flags: S SP P nbr nbr 10.1.4.1, 10.1.4.1,

(*, (*, 224.1.1.1), 224.1.1.1), 00:02:32/00:02:59, 00:02:32/00:02:59, Incoming Incoming interface: interface: Serial0, Serial0, RPF RPF Outgoing interface Outgoing interface list: list:

(171.68.37.121/32, (171.68.37.121/32, 224.1.1.1), 224.1.1.1), 00:01:56/00:00:53, 00:01:56/00:00:53, flags: flags: P PT T Incoming Incoming interface: interface: Serial1, Serial1, RPF RPF nbr nbr 10.1.9.2 10.1.9.2 Outgoing Outgoing interface interface list: list:

State in rtr-a after Pruning


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

110 110

PIM SM Pruning Example SPT Case (cont.)


5) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 7) Because rtr-a is no longer receiving (Si, G) Join messages from rtr-b, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source Si. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

110

PIM SM Pruning
Source (SPT) Case
(Si,G) Data

To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

S1 S0

7 To Source Si

rtr-a
E0 10.1.2.1

8 (Si,G) Prune

10.1.2.2

E0

E1

rtr-b

7 8

Another (Si,G) data packet arrives via Serial1. rtr-a responds by sending an (Si,G) Prune toward source.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

111 111

PIM SM Pruning Example SPT Case (cont.)


5) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 7) Because rtr-a is no longer receiving (Si, G) Join messages from rtr-b, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source Si. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

111

PIM SM Pruning
Source (SPT) Case
9 To RP (10.1.5.1)
(Si, G) Traffic Flow Shared Tree SPT Tree 10.1.4.2

S1 S0

To Source Si

rtr-a
E0 10.1.2.1

10.1.2.2

E0

E1

rtr-b

7 8 9

Another (Si,G) data packet arrives via Serial1. rtr-a responds by sending an (Si,G) Prune toward source. (Si,G) traffic ceases flowing down SPT.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

112 112

PIM SM Pruning Example SPT Case (cont.)


5) The (*, G) Prune is received by rtr-a which causes interface E0 to be removed from the Outgoing interface list of the (*, G) entry in the Multicast Routing Table. (Note: rtr-a delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) Outgoing interface list is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via S0 toward the RP. 7) Because rtr-a is no longer receiving (Si, G) Join messages from rtr-b, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source Si. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

112

PIM-SM Protocol Mechanics

PIM Neighbor Discovery PIM State PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning PIM SM State Maintenance
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

113 113

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

113

PIM SM State Maintenance

Periodic Join/Prunes are sent to all PIM neighbors. Periodic Joins refresh interfaces in a PIM neighbors oilists. Periodic Prunes refresh prune state in a PIM neighbor. Received Multicast packets reset (S,G) entry expiration timers.
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

114 114

PIM SM State Maintenance


In PIM SM, Join/Prune state information has an normal expiration time of 3 minutes. If a periodic Join/Prune message is not received to refresh this state information, it automatically expires and is deleted. Therefore, a PIM router sends periodic Join/Prune messages to its PIM neighbors to maintain this state information. When a Join message is received from a PIM Neighbor, the expiration timer of the interface (in the Outgoing interface list) from which the Join was received is reset to 3 minutes. If the interface expiration timer goes to zero, the interface is removed from the Outgoing interface list. (Note: This can trigger a Prune if the removal of the interface causes the Outgoing interface list to become Null.) When a Prune message is received in PIM Sparse mode, the interface on which the Prune was received is normally just removed from the Outgoing interface list. The exception to this is the special case of (S, G)RP -bit Prunes which are used to Prune (S, G) traffic from the Shared Tree. In this case, periodic (S, G)RP -bit Prunes must be sent in order to refresh the prune state in the upstream PIM Neighbor toward the RP. All (S, G) entries have entry expiration timers which are reset to 3 minutes by the receipt of an (S, G) packet received via the Shortest-Path Tree (SPT). If the source stops sending, this expiration timer goes to zero and the (S, G) entry is deleted.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

114

PIM Sparse Mode Review


Link Data Control

RP

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

115 115

PIM Sparse Mode Review


The following slides will review all the major concepts previously present in a sample network situation.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

115

PIM Sparse Mode Review


Receiver 1 Joins Group G C Creates (*, G) State, Sends (*, G) Join to the RP

RP Join

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

116 116

PIM Sparse Mode Review (cont.)


Receiver 1 joins group G by sending an IGMP Host message to C. C creates a (*, G) entry that has the interface towards Receiver 1 in the Outgoing interface list. C then sends a (*, G) Join up the Shared Tree toward the Rendezvous Point (RP).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

116

PIM Sparse Mode Review


RP Creates (*, G) State

RP

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

117 117

PIM Sparse Mode Review (cont.)


The Rendezvous Point (RP) receives the (*, G) Join from C and creates a (*, G) entry that has the interface towards C in the Outgoing interface list. The Shared Tree for group G has now been built (indicated by the arrows in the drawing) down to Receiver 1.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

117

PIM Sparse Mode Review


Source 1 Source 1 Sends Data A Sends Registers to the RP

Register

RP

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

118 118

PIM Sparse Mode Review (cont.)


Source 1 begins sending data to group G. (Note: It is not necessary for Source 1 to join group G by sending an IGMP Host Membership message before sending to group G.) A encapsulates Source 1 multicast packets in PIM Register messages and unicasts them to the Rendezvous Point (RP).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

118

PIM Sparse Mode Review


Source 1 RP De -Encapsulates Registers Forwards Data Down the Shared Tree Sends Joins Towards the Source

Join

Join

RP

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

119 119

PIM Sparse Mode Review (cont.)


The Rendezvous Point (RP) receives and de-encapsulates the Register messages and finds that it contains a packet for group G for which it has (*, G) state. The RP forwards the de-encapsulated packet down the Shared Tree. The RP then sends a Join towards Source 1 so that it can begin receiving native (I.e. unencapsulated) packets from the Source 1.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

119

PIM Sparse Mode Review


Source 1 RP Sends Register-Stop Once Data Arrives Natively

Register-Stop

RP

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

120 120

PIM Sparse Mode Review (cont.)


Once the data from Source 1 begins arriving natively, the Rendezvous Point (RP) sends a Register-Stop message to notify A that it no longer needs to encapsulate traffic in Register messages.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

120

PIM Sparse Mode Review


Source 1 C Sends (S, G) Joins to Join the Shortest Path (SPT) Tree

A (S, G) Join C

RP

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

121 121

PIM Sparse Mode Review (cont.)


Traffic exceeds the SPT-Threshold and C begins the process of switching to the Shortest-Path Tree (SPT) for Source 1 by sending an (S, G) Join to A up the SPT towards Source 1.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

121

PIM Sparse Mode Review


Source 1 C Sends Prunes Up the RP tree for the Source. RP Deletes (S, G) OIF and Sends Prune Towards the Source

(S, G) Prune

RP

(S, G) RP Bit Prune C E

Receiver 1

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

122 122

PIM Sparse Mode Review (cont.)


In order to prune Source 1 traffic from the Shared Tree (its now receiving this traffic via the Shortest-Path Tree), C sends an (S, G)RP -bit Prune up the Shared Tree. (Note: The RP-bit indicates to up-stream routers that this prune should flow up the Shared Tree (RPT) to the RP.) The RP receives the (S, G)RP-bit Prune and removes the interface (towards C) from the (S, G) oilist. This stops the flow of Source 1 traffic down the Shared Tree to C. The RPs (S, G) oilist is now Null (I.e. there are no other branches on the Shared Tree that want Source 1 traffic)and therefore no long needs Source 1 traffic. The RP responds by sending (S, G) Prunes towards Source 1. This stops the flow of Source 1 traffic to the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

122

PIM Sparse Mode Review


Source 1 New Receiver 2 Joins E Creates State and Sends (*, G) Join

B (*, G) Join C

RP

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

123 123

PIM Sparse Mode Review (cont.)


Receiver 2 also joins group G by sending an IGMP Host message to E. E creates a (*, G) entry that has the interface towards Receiver 2 in the Outgoing interface list. E then sends a (*, G) Join up the Shared Tree toward the Rendezvous Point (RP).

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

123

PIM Sparse Mode Review


Source 1 C Adds Link Towards E to the OIF List of Both (*, G) and (S, G) Data from Source Arrives at E

RP

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

124 124

PIM Sparse Mode Review (cont.)


C receives the (*, G) Join from E and adds the interface (towards E) to the oilist in both the existing (*, G) and (S, G) entries in the Multicast Routing table. (Note: Since C already had (*, G) state, it is not necessary to send a (*, G) join toward the RP again.) Group G traffic now begins flowing through C and E to Receiver 2.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

124

PIM Sparse Mode Review


Source 1 Source 2 Starts Sending, D Sends Registers, RP Forwards Data to Receivers down Shared Tree Register Source 2

RP

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

125 125

PIM Sparse Mode Review (cont.)


Source 2 now begins sending data to group G. (Note: Again, it is not necessary for Source 2 to join group G by sending an IGMP Host Membership message before sending to group G.) D encapsulates Source 2 multicast packets in PIM Register messages and unicasts them to the Rendezvous Point (RP). The Rendezvous Point (RP) receives and de-encapsulates the Register messages and finds that it contains a packet for group G for which it has (*, G) state. The RP forwards the de-encapsulated packet down the Shared Tree.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

125

PIM Sparse Mode Review


Source 1 RP Sends Joins to D

Register Join Source 2

RP

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

126 126

PIM Sparse Mode Review (cont.)


The RP then sends a Join towards Source 2 so that it can begin receiving native (I.e. unencapsulated) packets from the Source 2.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

126

PIM Sparse Mode Review


Source 1 Data begins flowing to RP down SPT, RP sends Register-Stop Register-Stop

RP

Source 2

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

127 127

PIM Sparse Mode Review (cont.)


Data from Source 2 begins arriving natively down the Shortest-Path Tree (SPT) via D. The Rendezvous Point (RP) sends a Register-Stop message to notify D that it no longer needs to encapsulate traffic in Register messages.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

127

PIM Sparse Mode Review


Source 1 Both Shared Tree and SPT in use

RP

Source 2

Receiver 1

Receiver 2

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

128 128

PIM Sparse Mode Review (cont.)


At this point, both the Shared Tree (Source 2 traffic) and a Shortest-Path Tree (Source 1 traffic) are being used to deliver group G traffic to the Receivers.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

128

PIM-SM Configuration Steps


Enable Multicast Routing on every router Configure every interface for PIM Configure the RP
Using Auto-RP or BSR
Configure certain routers as Candidate RP(s) Configure certain routers as Mapping Agents/CBSRs All other routers automatically learn elected RP

Anycast/Static RP addressing
RP address must be configured on every router Note: Anycast RP requires MSDP
Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

129 129

Enable Multicast Routing (Forwarding) on every router.


This insures that there will be no multicast black-holes caused by a non multicast router in the unicast RPF path back to the source or RP.

Configure every interface for PIM


This insures that there will be no multicast black-holes caused by a non multicast interface in the unicast RPF path back to the source or the RP.

Configure the RP
Auto-RP (or BSR) are the simplest forms of RP configuration as they allow the routers in the network to automatically learn the address of the RP. This requires two additional command lines on one or more routers in the network that have been selected as candidate RPs. Configure one or more routers as Candidate RPs using the appropriate Auto-RP or BSR command. Configure one or more routers as Mapping Agents (Auto-RP) or Candidate BSRs (BSR). Anycast/Static RP addressing takes more work as the single RP address must be configured on every router in the network. Anycast RP is a form of redundant static RPs which requires the use of the Multicast Source Discovery Protocol (MSDP) but provides rapid RP failover.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

129

Enabling Multicast on the Router

Global Configuration Command


ip multicast-routing Enables IP multicast forwarding in the router. Configure on EVERY router in the network.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

130 130

Enabling Multicast on the Router


ip multicast-routing This command enables IP multicast forwarding on the router. Be sure to configure this command on every router in the network to avoid blackholes caused by a non multicast appearing in the unicast RPF path to a source or the RP.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

130

Enabling Multicast on an Interface


Interface Configuration Commands
Enables multicast forwarding on the interface. Controls the interfaces mode of operation.
ip pim dense-mode Interface mode is hard-wired to Dense mode operation. ip pim sparse-mode Interface mode is hard-wired to Sparse mode operation. ip pim sparse-dense-mode Interface mode is determined by the Group mode.
If Group is Dense, interface operates in Dense mode. If Group is Sparse, interface operates in Sparse mode. Decision is made on a packet-by-packet basis.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

131 131

Enabling Multicast on an Interface


The following commands enable multicast forwarding on an interface as well as determining the mode in which it operates. ip pim dense-mode Causes the interface to be hard-wired into operating in dense mode for all multicast traffic flows. ip pim sparse-mode Causes the interface to be hard-wired into operating in sparse mode for all multicast traffic flows. ip pim sparse-dense-mode Causes the interface to dynamically determine the interface mode on a packet-by-packet basis depending on whether the destination group is Dense mode, or Sparse mode. (This mode is shown by the D or S flag on the (*,G) entry.) If the destination group of a packet is in Dense mode, the interface uses dense mode operation to forward the packet. If the destination group of a packet is in Sparse mode, the interface uses sparse mode operation to forward the packet.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

131

Group Mode vs. Interface Mode


Avoid Group/Interface mode mismatches.
Group and Interface mode should be the same.
Otherwise you may get unwanted/unpredictable results.

Sparse-Dense interfaces always match the Group mode.


Should normally be used if running Auto-RP.
Permits Auto-RP groups to automatically run in Dense mode.
All other groups run in Sparse mode. (Assuming an RP is defined for all other groups.)

Can also be used for Sparse-only or Denseonly networks.


Module5. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

132 132

Group Mode vs. Interface mode


Care should be taken to make sure that the Group mode always matches the interface mode. Otherwise, the forwarding mechanisms may not perform as desired/predicted. Group/Interface mismatches can be avoided by configuring ip pim sparse-dense-mode on an interface. This results in the interface always matching the Group mode. Sparse/Dense mode should normally be used if running Auto-RP. This allows the two Auto-RP groups (224.0.1.39 and 224.0.1.40) to operate in Dense mode. Once the routers in the network learn (via Auto-RP) the address of the elected RP, they will operate all other groups in Sparse mode. (By default, Auto-RP learned Group-to-RP ranges never include the Auto-RP groups.) Sparse/Dense mode can be also be used if pure Sparse mode or pure Dense mode operation is desired by either configuring or not configuring an RP address on each router, respectively.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

132

Group Mode vs. Interface Mode Common Misconception


Interface Mode controls Group Mode.
If I set all interfaces to ip pim sparse-mode, the router will always operate in Sparse mode and never fall back into Dense mode. Bzzzztt!!! Im sorry, but thats the incorrect answer. Group mode is independent of interface mode.
Interface mode only controls how the interface operates. Group mode is controlled by RP information!!!

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

133 133

Common Misconception
Interface Mode controls Group Mode. This is a classic error often made by network administrators. They assume that, If I set all interfaces to ip pim sparse-mode, the router will always operate in Sparse mode and never fall back into Dense mode. Unfortunately, this is incorrect. Group mode is solely controlled by the existence of a valid RP. If a valid RP is learned/configured for a group range, those groups will operate in sparse mode and the (*,G) entry will be created with the S flag set. Otherwise, the groups will operate in Dense mode and the D flag will be set on the (*,G) entry.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

133

Interface Mode Summary

Let Group mode control Interface mode.


Use ip pim sparse-dense-mode command.
Allows maximum flexibility. No need to ever change interface configuration.

Control Group mode with RP info.


If RP info exists, Group = Sparse.
Therefore interface mode = Sparse for this group.

If RP info does not exist, Group = Dense.


Therefore interface mode = Dense for this group.

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

134 134

Interface Mode Summary


By configuring all interfaces with ip pim sparse-dense-mode we allow the Group mode to determine the interface mode on a packet-by-packet basis. This allows maximum flexibility since we never have to reconfigure the interface to change mode. Control the Group mode (and hence the interface mode if ip pim sparse-densemode is configured) by defining RP information in the network. If a router has a valid RP address for a particular group, the group will be created in sparse mode; thereby causing the interface to operate in sparse mode when ip pim sparse-dense-mode is configured. If a router does not have a valid RP address for a particular group, the group will be created in dense mode; thereby causing the interface to operate in dense mode when ip pim sparse-dense-mode is configured.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

134

Avoiding DM Fallback

To always guarantee Sparse mode operation (and avoid falling back to Dense mode), make sure that every router always knows of an RP for every group.

Module5.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

135 135

How to avoid falling back into Dense mode.


It is often desired that the network NEVER fall back into Dense mode. Even if all primary and backup RPs fail, it is often better to have multicast forwarding stop instead of reverting back to dense mode. In order to prevent falling back into Dense mode, make sure that there is always an RP learned/configured for the entire multicast group range.

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

135

Avoiding DM Fallback
Define an "RP-of-last-resort.
Configure as a Static RP on every router.
Will only be used if all Candidate-RPs fall. Can be a dummy address.
Recommendation: Use lowest priority C-RP address.

Use ACL to avoid breaking Auto-RP


ip pim rp-address <RP-of-last-resort> 10 access-list 10 deny 224.0.1.39 access-list 10 deny 224.0.1.40 access-list 10 permit any

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

136 136

Avoiding DM Fallback
In order to guarantee that the router will never fall back into dense mode, it is necessary to guarantee that the router will never loose RP information. This can be accomplished by defining a static, RP-of-last-resort in each router in the network. Since automatically learned RPs (Auto-RP or BSR) take precedence over statically defined RPs, the static entry will only be activated if all learned RPs timeout and/or fail. The recommendation is to define the lowest priority Candidate RP as the RP-of-last-resort by using a static RP definition pointing to this IP address. This locks the lowest priority RP into the bottom of the failover order. Even if this router fails (or its information times out), the static entry in each router will prevent a total loss of RP information. Special care must be taken if an RP -of-last-resort is defined when using Auto-RP. By default, a static RP definition that covers the Auto-RP group range will be interpreted as the RP for the two Auto-RP groups. (Unlike Auto-RP learned group ranges which have an implied deny for these two groups so that the two Auto-RP groups will default to using dense mode.) The following example shows how to configure an RP -of-last-resort so that the two Auto-RP groups do not accidentally switch to sparse mode: ip pim rp-address <RP-of-last-resort> 10 access-list 10 deny 224.0.1.39 access-list 10 deny 224.0.1.40 access-list 10 permit any

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

136

Simple PIM-SM Configuration


RP/Mapping Agent A B

PIM Sparse Mode


C RP/Mapping Agent D

On every router: On every interface:

ip multicast-routing ip pim sparse-dense-mode

On routers B and C: ip pim send-rp-announce loopback0 scope 16 ip pim send-rp-discovery loopback0 scope 16

Module5. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/10/2001 2:37 PM

137 137

Example PIM-SM Configuration


The above example network shows how to configure a network to run Sparse mode using Auto-RP using two Candidate-RPs/Mapping Agents. Note: A common practice is to combine the function of Candidate-RP and Mapping Agent on the same router. This is done more as a configuration convenience than for any operational requirement. One every router in the network: Configure the ip multicast-routing global command to enable multicast on the router. Configure the ip pim sparse-dense-mode interface command on EVERY interface on each router. This allows the Auto-RP groups to function in Dense mode and all other groups to operate in Sparse or Dense mode depending on whether an RP has been configured for the group. On the router(s) that are to function as Candidate-RPs, configure the ip pim send-rp-announce Loopback0 scope <ttl> command. (Make sure the <ttl> value is sufficient to allow the message to reach all Mapping Agents in the network.) On the router(s) that are to function as Mapping Agents, configure the ip pim send-rp-discovery Loopback0 scope <ttl> command. (Make sure the <ttl> value is sufficient to allow the message to reach all routers in the network.) No additional configuration is generally necessary. The network is now completely enabled for PIM-SM IP Multicast!

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

137

Module5.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

138

Copyright ? ?1999-2001, Cisco Systems, Inc.

Module5.ppt

138

Rendezvous Points
Module 6

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/13/2001 11:12 AM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Module Objectives

Explain the basic operation of Auto-RP. Explain the basic operation of PIMv2 BSR. Explain how to configure RPs How to use various IOS commands to tune and control RP operation.

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Module Agenda

Auto RP PIMv2 BSR Static RPs Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RP Overview
All routers automatically learn RP address
No configuration necessary except on:
Candidate RPs Mapping Agents

Makes use of Multicast to distribute info


Two specially IANA assigned Groups used
Cisco-Announce - 224.0.1.39 Cisco-Discovery - 224.0.1.40

Typically Dense mode is used forward these groups

Permits backup RPs to be configured Can be used with Admin-Scoping


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Auto-RP Overview
Auto-RP allows all routers in the network to automatically learn Group-to-RP mappings. There are no special configuration steps that must be taken except on the router(s) that are to function as: Candidate RPs Mapping Agents Multicast is used to distribute Group-to-RP mapping information via two special, IANA assigned multicast groups. Cisco-Announce Group - 224.0.1.39 Cisco-Discovery Group - 224.0.1.40 Because multicast is used to distribute this information, a Chicken and Egg situation can occur if the above groups operate in Sparse mode. (Routers would have to know a priori what the address of the RP is before they can learn the address of the RP(s) via Auto-RP messages.) Therefore, it is recommend that these groups always run in Dense mode so that this information is flooded throughout the network. Multiple Candidate RPs may be defined so that in the case of an RP failure, the other Candidate RP can assume the responsibility of RP. Auto-RP can be configured to support Administratively Scoped zones. (BSR cannot!) This can be important when trying to prevent high-rate group traffic from leaving a campus and consuming too much bandwidth on WAN links.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RP Fundamentals
Candidate RPs
Multicast RP-Announcement messages
Sent to Cisco-Announce (224.0.1.39) group Sent every rp-announce-interval (default: 60 sec)

RP-Announcements contain:
Group Range (default = 224.0.0.0/4) Candidates RP address Holdtime = 3 x <rp-announce-interval>

Configured via global config command


ip pim send-rp-announce <intfc> scope <ttl> [group-list acl]

Deny in group-list has variable meaning


Before 12.0(1.1) Deny = Im not C-RP for this group-range After 12.0(1.1) Deny = Force group-range to always be DM
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Auto-RP Candidate RPs (C-RPs)


Multicast RP-Announcement messages to the Cisco-Announce (224.0.1.39) group. These messages announce this router as being a Candidate for selection as RP and are sent every 60 seconds by default. RP-Announce messages contain: Group Range (default is all multicast groups or 224.0.0.0/4) The Candidates IP address A holdtime which is used to detect when the C-RP has failed. This holdtime is 3 times the announcement interval or 3x60 = 180 seconds = 3 minutes C-RPs are configured using the (rather obtuse) command: ip pim send-rp-announce <intfc> scope <ttl> [group-list <acl>] The <intfc> specifies which IP address is used as the source address in the RP-Announce messages that are sent out all multicast interfaces on the router. The <ttl> value controls the TTL of the RP -Announce message. The optional group-list permits a group range other than the default to be assigned. This command may be configured more than once on a router so that the router will function as C-RP for multiple group ranges. Note: A deny in the group-list access-list has a different meaning beginning with IOS release 12.0(1.1). Before 12.0.(1.1): Deny means Im not the RP for this group range. After 12.0.(1.1): Deny means Force this group range to always work in Dense mode. Note: Only a single C-RP needs to deny this group range to force this to happen. In other words, the deny overrides any other routers permit advertisement.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RP Fundamentals
Mapping agents
Receive RP-Announcements
Stored in Group-to-RP Mapping Cache with holdtimes Elects highest C-RP IP address as RP for group range

Multicast RP-Discovery messages


Sent to Cisco-Discovery (224.0.1.40) group Sent every 60 seconds or when changes detected

RP-Discovery messages contain:


Elected RPs from MAs Group-to-RP Mapping Cache

Configured via global config command


ip pim send-rp-discovery [<interface>] scope <ttl>

Source address of packets set by <interface> (12.0)


If not specified, source address = output interface address Results in the appearance of multiple MAs. (one/interface)
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Auto-RP Mapping Agents (MAs)


Mapping Agents join the RP -Announce group (224.0.1.39) in order to receive RP Announcements sent by all Candidate RPs. When they receive an Announcement they: Save the Announcement in the Group-to-RP mapping cache Select the C-RP with the highest IP address as RP for the group range The holdtimes are used to timeout an entry in the cache if a C-RP fails and is no longer sending periodic C-RP announcements. Mapping Agents periodically send the elected RPs from their Group-to-RP mapping cache to all routers in the network via RP Discovery messages. RP Discovery messages are multicast to the Auto-RP Discovery group 224.0.1.40. They are sent every 60 seconds or when a change to the information in the Group-to-Mapping takes place. MAs are configured using the (rather obtuse) command: ip pim send-rp-discovery[ <intfc>] scope <ttl> The optional <intfc> specifies which IP address is used as the source address in the RP-Discovery messages that are sent out all multicast interfaces on the router. (A Loopback interface is normally specified here.) If this interface is not specified, the source address of each multicast interface on the router is used. Note: The reason that this is an optional clause is strictly to be backwards compatible with IOS releases prior to 12.0 that did not allow the interface to be specified. In practice, an interface should always be specified. The <ttl> value controls the TTL of the RP -Discovery message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RP Fundamentals

All Cisco routers


Join Cisco-Discovery (224.0.1.40) group
Automatic No configuration necessary

Receive RP-Discovery messages


Stored in local Group-to-RP Mapping Cache Information used to determine RP for group range

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

All Cisco Routers


Automatically join the Cisco-Discovery (224.0.1.40) group in order to receive Group-to-RP mapping information being multicast by the Mapping Agents in the network. No configuration is necessary! Group-to-RP mapping information contained in the RP -Discovery messages is stored in the routers local Group-to-RP mapping cache. This information is used by the router to map a Group address to the IP address of the active RP for the group.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Simple Auto-RP Configuration


RP/Mapping Agent A B

PIM Sparse Mode


C RP/Mapping Agent D

On each router: On each interface:

ip multicast-routing ip pim sparse-dense-mode

On routers B and C: ip pim send-rp-announce loopback0 scope 16 ip pim send-rp-discovery loopback0 scope 16

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Example Auto-RP Configuration


The above example network shows how to configure a network to run Sparse mode using Auto-RP and two Candidate-RPs/Mapping Agents. Note: A common practice is to combine the function of Candidate-RP and Mapping Agent on the same router. This is done more as a configuration convenience than for any operational requirement. One every router in the network: Configure the ip multicast-routing global command to enable multicast on the router. Configure the ip pim sparse-dense-mode interface command on EVERY interface on each router. This allows the Auto-RP groups to function in Dense mode and all other groups to operate in Sparse or Dense mode depending on whether an RP has been configured for the group. On the router(s) that are to function as Candidate-RPs, configure the ip pim send-rp-announce Loopback0 scope <ttl> command. (Make sure the <ttl> value is sufficient to allow the message to reach all Mapping Agents in the network.) On the router(s) that are to function as Mapping Agents, configure the ip pim send-rp-discovery Loopback0 scope <ttl> command. (Make sure the <ttl> value is sufficient to allow the message to reach all routers in the network.) No additional configuration is generally necessary. The network is now completely enabled for IP Multicast!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RPFrom 10,000 Feet

Announce

MA A A
Announce

MA B B
Announce

Announce

Announce

Announce

RP- Announcements multicast to the Cisco Announce (224.0.1.39) group


Announce

Announce

C-RP 1.1.1.1

C C

D D

Announce

C-RP 2.2.2.2

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Auto-RP - The big picture


In this example, routers A and B have been configured as Mapping Agents while routers C & D have been configured as Candidate RPs. Step 1 The Candidate RPs begin multicasting their candidacy to be the RP via RP Announce messages which are sent via the Cisco-Announce group, 224.0.1.39.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

Auto-RPFrom 10,000 Feet

Dis cov ery

MA A A
Dis cov ery

Disc ove ry

Dis cov ery


Disc ove ry

Disc ove ry

Dis cov ery

MA B B
Disc ove ry

C-RP 1.1.1.1

C C

D D

C-RP 2.2.2.2

RP-Discoveries multicast to the Cisco Discovery (224.0.1.40) group


Discovery

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

10 10

Auto-RP - The big picture


Step 2 The two Mapping Agents (routers A & B) receive the RP-Announce messages from the two Candidate RPs (routers C & D). Step 3 The C-RP with the highest IP address (in this case, router D) is stored in the Group-to-Mapping cache of the Mapping Agents. Step 4 The Mapping Agents both multicast the contents of their Group-to-RP Mapping Cache to the Cisco-Discovery group, 224.0.1.40. Note: All Mapping Agents are transmitting this Group-to-RP Mapping information simultaneously. The originally published specification on AutoRP implied that there was a Master-Slave relationship between Mapping Agents and that only the Master would transmit while the Slave(s ) were quiet until the Master failed. This specification is in error and this is not how Auto-RP has been implemented. As long as both Mapping Agents are transmitting identical information, there is no need to add the complexity of a Master-Slave failover scheme. Step 5 The RP Discovery messages are received via multicast by all routers in the network. The Group-to-RP mapping information contained in these messages is stored in the routers local Group-to-RP mapping cache. This information is subsequently used by the router to determine the IP address of the RP for a given group.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

10

Auto-RPA Closer Look


MA

C-RP 1.1.1.1

Rtr -A# show ip pim rp mapping This system is an RP -mapping agent

C-RP 2.2.2.2

C Initial Cache State in the Mapping Agent

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

11 11

Auto-RP Up Close
This is the same example that was presented in the previous slides. However, in this case, we will examine the process in more detail at each step. Step 1 At time zero, the Group-to-RP mapping caches in the Mapping Agents are empty since no RP-Announcements have been received. The output of the show ip pim rp mapping command shows that router A is a Mapping Agent and that the Group-to-RP mapping cache is empty.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

11

Auto-RPA Closer Look


MA

A
0/4 .0. 4.0 2 =2 .1 ge ec .1.1 Ran 80 s nce 1 - 1 ou = = RP roup ime Ann G oldt H

R G P= Ho roup 2.2.2 ldt im -R .2 e a An = 1 nge no 80 = 2 un sec 24 ce .0. 0.0 /4

C-RP 1.1.1.1

Rtr -A# show ip pim rp mapping This system is an RP -mapping agent Group(s) 224.0.0.0/4 RP 2.2.2.2 (Rtr (Rtr -D), D), v2v1 Info source: 2.2.2.2 (Rtr -D), via Auto-R P Uptime: 00:00:03, expires: 00:02:57 RP 1.1.1.1 (Rtr -C), v2v1 Info source: 1.1.1.1 (Rtr -C), via Auto-R P Uptime: 00:00:11, expires: 00:02:49

C-RP 2.2.2.2

C-RP Information is Stored in MAs Group-to-RP Mapping Cache Mapping Agent elects highest IP Address as RP
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

12 12

Auto-RP Up Close
Step 2 Routers C and D begin sending their RP Announce messages advertising themselves as a candidate to be RP for all multicast groups. (Note the group range, the IP address of the C-RP and the holdtime in the message.) Step 3 The Mapping Agent (router A) receives these RP Announcements and stores this information in its Group-to-RP mapping cache. The output of the show ip pim rp mapping command on the Mapping Agent (router A) now shows both router C and D as candidates for group range 224.0.0.0/4 (i.e. all multicast groups with the exception of the Auto-RP groups). The Mapping Agent then elects the C-RP with the highest IP address as the active RP for the group range.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

12

Auto-RPA Closer Look


MA

A
R G P= Ho roup 2.2.2 ldt . im - R 2 e an g = e= Di 180 224 sc se .0. ov c 0.0 ery /4

Mapping Agent advertises elected RP via Discovery messages

/4 .0.0 4.0 22 c e= se 2.2 ng 80 .2. Ra =1 ry =2 ve RP roup ime isco G oldt D H


13 13

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Auto-RP Up Close
Step 4 The Mapping Agent begins advertising the results of the RP election to the rest of the network via Auto-RP Discovery messages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

13

Auto-RPA Closer Look

MA

All Mapping Agents Must Have Consistent Data !

MA

A
172.16.2.1

B
172.16.2.2

Rtr -A# show ip pim rp mapping This system is an RP -mapping agent Group(s) 224.0.0.0/4 RP 2.2.2.2 (Rtr (Rtr -D), v2v1 Info source: 2.2.2.2 (Rtr -D), via Auto-R P Uptime: 00:00:03, expires: 00:02:57 RP 1.1.1.1 (Rtr -C), v2v1 Info source: 1.1.1.1 (Rtr -C), via Auto-R P Uptime: 00:00:11, expires: 00:02:49

Rtr -B# show ip pim rp mapping This system is an RP -mapping agent Group(s) 224.0.0.0/4 RP 2.2.2.2 (Rtr (Rtr -D), v2v1 Info source: 2.2.2.2 (Rtr -D), via Auto-R P Uptime: 00:00:03, expires: 00:02:57 RP 1.1.1.1 (Rtr -C), v2v1 Info source: 1.1.1.1 (Rtr -C), via Auto-R P Uptime: 00:00:11, expires: 00:02:49

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

14 14

Auto-RP Up Close
It is critical that all Mapping Agents in the PIM-SM domain have identical information in their Group-to-RP mapping caches. Note that in our example network, they do. If the information in the mapping caches are not identical, it can cause the routers in the network to flip-flop between two different RPs.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

14

Auto-RPA Closer Look


Local Cache Initially Loaded from Router B
MA
Rtr -X# show ip pim rp mapping Group(s) 224.0.0.0/4 RP 2.2.2.2 (Rtr -D), v2v1 Info source: 172.16.2.2 ( Rtr -B), via Auto- RP Uptime: 00:00:03, expires: 00:02:57

MA

A
172.16.2.1

B
172.16.2.2

RP D G = 2 isc Ho roup .2.2 over ldt -R .2 y im a e = ng 18 e = 0 s 22 ec 4.0 .0.0 /4

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

15 15

Auto-RP Up Close
Step 6 Assume that router B is the first MA to send its RP Discovery message containing its Group-to-RP mapping cache contents. Step 7 The routers in the network (router X in this example) all receive this RP Discovery message and install the information in their local Group-to-RP mapping cache. The output of the show ip pim rp mapping command shows that router D is currently selected as the RP for group range 224.0.0.0/4 (i.e. all multicast groups with the exception of the Auto-RP groups) and that this information was most recently received from router B.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

15

Auto-RPA Closer Look


Identical Info Received from Router A
MA
Rtr -X# show ip pim rp mapping Group(s) 224.0.0.0/4 RP 2.2.2.2 (Rtr -D), v2v1 Info source: 172.16.2.1 ( Rtr -A ), via Auto- RP Uptime: 00:00:03, expires: 00:02:57

MA

A
172.16.2.1

B
172.16.2.2

ry .0/4 ve co .0.0 24 Dis =2 2.2 e c .2. ang se = 2 - R 180 RP oup e = Gr ldtim Ho

Info source will continue to flip-flop between routers A and B No performance impact
X

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

16 16

Auto-RP Up Close
Step 8 Next, router A sends an RP Discovery message containing its Group-to-RP mapping cache contents. Step 9 The routers in the network (router X in this example) all receive this RP Discovery message and update the information in their local Group-to-RP mapping cache. Since both Mapping Agents are sending identical information, the only thing that will change in the local Group-to-RP mapping cache is the source of the information. The output of the show ip pim rp mapping command shows that router D is still selected as the RP for group range 224.0.0.0/4 (i.e. all multicast groups with the exception of the Auto-RP groups). However, the data reflects that this information was most recently received from router A. The flip-flop of the information source in the routers local Group-to-RP mapping cache has little or no performance impact on the router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

16

Module Agenda

Auto RP PIMv2 BSR Static RPs Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

17 17

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

17

PIMv2 BSR Overview


A single Bootstrap Router (BSR) is elected
Multiple Candidate BSRs (C-BSR) can be configured
Provides backup in case currently elected BSR fails

C-RPs send C-RP announcements to the BSR


C-RP announcements are sent via unicast BSR stores ALL C-RP announcements in the RP-set

BSR periodically sends BSR messages to all routers


BSR Messages contain entire RP-set and IP address of BSR Messages are flooded hop-by-hop throughout the network away from the BSR

All routers select the RP from the RP-set


All routers use the same selection algorithm; select same RP

BSR cannot be used with Admin-Scoping


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

18 18

BSR Overview
Bootstrap Router (BSR) A single router is elected as the BSR from a collection of Candidate BSRs. If the current BSR fails, a new election is triggered. The election mechanism is pre-emptive based on C-BSR priority. Candidate RPs (C-RPs) Send C-RP announcements directly to the BSR via unicast. (Note: C-RPs learn the IP address of the BSR via periodic BSR messages.) The BSR stores the complete collection of all received C-RP announcements in a database called the RP -set. The BSR periodically sends out BSR messages to all routers in the network to let them know the BSR is still alive. BSR messages are flooded hop-by-hop throughout the network. Multicast to the All-PIM Routers group (224.0.0.13 ) with a TTL of 1. BSR messages also contain: The complete RP-set consisting of all C-RP announcements. The IP Address of the BSR so that C-RPs know where to send their announcements. All routers receive the BSR messages being flooded throughout the network. Select the active RP for each group range using a common hash algorithm that is run against the RP-set. This results in all routers in the network selecting the same RP for a given group-range. BSR cannot be used with Admin-Scoping! Admin scoping was not considered when BSR was designed. One problem is that C-RP announcements that are unicast to the BSR cross multicast boundaries. There are several other problems as well.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module6.ppt 18

PIMv2 BSR Fundamentals


Candidate RPs
Unicast PIMv2 C-RP messages to BSR
Learns IP address of BSR from BSR messages Sent every rp-announce-interval (default: 60 sec)

C-RP messages contain:


Group Range (default = 224.0.0.0/4) Candidates RP address Holdtime = 3 x <rp-announce-interval>

Configured via global config command


ip pim rp-candidate <intfc> [group-list acl]

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

19 19

BSR Candidate RPs (C-RP)


C-RP Messages Sent periodically (default: 60sec) directly to the BSR via unicast. Messages contain the Group-range, C-RP address and a holdtime. The IP address of the current BSR is learned from the periodic BSR messages that are received by all routers in the network. C-RPs are configured using the following command: ip pim rp-candidate <intfc> [group-list <acl>] The <intfc> parameter dictates the IP Address that is advertised in the C-RP message. In most cases, a Loopback interface is used. The optional group-list access-list can be used to specify a group-range other than the default of 224.0.0.0/4 (i.e. all multicast groups) This command may be configured more than once on a router so that the router will function as C-RP for multiple group ranges.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

19

PIMv2 BSR Fundamentals


Bootstrap router (BSR)
Receive C-RP messages
Accepts and stores ALL C-RP messages Stored in Group-to-RP Mapping Cache w/holdtimes

Originates BSR messages


Multicast to All-PIM-Routers (224.0.0.13) group
(Sent with a TTL = 1)

Sent out all interfaces. Propagate hop-by-hop Sent every 60 seconds or when changes detected

BSR messages contain:


Contents of BSRs Group-to-RP Mapping Cache IP Address of active BSR
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

20 20

Bootstrap Router
The primary purpose of the Bootstrap router is to collect all C-RP announcements in to a database called the RP -set and to periodically send the RP-set out to all other routers in the network inside of BSR messages. BSR Messages Sent periodically (default: 60 secs) by the BSR out all multicast interfaces. BSR messages are multicast to the All-PIM-Routers (224.0.0.13) group with a TTL of 1. These messages are received by all PIM neighbors who retransmit them (again with a TTL of 1) out all interfaces except the one in which the messages was received. (An RPF check is done to insure the BSR message came in on the correct interface in the direction of the BSR.) BSR messages contain the RP-set and the IP address of the currently active BSR. (This is how C-RPs know where to unicast their C-RP messages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

20

PIMv2 BSR Fundamentals


Candidate bootstrap router (C-BSR)
C-BSR with highest priority elected BSR
C-BSR IP address used as tie-breaker
(Highest IP address wins)

The active BSR may be preempted


New router w/higher BSR priority forces new election

Configured via global config command


ip pim bsr-candidate <intfc> <hash-length> [priority <pri>]

<intfc>
Determines IP address

<hash-length>
Sets RP selection hash mask length

<pri>
Sets the C-BSR priority (default = 0)
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

21 21

Candidate Bootstrap Routers (C-BSRs)


C-BSRs participate in the BSR election mechanism. The C-BSR with the highest priority is elected as the BSR. The highest IP address of the C-BSRs is used as a tie-breaker. The election mechanism is preemptive. If a new C-BSR with a higher priority comes up, it triggers a new election. C-BSRs are configured using the following command: ip pim bsr-candidate <intfc> <hash-length> [priority <pri>] The <intfc> parameter is used to specify the BSRs IP address which is forwarded in BSR messages. (This is where C-RPs will send their messages if the C-BSR is elected as BSR.) The <hash-length> parameter specifies the number of bits in the hash. This can be used to control RP load balancing across a group range where different RPs are selected for different groups within a group range whose size is defined by the hash-length in bits. The optional <pri> value permits the C-BSR to be configured with a priority other than the default of zero.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

21

PIMv2 BSR Fundamentals


BSR election mechanism
C-BSRs:
Begin in Candidate-BSR state
BSR-Timeout timer started (150 seconds) If higher priority (preferred) BSR message received
Restart timer and forward BSR message Copy info to local Group-to-RP mapping cache Otherwise, discard BSR message

If timer expires, transition to Elected-BSR state

While in Elected-BSR state


Periodically originate own BSR messages
Include local Group-to-RP mapping cache in msg

Return to Candidate-BSR state if preferred BSR message (higher priority) received


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

22 22

BSR Election
How and when routers in the network forward BSR messages plays a key role in the BSR election mechanism. The algorithm used to decide whether to process and forward an incoming BSR message depends on whether the router is a Candidate BSR or not.

BSR Message forwarding by C-BSR routers


C-BSRs operate in one of two states, Candidate-BSR or Elected-BSR. Initially, a C-BSR comes up in C-BSR state.

C-BSR State
A BSR-Timeout timer started with a period of 150 seconds. If this timer expires, the router transitions to the Elected-BSR State. If a BSR message is received with higher priority than the C-BSRs priority, then the BSR (whose address is in the BSR message) is considered to be preferred and the BSR message is processed as follows: The BSR-Timeout timer is reset. The BSR message is forwarded out all other interfaces. The RP-set in the BSR message is copied into the local Group-to-RP mapping cache. If a BSR message is received with a priority less than the C-BSRs priority, the BSR message is simply discarded.

Elected BSR State


The router has been elected as the BSR and periodically originates BSR messages containing the current RP-set. If a BSR message is received from another router with a higher priority, forward the BSR message and transition back to C-BSR state; otherwise discard the BSR message.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module6.ppt 22

PIMv2 BSR Fundamentals

BSR election mechanism


Non C-BSRs (i.e., all other routers):
Start in Accept-Any state
Accepts first BSR message received Saves BSR info and forwards BSR message Transitions to Accept-Preferred state

While in Accept-Preferred state


Starts BSR-Timeout timer Only accept and forward preferred BSR messages
(i.e., BSR messages with priority > current BSR priority)

Otherwise, discard BSR message Return to Accept-Any state if timer expires

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

23 23

BSR Message forwarding by Non C-BSRs


Non C-BSR routers operate in two states, Accept-Any state and AcceptPreferred state. When a non C-BSR router boots up, it starts in the AcceptAny state.

Accept-Any State
Accept the first BSR message received and process it as follows: Copy the RP -Set into the local Group-to-RP mapping cache. Save the current BSR priority and BSR address in the BSR message. Transition to the Accept-Preferred state.

Accept-Preferred State
Start the BSR-Timeout timer with a period of 150 seconds. If this timer expires, transition back to the Accept-Any state. Accept only BSR messages that are preferred. (A preferred BSR message is one with a priority greater than or equal to the current BSR priority.) The accepted BSR message is then processed as follows: The BSR-Timeout timer is reset. The BSR message is forwarded out all other interfaces. The RP-set in the BSR message is copied into the local Group-to-RP mapping cache. Save the current BSR priority and BSR address in the BSR message. If a BSR message is received with a priority less than the C-BSRs priority, the BSR message is simply discarded. (Remember, the IP address of the BSR is used to break any ties with the winner being the C-BSR with the highest IP address.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

23

PIMv2 BSR Fundamentals

All PIMv2 routers


Receive BSR messages
Stored in local Group-to-RP Mapping Cache Information used to determine active BSR address

Selects RP using Hash algorithm


Selected from local Group-to-RP Mapping Cache All routers select same RP using same algorithm Permits RP-load balancing across group range

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

24 24

All PIMv2 Routers


Accept BSR messages based on the rules described in the previous pages. When a BSR message is accepted: The RP-Set in the BSR message is stored in the local Group-to-RP mapping cache. The BSR message is forwarded out all other interfaces (except the one in which it was received) on the router. Selects RP using a Hash Algorithm The RP for a group is selected from the set of C-RPs (stored in the Groupto-RP mapping cache) that have advertised their candidacy for a matching group-range. The same hashing algorithm is used by all routers to select the RP from the set of C-RPs in the RP-set. Since all routers run the same algorithm on the same RP-set (received from the BSR), all routers will select the same RP for a given group. The hashing algorithm permits multiple C-RPs to load balance the duties of RP across a range of groups. Only one C-RP will be selected as RP for any single group in the group range. However, the hash algorithm may select other C-RPs as RP for another group within the group range. For example, given a BSR hash length of 30 bits being used on IP v4 group addresses, this results in a remainder of 2 bits of an IPv4 address or 4 group addresses that a C-RP will serve as RP. In this scenario, if C-RP routers A and B both advertise their candidacy for group-range 224.1.1.0/24 and the hash algorithm selects router A as RP for 224.1.1.0, the hash length of 28 bits will also cause router A to be selected as RP for groups 224.1.1.1, 224.1.1.2 and 224.1.1.3 (i.e. a contiguous group range of of 4 addresses.) If the hash algorithm selects router B as RP for group 224.1.1.4, it will also select router B for groups 224.1.1.5, 224.1.1.6 and 224.1.1.7.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module6.ppt 24

Basic PIMv2 BSR


G
BSR Msgs

PIMv2 Sparse Mode


BSR Msgs

F
BSR Msgs
CRP Ad v (un ertise ica me st) nt

BSR
A
BSR Msgs

D
nt me ise ert t) v Ad as RP (unic C-

C-RP

B E

C-RP

BSR Msgs Flooded Hop -by-Hop

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

25 25

BSR Example
Step 1 Candidate RPs unicast their C-RP messages to the previously elected BSR. (The C-RPs learned the IP address of the BSR from the BSR messages that are being flooded throughout the network.) Step 2 The BSR receives and stores ALL C-RP information in a database called the RP-Set (which is stored in the Group-to-RP mapping cache on Cisco routers). Step 3 The BSR periodically sends BSR messages containing the RP -set out all of its interfaces. These BSR messages are forwarded hop-by -hop away from the BSR by all routers in the network. The RP -set is used by all routers in the network to calculate the RP for a group using a common hash algorithm.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

25

Module Agenda

Auto RP PIMv2 BSR Static RPs Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

26 26

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

26

Static RPs
Hard-coded RP address
When used, must be configured on every router All routers must have the same RP address RP fail-over not possible
Exception: If Anycast RPs are used. (See MSDP module.)

Command
ip pim rp-address <address> [group-list <acl>] [override]

Optional group list specifies group range


Default: Range = 224.0.0.0/4 (Includes AutoAuto- RP Groups!!!!)

Override keyword overrides Auto-RP information


Default: Auto-RP learned info takes precedence

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

27 27

Hard-code RP Addresses
Requires every router in the network to be manually configured with the IP address of a single RP. If this RP fails, there is no way for routers to fail-over to a standby RP. The exception to this rule is if Anycast-RPs are in use. This requires MSDP to be running between each RP in the network.

Command
ip pim rp-address <address> [group-list <acl>] [override] The group-list allows a group range to be specified. The default is ALL multicast groups or 224.0.0.0/4 DANGER, WILL ROBINSON!!! The default range includes the Auto-RP groups (224.0.1.39 and 224.0.1.40) which will cause this router to attempt to operate these groups in Sparse mode. This is normally not desirable and can often lead to problems where some routers in the network are trying to run these groups in Dense mode (which is the normal method) while others are trying to use Sparse mode. This will result in some routers in the network being starved of Auto-RP information. This in turn, can result in members of some groups to not receive multicast traffic. The override keyword permits the statically defined RP address to take precedence over Auto-RP learned Group-to-RP mapping information. The default is that Auto-RP learned information has precedence.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

27

Module Agenda

Static RPs Auto RP PIMv2 BSR Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

28 28

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

28

RP Placement

Q: Where do I put the RP?


A: Generally speaking, its not critical

SPTs are normally used by default


RP is a place for source and receivers to meet Traffic does not normally flow through the RP RP is therefore not a bottleneck

Exception: SPT-Threshold = Infinity


Traffic stays on the shared tree RP could become a bottleneck

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

29 29

PIM MFAQ (Most Frequently Asked Question)


Q: Where should I put the RP? A: Generally speaking, it is not critical. The default behavior of PIM-SM is to switch to the Shortest-Path Tree (aka the Source Tree) and bypass the RP as soon as a new source is detected. This means that in most cases, multicast traffic does not flow through the RP. Therefore, the RP does not become a point of congestion. The default behavior can be overridden in Cisco routers by setting the SPTThreshold to Infinity. This prevents the Cisco router from joining the SPT and keeps all group traffic flowing down the Shared Tree. In this case, the RP could become a bottleneck.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

29

RP Performance Considerations
CPU load factors
RP must process Registers RP must process Shared-tree Joins/Prunes RP must send periodic SPT Joins toward source PIM performs RPF recalculation every 5 seconds
Watch the total number of mroute table entries in the RP Only when spt-threshold = infinity is in use

Shared-tree forwarding

Memory load factors


(*, G) entry ~ 380 bytes + OIL size (S, G) entry ~ 220 bytes + OIL size Outgoing interface list (OIL) size
Each oil entry ~ 150 bytes
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

30 30

RP Performance Considerations
CPU Load Factors The RP will receive all Register messages for any new sources in the network. Although processing of Register messages is done at Process Level, the impact on the router is usually small since the RP will immediately send back a Register-Stop message. The RP will receive and must process all Shared Tree Join/Prune messages from downstream routers on the Shared Tree. Downstream routers continue to send periodic (once a minute) Join/Prune messages up the Shared Tree. The number of these Join/Prune messages is generally quite small and therefore has little impact on the RP. The RP must send periodic (once a minute) SPT Joins toward all sources for which it has members active on a branch of the Shared Tree. In order to detect a network topology change, ALL PIM routers perform an RPF recalculation on every (*, G) and (S, G) entry in the mroute table every 5 seconds. The impact of this will grow as the total number of entries in the mroute table increase and as the number of entries in the unicast routing table increase. (The later is due to the fact that each RPF calculation requires the route to the source to be looked up in the unicast routing table. If this table is quite large, as would be the case for poorly aggregated address space, the lookup can take more effort than when the number of entries is kept small.) Except for the following load factor, this is the most significant CPU load factor. Any traffic that does have to flow through the RP requires it to replicate the packets out all outgoing interfaces. Memory Factors The amount of memory consumed by PIM is primarily a function of the size of the mroute table. (See the numbers in the slide for details.)
Copyright ? ?1998-2001, Cisco Systems, Inc. Module6.ppt 30

Dealing with Overloaded RPs

Increase CPU horsepower Increase memory Use SPTs if not already Split RP load across several RPs!

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

31 31

Dealing with overloaded RPs


If possible, increase the CPU horsepower of the RP. In some cases, this can be accomplished by changing the RP or RSP card in the router. If the multicast traffic in the network results in an extremely large number of mroute table entries, it may be necessary to increase the amount of memory in the router. This scenario is not likely to occur except in the cases where a lowend router with a minimal amount of memory is used in a network with a large number of multicast sources. If the RP is overloaded due to the multicast packet replication and forwarding demands, insure that Shortest Path Trees are in use by making sure all routers in the network have an SPT-Threshold set to the default value of zero. If all else fails, split the RP load across several RPs by assigning different group ranges to different RPs. (The Anycast-RP technique can be used in conjunction with MSDP to allow more than on RP to be active for a single group and to share the load.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

31

Auto-RP Announcement Scope


RP Announcements Leaking Outside of Network

PIM Sparse Mode Network


scope 16 scope 16 C Candidate-RP A Mapping Agent B Mapping Agent

RP Announcements Not Reaching this Map Agent Network Diameter = 32 Hops


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

32 32

Auto-RP Announcement Scope


Care must be taken in the selection of the TTL scope of RP Announcement messages that are sent by C-RPs to insure that the messages reach all Mapping Agents in the network.

Example
In the diagram above, an arbitrary scope of 16 was used in the ip pim send-rpannounce command on the C-RP router. However, the maximum diameter of the network is greater than 16 hops and in this case one Mapping Agent is further away than 16 hops. As a result, this Mapping Agent does not receive the RP Announcement messages from the C-RP. This can cause the two Mapping Agents to have different information in their Group-to-RP mapping caches. If this occurs, each Mapping Agent will advertise a different router as the RP for a group which will have disastrous results. Notice also, that the C-RP is fewer than 16 hops way from the edge of the network. This can result in RP Announcement messages leaking into adjacent networks and causing Auto-RP problems in those networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

32

Auto-RP Announcement Scope


RP Announcements (224.0.1.39) Blocked from Leaving/Entering the Network Using ip multicast boundary Commands

scope 32 C Candidate-RP

scope 32

PIM Sparse Mode Network


B Mapping Agent

A Mapping Agent

Both Mapping Agents Are Now Receiving Announcements from the Candidate RP Network Diameter = 32 Hops
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

33 33

Auto-RP Announcement Scope


The best way to avoid the problems on the preceding page is to use a sufficiently large enough scope so that the RP Announcement messages reach all Mapping Agents in the network.

Example
In the above diagram, the maximum network diameter is 32. Therefore by setting the scope to 32 or greater, we are assured that the RP Announcements will reach both Mapping Agents shown in the example network. In order to prevent RP Announcement messages from leaking into adjacent networks, a multicast boundary is defined for the Cisco-Announce (224.0.1.39) multicast group on all border routers in the network. This not only stops RP Announcement messages from leaking out, it more importantly, stops any from leaking in from adjacent networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

33

Auto-RP Discovery Scope


RP Discovery Messages Leaking Outside of Network RP Discovery Messages Not Reaching this Router (Assumes All Groups Are Dense Mode)
scope 16 scope 16 A Mapping Agent

PIM Sparse Mode Network

Network Diameter = 32 Hops


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

34 34

Auto-RP Discovery Scope


Care must be taken in the selection of the TTL scope of RP Discovery messages that are sent by Mapping Agents to insure that the messages reach all routers in the network.

Example
In the diagram above, an arbitrary scope of 16 was used in the ip pim send-rpdiscovery command on the Mapping Agent. However, the maximum diameter of the network is greater than 16 hops and in this case, at least one router is further away than 16 hops. As a result, this router does not receive the RP Discovery messages from the MA. This can result in the router having no Group-to-RP mapping information. If this occurs, the router will attempt operate in Dense mode for all multicast groups while other routers in the network are working in Sparse mode. Notice also, that the MA is fewer than 16 hops way from the edge of the network. This can result in RP Discovery messages leaking into adjacent networks and causing Auto-RP problems in those networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

34

Auto-RP Discovery Scope


RP Discovery Messages Now Reaching this Router RP Discoveries (224.0.1.40) Blocked From Leaving/entering the Network Using ip multicast boundary Commands

scope 32 A Mapping Agent

scope 32

PIM Sparse Mode Network

Network Diameter = 32 Hops


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

35 35

Auto-RP Discovery Scope


The best way to avoid the problems on the preceding page is to use a sufficiently large enough scope so that the RP Discovery messages reach all routers in the network.

Example
In the above diagram, the maximum network diameter is 32. Therefore by setting the scope to 32 or greater, we are assured that the RP Discovery messages will reach the farthest router in the network. In order to prevent RP Discovery messages from leaking into adjacent networks, a multicast boundary is defined for the Cisco-Discovery (224.0.1.40) multicast group on all border routers in the network. This not only stops RP Discovery messages from leaking out, it more importantly, stops any from leaking in from adjacent networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

35

Constraining Auto-RP Messages


Need to Block Auto-RP Discovery (224.0.1.40) and Announcement (224.0.1.39) Messages from Entering/Leaving the Network

Boundary Router scope 32 scope 32 A Mapping Agent

S0

scope 32

scope 32

C Candidate-RP

Interface S0 ip multicast boundary 10

PIM Sparse Mode Network

access-list 10 deny 224.0.1.39 access-list 10 deny 224.0.1.40 access-list 10 permit any

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

36 36

Constraining Auto-RP Messages


This example shows how to configure the multicast boundary on a border router so that Auto-RP messages do not leak into or out of the network. On the border interface (in this case, Serial0) the ip multicast boundary command is used. The access list associated with the ip multicast boundary command is as follows: access-list 10 deny 224.0.1.39 access-list 10 deny 224.0.1.40 access-list 10 permit any The above access list stops the flow of multicast traffic for the two Auto-RP groups (224.0.1.39 and 224.0.1.40) while allowing all other multicast traffic to enter or exit via interface Serial 0.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

36

Constraining BSR Messages


Need to Block All BSR Message from Entering/Leaving Network
Border Router
S0

PIMv2 Sparse Mode Network


BSR Msgs BSR BSR Msgs

Need to Block All BSR Message from Entering/Leaving Network


Border Router B
S0

Neighboring PIMv2 Domain

Neighboring PIMv2 Domain

Interface S0 . . ip pim bsr-border

Interface S0 . . ip pim bsr-border

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

37 37

Constraining BSR Messages


Like Auto-RP, allowing BSR messages to leak into or out of a network can cause problems both in the local network and in adjacent networks. In order to block BSR messages from entering or exiting on a given interface, the ip pim bsr-border interface command can be used.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

37

Controlling RP Acceptance
What really determines if a router is the RP
Any router will assume the duties of the RP if:
It receives a (*,G) Join that contains an RP address that is the IP address of one of its interfaces AND The interface is multicast enabled AND The RP address matches the RP in the Group-to-RP mapping cache OR There is no entry in the Group-to-RP mapping cache.

Misconfigured routers could create multiple RPs in the network


Each sends a (*,G) Join with a different RP address Each (*,G) Join results in another RP for the same group

The Accept-RP command provides additional control (insurance?) to prevent this.


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

38 38

By default, a router operates as an RP for a group if:


It receives a (*, G) Join containing one of its addresses as the RP or It receives a (S, G) Register message.

Basic sanity check


Routers will perform a rudimentary sanity check to see if it actually should be the RP for group G by searching the Group-to-RP mapping cache for an entry for group G. If an entry is found and the RP for the group is not this router, then the router will discard the (*, G) Join or (S,G) Register and will not become the RP. Otherwise, it will assume that it is the RP for group G and assume duties of the RP.

Extended sanity checking


In order to provide additional control and sanity checking over which router should be accepted as the RP, the IOS command ip pim accept-rp was created.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

38

Controlling RP Acceptance
Accept-RP Command
Global configuration commands
ip pim accept-rp <rp-address> [<acl>] ip pim accept-rp Auto-rp [<acl>] ip pim accept-rp 0.0.0.0 [<acl>]

Multiple commands accepted


Command list sorted in order shown above Only one Auto-RP and one 0.0.0.0 (wildcard) accepted Omitting ACL implies 224.0.0.0/4 group range

Search Rules
Top down search Stop on RP address matchApply ACL and exit Exception: Auto-RP denies RP/Group
Apply 0.0.0.0 (wildcard) entry (if it exists)
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

39 39

Command format
ip pim accept-rp <rp-address> [<acl>] ip pim accept-rp Auto-rp [<acl>] ip pim accept-rp 0.0.0.0 [<acl>] The option <acl> is used to specify which groups are valid using standard permit and deny clauses. Omitting the <acl> assumes a permit 224.0.0.0 15.255.255.255. If more than one of the above commands is configured, they are sorted in the order shown above. The Auto-rp entry matches any RP address learned via Auto-RP. (Note: This form has implied deny clauses for the Auto-RP groups, 224.0.1.39 and 224.0.1.40, that cannot be overridden in the optional <acl>. This helps prevent the Auto-RP groups from accidentally switching to Sparse mode.) The 0.0.0.0 (wildcard) matches any RP address. While multiple ip pim accept-rp <rp-address> commands may be configured, only a single Auto-rp and a single 0.0.0.0 (wildcard) command is accepted.

Search Rules
The list of configured commands is searched from top down and stops at the first entry that matches the RP address. The <acl> is applies and the RP is either permitted or denied. Exception: If an Auto-RP entry denies an RP and a 0.0.0.0 entry exists, the 0.0.0.0 entry is also tried.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

39

Controlling RP Acceptance

Accept-RP Command Usage


Case 1 Controlling Group mode Case 2 Accepting (*, G) joins Case 3 Accepting PIM registers at the RP

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

40

Accept-RPCase 1

Controlling Group Mode


Router Has No State for Group G IGMP Join (G)

Search Groupto-RP Mapping Cache

Accept-RP Filter List

State Creation Engine

Group Group Address Address

Group Group Address, Address, RP RP Address Address

Permit Permit

Sparse Sparse Mode Mode

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

41 41

Case 1 - Controlling Group Mode


If a router has no (*, G) state when an IGMP Membership Report is received for group G, the router will apply the configured Accept-RP rules to determine if there is a valid RP for this group or not. If there isnt, the (*,G) entry is created and set as a Dense mode group. If there is a valid RP, the (*,G) entry is set in Sparse mode. Step 1 The Group-to-RP mapping cache is searched for the group address in the IGMP Join message. If an entry is not found, then the group is created in Dense mode. Step 2 If a matching entry is found in the Group-to-RP mapping cache, the Group and RP addresses are run through the Accept-RP filters. If a permit is returned, then the group is created in Sparse mode; otherwise the group is created in Dense mode.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

41

Accept-RPCase 2

Accepting (*, G) Joins


A A

(*, G) Join Messages Contain the RP Address (*, G) Join

B B

Accept-RP Filter List

Join Processing Engine

Group Group Address Address RP RP Address Address

RP RP Address Address

Process Process Join Join

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

42 42

Case 2 - Accepting (*, G) Joins


If a router receives (*, G) Join it will apply the configured Accept-RP rules to determine if the RP address contained in the (*, G) Join is valid or not. Step 1 (not shown) The Group-to-RP mapping cache is searched for the group address in the (*, G) Join message. If an entry is found and it is a negative entry indicating that the group has been forced to always be in Dense mode, then the (*, G) Join is not accepted and an error message is generated. Step 2 The Group and RP addresses (contained in the (*, G) Join) are run through the Accept-RP filters. If a permit is returned, then the (*, G) Join is processed normally; otherwise the (*, G) Join is dropped and an error message is generated.

Example
When Auto-RP is in use, it is normally the case that the two Auto-RP groups, 224.0.1.39 and 224.0.1.40 should be operating in Dense mode. However, if a downstream router is misconfigured with a static RP address, it will send (*, G) Joins for these Auto-RP groups. The routers that receive these (*, G) Joins will create a (*,G) entry in Sparse mode for these Auto-RP groups. This can result in portions of the network trying to operate these groups in Dense mode while other parts of the network are operating in Sparse mode. This will generally cause the Auto-RP mechanisms to fail. The following Accept-RP command will cause a router to reject any (*,G) Joins for the Auto-RP groups and prevent these Joins from propagating. ip pim accept-rp Auto-rp

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

42

Accept-RPCase 3

Accepting PIM Registers at the RP


(S, G) Register Unicast to One of As Interface Addresses RP

Source S

B B

A A

Accept-RP Filter List

RP Processing Engine

Group Group Address, Address, Interface Interface Address Address

Permit Permit

RP

Accept Accept the the Register Register

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

43 43

Case 3 - Accepting PIM Register messages


If a router receives PIM Register message, it will apply the configured AcceptRP rules to determine if the router is permitted to be the RP or not. Step 1 (not shown) The Group-to-RP mapping cache is searched for the group address in the (*, G) Join message. If an entry is found and it is a negative entry indicating that the group has been forced to always be in Dense mode, then the PIM Register is not accepted, a Register-Stop is sent back to the firsthop router and an error message is generated. Step 2 The Group (contained in the multicast packet encapsulated in the Register message) and RP addresses (the destination IP address in the Register message) are run through the Accept-RP filters. If a permit is returned, then the PIM Register is processed normally; otherwise a Register-Stop is sent back to the first-hop router and an error message is generated.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

43

Filtering C-RP Announcements


Use on Mapping Agents to filter out bogus C-RPs
Some protection from RP-Spoofing denial-of-service attacks Multiple commands may be configured as needed

Global command
ip pim rp-announce-filter rp-list <acl> [group-list <acl>] rp-list <acl>
Specifies from which routers C-RP Announcements are accepted.

group-list <acl>
Specifies which groups in the C-RP Announcement are accepted. If not specified, defaults to deny all groups

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

44 44

Filtering RP Announcements
Network Administrators may wish to configure Mapping Agents so that they will only accept C-RP Announcements from well-known routers in the network. This will prevent C-RP Announcements from bogus routers from being accepted and potentially being selected as the RP.

Global Command
ip pim rp-announce-filter rp-list <acl> [group-list <acl>] The rp-list <acl> specifies the IP address(es) from which C-RP announcements will be accepted. The option group-list <acl> specifies the group range(s) that are acceptable for the routers in the rp-list. If not specified, the default group-list <acl> is deny all Multiple instances of this command may be configured.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

44

Controlling Source Registration


New global command - IOS 12.0(6)
ip pim accept-register [list <acl>] | [route-map <map>]

Used on RP to filter incoming Register messages Filter on Source address alone (Simple ACL) Filter on (S, G) pair (Extended ACL) May use route-map to specify what to filter
Filter by AS-PATH if (m)BGP is in use.

Prevents unwanted sources from sending


First hop router blocks traffic from reaching net Note: Source can still send traffic on local wire
45 45

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Controlling Source Registration


In some cases, it may be desirable to control which hosts in the network can actually source traffic to a group. While there is currently no way to prevent a bogus source from transmitting traffic on its local segment, we can prevent it from being registered to the RP. This will, in most cases, prevent this traffic from going past the first-hop router and reaching other hosts in the network. A new IOS command, ip pim accept-register was introduced which when configured on an RP, controls which (S, G) Register messages will be accepted and which will be rejected.

Global Command (IOS 12.0(6) or later)


ip pim accept-register [list <acl>] | [route-map <map>] If the list <acl> is specified, the <acl> can either be a simple access list to control which hosts may send to any groups or an extended access list that specifies both source and group address combinations that are permitted or denied from sending. If the route-map <map> is specified, then only matching (S, G) traffic will be accepted. (Note: This permits other matching criteria to be considered such as AS-PATH.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

45

Module Agenda

Static RPs Auto RP PIMv2 BSR Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

46 46

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

46

Debugging Auto-RP Operation


Understand the Auto-RP mechanisms
This is the fundamental debugging tool for problems with AutoAuto -RP!!!

Verify Group-to-RP Mapping Caches


First on the Mapping Agents
Other routers will learn Group-to-RP mapping info from these routers
If not correct, use debug commands to see whats wrong

Make sure all MAs have consistent Group-to-RP information


If not, watch for TTL Scoping problems

Then on other routers


If info doesnt match MA, there is a problem distributing the information Use show and debug commands to find where the break is
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

47 47

Debugging Auto-RP
First and foremost, you must understand the fundamental mechanisms behind Auto-RP in order to debug problems! Verify Group-to-RP Mapping Caches on Mapping Agents Because other routers in the network will learn the Group-to-RP mapping information from the Mapping Agents, it is important that this information is correct on the Mapping Agents. If the information is not correct, verify that the C-RPs are configured correctly and that C-RP Announcements are being received properly by the Mapping Agent. If multiple Mapping Agents are in use, make sure that their Group-to-RP mapping information is identical. If not, the routers in the network will oscillate between the different RPs selected by the Mapping Agents. Again, make sure all Mapping Agents are properly receiving Auto-RP Announcements from all C-RPs in the network. Watch out for TTL scoping problems! Verify Group-to-RP Mapping Caches on all other routers Group-to-RP mapping information should match the MAs. If not, verify that the router is properly receiving Auto-RP Discovery messages from the Mapping Agents. Again, watch out for TTL scoping problems!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

47

Debugging Auto-RP Operation

Insure Auto-RP group state is correct


Should normally be in Dense mode Watch out for mixed DM and SM conditions
Can occur when Static RPs are also defined
Always deny Auto-RP groups on Static RP configurations

Use Accept-RP filters on all routers as insurance

Watch out for DM problems in NBMA networks


(See Module 7 for details)

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

48 48

Common Problem - Incorrect Auto-RP Group mode


The two Auto-RP groups, 224.0.1.39 and 224.0.1.40 are normally run in Dense mode so that this information is flooded throughout the network. Only in very rare situations is it desirable to run these two groups in Sparse mode because this creates a chicken-and-the-egg problem. (How do you join the RP for the Auto-RP groups if you dont know the RP address?) Therefore, the following situations should be verified: Insure that the Auto-RP groups are operating in Dense mode on all routers in the network. Mixed DM/SM situations can arise when static RP addresses have been configured on some routers in the network. To avoid this: Always specify an <acl> on any ip pim rp-address commands that denies groups 224.0.1.39 and 224.0.1.40. Configure Accept-RP filters on all routers in the network that deny groups 224.0.1.39 and 224.0.1.40. (Note: The ip pim accept-rp auto-rp command has an implied set of deny clauses for these two groups to prevent them from switching into Sparse mode.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

48

Debugging BSR Operation


Understand the BSR mechanisms
This is the fundamental debugging tool for problems with BSR!!!

Verify Group-to-RP Mapping Caches


First on the BSR
Other routers will learn Group-to-RP mapping info from this router
If not correct, use debug commands to see whats wrong

Then on other routers


If info doesnt match BSR, there is a problem distributing the information Use show and debug commands to find where the break is

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

49 49

Debugging BSR Operation


First and foremost, you must understand the fundamental mechanisms behind BSR in order to debug problems! Verify Group-to-RP Mapping Caches on the elected BSR. Because other routers in the network will learn the Group-to-RP mapping information from the elected BSR, it is important that this information is correct on this router. If the information is not correct, verify that the PIMv2 C-RPs are configured correctly and that C-RP Announcements are being received properly by the BSR. Verify Group-to-RP Mapping Caches on all other routers Group-to-RP mapping information should match the BSR. If not, verify that the router is properly receiving BSR messages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

49

Module Agenda

Static RPs Auto RP PIMv2 BSR Tuning RP Operations Debugging RP Operation Special Cases

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

50 50

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

50

RP on a Stick
Triggering conditions on the RP:
A (*,G) entry (i.e. Shared Tree) exists with a single outgoing interface on the RP. And an (S,G) entry (i.e a Source) exists on the same interface with a Null outgoing interface list.

Results in special state at the RP


Frequently misunderstood and rarely seen
Default behavior is to join SPT which avoids this

Mishandled in versions of IOS prior to 12.0 Requires the Proxy Join Timer to resolve Need to understand concept of atomic joins
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

51 51

RP on a Stick
This is a special situation that occurs under the following conditions: All branches of the Shared Tree are out a single interface on the RP (i.e. there is only a single interface in the (*, G) OIL at the RP.) All sources for the group are out the same interface. (This would result in (S, G) entries with Null OILs since the incoming interface can never appear in the OIL of an entry.)

Unusually State results in this condition


Special PIM rules had to be created that were not in the original PIMv2 specification in order to avoid situations where : (S, G) traffic flows were incorrectly pruned. (S, G) traffic continued to flow to the RP only to be dropped. (S, G) state would get stuck in the RP and the First-Hop router even when the source has long since stopped sending. Problem was solved in IOS 12.0 by: Special Proxy Join Timer and Introduction of Atomic and Non-Atomic (*, G) Joins

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

51

RP on a Stick
RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

52 52

RP-on-a-Stick Example
Consider that above network topology where both rtr-b and rtr-c share a common Ethernet segment with the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

52

RP on a Stick
Shared Tree

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0
(*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:21/00:02:39

rtr-c
E0

Member 224.1.1.1

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

53 53

RP-on-a-Stick Example
When a host behind rtr-c joins group 224.1.1.1, a branch of the Shared Tree is created (shown by the solid arrow) which results in the following state on the RP: (*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:21/00:02:39

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

53

RP on a Stick
Shared Tree

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

(*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: SC Incoming interface: Ethernet1, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:21/00:02:39

Member 224.1.1.1

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

54 54

RP-on-a-Stick Example
This also results in the following state on rtr-c: (*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: SC Incoming interface: Ethernet1, RPF nbr 10.1.4.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:21/00:02:39

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

54

RP on a Stick
(192.1.1.1, 224.1.1.1) Traffic Flow Shared Tree SPT

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

Source 192.1.1.1

Member 224.1.1.1

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

55 55

RP-on-a-Stick Example
Now assume that source 192.1.1.1 behind rtr-b begins sending to group 224.1.1.1. After the normal Register process has completed, a branch of the SPT (shown by the heave dashed arrow) is built from rtr-b to the RP. This allows traffic to flow to the members as shown by the thin dashed arrows.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

55

RP on a Stick
(192.1.1.1, 224.1.1.1) Traffic Flow Shared Tree SPT

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

(*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: SP Incoming interface: Ethernet1, RPF nbr 10.1.4.1, Outgoing interface list: (192.1.1.1/32,Source 224.1.1.1), 00:00:49/00:02:46, flags: T Incoming interface: 192.1.1.1Ethernet0, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet1, Forward/Sparse, 00:00:49/00:02:11
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1

56 56

RP-on-a-Stick Example
The creation of the SPT results in the following state on rtr-b: (*, 224.1.1.1), 00:01:21/00:02:59, RP 10.1.4.1, flags: SP Incoming interface: Ethernet1, RPF nbr 10.1.4.1, Outgoing interface list: (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:46, flags: T Incoming interface: Ethernet0, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet1, Forward/Sparse, 00:00:49/00:02:11

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

56

RP on a Stick
(192.1.1.1, 224.1.1.1) Traffic Flow Shared Tree SPT

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), Source 00:00:49/00:02:46, flags: PT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, 192.1.1.1 Outgoing interface list:

rtr-c
E0

Member 224.1.1.1

Normally results in the suppression of (S,G) Joins


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

57 57

RP-on-a-Stick Example
The creation of the SPT also results in the following state on the RP: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:46, flags: PT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list: Notice that the OIL of the (S, G) entry is Null which, in turn, results in the P flag being set. Normally, this would cause the RP to send an (S, G) Prune toward the source to shut off the flow of (S, G) traffic. However in this case, that would starve the host behind rtr-c of the desired group traffic. Obviously, something else must be done to prevent this.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

57

RP on a Stick

Solution requires three new concepts


Atomic & Non-Atomic Joins Proxy Join Timer/Flag Header-only Registers

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

58 58

RP-on-a-Stick Solution
In order to deal of this special situation, several new concepts were added to the 12.0 implementation of PIM. These are: Atomic vs. Non-Atomic (*, G) Joins The Proxy Join Timer (and its flag) on (S, G) entries Header-only Registers (aka Data-less Registers) Each of the above are discussed in the following pages

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

58

RP on a Stick

Non-Atomic Joins
Contains (*, G) Join for group G only
This is the type of (*, G) Join we are familiar with

Atomic Joins
Contains (*, G) Join for group G followed by (S, G)RP-bit Prunes for all sources in group G
Used to prune unnecessary (S, G) traffic from the Shared Tree after switchover to the SPT.

All in the same PIM Join/Prune message

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

59 59

Non-Atomic Joins
This is a PIM Join/Prune message that contains only a (*, G) Join for group G in the Join list without any associated (S, G)RP-bit Prunes for group G in the Prune list. This is the typical (*, G) Join that has been described in most of the examples in Module 5, PIM -SM.

Atomic Joins
This is a PIM Join/Prune message that contains a (*, G) Join for group G in the Join list AND a complete list of all (S, G)RP-bit Prunes for group G in the Prune list. Remember, these (S, G)RP -bit Prunes are used to Prune specific (S, G) traffic off of the Shared Tree after a router has joined the SPT directly toward the source.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

59

RP on a Stick
PIM Join/Prune Message
Header (10.1.4.1, 224.1.0.5) (10.1.4.1, 224.1.1.1) (10.1.4.1, 224.2.2.2) Join List (10.1.19.21, 224.2.2.2) . . . (10.1.19.21, 224.1.0.5) (10.1.4.1, 224.3.3.3) (192.1.1.1, 224.1.1.1) Prune List (192.1.4.2, 224.2.2.2) . . .
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

WC, RP WC, RP WC, RP RP (*, G) Join + (S, G)RP-bit Prune = Atomic Join

WC, RP RP

60 60

Example: Atomic (*, G) Join


In this example, a (*, G) entry for group 224.1.1.1 in the Join list of the PIM Join/Prune message. (The WC (wildcard) and RP (RP -Tree) bits tell us that this entry is a (*, G) Join to RP 10.1.4.1) In addition, there is an (S, G) entry for group 224.1.1.1 (192.1.1.1, 224.1.1.1) with the RP-bit set in the Prune list. (I.e. an (S, G)RP-bit Prune exists for group 224.1.1.1). Because both a (*, G) Join along with one or more (S, G)RP-bit Prunes exist in this Join/Prune message for group 224.1.1.1, it is said to contain an Atomic (*, G) Join for group 224.1.1.1.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

60

RP on a Stick
PIM Join/Prune Message
Header (10.1.4.1, 224.1.0.5) (10.1.4.1, 224.1.1.1) (10.1.4.1, 224.2.2.2) Join List (10.1.19.21, 224.2.2.2) . . . (10.1.19.21, 224.1.0.5) (10.1.4.1, 224.3.3.3) (192.1.1.1, 224.1.1.1) Prune List (192.1.4.2, 224.2.2.2) . . .
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

WC, RP WC, RP WC, RP RP (*, G) Join + NO (S, G)RP-bit Prunes = Non-Atomic Join

WC, RP RP

61 61

Example: Non-Atomic (*, G) Join


Also in this example, is a (*, G) entry for group 224.1.0.5 in the Join list of the PIM Join/Prune message. (The WC (wildcard) and RP (RP-Tree) bits tell us that this entry is a (*, G) Join to RP 10.1.4.1) In addition, there are no (S, G) entries for group 224.1.0.5 with the RP-bit set in the Prune list. (I.e. there are no (S, G)RP-bit Prunes for group 224.1.0.5). Because only a (*, G) Join exists in this Join/Prune message for group 224.0.1.5 without any corresponding (S, G)RP -bit Prunes, it is said to contain an NonAtomic (*, G) Join for group 224.0.1.5.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

61

RP on a Stick
Proxy Join Timer
Used on (S, G) entries only Started on the RP by the initial (S, G) Register
If (S, G) OIL is Null AND (*, G) OIL is Non-Null

Started/Restarted by receipt of a Non-Atomic Join


The X flag indicates when it is running Times out in 2 minutes

Controls the sending of (S, G) Joins and Prunes down the SPT When Proxy Join Timer is Running (X flag set)
Send (S, G) Joins down SPT Suppress sending any (S, G) Prunes down SPT
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

62 62

Proxy Join Timer


The Proxy Join Timer only exists on (S, G) entries in the Mroute table. Its purpose is to attract (S, G) traffic to the router even when the OIL of the (S, G) entry is Null. This maintains the flow of (S, G) traffic in cases such as the RPon-a-Stick.

Proxy Join Timer Rules


The Proxy Join Timer is started when the RP receives the first (S, G) Register message when a source goes active if: The OIL is null in resulting (S, G) entry AND the OIL is non-Null in the (*, G) entry. (This is the RP-on-a-Stick condition.) The Proxy Join Timer is started whenever the router receives a Non-Atomic (*,G) Join and an (S, G) entry already exists. This timer runs for 2 minutes unless restarted by the receipt of another NonAtomic (*, G) Join. When this timer is running on an (S, G) entry, the X flag will be displayed in the flags field of the entry. When the Proxy Join Timer is running, the router will: Send periodic (S, G) Joins toward the source even though the OIL is Null. Suppress the sending of (S, G) Prunes toward the source even though the OIL is Null.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

62

RP on a Stick

Header-only Registers
Used to keep (S, G) state alive in the RP Sent every 2 minutes by First-hop router
As long as source is still active Continues sending until a Register-Stop is received

Register Messages contains null (S,G) data packet


Processed by the RP Resets (S, G) entry timer at the RP RP doesnt send Null packet down Shared Tree

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

63 63

Header-only (Data-less) Registers


Normally, the Expire timer of an (S, G) entry is reset to 3 minutes every time the router forwards a packet associated with that entry. However, in the RP-on-aStick case, the (S, G) entry has a Null OIL and is therefore not forwarding any packets. This would normally result in the (S, G) entry timing out at the RP. This can not be allowed to happen as it is possible that another member somewhere in the network could join the Shared Tree via another interface. If the (S, G) entry was allowed to timeout, the RP would not be able to trigger the BatchJoin to rejoin the SPT when this new member joined. (Because there wouldnt be any (S, G) entry to tell the RP of the active source.) To prevent this from happening, the behavior of the First-Hop DR was changed in IOS 12.0 so that (S, G) Header-only (aka Data-less) Registers would be sent periodically (every 2 minutes) to the RP. These Header-only Registers cause the RP to reset the Expire timer in the (S, G) entry thereby preventing it from timing out.

Contents of Header-only Registers


Header-only Registers contain a specially formatted null or data-less (S, G) packet. These null (S, G) packets are not forwarded down the Shared Tree by the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

63

RP on a Stick
Non-Atomic Join Normal Register (S, G) Join

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:46, flags: P PT XT Incoming interface: Ethernet0 Ethernet0, RPF nbr 10.1.4.2, Source Outgoing interface list: Null

rtr-c
E0

192.1.1.1

Member 224.1.1.1

Trigger condition met: Proxy Join Timer started


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

64 64

RP-on-a-Stick Example
In this example, rtr-c is sending Non-Atomic (*, G) Joins to the RP to keep on the Shared Tree. (Note that rtr-c has not joined the SPT at this point. This could be due to the SPT-Threshold being set to Infinity.) The RP is now running version 12.0 or later of IOS. Therefore, when the NonAtomic (*, G) Join for group 224.1.1.1 is received, the RP starts the Proxy Join Timer in all (S, G) entries for group 224.1.1.1. This results in the following state in the RP: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:46, flags: PXT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list: Notice the X flag is set in the above example. This causes the RP to continue sending (S, G) Joins toward the source (even though the OIL is Null) which, in turn, will keep the traffic flowing to the member across the common Ethernet segment.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

64

RP on a Stick
Periodic (S, G) Joins Header-only Registers

RP rtr-a
E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Source Outgoing interface list:

rtr-c
E0

192.1.1.1

Member 224.1.1.1

Expiration Timer restarted by Header -only Registers


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

65 65

RP-on-a-Stick Example
The First-hop router (rtr-b) is also running version 12.0 or later of IOS and it will therefore send periodic Header-only (S, G) Register messages to the RP. When RP receives these Header-only (S, G) Registers, (roughly every 2 minutes), it resets the Expire timer in the corresponding (S, G) entry in the Mroute table. This results in the following state in the RP: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list: (Notice the Expire timer in the (S, G) entry has been reset.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

65

Turnaround Router

Extension of the RP-on-a-Stick Problem


Occurs when the SPT and the Shared Tree share a single common path Want to avoid pulling traffic to the RP unnecessarily

Special state in the Turnaround Router


Uses special techniques to resolve
Proxy Join Timer Atomic and Non-Atomic Joins Header-only Registers

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

66 66

Turnaround Router
As it turns out, the RP-on-a-Stick problem is actually a special case of another problem referred to the Turnaround Router problem. This situation occurs whenever : There is only a single branch of the Shared Tree and the Shared Tree and a SPT share a common path to the RP. We want to have the (S, G) traffic flow along the SPT toward the RP and turnaround at the appropriate router in the network and flow back down the Shared Tree without sending unnecessary traffic to the RP.

Turnaround Router Solution


Once again, the new concepts of Proxy Join Timer Atomic and Non-Atomic Joins Header-only Registers permit the routers to solve this problem.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

66

Turnaround Router
(192.1.1.1, 224.1.1.1) Traffic Flow Shared Tree SPT

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
67 67

Turnaround Router Example


In this example, we once again have a single branch of the 224.1.1.1 Shared Tree at the RP. The SPT for source (192.1.1.1, 224.1.1.1) merges with the single branch of the 224.1.1.1 Shared Tree at rtr-x. This router is referred to as the Turnaround Router because it is here that we want the (S, G) traffic to turnaround and flow back down the Shared Tree to the members of group 224.1.1.1. Additionally, we do not want the (S, G) traffic flow to all the way to the RP as it is unnecessary traffic because there is only the single branch of the Shared Tree. In cases where the number of hops between the Turnaround Router and the RP is large or where the links along this path are congested, the flow of traffic to the RP would simply waste precious network resources. Instead, we want the traffic to only flow as shown by the thin dashed arrows in the drawing above.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

67

Turnaround Router
Non-Atomic Joins

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

rtr-x Turnaround Router


E0 10.1.4.1
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming 10.1.4.2 interface: E1 Null, RPF nbr 0.0.0.0, 10.1.4.3 E1 Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

E0

E0
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: Ethernet0, Forward/Sparse, Member 00:02:43/00:02:17

Source 192.1.1.1
Module6. ppt

224.1.1.1
68 68

1998 2001, Cisco Systems, Inc. All rights reserved.

Turnaround Router Step-by-Step


Step 1 The host connected to rtr-c joins group 224.1.1.1. This causes rtr-c to create (*, G) state and sends a Non-Atomic (*, G) Join toward the RP. Step 2 The Turnaround Router (rtr-x) receives this Non-Atomic (*, G) Join and it too creates (*, G) state and sends a Non-Atomic (*, G) Join to the RP. Step 3 The RP receives the (*, G) Join and creates (*, G) state with only Serial0 in the OIL.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

68

Turnaround Router
Non-Atomic Joins Normal Register

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
(*, 224.1.1.1), 00:02:43/00:02:59, E0 RP 10.1.3.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17

rtr-c
E0

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Serial0, RPF nbr 10.1.3.2, Source Member Outgoing interface list:

192.1.1.1

224.1.1.1
69 69

(S, G) Entry created & Proxy Join Timer started


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Turnaround Router Step-by-Step


Step 4 The source, 192.1.1.1, begins sending to group 224.1.1.1. This causes the first-hop router (rtr-b) to send an (S, G) Register to the RP. Step 5 The RP processes the Register message and creates an (S, G) entry. Because the OIL of the newly created (S, G) entry is Null and the OIL of the (*, G) entry is non-Null, the RP starts the Proxy Timer in the (S, G) entry and sends an (S, G) Join toward the source. At this point, the state in the RP is as follows: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Serial0, RPF nbr 10.1.3.2, Outgoing interface list: Notice that the Proxy Join Timer is running (note the X flag in the (S,G) entry.) While the Proxy Join Timer is running, the RP will continue to send periodic (S, G) Joins toward the source. The Proxy Join Timer will be restarted each time the RP receives another Non-Atomic Join from rtr-x.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

69

Turnaround Router
Non-Atomic Joins

RP

rtr-a
S0 10.1.3.1

(S, G) Joins S0 10.1.3.2

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: E0 E0 Ethernet0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: T Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list: Serial0, Forward/Sparse, 00:00:48/00:02:12

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
70 70

Turnaround Router Step-by-Step


Step 6 The (S, G) Join travels hop-by-hop building the SPT from the source to the RP. At this point, the state in the Turnaround Router (rtr-x) is as follows: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: T Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list: Serial0, Forward/Sparse, 00:00:48/00:02:12

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

70

Turnaround Router
Non-Atomic Joins

RP

rtr-a
S0 10.1.3.1

(S, G) Joins S0 10.1.3.2 (192.1.1.1, 224.1.1.1) Traffic Flow

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
71 71

Turnaround Router Step-by-Step


Once rtr-b receives the (S, G) Join, traffic begins to flow as shown above.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

71

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins S0 10.1.3.2 (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1

(Contains (*,G)Join + (S,G)RP-bit Prune)

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, Serial0, RPF nbr nbr 10.1.3.1 10.1.3.1, , Outgoing interface list: E0 E0 Ethernet0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT Incoming interface: Ethernet0, RPF RPF nbr nbr 10.1.4.2, 10.1.4.2, Outgoing interface list: Serial0, Forward/Sparse, 00:00:48/00:02:12

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
72 72

Turnaround Router Step-by-Step


Step 7 Router rtr-x detects that the paths of the SPT and the Shared Tree diverge at this point. As a result, rtr-x begins sending periodic (S,G)RP-bit Prunes up the Shared Tree in the same Join/Prune message with the periodic (*, G) Joins. In other words, it begins sending Atomic Joins to the RP instead of Non-Atomic Joins! (Note: Router rtr-x knows that the SPT and Shared Tree paths have diverged at this point because the RPF information (Incoming Interface and/or RPF neighbor) of the (S, G) entry is different than the (*,G) entry.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

72

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins S0 10.1.3.2 (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1

(Contains (*,G)Join + (S,G)RP-bit Prune)

rtr-x Turnaround Router


E0 10.1.4.1

(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, 10.1.4.2 E1 list: 10.1.4.3 E1 Outgoing interface Serial0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT PXT E0 Serial0, RPF nbr 10.1.3.2, E0 Incoming interface: Outgoing interface list:

Source 192.1.1.1
Proxy Join Timer eventually expires (Non-Atomic Joins no longer being received)
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
73 73

Turnaround Router Step-by-Step


Because the RP is no longer receiving Non-Atomic Joins, the Proxy Join Timer for the (S, G) entry is no longer being restarted and it eventually times out. This is indicated by the X flag being clear in the (S, G) entry shown below: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.4.1, flags: S (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT Incoming interface: Serial0, RPF nbr 10.1.3.2, Outgoing interface list: Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

73

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins (S, G) Prunes (192.1.1.1, 224.1.1.1) Traffic Flow S0 10.1.3.2

RP

rtr-a
S0 10.1.3.1

(Contains (*,G)Join + (S,G)RP-bit Prune)

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S E0 Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT Incoming interface: interface: Ethernet0 Ethernet0, , RPF nbr 10.1.4.2, Outgoing interface list: Source Serial0, Forward/Sparse, 00:00:48/00:02:12 192.1.1.1

rtr-c
E0

Member 224.1.1.1
74 74

Serial0 removed from (S, G) OIL Trigger condition met


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Turnaround Router Step-by-Step


Step 8 Now that the Proxy Join Timer is no longer running, the RP resumes its normal behavior and sends an (S, G) Prunes toward the source in response to the arrival of (S, G) packets. Step 9 When rtr-x receives the (S, G) Prune, it removes Serial0 from its outgoing interface list. This results in the Turnaround trigger condition in rtr-x.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

74

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Join S0 10.1.3.2 (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1

(Contains (*,G)Join + (S,G)RP-bit Prune)

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
(*, 224.1.1.1), 00:02:43/00:02:59, E0 RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: P PT XT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Source Outgoing interface list:

rtr-c
E0

192.1.1.1 Proxy Join Timer started by next non-Atomic Join.


Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
75 75

Turnaround Router Step-by-Step


As a result of Serial0 being removed from the (S, G) OIL, the flow of traffic to the RP is shutoff. Step 10 Non-Atomic Joins arriving at rtr-x now start the Proxy Join Timer. (Note the X flag in the (S, G) entry.) This causes the Turnaround Router (rtr-x) to suppress sending (S, G) Prunes and instead, send (S, G) Joins toward the source. This keeps the traffic flowing as shown.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

75

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins Header-only Registers (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

Final Steady-State Condition

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

rtr-b
E0 E0

rtr-c

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
76 76

Turnaround Router
Step 11 Finally, Header-only Registers sent by the First-hop router (rtr-b) continue to reset the Expire timer in the (S, G) entry at the RP. This prevents the (S, G) entry from timing out and being deleted at the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

76

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins Header-only Registers (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

Final Steady-State Condition

rtr-x Turnaround Router


E0 10.1.4.1

(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: 10.1.4.2 E1 Null, RPF nbr 0.0.0.0, 10.1.4.3 E1 Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT E0 Incoming interface: Serial0, RPF nbr 10.1.3.2, Outgoing interface list:

E0

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
77 77

Turnaround Router
As a result of the Header-only Registers, the state in the RP will be as follows as long as the source and member remain active: (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0, Outgoing interface list: Serial0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PT Incoming interface: Serial0, RPF nbr 10.1.3.2, Outgoing interface list:

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

77

Turnaround Router
Non-Atomic Joins Atomic Joins (S, G) Joins Header-only Registers (192.1.1.1, 224.1.1.1) Traffic Flow

RP

rtr-a
S0 10.1.3.1 S0 10.1.3.2

Final Steady-State Condition

rtr-x Turnaround Router


E0 10.1.4.1

10.1.4.2 E1

10.1.4.3 E1

(*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: E0 E0 Ethernet0, Forward/Sparse, 00:02:43/00:02:17

rtr-b

rtr-c

(192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list:

Source 192.1.1.1
Module6. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Member 224.1.1.1
78 78

Turnaround Router
As a result of the Non-Atomic Joins, the state in the Turnaround router will be as follows as long as the source and member remain active : (*, 224.1.1.1), 00:02:43/00:02:59, RP 10.1.3.1, flags: S Incoming interface: Serial0, RPF nbr 10.1.3.1, Outgoing interface list: Ethernet0, Forward/Sparse, 00:02:43/00:02:17 (192.1.1.1/32, 224.1.1.1), 00:00:49/00:02:59, flags: PXT Incoming interface: Ethernet0, RPF nbr 10.1.4.2, Outgoing interface list:

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

78

Module6. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

79

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module6.ppt

79

Advanced IP Multicast Features


Module 7

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 10:15 AM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Module Objectives

Develop an understanding of the more advanced multicast features available in IOS.

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

2 2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Module Agenda

Bandwidth Control of Multicast Multicast Traffic Engineering Network Redundancy Multicast over NBMA Networks Reliable Multicast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

3 3

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

BW Control via Rate-Limiting

IP multicast traffic can be rate-limited


Any data over the limit is discarded Rate-limit is on per second time slots Can rate-limit on input as well as output

Designed to
Deal with misbehaving sources Sharing bandwidth with unicast traffic

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

4 4

Bandwidth Control via Rate-Limiting


In general, concern over bandwidth utilization by multicast traffic is often more of an issue of FUD (Fear, Uncertainty, Doubt) than a real issue. However, like many other traffic types, multicast traffic can be rate-limited. Rates are measured over 1 second windows and compared to configured limits. Once the configured limit has been reached, further data that would exceed this limit is discarded. Rate-limiting may be applied to either incoming or outgoing traffic. Rate-limiting provides protection against: Misbehaving sources that are consuming too much bandwidth. Multicast consuming all of the available bandwidth.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

BW Control via Rate-Limiting

Interface-Based Rate-Limiting
Limits total rate of all multicast flows in/out of an interface

Flow-Based Rate-Limiting
Limits rate of each individual (S, G) or (*,G) flow in/out of an interface
Note: Both Interface and FlowFlow -based limits may not be used on an interface at the same time!

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

5 5

Interface-Based Rate-Limiting
Rate limits may be applied to the total overall rate of multicast traffic flowing into or out of an interface. This is the most commonly used form of multicast rate limiting, particularly for outgoing traffic. This permits an upper bound to be set on the bandwidth consumed by multicast traffic on an interface.

Flow-Based Rate-Limiting
Flow-based rate limits may be applied to an interface on a Group or Source/Group basis. When flow-based rate limits are defined, it causes the rate to be applied to the incoming or outgoing interface of matching (*, G) or (S, G) entries in the mroute table. For example, if an outgoing 50Kbps flow-based rate limit has been configured on interface Serial0 for (*, 224.1.1.1), then a 50Kbps rate limit will be set on Serial0 whenever it appears in the OIL of any (*, 224.1.1.1) or (S, 224.1.1.1) mroute table entries. This will limit the rate of each these flows to a maximum of 50Kbps. Note: Flow-based rate limiting is applied independently to each individual flow. In the above example, this would limit each individual matching flow out Serial0 to 50Kbps. It would not rate limit the total flow of 224.1.1.1 traffic out Serial0 to 50Kbps. Therefore, if there were several sources for the 224.1.1.1 group, the total flow can easily exceed 50Kbps. Keywords such as video and whiteboard may be used to further identify media specific flows on a UDP port basis. However, for this to work, ip sdr listen must be configured so that the router can obtain the necessary SDR session information to identify which flows are video and which are whiteboard.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

BW Control via Rate-Limiting


Rate limit interface command
ip multicast rate-limit in | out { [video] | [whiteboard] } [group-list <acl>] [source-list <acl>] [<kbps>]

An Interface-based rate limit is defined when the optional Group and/or Source ACLs are not used. A Flow-based rate limit is defined when the optional Group and/or Source ACLs are used
Multiple Flow-based entries may be used per interface Flow-based and Interface -based limits may not be used at the same time

Typical Rate-Limit Application


Use out form of command on WAN links Set <kbps> to desired percentage usage of link BW
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

6 6

Rate-Limit Interface Command


The format of the rate-limit interface command is shown above. The in and out keywords are used to specify if the rate-limit is to be applied to incoming or outgoing multicast traffic, respectively. An interface-based rate-limit is defined when the optional group-list and/or source-list ACLs are NOT used. Only one in and one out interfacebased rate limit may be defined on an interface. A flow-based rate-limit is defined when either the group-list or source-list ACLs are specified. Multiple flow-based rate-limit commands may be configured on an interface When an interface is added to the outgoing interface list or becomes the incoming interface, the list of rate-limits configured for that interface are searched for a match as follows: All rate-limits configured on an interface are maintained and searched in the order in which they were entered. If the interface is being used as the incoming interface for the mroute table entry the list is searched for an in limit. If the interface is being added to the outgoing interface list of the mroute entry, the list is search for an out limit. The list is searched for the first rate-limit that matches the optional group and source ACL. (if there was no group or source ACL specified, then the limit is an interface-based limit and the limit matches unconditionally.) The appropriate limit (interface or flow) is configured for the interface. The most typical usage for rate-limits is to configure out interface-based limits on WAN links where the value of <kbps> is set to some desired percentage of the overall WAN link bandwidth. This prevents misbehaving multicast sessions from consuming all of the link bandwidth.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

BW Control via Rate-Limiting


Limiting video or whiteboard streams
Add video or whiteboard keywords Requires ip sdr listen to be enabled Streams identified using info from sdr cache Example:
Listening to IETF broadcast behind a 128kbps link Theyre sending video at 128kbps and audio at 64kbps

Requirements
Want crystal clear audio Want good response to data actions (interactive) Marginal video acceptable

Configuration
interface serial 0 ip multicast rate-limit out video 48

Router will differentiate UDP port numbers for the same group
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

7 7

Limiting Video or Whiteboard Streams


Keywords such as video and whiteboard may be used to further identify media specific flows on a UDP port basis. For this to work, ip sdr listen must be configured so that the router can obtain the necessary SDR session information to identify which flows are video and which are whiteboard. The router actually uses the SDR session information to identify the multicast group and UDP port number that is being used to send v ideo or whiteboard data. In the example above, the upstream serial interface is configured so that video can only consume 48kbps of the 128kbps ISDN line. This permits the 64kbps audio to be sent without experiencing any loss due to the video stream over subscribing the line.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Debugging Rate-Limits
Rtr-A> show ip mroute 224.1.1.1 (*, 224.1.1.1), 7w0d/00:03:29, RP 171.69.10.13, flags: S Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 04:23:22/00:03:25, Int Limit 512 kbps Serial1, Forward/Sparse-Dense, 04:23:28/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 04:23:28/00:02:37, (128.9.160.238, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:48/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 00:00:48/00:03:25, (*, 224.2.2.2), 00:00:38/00:02:51, RP 171.68.20.1, flags: S Incoming interface: Ethernet0, RPF nbr 171.70.100.1, Int Limit 1000 kbps Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:38/00:02:51, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:38/00:03:29, Serial3, Forward/Sparse-Dense, 00:00:38/00:03:25, . . .

Interface -based Output Rate-limit

Total output multicast traffic rate on Serial0 will not exceed 512 Kbps. Kbps.
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8 8

Debugging Rate-Limits
Rate limits may be displayed via the show ip mroute command. Output interface-based rate-limits are shown on the entries in the outgoing interface list (OIL) of each mroute table entry. The text following the OIL will have the word Int preceding the rate limit value indicating that this is an Interface-based limit. In the example above, an output interface-based rate-limit of 512kbps has been configured on interface Serial0.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Debugging Rate-Limits
Rtr-A> show ip mroute 224.1.1.1 (*, 224.1.1.1), 7w0d/00:03:29, RP 171.69.10.13, flags: S Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 04:23:22/00:03:25, Int Limit 512 kbps Serial1, Forward/Sparse-Dense, 04:23:28/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 04:23:28/00:02:37, (128.9.160.238, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:48/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 00:00:48/00:03:25, (*, 224.2.2.2), 00:00:38/00:02:51, RP 171.68.20.1, flags: S Incoming interface: Ethernet0, RPF nbr 171.70.100.1, Int Limit 1000 kbps Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:38/00:02:51, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:38/00:03:29 Serial3, Forward/Sparse-Dense, 00:00:38/00:03:25 . . .

Interface -based Input Rate -limit

Total input multicast traffic rate on Ethernet0 will not exceed 1 Mbps.
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

9 9

Debugging Rate-Limits
Rate-limits may be displayed via the show ip mroute command. Input interfacebased rate-limits are shown on the incoming interface of each mroute table entry. The text following the incoming interface information will have the word Int preceding the rate limit value indicating that this is an Interface-based limit. In the example above, an input interface-based rate-limit of 1 Mbps has been configured on interface Ethernet0.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Debugging Rate-Limits
Rtr-A> show ip mroute 224.1.1.1 (*, 224.1.1.1), 7w0d/00:03:29, RP 171.69.10.13, flags: S Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 04:23:22/00:03:25, Int Limit 512 kbps Serial1, Forward/Sparse-Dense, 04:23:28/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 04:23:28/00:02:37, (128.9.160.238, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Serial2, RPF nbr 171.68.0.234 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:48/00:03:29, Limit 56 kbps Serial3, Forward/Sparse-Dense, 00:00:48/00:03:25, (*, 224.2.2.2), 00:00:38/00:02:51, RP 171.68.20.1, flags: S Incoming interface: Ethernet0, RPF nbr 171.70.100.1, Int Limit 1000 kbps Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:38/00:02:51, Int limit 512 kbps Serial1, Forward/Sparse-Dense, 00:00:38/00:03:29 Serial3, Forward/Sparse-Dense, 00:00:38/00:03:25 . . .

Flow-based Output Rate-limit

Each individual output multicast flow on Serial1 will not exceed 56 Kbps. The total output on Serial1 is the sum of all flows and can exceed 56Kbps.
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

10 10

Debugging Rate-Limits
Rate limits may be displayed via the show ip mroute command. Output flowbased rate-limits are shown on the entries in the outgoing interface list (OIL) of each mroute table entry. The text following the OIL will be missing the word Int preceding the rate limit value. This indicates that this is an flow-based limit. In the example above, an output flow-based rate-limit of 56kbps has been configured on interface Serial1. NOTE: Each individual matching flow will be rate-limited to 56kbps, not the total aggregate of all matching flows. This means that if there are 10 active flows that match the configured flow-based rate-limit, the total aggregate rate out Serial1 could be as high as 560kbps!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

10

BW Control via Rate-Limiting


Example: FlowFlow-based RateRate-Limiting
Video Server 1 (192.1.1.1, 224.1.1.1)
interface serial0 ip multicast rate-limit out group-list 10 100 access-list 10 permit 224.1.1.1

128 Kbps E0 Video Server 2 (192.2.2.2, 224.1.1.1) S0 E1 128 Kbps


(192.1.1.1, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet0, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Limit 100 kbps (192.2.2.2, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet1, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Limit 100 kbps
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

100 Kbps

Total: 200 Kbps


100 Kbps

11 11

Flow-based Rate-Limiting Example


In the above example, there are two 128kbps sources of video being sent to group 224.1.1.1 which in turn, flow through the 7500 router and out Serial0 to downstream receivers (not shown). Interface Serial0 is configured with an output flow-based rate limit of 100kbps (as shown in the configuration excerpt). The output of a show ip mroute command clearly shows that this output flowbased rate limit has been applied to Serial0 for both sources. The results will be that each video flows will be rate-limited to a maximum of 100kbps, thereby causing drops in the multicast video streams.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

11

BW Control via Rate-Limiting


Example: FlowFlow-based RateRate-Limiting
Video Server 1 (192.1.1.1, 224.1.1.1)
interface serial0 ip multicast rate-limit out group-list 10 200 access-list 10 permit 224.1.1.1

128 Kbps E0 Video Server 2 (192.2.2.2, 224.1.1.1) S0 E1 128 Kbps


(192.1.1.1, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet0, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Limit 200 kbps (192.2.2.2, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet1, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Limit 200 kbps
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

128 Kbps

Total: 256 Kbps!!!


128 Kbps

12 12

Flow-based Rate-Limiting Example


Continuing with the same example, interface Serial0 is now reconfigured with an output flow-based rate limit of 200kbps (as shown in the configuration excerpt). The output of a show ip mroute command clearly shows that this output flowbased rate limit has been applied to Serial0 for both sources. The results will be that neither video flow will be rate-limited since their maximum streaming speed is only 128kbps. This results in a total aggregate rate of 256kbps for both flows. Note that this rate-limit will NOT limit the total aggregate flow rate of these matching flows to a maximum of 200kbps as might be expected by some network engineers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

12

BW Control via Rate-Limiting


Example: InterfaceInterface-based Rate Limiting
Video Server 1 (192.1.1.1, 224.1.1.1)
interface serial0 ip multicast rate-limit out 200

128 Kbps E0 Video Server 2 (192.2.2.2, 224.1.1.1) S0 E1 128 Kbps


(192.1.1.1, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet0, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Int Limit 200 kbps (192.2.2.2, 224.1.1.1), 00:00:48/00:02:42, flags: T Incoming interface: Ethernet1, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse-Dense, 00:00:48/00:02:41, Int Limit 200 kbps
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

??? Kbps

Total: 200 Kbps


??? Kbps

13 13

Interface-based Rate-Limiting Example


Finally, interface Serial0 is once again reconfigured with an output interfacebased rate limit of 200kbps (as shown in the configuration excerpt). The output of a show ip mroute command clearly shows that this output interface-based rate limit has been applied to Serial0 for both sources. The results will be that all multicast traffic (including these two video flows) will be rate-limited to a total aggregate rate of 200kbps. Note that this rate-limit results in potential loss for both flows since the combined rate of the two flows exceeds the 200kbps output interface limit.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

13

BW Control via Rate-Limiting

Summary
Flow-based rate-limits do not limit the total aggregate of all the matching flows. Therefore, the use of interfacebased rate-limits are recommended when an upper bound on multicast traffic rates is desired.

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

14 14

Summary
Flow-based rate-limits generally do not provide the sort of rate-limiting that is very useful in real-world networks. As a result, only interface-based rate-limits are normally used when an upper bound on multicast traffic is desired.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

14

BW Control via Admin-Scoping


Limit high-BW source to local site Use administratively-scoped zones
Simple scoped zone example:
239.192.0.0/16 = Site-Local Scope Zone 239.0.0.0/8 = Org.-Local Scope Zone 224.0.1.0 - 238.255.255.255 = Global scope (Internet) zone

High-BW sources use only site-local zone groups Med.-BW, org-wide sources use org.-local zone Low-Med. BW, Internet-wide sources use global zone
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

15 15

BW Control via Admin Scoping


Another method that can be used to control the BW usage by multicast traffic is to employ Admin-Scoped zones along with corresponding multicast boundaries. This allows one to restrict high-rate multicast flows from leaving certain geographical boundaries where bandwidth is plentiful, and going out over bandwidth constrained WAN links. In the above example, the following Admin Scoped ranges are in use Site-Local Scope Global Scope (Internet) - 239.192.0.0/16 - 224.0.1.0 - 238.255.255.255 Organization-Local Scope (Company?) - 239.0.0.0/8 High BW, Site-Local sources should always use a group address in the SiteLocal Scope. Medium BW, Organization-Local sources should always use a group address in the Organization-Local Scope. Sources that wish to transmit to the Internet should use group addresses that do not fall in either the Site-Local or Organization-Local scopes.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

15

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A

S0

S1

T1
S0

T1
S0

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
16 16

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
In this example, two remote sites (Los Angeles and Atlanta) are linked to the company HQ site via T1 lines. Each of the three sites has its own Mapping Agent and a Site-Local RP that serves the Site-Local group range 239.192.0.0/16 that was described in the previous slide. This is necessary since the goal is to keep all Site-Local traffic within each site and therefore each site must have its own independent RP for this group range.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

16

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A

Org-Local Boundary Site -Local Boundaries


T1
S0

Site -Local Boundaries


T1
S0

S0

S1

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
17 17

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
The first step is to establish multicast boundaries that prevent Site-Local and Organization-Local multicast traffic from crossing certain boundaries. In the case of Site-Local traffic, these boundaries are established on the T1 links on each of the three border routers, A, B and C. The Site-Local boundaries are implemented using the ip multicast boundary interface command along with an access-control list that denies multicast traffic in the Site-Local group range (239.192.0.0/16) from crossing this boundary. The Organization-Local boundary is established on the link to the internet and is also implemented using the ip multicast boundary interface command. The access-control list for this boundary denies multicast traffic in the Organization-Local group range (239.0.0.0/8) from crossing this boundary.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

17

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Interface Serial0 . . . ip multicast ttl-threshold 16 ip multicast boundary 10

Site Local RP/MA

S0

access-list 10 deny 239.192.0.0 0.0.255.255 access-list 10 permit any

Border A

Interface Serial0 . . . ip multicast ttl-threshold 16 ip multicast boundary 10 access-list 10 deny 239.192.0.0 0.0.255.255 access-list 10 permit any

S0

S1

T1
S0

T1
S0

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
18 18

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
The configuration commands necessary to establish the Site-Local boundaries on border routers B and C are listed in the above drawing. In both configurations, the Site-Local boundary is established on interface Serial0 via the ip multicast boundary 10 interface command. Access-control list 10 is constructed to deny the passage of any traffic in the Site-Local group range (239.192.0.0/16) while all other multicast groups are permitted to cross the interface. Pay particular attention to the ip multicast ttl-threshold 16 command also configured on Serial0. The requirement for this command will be discussed in a later slide.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

18

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A

S0

S1

T1
Interface Serial0 . . . ip multicast ttl-threshold 16 ip multicast boundary 10 Interface Serial1 . . . ip multicast ttl-threshold 16 ip multicast boundary 10 Site Local access-list 10 deny 239.192.0.0 0.0.255.255 access-list 10 permit any

T1

S0

S0

Border B

Border C

RP/MA

Site Local RP/MA

Site B (LA)

Site C (ATL)
19 19

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
The configuration commands necessary to establish the Site-Local boundaries on border router A are listed in the above drawing. In this case, the Site-Local boundary is established on both interface Serial0 and Serial1 via the ip multicast boundary 10 interface command. Access-control list 10 is constructed to deny the passage of any traffic in the Site-Local group range (239.192.0.0/16) while all other multicast groups are permitted to cross the interface. Again notice the ip multicast ttl-threshold 16 command also configured on Serial0 and Serial1. The requirement for this command will be discussed in a later slide.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

19

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
interface Loopback0 Site Local ip address 192.168.10.2 255.255.255.255 RP/MA ip pim send-rp-discovery Loopback0 scope 15 ip pim send-rp-announce Loopback0 scope 15 group 20 access-list 20 permit 239.192.0.0 0.0.255.255

S0

Border A

S0

S1

T1
S0

T1
S0

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
20 20

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
The next step is to configure the independent Mapping Agents for each site as well as the independent Site-Local group RP. In the drawing above, the configuration for the RP/Mapping Agent in Site B is shown. In this case, interface Loopback0 has been configured for use as the interface of choice for the Mapping Agent and RP. A loopback interface is often used for this purpose as it provides more flexibility in the management of the Mapping Agent as well as the selection of the RP when multiple candidate RPs are define. (The highest candidate IP address is chosen as the active RP by Mapping Agents.) The ip pim send-rp-discovery global command defines the router as a Mapping Agent for Site B with an IP address of the loopback interface. Note that a TTL scope of 15 is used on this command so that the Discovery messages sourced by this Mapping Agent will not exit the Site. (This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) The ip pim send-rp-announce global command along with accesscontrol list 20, defines the router as a Candidate RP for the Site-Local group range. Note that a TTL scope of 15 is used on this command so that the Announce messages sourced by this Candidate-RP will not exit the Site. This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) Note: It is crucial that the Candidate-RP Announce messages do not leak outside of this site and into other sites. If this were to occur, the Mapping Agent(s) in the other site(s) might select the Candidate-RP in Site B as the currently active RP for the Site-Local group. This would break Site-Local multicast in that site.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

20

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0
interface Loopback0 ip address 192.168.10.1 255.255.255.255

Border A

ip pim send-rp-discovery scope 15 ip pim send-rp-announce Loopback0 scope 15 group 20 access-list 20 permit 239.192.0.0 0.0.255.255 S0

S1

T1
S0

T1
S0

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
21 21

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
In the drawing above, the configuration for the RP/Mapping Agent in Site C is shown. Interface Loopback0 has also been configured for use as the interface of choice for the Mapping Agent and RP. The ip pim send-rp-discovery global command defines the router as a Mapping Agent for Site C with an IP address of the loopback interface. Note once again that a TTL scope of 15 is used on this command so that the Discovery messages sourced by this Mapping Agent will not exit the Site. (This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) The ip pim send-rp-announce global command along with accesscontrol list 20, defines the router as a Candidate RP for the Site-Local group range. Note that a TTL scope of 15 is used on this command so that the Announce messages sourced by this Candidate-RP will not exit the Site. This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) Note: It is crucial that the Candidate-RP Announce messages do not leak outside of this site and into other sites. If this were to occur, the Mapping Agent(s) in the other site(s) might select the Candidate-RP in Site C as the currently active RP for the Site-Local group. This would break Site-Local multicast in that site.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

21

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A

S0

S1

T1

S0 interface Loopback0 Border B 255.255.255.255 Border C ip address 192.168.10.3


ip pim send-rp-discovery scope 15 ip pim send-rp-announce Loopback0 scope 15 group 20 access-list 20 permit 239.192.0.0 0.0.255.255

T1
S0

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
22 22

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
In the drawing above, the configuration for the RP/Mapping Agent in Site A is shown. Interface Loopback0 has also been configured for use as the interface of choice for the Mapping Agent and RP. The ip pim send-rp-discovery global command defines the router as a Mapping Agent for Site A with an IP address of the loopback interface. Note once again that a TTL scope of 15 is used on this command so that the Discovery messages sourced by this Mapping Agent will not exit the Site. (This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) The ip pim send-rp-announce global command along with accesscontrol list 20, defines the router as a Candidate RP for the Site-Local group range. Note that a TTL scope of 15 is used on this command so that the Announce messages sourced by this Candidate-RP will not exit the Site. This is accomplished by the use of a ttl-threshold of 16 on Serial0 that was configured in the previous slides.) Note: It is crucial that the Candidate-RP Announce messages do not leak outside of this site and into other sites. If this were to occur, the Mapping Agent(s) in the other site(s) might select the Candidate-RP in Site A as the currently active RP for the Site-Local group. This would break Site-Local multicast in that site.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

22

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A

S0

T1

Main RP/MA (non-Site Local Groups)


S0

interface Loopback0 S0 ip address 192.168.1.3 Border B255.255.255.255 Border C ip pim send-rp-discovery Loopback0 scope 64 ip pim send-rp-announce Loopback0 scope 64 group 20 ip pim rprp -announce announce-filter rprp -list 10 access accesslist 10 deny 192.168.10.3 Site Local accessaccess -list 10 permit any

S1
T1

RP/MA

Site Local RP/MA

access-list 20 permit 224.0.0.0 0.255.255.255 access-list 20 permit 225.0.0.0 0.255.255.255 . . . access-list 20 permit 238.0.0.0 0.255.255.255
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Site B (LA)

Site C (ATL)
23 23

Admin-Scoping Example
Next, an RP for all the non-Site-Local multicast groups must be configured. Border router A has been chosen for this task. Additionally, router A is configured as a Mapping Agent with sufficient scope to cover the entire network. (Note: The independent Mapping Agents for each site make this step unnecessary as they can handle the Mapping Agent functionality for their site.) Once again, interface Loopback0 has been configured for use as the interface of choice for the Mapping Agent and RP. The ip pim send-rp-announce global command along with accesscontrol list 20, defines the router as a Candidate RP for all non-Site-Local groups. Note that a TTL scope of 64 is used on this command so that the Announce messages sourced by this Candidate-RP will reach all routers in all Sites. The ip pim send-rp-discovery global command defines the router as a Mapping Agent for the entire network. Note that a TTL scope of 64 is also used on this command so that the Discovery messages sourced by this Mapping Agent reach all routers in all Sites. Because the scope of the Discovery messages are 64, they will reach all routers in all sites. Therefore, care must be taken to insure that the C-RP information from the Site-Local C-RP in the HQ site is not accepted and inadvertently advertised in Discovery messages by the Mapping Agent. If this were to occur, the routers in the other site(s) might select the Candidate-RP in the HQ Site as the currently active RP for the Site-Local group. This would break Site-Local multicast in that site. To prevent this from happening, the ip pim rp-announce-filter command along with access-control list 10 is used to filter out C-RP Announcement messages from the Site-Local C-RP in the HQ Site. Note: This problem can be avoided if router A is not configured as a Mapping Agent.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module7.ppt 23

BW Control via Admin-Scoping


Site A (HQ)
AS Border

To Internet
Site Local RP/MA
S0

Border A
Interface Serial0 . . . S0 ip multicast ttl-threshold 128 ip multicast boundary 10

S1

T1

access-list 10 deny 239.0.0.0 0.0.0.255 access-list S0 10 permit any

T1

S0

Border B

Border C

Site Local RP/MA

Site Local RP/MA

Site B (LA)
Module7. ppt

Site C (ATL)
24 24

1998 2001, Cisco Systems, Inc. All rights reserved.

Admin-Scoping Example
Finally, the AS Border router must be configured with a multicast boundary so that all locally scoped multicast traffic in the 239.0.0.0/8 range is blocked from entering or leaving the company. In this configuration, the multicast boundary is established on interface Serial0 via the ip multicast boundary 10 interface command. Access-control list 10 is constructed to deny the passage of any traffic in the Admin-Scoped group range (239.0.0.0/8) while all other multicast groups are permitted to cross the interface. Although it is not necessary to implement Admin-Scoping, the ip multicast ttl-threshold 128 command is also configured on Serial0 of the AS border router. This is often used to provide TTL scoping of traffic inside of the company. Sources that do not wish to have their multicast traffic leave the company can transmit with a TTL less than 128 and be insured that the traffic will not be forwarded into the Internet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

24

Module Agenda

Bandwidth Control of Multicast Multicast Traffic Engineering Network Redundancy Multicast over NBMA Networks Reliable Multicast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

25 25

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

25

Non-Congruent Networks

Why would you have non-congruent unicast & multicast networks?


Multicast is not enabled on all paths in the network Tunnels are used to bypass normal unicast routing You have policy reasons for making them different You want to use idle links for multicast traffic

Non-congruent unicast/multicast networks


RPF Calculation cannot use unicast route table Other source of RPF information must be used

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

26 26

Non-congruent Multicast/Unicast Networks


There are several cases were it might be desirable for Unicast and Multicast traffic to follow separate paths through a network. Some of these cases are: When multicast is not enabled on all interfaces of a router. When tunnels are in use to bypass unicast-only sections of a network. There exists some policy that dictates that multicast traffic follow different paths. It is desirable for multicast traffic to flow over idle/backup links for better load balancing. When the unicast and multicast networks are not congruent, certain limitations come into play, such as: The RPF calculation cannot use the unicast routing table since that would cause the unicast and multicast networks to be congruent by default. Some other source of information than the unicast routing table must be used. This imposes additional configuration and administration requirements that (in many cases) are non-trivial.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

26

PIM RPF Calculation Details


Decreasing Preference Static Mroute Table
(First Match)

Route/Mask, Dist.
(Default Dist. = 0)

(Best Path)

RPF Calculation (Use best Distance unless Longest Match1 is enabled. If enabled, use longest Mask.)

BGP MRIB

Route/Mask, Dist.
(eBGP Def. Dist.=20) (iBGP Def. Dist.=200) (Longest Match)

IIF, RPF Neighbor

DVMRP Route Table

Route/Mask, Dist.
(Default Dist. = 0)

Unicast Routing Table


Module7.ppt

(Longest Match)

Route/Mask, Dist.
1 Global Command:

ip multicast longest-match
27 27

1998 2001, Cisco Systems, Inc. All rights reserved.

PIM RPF Calculations


Cisco IOS permits other sources of information to be used in the RPF calculation other than the unicast routing table. In general, these other sources are preferred based on their Admin. Distance. If Admin Distance values are equal, the sources are preferred in the order listed below: Static Mroute Table Static Mroutes may be defined that are local to the router on which they are defined. If a matching Static Mroute is defined, its default Admin. Distance is zero and is therefore preferred over other sources. (If another source also has a distance of zero, the Static Mroute takes precedence.) BGP Multicast RIB (M-RIB) If MBGP is in use and a matching prefix exists in the MBGP M-RIB, it will be used as long as its Admin. Distance is the lowest of the other sources. (MBGP M-RIB prefixes are preferred over DVMRP or Unicast routes if the Admin Distances are the same.) DVMRP Route Table If DVMRP routes are being exchanged and there exists a matching route in the DVMRP route table, the default Admin. Distance of this route is zero. DVMRP routes are preferred over Unicast routes if their Admin. Distances are equal. Unicast Route Table This is least preferred source of information. If no other source has a matching route with a lower Admin. Distance, then this information is used.
Note: The above behavior can be modified so that the longest match route is used from the available sources. This is configured with the ip multicast longest-match hidden command.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

27

Alt. Path Routing with Static Mroutes


Statically configured using command:
ip mroute <source> <mask> [<protocol>][route-map <map>] <rpf-nbr> | <interface> [<distance>]

Multiple mroutes may be specified


Searched in order of configuration Search stops on first match and route is used Admin distance of mroute compared to other routes

Mroutes have a default distance of zero


Preferred over all other routes by default

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

28 28

Static Mroutes
A Static Mroute may be configured using the command listed above to match based on the <source> and <mask> parameters. Either an <rpf-nbr> address or a specific <interface> can be configured as the next-hop. An optional <protocol> may be configured which requires that a matching route must exist in the specified unicast routing protocols database for the Static Mroute to match. An optional route-map <map> clause may be configured to constrain the match to the qualifications specified in the route-map in order for the Static Mroute to match. The default Admin. Distance of a Static Mroute is zero. This value may be overridden by the use of the <distance> parameter. If multiple Static Mroutes are configured, they are searched for a match in the order in which they were configured. If a match is found, the search terminates and the Static Route is used if it has an Admin. Distance less than or equal to any other source of RPF information. Note: Unlike their unicast counterparts, Static Mroutes only have significance on the router on which they are defined and cannot be redistributed.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

28

Alt. Path Routing with Static Mroutes

A stub connection where you have a tunnel for multicast access


ip mroute 0.0.0.0 0.0.0.0 tunnel0 Central Site tunnel0 MR UR

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

29 29

Static Mroute: Example 1


In this example, a multicast router (MR) has been configured with a tunnel to a multicast router outside of the network so that it can receive multicast traffic. The ip mroute 0.0.0.0 0.0.0.0 Tunnel0 command instructs router MR to RPF to Tunnel0 for all multicast sources instead of using the unicast routing information. Therefore, any multicast traffic arriving via Tunnel0 will be RPF correctly using this Static Mroute. If this Static Mroute was not used, router MR would us the unicast routing information which would cause it to RPF to router UR instead of Tunnel0 for traffic arriving from the outside of the network. Note: While this is a simple way to force router MR to RPF to the tunnel for traffic arriving from outside of the network, traffic arriving at MR from sources inside the network will RPF fail. This is because the source/mask covers all multicast sources, both inside and outside of the network. Obviously, a more sophisticated solution is required. (See next slide.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

29

Alt. Path Routing with Static Mroutes

You want to tailor RPF for many routes


ip mroute 0.0.0.0 0.0.0.0 ospf 1 null0 255 ip mroute 0.0.0.0 0.0.0.0 tunnel0
Central Site
OSPF Domain

tunnel0

MR UR

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

30 30

Static Mroute: Example 2


In this example, the multicast router (MR) has been reconfigured so that only traffic arrving from outside the network will RPF to Tunnel0 while traffic arriving from sources inside the OSPF domain will RPF correctly. An additional Static Mroute command ip mroute 0.0.0.0 0.0.0.0 ospf 1 null0 255 is configured ahead of the original Static Mroute used in the previous example. The command instructs router MR to match on any multicast source that has a corresponding matching entry in the OSPF 1 process database. For these matching sources, the RPF interface is set to Null0 and the Admin. Distance of the Static Mroute is set to 255. The second (and original) ip mroute 0.0.0.0 0.0.0.0 Tunnel0 command instructs router MR to RPF to Tunnel0 for all multicast sources just as was done in the previous example. However, since this command appears second in the configuration, it will only be reached if the first command fails to match (i.e. the source is not inside the OSPF domain). Exactly how these two commands combine to achieve the desired result may not be immediately obvious and is therefore described below: Sources outside the OSPF domain: Will not match on the first Static Mroute in the list since there is no matching route in the OSPF database. However, the second Static Mroute will match with an RPF interface of Tunnel0 and an Admin. Distance of zero. Therefore, these sources will RPF to Tunnel0. Sources inside the OSPF domain: Will match on the first Static Mroute because there is a matching route in the OSPF database and the search of the Static Mroutes terminates with an RPF interface of Null0 and an Admin. Distance of 255. However, since the unicast routing table will also have a matching OSPF route with a lower (better) Admin. Distance than 255, the route in the unicast routing table will be used. Therefore, the router will RPF to the correct interface for this source inside the OSPF domain.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module7.ppt 30

Alt. Path Routing with DVMRP

Use DVMRP routes for RPF Check


Permits separate unicast & multicast topologies Can use some unicast routes and some routes from the DVMRP table DVMRP routes are preferred by default
Default DVMRP Distance = 0

Warning!
Care must be used to prevent route redistribution problems

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

31 31

Using DVMRP Routes


As shown in the slide on RPF Details, DVMRP routes can be used as an alternate source of RPF information which, in turn, can be used to provide separate unicast and multicast topologies. The default Admin. Distance of DVMRP routes (zero) makes them preferred over routes in the unicast route table. (In the case of a tie in Admin. Distance values, a DVMRP route is preferred over a unicast route for RPF calculation.) Just as in unicast redistribution scenarios, care must be taken to avoid route loops from occurring when exchanging DVMRP routes between Cisco routers. This is due to the fact that unicast routes are automatically injected into DVMRP by default. This redistributing can cause multicast route loops or RPF failures to occur.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

31

Alt. Path Routing with DVMRP


ip dvmrp unicast-routing causes DVMRP routes to be exchanged between two Cisco routers.
PIM Router DVMRP Route Table Unicast Route Table

ip dvmrp unicast-routing

DVMRP Routes*

PIM Router * Split-Horizon is used between two Cisco


routers.
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

DVMRP Route Table

Unicast Route Table

32 32

Using DVMRP Routes


The ip dvmrp unicast-routing interface command instructs a Cisco router to begin sending and receiving DVMRP routes on the interface. When two Cisco routers are connected via interfaces that have this command configured (such as in the example above), they will: Receive DVMRP route reports and install them in their DVMRP route table using standard DVMRP metrics. Send DVMRP route reports from the routes contained in their DVMRP route table. Inject certain selected routes from the unicast routing table in DVMRP route reports. By default, only connected routes are injected in these DVMRP route reports. However, additional routes may be injected from the unicast routing table. (See next slide.) Note: When two Cisco routers are exchanging DVMRP route reports, the normal DVMRP Poison-Reverse mechanism is not used. Instead, Split-Horizon is used so that DVMRP routes are not advertised back out interface from which they were received.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

32

Alt. Path Routing with DVMRP


Injecting ALL routes using the ip dvmrp metric command
DVMRP Routes 151.16.0.0/16, M=8 172.34.15.0/24, M=11 202.13.3.0/24, M=9 176.32.10.0/24, M=1 176.32.15.0/24, M=1 176.32.20.0/24, M=1 . . . (10,000 Routes!) interface Tunnel 0 ip unnumbered Ethernet 0 ip dvmrp metric 1 ... interface E0 ip addr 176.32.10.1 255.255.255.0 ip pim sparse-dense-mode interface E1 ip addr 176.32.15.1 255.255.255.0 ip pim sparse-dense-mode

Tunnel
Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0

E0

E1

176.32.10.0/24 176.32.15.0/24 Always Use an Access - List with the ip dvmrp metric Command
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

33 33

Using DVMRP Routes - Injecting unicast routes


The default behavior of a Cisco router is to inject only connected routes. However, the ip dvmrp metric interface command can be used to modify this behavior. When this command is used without an access-control list, ALL routes in the unicast routing table will be injected into DVMRP route reports. An example of this is shown In the drawing above. The ip dvmrp metric 1 command results in the entire unicast routing table being injected into DVMRP route reports with a metric of 1. In most cases, this is not desirable and an access-control list should be used to limit which routes are injected.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

33

DVMRP Redistribution Problem


Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.1.0/24 S0 45106272 45106272 90 90 172.16.1.0/24 S0 172.16.20.0/24 S0 S0 45126319 45126319 90 90 172.16.20.0/24 Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.1.0/24 S1 45034510 45034510 90 90 172.16.1.0/24 S1 172.16.20.0/24 S1 S1 45085628 45085628 90 90 172.16.20.0/24

ip dvmrp unicast-routing ip dvmrp metric 1


172.16.10.0/24

Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

C
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

34 34

DVMRP Redistribution Problem


As previously stated, care must be taken to avoid redistribution problems when DVMRP routes are being exchanged between Cisco routers. This example demonstrates what can happen if careful attention to route injection is not observed. The drawing above shows two Cisco routers (along with their unicast route table) connected via a GRE Tunnel which is being used to tunnel through a unicast-only cloud. At each end of the tunnel, ip dvmrp unicast-routing has been enabled and the entire set of routes from the unicast routing table is being injected into DVMRP route reports with a metric of 1 by the use of the ip dvmrp metric 1 command.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

34

DVMRP Redistribution Problem


Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.1.0/24 S0 45106272 45106272 90 90 172.16.1.0/24 S0 172.16.20.0/24 S0 S0 45126319 45126319 90 90 172.16.20.0/24

172.16.10.0/16, 172.16.15.0/24, 172.16.1.0/24, 172.16.20.0/24 ...

M=1 M=1 M=1 M=1

All Unicast routes are advertised as DVMRP routes as a result of the ip dvmrp metric 1 command.

DVMRP Report 172.16.10.0/24


Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

C
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

35 35

DVMRP Redistribution Problem


The drawing above shows router A injecting its entire set of unicast routes into a DVMRP report, each with a metric of 1.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

35

DVMRP Redistribution Problem


DVMRP Route Route Table Table DVMRP Network Intf Metric Network Intf Metric 172.16.10.0/24 Tun0 Tun0 2 172.16.10.0/24 2 172.16.15.0/24 Tun0 Tun0 2 172.16.15.0/24 2 172.16.1.0/24 Tun0 Tun0 2 172.16.1.0/24 2 172.16.20.0/24 Tun0 Tun0 2 172.16.20.0/24 2 Dist Dist 0 0 0 0 0 0 0 0 Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.1.0/24 S1 45034510 45034510 90 90 172.16.1.0/24 S1 172.16.20.0/24 S1 S1 45085628 45085628 90 90 172.16.20.0/24

Preferred Route

Module7. ppt

DVMRP Redistribution Problem


As a result, router B has installed these DVMRP routes in its DVMRP route table with the appropriate metrics. Router B will now prefer the DVMRP route for network 172.16.1.0/24 over the same unicast route since the DVMRP route has an Admin Distance of zero. Now assume that the source begins to send multicast traffic which arrives at router B via interface Serial1. Unfortunately, this traffic will RPF Fail because the preferred DVMRP route indicates that the correct RPF interface for this traffic is Tunnel0.

172.16.10.0/24

Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

RPF Failure!

Source
1998 2001, Cisco Systems, Inc. All rights reserved.

C
36 36

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

36

DVMRP Redistribution Problem


Correct Configuration
interface Tunnel0 Tunnel0 interface ip address address <address> <address> <mask> <mask> ip ip dvmrp dvmrp unicast-routing unicast-routing ip ip dvmrp dvmrp metric metric 1 1 list list 10 10 ip access-list 10 10 access-list permit 172.16.15.0 172.16.15.0 permit permit 172.16.10.0 172.16.10.0 permit 0.0.0.255 0.0.0.255 0.0.0.255 0.0.0.255

Module7. ppt

Solution to DVMRP Redistribution Problem


The correct way to configure router A is to use an access-control list on the ip dvmrp metric command that specifies only those networks behind router A. (In this case, networks 172.16.15.0/24 and 172.16.10.0/24.)

172.16.10.0/24

Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

Source
1998 2001, Cisco Systems, Inc. All rights reserved.

C
37 37

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

37

DVMRP Redistribution Problem


Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.1.0/24 S0 45106272 45106272 90 90 172.16.1.0/24 S0 172.16.20.0/24 S0 S0 45126319 45126319 90 90 172.16.20.0/24

Only selected Unicast routes are advertised as DVMRP routes as a result of the new acl on the ip dvmrp metric 1 command.
172.16.10.0/24, M=1 172.16.15.0/24, M=1

DVMRP Report 172.16.10.0/24


Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

Source
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

C
38 38

Solution to DVMRP Redistribution Problem


Using this new configuration, we now see that router A is only injecting networks 172.16.10.0/24 and 172.16.15.0/24 in DVMRP route reports that are sent to router B.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

38

DVMRP Redistribution Problem


Preferred Route
DVMRP Route Route Table Table DVMRP Network Intf Metric Network Intf Metric 172.16.10.0/24 Tun0 Tun0 2 172.16.10.0/24 2 172.16.15.0/24 Tun0 Tun0 2 172.16.15.0/24 2 Unicast Route Route Table Table Unicast Network Intf Metric Dist Dist Network Intf Metric 172.16.1.0/24 S1 45034510 45034510 90 90 172.16.1.0/24 S1 172.16.10.0/24 E0 E0 10514432 10514432 90 90 172.16.10.0/24 172.16.15.0/24 E1 E1 10512012 10512012 90 90 172.16.15.0/24 172.16.20.0/24 S1 S1 45085628 45085628 90 90 172.16.20.0/24

Dist Dist 0 0 0 0

Module7. ppt

Solution to DVMRP Redistribution Problem


As a result, router B has installed only these two DVMRP routes in its DVMRP route table with the appropriate metrics. Now when router B receives traffic from the source on network 172.16.1.0/24, it will only find this route in its unicast routing table and will use this information to correctly calculate the RPF interface as Serial1. This will allow the RPF check to succeed for traffic arriving from the source.

172.16.10.0/24

Tun 0 E0

GRE Tunnel

Tun 0 S0

A
E1

S0

B
S1

172.16.15.0/24 172.16.1.0/24

RPF Succeeds!

Source
1998 2001, Cisco Systems, Inc. All rights reserved.

C
39 39

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

39

Alt. Path Routing with DVMRP


Either must make careful use of ACLs
Use ACL on all ip dvmrp metric commands
To prevent route loops and RPF problems. Complex problem to administer. ACLs may prevent network from converging after a failure.

Or run DVMRP everywhere!


Results in ships-in-the-night routing
Enable dvmrp unicast-routing on every interface Inject only connected routes (default)
i.e. Dont use ip dvmrp metric command
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

40 40

Using DVMRP Routes Summary


The use of DVMRP for alternate path routing must be done using careful planning and configuration. Access-control lists must be used on all ip dvmrp metric commands if route loops and RPF problems are to be avoided. The administration problems can grow quite large for networks where DVMRP alternate path routing is used on a large scale basis. Furthermore, because the access-control lists used to control which unicast routes are injected into DVMRP are static, changes in topology can result in the network failing to route around failed links. The other alternative is to use a Ships-in-the-night approach where DVMRP routes are exchanged over every interface in the network! When this approach is used, only connected networks should be injected on every interface by every router in the network. (This is the default behavior if the ip dvmrp metric command is not used when ip dvmrp unicast-routing is enabled on an interface.) The advantage of this method is that no ACLs are necessary and the network will re-converge around failed links. The disadvantage is that the network is now running both DVMRP and a separate unicast routing protocol which pass each other like Ships-in-the-night at every point in the network. In most cases, this is generally not desirable.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

40

Alternate Path Routing

CONCLUSION
Multicast Alternate Path Routing is very complex to implement and administer.

Avoid doing it if you can!


Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

41 41

Alternate Path Routing Conclusion


The use of alternate path routing for multicast traffic generally results in complex and difficult to administer networks. Until new tools become available to simplify this task, Network administrators are advised to use this only as a last resort.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

41

Tunneling Capabilities

Tunnels are used when multicast routers dont have contiguous connectivity Support two types of encapsulations for IP multicast traffic
DVMRP tunnels (IP protocol number 4) GRE tunnel (IP protocol number 47)

Both are supported by fast switching

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

42 42

Tunneling Multicast
Cisco IOS supports two types of tunnels for IP Multicast traffic. Both of these tunnel modes are fast-switched. DVMRP Tunnels - This is actually an IP-in-IP tunnel and uses the protocol number of 4. DVMRP tunnels are not supported between two Cisco routers. It is solely intended for use between a Cisco router and a non-Cisco router running DVMRP. GRE Tunnels - IP Multicast traffic may be tunneled between two Cisco routers using GRE tunnels (protocol 47).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

42

DVMRP Tunnels

Used between a Cisco router and a non-Cisco DVMRP router DVMRP Tunnel Example
interface tunnel0 tunnel source ethernet0 tunnel destination <ip-address> tunnel mode dvmrp ip unnumbered ethernet0 ip pim sparse-dense

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

43 43

DVMRP Tunnels
DVMRP tunnels are only used between Cisco routers and non-Cisco DVMRP routers. DVMRP tunnels cannot be used between two Cisco routers! A normal tunnel configuration (such as the one shown above) is used with the tunnel mode set to dvmrp.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

43

GRE Tunnels
Used between two Cisco routers Looks like a point-to-point link for all protocols in the box,
All packets get an extra IP header Provides data sequencing and security

Used when user wants non-congruent multicast and unicast topologies GRE Tunnel Example
interface tunnel0 tunnel source ethernet0 tunnel destination <ip-address> tunnel mode gre ip ip unnumbered ethernet0 ip pim sparse-dense
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

44 44

GRE Tunnels
When it is necessary to tunnel multicast traffic between two Cisco routers, a GRE Tunnel must be used. The GRE Tunnel appears as a point-to-point link to both ends. Each packet sent down the tunnel gets an extra IP header. GRE tunnels also provide data sequencing and some degree of security. A normal tunnel configuration (such as the one shown above) is used with the tunnel mode set to gre ip.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

44

Load Splitting using Tunnels


We use tunnels and load split across different (S,G) entries
Per packet when process level switching
(S1, G), oif = tu0 rewrite = serial0 (S2, G), oif = tu0 rewrite = serial1 tunnel0 serial0 serial1

When doing MAC level rewrite, select among a set of equal-cost paths to the tunnel endpoint
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

45 45

Load Splitting using Tunnels


Normally, a multicast traffic flow can have only one RPF interface. Even if multiple equal-cost routes exist in the unicast routing table, only the highest IP address is used. This normally precludes any load balancing of multicast traffic. However, the router can be made to see two equal-cost paths as a single interface if a Tunnel is used. The example above show how a tunnel may be used to accomplish a limited degree of load balancing. Tunnel0 is configured between the two routers. Multicast is enabled on Tunnel0 but not Serial0 and Serial1. Static Mroutes or some other form of alternate path routing is used so that the routers will RPF to Tunnel0 for all traffic arriving from the other router. (Warning: Care must be taken when alternate path routing is used to avoid route loops or black holes.) The load balancing method will depend on whether Tunnel0 is process or fastswitching. Fast Switching - Load balancing will be on a (*,G) or (S,G) flow basis. Process Switching - Load balancing will be on a per packet basis albeit at the reduced through-puts associated with Process Switching.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

45

SPT Thresholds

Shared trees are good for router state savings


When delay and frequency is not an issue

Source trees are good for low delay paths


At the expense of router state

SPT thresholds allow you to use both tree typesyou can tailor when you switch from shared to source trees
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

46 46

SPT Thresholds
The key advantage of a Shared Tree is that all multicast flows in the group may use the Shared Tree thereby reducing the amount of multicast forwarding state in the routers in the network. However, the (normally) sub-optimal paths of the Shared Tree introduce additional delay and possible points of congestion along the single common tree. Switching to Sources Trees (aka Shortest-Path Trees or SPT for short) is the default behavior of Ciscos PIM implementation. The advantage of Source Trees is that multicast flows via the shortest path from the sources to the receivers which reduces latency and the potential for congestion. However, this is accomplished at the expense of more multicast forwarding state in the routers in the network. SPT Thresholds permit the network engineer to tune at what point in terms of kbps a last-hop router switches to the SPT.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

46

SPT Thresholds

How to configure SPT-Thresholds


ip pim spt-threshold <kbps> | infinity [group-list <acl>]

Must be configured on all last-hop routers


Not in the RP

When you want only Shared Trees


ip pim spt-threshold infinity

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

47 47

SPT Thresholds
SPT Thresholds may be configured on a router using the global configuration command shown above. <kbps> | infinity Defines at what rate in kbps a last-hop router switches to the SPT. If the value of infinity is used the router will never switch to the SPT and traffic will only flow down the Shared Tree. Option ACL that defines the groups for which the SPT-threshold is applicable. If this ACL is not specified, all multicast groups are assumed.

group-list <acl> -

SPT-Thresholds must be configured on each individual router in the network. It will not have the desired affect if it is only configured on the RP. (This is because the RP does not communicate this value to the routers in the network.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

47

SPT Threshold Example

Forcing 224.2.0.0/16 traffic to remain on the Shared Tree


ip pim spt-threshold infinity group-list 1 access-list 1 permit 224.2.0.0 0.0.255.255

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

48 48

SPT-Threshold Example
In the above example, the network engineer desires to force all multicast traffic in the 224.2.0.0/16 group range to never switch to the SPT. This will help reduce the amount of multicast forwarding state in the network for this group at the expense of sub-optimal routing paths.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

48

IP Multicast Helper Maps

Problem: hosts take longer than routers to get IP multicast deployed Issue: there are host applications deployed that use UDP broadcast transmission Solution: have routers map broadcast address to multicast address
To make use of IP multicast in the infrastructure as soon as possible
49 49

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

IP Multicast Helper Maps


Although most of todays modern IP stacks support IGMP and multicast, there are still cases where multicast is not supported. For example, there are several cases where Tandem or IBM hosts are used to provide Stock Market ticker information that is sent via IP in unicast or (ugh) UDP broadcast. When it is desired to send this UDP broadcast traffic across a routed network, problems arise. UDP Flooding has often been used but is difficult to maintain and does not scale. The solution is to have the first-hop router convert the UDP Broadcast into multicast. This is a straight-forward process requiring the following steps: Packets of the UDP broadcast are identified by UDP port number. The destination broadcast IP address of the packet is rewritten with the desired multicast group address. A new checksum for the IP header is recalculated. The (now) multicast packet is forwarded as any other multicast packet. If it is also the case that the receivers do not support multicast, the last-hop routers can also convert specific multicast flows into local subnet broadcasts which can be received by these brain damaged hosts. This is also a straightforward process that requires the following steps: The router joins the desired multicast group. The packets desired UDP flow in this group are identified by UDP port number. The destination multicast group address is rewritten with the specified local subnet broadcast address. A new checksum for the IP header is recalculated. The (now) UDP broadcast packet is forwarded onto the local subnet.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module7.ppt 49

IP Multicast Helper Maps

Mapping from broadcast to multicast


ip multicast helper-map broadcast <group-addr> <acl> [ <ttl>]

Mapping from multicast to broadcast


ip multicast helper-map < group-addr> < bcast-address> < acl >

Router automatically joins group


ip igmp join-group < group-address >

The above command is automatically added to the router configuration

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

50 50

IP Multicast Helper Maps


Mapping from UDP Broadcast to Multicast is accomplished using the following command syntax: ip multicast helper-map broadcast <group-addr> <acl> <group-addr> - is the multicast group address that the router will use to replace the destination broadcast address in the received broadcast packet. <acl> - is an extended access-control list that is used to identify which UDP broadcast flow is to be converted to multicast. <ttl> - is an optional parameter that can be used to specify the TTL value of the multicast packet. This may be important if TTL Scoping is in use. Mapping from multicast back to broadcast is accomplished using the following command syntax: ip multicast helper-map <group-addr> <bcast-addr> <acl> <group-addr> - is the group address of the multicast flow that the router will convert back to a local broadcast. <bcast -addr> - is the destination broadcast address that the router will use to replace the above multicast group address when it rewrites the packet. (This address also identifies which interface/sub-net that the rewritten packet is to be sent.) <acl> - is an extended access-control list that is used to identify the UDP port number of the multicast flow that is to be converted to broadcast. When mapping from multicast back to broadcast, the router joins the group specified by the <group-addr> parameter by automatically adding an ip igmp join-group command to the configuration.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

50

IP Multicast Helper Maps


Broadcast Source .10 10.1.1.0/24

Broadcast to Multicast Conversion


.1 E0

Broadcast

Multicast A

ip multicast helper-map broadcast 224.1.1.1 100 ttl 15 ip forward-protocol udp 2000 access-list 100 permit any any udp 2000

ACL 100

multicast helper-map

Multicast Forwarding Engine

Broadcast
(10.1.1.10, 10.1.1.255) UDP Port 2000

any any udp 2000

Broadcast
(10.1.1.10, 10.1.1.255) UDP Port 2000

Multicast
(10.1.1.10, 224.1.1.1) UDP Port 2000

MATCH!
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

51 51

Broadcast to Multicast Conversion Details


In the top half of the above drawing, router A is configured to convert arriving UDP broadcast packets with a port number of 2000 into multicast packets with a destination group address of 224.1.1.1 and a TTL of 15. Extended access-list 100 is used to identify the UDP broadcast flow to UDP port 2000. (Note that it is necessary to configure the ip forward-protocol udp 2000 command to prevent the router from immediately discarding broadcast packets to UDP 2000.) The bottom half of the drawing show the steps that are taken by the router to convert the packet to multicast. A UDP broadcast packet with a UDP port of 2000 is received from host 10.1.1.10, sent to the subnet broadcast address 10.1.1.255. The packet is checked against extended access-list 100 to see if it matches the specified flow. (In this case, it does.) The matching packet is then sent to the multicast-helper front-end which simply replaces the destination broadcast address with the specified multicast group address (224.1.1.1) and recalculates a new IP header checksum. The (now) multicast packet is handed off to the routers Multicast Forwarding Engine which processes the packet like any other arriving multicast packet. (Note: As far as the Multicast Forwarding Engine is concerned, the packet was sent by the original host, in this case 10.1.1.10.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

51

IP Multicast Helper Maps


10.2.2.0/24

Multicast to Broadcast Conversion


Multicast
S0

Non-Multicast Receiver

.1 E0

Broadcast

ip multicast helper-map 224.1.1.1 10.2.2.255 100 ip forward-protocol udp 2000 access-list 100 permit any any udp 2000

Multicast Forwarding Engine

ACL 100

multicast helper-map

Multicast
(10.1.1.10, 224.1.1.1) UDP Port 2000

any any udp 2000

Multicast
(10.1.1.10, 224.1.1.1) UDP Port 2000

Broadcast
(10.1.1.10, 10.2.2.255) UDP Port 2000

MATCH!
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

52 52

Multicast to Broadcast Conversion Details


In the top half of the above drawing, router A is configured to convert arriving group 224.1.1.1 multicast packets with a port number of 2000 into UDP broadcast packets with a destination subnet broadcast address of 10.2.2.255. Extended access-list 100 is used to identify UDP packets addressed to UDP port 2000 within the group 224.1.1.1 multicast flow. The bottom half of the drawing show the steps that are taken by the router to convert the packet to multicast. The Multicast Forwarding Engine identifies arriving multicast packets for group 224.1.1.1 and checks them against extended access-list 100 to see if it matches the specified flow. (In this case, it does.) The matching packet is then sent to the multicast-helper back-end which simply replaces the destination multicast group address (224.1.1.1) with the specified broadcast address (10.2.2.255) and recalculates a new IP header checksum. The (now) UDP broadcast packet is forwarded to the appropriate subnet. (Note that it is necessary to configure the ip forward-protocol udp 2000 command so the the router will forward the broadcast packets to the destination subnet.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

52

IP Multicast Helper Maps


Broadcast Source .10 10.1.1.0/24 10.2.2.0/24 Non-Multicast Receiver

.1 E0

S0

Multicast Capable Network

S0

.1 E0

ip multicast helper-map

Interface Ethernet0 ip address 10.1.1.1 255.255.255.0 ip directed-broadcasts ip multicast helper-map broadcast 224.1.1.1 100 ttl 15 access-list 100 permit any any udp 2000 ip forward-protocol udp 2000

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

53 53

IP Multicast Helper Maps - Example


This is an example configuration where the UDP broadcast traffic from a broadcast source is converted to IP Multicast, travels across the multicast enable network and is converted back to a directed subnet broadcast at the far end. The Router A configuration necessary to convert from broadcast to multicast is shown above. Notice the following: ip directed-broadcasts must be enabled. ip forward-protocol udp 2000 must be configured to prevent the router from ignoring the UDP broadcast packets. Extended access-list 100 is used to identify the UDP broadcast stream that is to be converted. The ip multicast helper-map command causes the UDP broadcast stream to be converted to a 224.1.1.1 multicast stream with a TTL of 15.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

53

IP Multicast Helper Maps


Broadcast Source .10 10.1.1.0/24 10.2.2.0/24 Non-Multicast Receiver

.1 E0

S0

Multicast Capable Network

S0

.1 E0

ip multicast helper-map Interface Serial0 ip address 172.16.255.2 255.255.255.252 ip multicast helper-map 224.1.1.1 10.2.2.255 100 ip igmp join-group 224.1.1.1 interface Ethernet0 ip address 10.2.2.1 255.255.255.0 ip directed-broadcast access-list 100 permit any any udp 2000 ip forward-protocol udp 2000

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

54 54

IP Multicast Helper Maps - Example


The Router B configuration necessary to convert from multicast back to broadcast is shown above. Notice the following: Extended access-list 100 is used to identify the UDP multicast stream within group 224.1.1.1 that is to be converted. The ip multicast helper-map command causes the UDP broadcast stream to be converted to a 224.1.1.1 multicast stream with a TTL of 15. The router has automatically configured the ip igmp join-group 224.1.1.1 command. ip directed-broadcasts must be enabled on Ethernet0. ip forward-protocol udp 2000 must be configured to get the router to forward the UDP broadcast packets.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

54

IP Multicast Helper Maps


Broadcast Source .10 10.1.1.0/24 10.2.2.0/24 Non-Multicast Receiver

.1 E0

S0

Multicast Capable Network

S0

.1 E0

Broadcast
(10.1.1.10, 10.1.1.255)

Multicast
(10.1.1.10, 224.1.1.1)

Broadcast
(10.1.1.10, 10.2.2.255)

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

55 55

IP Multicast Helper Maps - Example


The results of this configuration are show above. UDP port 2000 broadcast packets arriving at Ethernet0 on router A have their IP destination address rewritten to multicast group 224.1.1.1. These 224.1.1.1 multicast packets flow across the multicast enabled network to router B. UDP port 2000 Multicast flows to group 224.1.1.1arriving on Serial0 at router B have their IP destination address rewritten to the subnet broadcast address of 10.2.2.255. Router B forwards these 10.2.2.255 subnet broadcast packets to Ethernet0 which is subnet 10.2.2.0.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

55

Module Agenda

Bandwidth Control of Multicast Multicast Traffic Engineering Network Redundancy Multicast over NBMA Networks Reliable Multicast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

56 56

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

56

RP-Failover
RP failover time
Function of Holdtime in RP-Announcement
Holdtime = 3 x <rp-announce-interval> Default < rp-announce-interval> = 60 seconds Worst-case (default) Failover ~ 3 minutes

Minimizing impact of RP failure


Use SPTs to reduce impact
Traffic on SPTs not affected by RP failure Immediate switch to SPTs is on by default New and/or bursty sources still a problem

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

57 57

RP Failover
The time it takes for the network to detect the failure of the active RP and switch to a backup RP depends on the value of the Holdtime field in the RPAnnouncement. The Holdtime field indicates when the RP will timeout and assumed to be down if another RP-Announcement is not received by the Mapping Agent. Holdtime is computed as 3 times the RP-Announce-Interval which has a default value of 60 seconds. This results in Holdtime values of 3 minutes which is the worst case failover time. The use of SPTs will reduce the affect of an RP failure since the Shared Tree is not being used to deliver multicast traffic. Therefore, if the RP fails, traffic from currently active sources will continue to flow to currently active receivers. However, new sources and or new receivers will not be able to register or join the Shared Tree until the RP failure has been detected and a switch to a backup RP occurs. Note: There is no failover mechanism built in to PIM for Static-RPs. In order to use backup RPs, either Auto-RP or BSR must be used. (A special configuration called Anycast RP can be used if multiple static RPs are desired. This configuration, however, requires the use of MSDP and is not a native function of the PIM Protocol.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

57

Tuning RP-Failover
Tune Candidate RPs New interval clause added for C-RPs
ip pim send-rp-announce <intfc> scope <ttl> [group-list acl] [interval <seconds>]

Allows rp-announce-interval to be adjusted Smaller intervals = Faster RP failover Smaller intervals increase amount Auto-RP traffic Increase is usually insignificant Total RP failover time reduced Min. failover ~ 3 seconds
58 58

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Tuning RP Failover
Prior to IOS release 12.0, the rp-announce-interval was fixed at 60 seconds. Beginning with IOS 12.0, the interval <seconds> clause was added to the ip send-rp-announce command. This new clause allows this interval to be tuned and hence tunes the Holdtime advertised in the RP Announcements. By reducing the rp-announce-interval, RP failure is detected sooner and therefore failover to the backup RP occurs sooner. However, the reduced intervals between announcements results in an increase in RP Announcement traffic in the network. This is generally insignificant and worth the reduced RP failover times. The minimum rp-announce-interval that may be set is 1 second. This corresponds to a worst case failover of 3 seconds.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

58

DR Failover

A
.2 (DR)

192.168.1.0/24

.1

Rtr-B>show ip pim neighbor PIM Neighbor Table Neighbor Address Interface 192.168.1.2 Ethernet0

Uptime 4d22h

Expires Mode 00:01:18 00:01:18 Sparse -Dense (DR)

Depends on neighbor expiration time Expiration Time sent in PIM query messages
Expiration time = 3 x <query-interval> Default <query-interval> = 30 seconds DR Failover ~ 90 seconds (worst case) by default
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

59 59

Designated Router Failover


When more than one multicast routers are connected to a LAN, one is elected as the DR and is responsible for sending Registers and Joins on behalf of sources and receivers that are active on the LAN segment. If this router fails, the other router(s) on the LAN segment will detect this and a new DR will be elected. The length of time that it takes for the other routers on the LAN segment to detect that the DR has failed is dependant on the Expire time value advertised by the DR in its PIM Hello messages. This value is fixed at 3 times the PIM Query interval which governs how often the router sends PIM Hello messages on the local LAN interface. By default, the query interval is set to 30 seconds which results in an expiration time of 90 seconds. Therefore, the worst case scenario is that it will take the other routers on the LAN segment 90 seconds to detect the failure of the DR and elect a new one.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

59

Tuning DR Failover

Tune PIM query interval


Use interface configuration command
ip pim query-interval <seconds>

Permits DR failover to be adjusted


Min. DR failover ~ 3 seconds (worst case) Smaller intervals increase PIM query traffic
Increase is usually insignificant

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

60 60

Tuning DR Failover
The DR Failover process can be indirectly tuned by varying the PIM query interval on the interface. This is accomplished using the following IOS interface command: ip pim query-interval <seconds> By reducing this value from its default of 30 seconds, the period between PIM Hello messages is reduced as well as the expiration time advertised in the PIM Hello message. The minimum query interval that can be configured is 1 second. This results in an expiration time of 3 seconds which is the worst case scenario for DR failover. However, reducing the query interval from its default of 30 seconds increases the amount of PIM Hello traffic on the local LAN. In most cases, this is an acceptable trade-off when faster DR Failover is desired.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

60

Network Topology Changes

Unicast routing must converge first PIM converges ~ 5 seconds after unicast PIM convergence algorithm
Entire mroute table scanned every 5 seconds RPF interface recalculated for every (*, G) and (S, G) Joins/prunes/grafts triggered as needed

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

61 61

Network Topology Changes


Convergence of the PIM distribution tree topology is dependent on the convergence of the unicast routing topology. This is because PIM normally makes us of the unicast routing table to calculate the correct RPF interface for each mroute table entry. In order to synchronize changes in the unicast routing table with the PIM topology, the Cisco IOS implementation of PIM recalculates the RPF interfaces for every entry in the mroute table every 5 seconds. If an RPF interface changes, then the appropriate joins, prunes and/or grafts are sent by the router to rebuild the multicast distribution tree around the network failure. The end result of the above is that the worst case convergence of PIM is approximately 5 seconds after unicast routing converges.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

61

Module Agenda

Bandwidth Control of Multicast Multicast Traffic Engineering Network Redundancy Multicast over NBMA Networks Reliable Multicast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

62 62

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

62

Multicast over NBMA


Full Mesh NBMA Network
Frame Relay or ATM
S0 S0

A
S0

.1 Logical IP Subnet 192.1.1.0/24

.2 B

.3 C

Physical Interface Virtual Circuit


S0

.4 D

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

63 63

Multicast over NBMA


Non-Broadcast, Multi-Access (NBMA) networks such as ATM and Frame relay are implemented using a Virtual Circuit (VC) concept. When these VCs are implemented using point-to-multipoint interfaces, the NBMA cloud is configured as a Logical IP Subnet (LIS). When this form of NBMA network connectivity is used, the special nature of how broadcast and multicast traffic is forwarded must be considered in order for IP Multicast in general and PIM in specific, to operate correctly. Note: It is completely different situation when point-to-point sub-interfaces are used. In that case, the network is not a LIS and each point-to-point VC/subinterface has its own subnet. Furthermore, the router sees the network as a collection of point-to-point links in with the same characteristics of serial links. This section is not applicable to this point-to-point sub-interface model. The interconnectivity of the nodes in the NBMA cloud generally fall into two categories, Full Mesh and Partial Mesh. The example network shown in the drawing above is of a Full Mesh NBMA network. This has the following characteristics: Each router has a single point -to-multipoint physical interface to the LIS. Each router has a separate VC to every other router in the network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

63

Multicast over NBMA


Central Site Router

Partial Mesh NBMA Network


Frame Relay, ATM or Dialup
S0

A
S0

.1 Logical IP Subnet 192.1.1.0/24

S0

.2 B Remote Site Router

.3 C Remote Site Router

Physical Interface Virtual Circuit


S0

.4 D Remote Site Router


64 64

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Multicast over NBMA


The example network shown in the drawing above is of a Partial Mesh NBMA network. This has the following characteristics: Each router has a single point -to-multipoint physical interface to the LIS. Each router has does not have a separate VC to every other router in the network. Instead of having a complete mesh that interconnects every router in the network, only a paritial set of VCs are configured. The typical configuration is for only the central site router to have a full set of VCs to every other remote site router in the network. It is this particular configuration that poses some restrictions on the design of IP Multicast. Frame Relay and ATM are the scenarios that most people immediately think of when this type of network is mentioned. However, another very common network scenario also fits in this category: Dialup networks. Most Dialup networks (Modem or ISDN) are configured such that all incoming dial connections are given an IP Address within a Logical IP Subnet. (Assigning each dialup connect its own subnet would use 4 IP addresses; 2 host addresses, 1 subnet address and 1 broadcast address. The LIS approach saves precious IP address as it only requires 2 addresses per dialup connection.) As a result, the same issues that apply to the Partial Mesh, Frame Relay and ATM scenarios apply to Dialup networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

64

How Routers See NBMA at L3


Central Site Router
S0
Interface S0 ip address 192.1.1.1 255.255.255.0 ip pim sparse-dense-mode

.1

Layer 3 Viewpoint

192.1.1.0/24

S0 .2

S0 .3

S0

.4

S0 .5

Remote Site Routers

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

65 65

Layer 3 Viewpoint
When a point-to-multipoint NBMA network is configured using a LIS (as described in the previous slides), the router sees only a single physical interface from a Layer 3 perspective. In the above example, the Central Site router sees Serial0 as a single interface that is connected to the Remote site routes. Furthermore, this single interface appears (from a Layer 3 point of view) as having the same characteristics as an Ethernet. That is to say, any broadcast or multicast packet sent on Serial0 will reach all remote site routers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

65

How Routers See NBMA at L3


Interface S0 ip address 192.1.1.1 255.255.255.0 ip ip ip ip frame-relay frame-relay frame-relay frame-relay map map map map 192.1.1.2 192.1.1.3 192.1.1.4 192.1.1.5 dlci dlci dlci dlci 10 11 12 13 broadcast broadcast broadcast broadcast

Central Site Router


S0

.1

Layer 2 Reality
192.1.1.0/24

uit rc Ci al rtu Vi cuit al Cir Virtu

uit irc C l tua Vir

S0 .2

S0 .3

Vir tua lC ircu it

S0 .4

S0 .5

Remote Site Routers

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

66 66

Layer 2 Reality
The Layer 2 reality is shown In the above example. The Central Site actually has separate VCs configured on Serial0 that connect it to all of the other remote site routers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

66

How Routers See NBMA at L3


Central Site Router
S0
(*, 224.1.1.1), 00:00:12/00:00:00, RP 10.1.1.1, flags: S Incoming interface: Null, RPF nbr 0.0.0.0 Outgoing interface list: Serial0, Forward/Sparse, 00:00:12/00:02:48

.1

Layer 3 Viewpoint

192.1.1.0/24
(*, G) Join S0 .5

(*, G) Join (*, G) Join S0 .2 S0 .3

S0 .4

Remote Site Routers

Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

67 67

Layer 3 Viewpoint (*, G) Joins


When several remote site routers join a group (as shown in the above drawing), the routers Layer 3 viewpoint treats Serial0 like an Ethernet interface capable of broadcast. The arriving (*, G) Joins result in only a single interface (Serial0) being put on the Outgoing Interface List of the (*, G) entry as shown in the above example.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

67

How Routers See NBMA at L3


Central Site Router
Source
E0 S0

.1

Layer 3 Viewpoint

192.1.1.0/24

S0 .2

S0 .3

S0 .4

S0 .5

Remote Site Routers

Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

68 68

Layer 3 Viewpoint Multicast Flow


When a source for group G goes active, the routers Layer 3 vi ewpoint causes it to treat Serial0 like an Ethernet interface capable of broadcast. Therefore, the router queues a single copy of the packet to the Serial0 output queue with the expectation that the packet will reach all remote site routers on the LIS.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

68

How Routers See NBMA at L3


Central Site Router
Source
E0 S0

.1

Layer 2 Reality
192.1.1.0/24

Router has to use pseudo broadcast to replicate packets in Layer 2 code Packets go where they are not wanted Process switched!

Unwanted!
S0 .2 S0 .3 S0 .4 S0 .5

Remote Site Routers

Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

69 69

Layer 2 Reality Multicast Flow


The Layer 2 reality of this situation is that there is a separate VC configured on Serial0 for each remote site router. Therefore, the Frame Relay (or ATM or Dial) interface driver is forced to replicate and send a separate copy of the multicast packet out each VC. (Often with different MAC headers.) This is referred to as Pseudo Broadcast and has the following implications: Copies of the multicast packet are sent out all VCs regardless of whether the remote router at the other end needs the multicast packet or not. The packet replication process must be handled at the Process switching level. This has an enormous impact on router performance and causes throughput to suffer! These replicated packets are placed in a separate Broadcast output queue on the interface. This is a limited resource and can easily fill up at Process switching speeds resulting in dropped multicast packets. Drops of multicast data is bad enough but consider what happens if the dropped multicast packet is a PIM control message such as a Join or a Prune. (More on that later.) Remember: This scenario is only applicable for NBMA networks implemented using point-to-multipoint interfaces and a Logical IP subnet configuration. This is not applicable when point -to-point sub-interfaces are used.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

69

Pruning Problem with Partial Mesh


Central Site Router
Source
E0 S0

.1

Layer 3 Viewpoint

192.1.1.0/24

(S, G) Prune S0 .2 S0 .3 S0

.4

S0 .5

Remote Site Routers

Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

70 70

Pruning problems with Partial Mesh NBMA Networks


The misguided Layer 3 viewpoint of the routers in partial mesh networks also has an impact on normal PIM control mechanisms. Consider the network example shown above where only a subset of the remote sites have members for a particular group. Traffic is being sent (via the pseudo broadcast mechanism) from the source to all the remote site routers. One remote site router does not have a member for this group yet it is receiving unwanted (S, G) traffic. It therefore responds by multicasting an (S,G) Prune PIM control message to the LIS for the Central site router to process.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

70

Pruning Problem with Partial Mesh


Central Site Router Not Heard by the Other Remote Site Routers!!!
Source
E0 S0

.1

Layer 2 Reality
192.1.1.0/24

(S, G) Prune

Other routers will not override the prune

S0 .2

S0 .3

S0

.4

S0 .5

Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

71 71

Pruning problems with Partial Mesh NBMA Networks


The Layer 2 reality of the situation is that the (S, G) Prune message is only sent up the VC to the Central site router. The other remote site routers in the partial mesh networks do not hear the (S, G) Prune. As a result of not hearing the (S, G) Prune, the remote site routers do not know to send an (S, G) Join message to the Central site router to override the Prune.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

71

Pruning Problem with Partial Mesh


Central Site Router
Source
E0 S0

.1

Layer 2 Reality
192.1.1.0/24

S0 .2

S0 .3

S0

.4

S0 .5

(S, G) traffic is shut off !


Members

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

72 72

Pruning problems with Partial Mesh NBMA Networks


As a result of not hearing any (S, G) Joins to to override the Prune, the Central site router dutifully prunes interface Serial0 after the 3 second prune delay. This result in (S, G) traffic being shutoff to the other sites!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

72

NBMA Mode
Solution: PIM-SM + NBMA mode
ip pim nbma-mode

Requires sparse mode When router receives join, it puts the interface and joiner in the outgoing interface list (OIL) When router receives a prune, it removes the interface/joiner from OIL

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

73 73

NBMA Mode Solution


In order to deal with these problems introduced by NBMA networks, it is necessary to provide the router with information about the underlying Layer 2 topology of the NBMA network. Cisco IOS accomplishes this with the following interface command: ip pim nbma-mode In order for this command to function correctly, sparse mode must be in use. The reason that this is necessary should be clear from the following description of how nbma-mode operates. When the ip pim nbma-mode command is configured on an interface, the normal PIM control message processing is modified as follows: When a Join message is received on the interface, the router puts both the interface and the joiner (usually in the form of the joiners IP address) in the Outgoing Interface List (OIL). When a Prune message is received on the interface, the router removes the associated interface/joiner from the OIL. The method effectively maintains a picture of the active underlying Layer 2 topology in the OIL which allows the router to make the appropriate fowarding decisions at Layer 3.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

73

NBMA Mode
Avoiding Pseudo Broadcast by Using ip pim nbma-mode
Central Site Router
S0
Interface S0 ip address 192.1.1.1 255.255.255.0 ip pim sparse-dense-mode ip pim nbma-mode

.1

192.1.1.0/24

S0 .2

S0 .3

S0

.4

S0 .5

Remote Site Routers

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

74 74

NBMA Mode Example


Returning to our original example, we now configure ip pim nbma-mode on Serial0.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

74

NBMA Mode
Avoiding Pseudo Broadcast by Using ip pim nbma-mode
Central Site Router
S0

.1

192.1.1.0/24

(*, G) Join S0 .2

(*, G) Join (*, G) Join S0

S0 .3

.4

S0 .5

Remote Site Routers

Members
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

75 75

NBMA Mode Example


Assume that some subset of the remote site routers have members for group G and therefore send (*, G) Joins toward the RP. (We are assuming the RP is at the Central site.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

75

NBMA Mode
Avoiding Pseudo Broadcast by Using ip pim nbma-mode
Central Site Router
S0

.1

192.1.1.0/24
(*, 224.1.1.1), 00:03:23/00:00:00, RP 10.1.1.1, flags: S Incoming interface: Ethernet0, RPF nbr 10.1.1.1 Outgoing interface list: Serial0, 192.1.1.2, Forward/Sparse, 00:00:12/00:02:48 Serial0, 192.1.1.3, Forward/Sparse, 00:03:23/00:01:36 Serial0, 192.1.1.4, Forward/Sparse, 00:00:48/00:02:12

S0 .2

S0 .3

S0

.4

S0 .5

Remote Site Routers

Members
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

76 76

NBMA Mode Example


When the Central site router receives these (*, G) Joins from three of the remote site routers, it adds a separate interface/joiner entry in the OIL for each of the received (*, G) Joins. This results in the OIL as shown in the example above.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

76

NBMA Mode
Avoiding Pseudo Broadcast by Using ip pim nbma-mode
Central Site Router
Source
E0 S0

.1

192.1.1.0/24

Router can now replicate packets in Layer 3 code Packets only go where needed Fast switched!

S0 .2

S0 .3

S0

.4

S0 .5

Remote Site Routers

Members
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

77 77

NBMA Mode Example


Because the router now has detailed information about the underlying Layer 2 topology (in the form of separate interface/joiner entries in the OIL) it can now replicate the multicast packets in the Layer 3 PIM code. This has the following advantages: The multicast packets are only sent to those remote site routers that have joined the group. Because the multicast fast-switching cache headers are a part of the OIL data structure, the router has the necessary MAC header information to perform fast-switching of the multicast traffic. This improves throughput considerably over the pseudo broadcast method which is process switched!

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

77

Auto-RP over NBMA Networks


A
S0

.1

Layer 2 Reality
192.1.1.0/24

Not heard by the other routers!!!

Auto-RP Messages
Announce messages Discovery messages
S0

.2 B

S0

.3 C

S0

.4 D

S0

.5 E

C-RP or MA

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

78 78

Auto-RP over NBMA Networks


Because Auto-RP relies on Dense mode flooding of the two Auto-RP groups to function properly, care must be taken when using Auto-RP in partial-mesh NBMA networks. The example above shows a Candidate-RP or Mapping Agent that has been configured at a remote site that is connected to a Hub-and-Spoke, partial mesh NBMA cloud. Because the network is not fully meshed, Auto-RP Announcement and Discovery messages do not reach remote site routers C, D and E.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

78

Auto-RP over NBMA networks


MA MA

Central Site Network

Solving the problem by moving the MAs

A
S0

.1

Layer 2 Reality
192.1.1.0/24

Auto-RP Messages
Announce messages
S0

.2 B

S0

.3 C

S0

.4 D

S0

.5 E

Discovery messages

C-RP
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

79 79

Auto-RP over NBMA Networks


One solution is to move the Mapping Agent(s) to the Central Site Network as shown in the example above. Announcement messages from the Candidate-RP travel from the remote site to the Mapping Agent(s) in the Central Site. The Mapping Agent(s) select the RP and send out Discovery messages which are flooded via Dense mode to all of the remote sites.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

79

Auto-RP over NBMA networks


MA MA

Solving the problem with additional VCs

Central Site Network

A
S0

.1

Layer 2 Reality
192.1.1.0/24

S0

.2 B

S0

.3 C

S0

.4 D

S0

.5 E

MA
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

80 80

Auto-RP over NBMA Networks


Another solution is to configure additional Virtual Circuits as shown in the example above. Discovery messages can now be flooded via Dense mode to all of the other remote sites. The obvious disadvantage is that more Virtual Circuits must be used which often increases the overal operational cost of the network since most Network Service Providers charge for each Virtual Circuit.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

80

Multicast over ATM

P2P PVCs ATM NBMA Cloud with Pseudo Broadcast ATM NBMA Cloud with PIM NBMA-Mode ATM NBMA Cloud with a P2MP Broadcast SVC ATM NBMA Cloud with a P2MP SVC per Group

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

81 81

Multicast over ATM


There are several methods that can be employed to run Multicast over a core ATM network. Each of the above methods are addressed in the following section.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

81

P2P PVCs
A

ATM
B C

Router Backbone Each PVC is a p2p subinterface Each PVC is a separate subnet ATM fabric modeled as a collection of p2p links to IPmc Use any PIM mode

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

82 82

Comments
Could use static PVCs or soft PVCs on the ATM switches Could use SVCs with mapping and broadcast (PVCs more stable)

Advantages
Works well when p2mp VCs are not available Effective pruning and excellent configuration control since each VC looks like a separate interface Fast switching supported

Disadvantages
Router does replication - watch out for high bandwidth flows and/or high fanout More configuration - more links, more subinterfaces, etc Multiple copies of packet on the ATM fabric (unlike p2mp VCs) Scalability - configuration gets larger as # of routers in mesh gets large

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

82

ATM NBMA Cloud w/Pseudo Bcast


A

ATM
B

NBMA Cloud

Each PVC is p2p on a multipoint (sub)interface Router Backbone each router fully meshed to the others ATM fabric modeled as a cloud/subnet Use any PIM mode Pseudo Bcast has poor performance Process-Swiched!!

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

83 83

Comments
Identical to Frame Relay cloud but over ATM Could use static PVCs Could use SVCs with mapping and broadcast (PVCs more stable)

Advantages
Easy to configure. Works with any PIM mode.

Disadvantages
Router does replication - watch out for high bandwidth flows and/or high fanout Fast switching not supported Multiple copies of packet on the ATM fabric (unlike p2mp VCs) Scalability - configuration gets larger as # of routers in mesh gets large

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

83

ATM NBMA Cloud w/PIM NBMA-mode


A

Each PVC is p2p on a multipoint (sub)interface Router Backbone - each router fully meshed to the others
C

ATM
B

NBMA Cloud

ATM fabric modeled as a cloud/subnet Use PIM sparse mode only with nbma-mode Fast-Switched!!

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

84 84

Comments
Identical solution to example in the NBMA mode section but over ATM Could use static PVCs or soft PVCs on the ATM switches Could use SVCs with mapping and broadcast (PVCs more stable)

Advantages
Traffic only goes to members who have joined the group via PIM SM even though its a multipoint interface (I.e. we just dont replicate to each neighbor who has a broadcast map) Works well when p2mp VCs are not available Effective pruning since we can control replication to each neighbor/VC Fast switching supported

Disadvantages
Router does replication - watch out for high bandwidth flows and/or high fanout More configuration - more links, more subinterfaces, must configure map list commands, etc Multiple copies of packet on the ATM fabric (unlike p2mp VCs) Scalability - configuration gets larger as # of routers in mesh gets large

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

84

ATM P2MP Broadcast SVCs


What if the WAN media could do native broadcast/multicast?
ATM can perform the broadcast/multicast packet replication task via p2mp SVCs

Answer: ATM multipoint-signaling


One p2mp SVC handles any and all outgoing broadcast & multicast traffic Sends 1 copy to N neighbors out of K interested parties. (Not optimal)

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

85 85

ATM Point-to-Multipoint (p2mp) Broadcast SVCs


Many ATM networks have the ability to build p2mp Switched Virtual Circuits. These p2mp SVCs can be used to connect a router to all of the other routers in the ATM LIS. By configuring p2mp SVCs (one originating from each router in the ATM LIS), the ATM network can be made responsible for performing the broadcast function. ATM Multipoint Signaling is a feature in Cisco IOS that permits a router to be configured so that it will create a p2mp SVC to the other designated routers in the ATM LIS. All outgoing broadcast and multicast traffic is sent on the p2mp SVC. The router sends only a single packet to the ATM network which then reaches N neighbor routers out of K interested routers. (If the number of K interested routers is small compared to N number of routers in the LIS, the efficiency of this solution decreases substantially.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

85

ATM NBMA Cloud w/P2MP Bcast SVCs


Each PVC is p2p on a multipoint (sub)interface Router Backbone p2mp SVCs do all broadcast/multicast replication instead of the router Use any PIM mode Suboptimal mcast solution Fast-Switched!!

ATM NBMA Cloud


D

Note: Only VCs for Router A shown.


Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

86 86

Comments
Leverage ATM fabric to do the broadcast/multicast replication vi a a single p2mp VC Could use static P2P PVCs or soft PVCs on the ATM switches

Advantages
ATM fabric does the replication instead of the router Off-loads router CPU Only one packet sent to the ATM fabric Fast switching supported

Disadvantages
All ATM routers get all the multicast groups even if they dont care about some of them ATM fabric must support p2mp VCs and have good replication performance Leafs of the p2mp VC are configured with map-list commands with the broadcast keyword Only useful for router-to-router backbones

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

86

ATM Multipoint Signaling

Command:
atm multipoint-signaling

Requires broadcast keyword on all ATM map-list statements


atm map-list mumble ip x.x.x.x atm-nsap xxxx.xxxx broadcast ip y.y.y.y atm-nsap yyyy.yyyy broadcast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

87 87

ATM Multipoint Signaling


May be enabled via the following interface command
atm multipoint-signaling

An ATM map-list must be configured to specify the IP to ATM NSAP address mapping of all the routers in the ATM LIS. The broadcast keyword must be configured on each entry of the ATM map list. This triggers the router to signal the ATM UNI layer to build a p2mp SVC to these routers in the LIS.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

87

ATM P2MP SVC per Group

What if each Group had its own ATM p2mp SVC


NBMA-mode solved sending K copies to K interested parties out of N neighbors A p2mp SVC/Group can solve sending 1 copy to K interested parties out of N neighbors

Answer: use PIM multipoint-signaling

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

88 88

ATM Point-to-Multipoint SVC per Group


Using a single p2mp SVC for all multicast traffic is sub-optimal as precious bandwidth is consumed sending unwanted copies of the multicast packets to routers that do not have downstream members for the group. If each group had a dedicated p2mp SVC that connected to only those downstream routers that had joined the group (i.e have downstream members), then efficiency is maximized. ATM Multipoint Signaling is a feature in Cisco IOS that permits a router to be configured so that it will create per group p2mp SVCs to the other routers that have joined the group. A single copy of the packet is sent to exactly K interested routers out of a total of N neighbor routers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

88

ATM NBMA Cloud w/P2MP SVC / Group


ATM NBMA Cloud
A

One p2mp SVC/group performs multicast replication instead of the router Bcast p2mp SVC used when # Groups > max p2mp VC count Use PIM Sparse mode p2mp SVCs map group membership Fast-Switched!!

Note: Only p2mp SVCs for Router A shown for clarity.


Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

89 89

Comments
Leverage ATM fabric to do the multicast replication via a p2mp VC per Group Could use static P2P PVCs or soft PVCs on the ATM switches Router cannot support unlimited number of p2mp VCs. Therefore Group to p2mp VC mapping is limited to a configured number of Groups. A rate threshold can be set to cause low traffic Groups to drop back to the shared broadcast p2mp VC.

Advantages
ATM fabric does the replication instead of the router Off-loads router CPU Only one packet sent to the ATM fabric Fast switching supported

Disadvantages
ATM fabric must support p2mp VCs and have good replication performance Leafs of the p2mp VC are configured with map-list commands with the broadcast keyword Only useful for router-to-router backbones

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

89

ATM P2MP SVC per Group

Algorithm: similar to NBMA mode


Rather than putting interface/joiner in OIL, put joiner on multipoint SVC Received joins cause UNI signaling ADD-PARTYs Received prunes cause UNI signaling DROP-PARTYs

Use a VC count threshold to keep down the number of SVCs opened


Use shared multipoint SVC and fanout as tie breaker

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

90 90

ATM P2MP SVC per Group


This feature is implemented similar to the ip pim nbma-mode feature. Each individual downstream router is not added to the outgoing interface list as a separate interface/joiner ID as is done in ip pim nbma-mode. Instead, the ATM interface and VC ID of the p2mp SVC is put in the outgoing interface list. Received PIM Joins trigger the router to send an ADD-PARTY signal (with the NSAP address of the Joiner) to the ATM UNI layer. Received PIM Leaves trigger the router to send an DROP-PARTY signal (with the NSAP address of the Joiner) to the ATM UNI layer. ATM SVCs are limited resources in both the routers and switches in the ATM network. Therefore an upper limit must be placed on the total number of p2mp group SVCs that can be created. In order to accomplish this, a VC Count threshold is used to limit the number of SVCs opened.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

90

ATM P2MP VC per Group

Commands:
ip pim multipoint-signalling ip pim vc-count <number> ip pim minimum-vc-rate <pps>

Good for single LIS which is fully meshed


Need the shared broadcast p2mp SVC
otherwise uses Pseudo-Broadcast (ugh!)

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

91 91

[no] ip pim vc-count <number>


Configures the maximum number of p2mp SVCs PIM opens. The default value is 200. When the router hits this maximum limit it will delete inactive p2mp SVCs so it may open other p2mp SVCs for new groups that might have activity.

[no] ip pim minimum-vc-rate <pps>


Configures the minimum traffic rate to keep p2mp SVCs active. When the maximum number of p2mp SVCs are opened and a new p2mp SVC needs to be opened, the router will scan existing p2mp SVCs. SVCs that have a current 1 second rate less than or equal to <pps> are eligible for deletion. Ties are broken by group fanout. Higher fanout groups lose and are deleted. (The idea is that high fanout groups can be moved to the broadcast p2mp SVC with a minimum loss of efficiency since these p2mp SVC come closest to mapping to all routers in the LIS.) If a p2mp SVC is deleted, it means that packets for its respective group do not have its own multipoint SVC. However, packets will flow over the shared broadcast/multicast p2mp SVC which delivers packets to all PIM neighbors. If all p2mp SVCs have a 1 minute rate more than <pps>, the new group will use the shared broadcast/multicast p2mp SVC. The default value of minimum-vc-rate is 0 packets per second.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

91

ATM P2MP VC per Group


Debugging P2MP VCs
rtr-a> rtr-a> show show ip ip pim pim vc vc IP IP Multicast Multicast ATM ATM VC VC Status Status ATM0/0 ATM0/0 VC VC count count is is 5, 5, max max is is 5 5 Group VCD Interface Group VCD Interface 224.0.1.40 21 ATM0/0 224.0.1.40 21 ATM0/0 224.2.2.2 26 ATM0/0 224.2.2.2 26 ATM0/0 224.1.1.1 28 ATM0/0 224.1.1.1 28 ATM0/0 224.4.4.4 32 ATM0/0 224.4.4.4 32 ATM0/0 224.5.5.5 35 ATM0/0 224.5.5.5 35 ATM0/0

Leaf Leaf Count Count 2 2 1 1 1 1 2 2 1 1

Rate Rate 0 0 pps pps 0 0 pps pps 0 0 pps pps 0 0 pps pps 0 0 pps pps

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

92 92

Debugging P2MP SVCs


show ip pim vc Displays ATM VC status information for multipoint VCs opened by PIM. When <group-or-name> is specified, only the single group is displayed. When <interface> is specified, only the single ATM interface is displayed. [11.3]

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

92

ATM P2MP VC per Group


Debugging P2MP VCs
Root P2MP VC with 3 Leaf Routers
rtr-a> rtr-a> show show atm atm vc vc Interface Interface ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 ATM0/0 VCD VCD 1 1 2 2 3 3 4 4 5 5 6 6 9 9 10 10 11 11 12 12 13 13 14 14 15 15 VPI VPI 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VCI VCI 5 5 16 16 124 124 125 125 126 126 127 127 130 130 131 131 132 132 133 133 134 134 135 135 136 136 AAL AAL / / Type Encapsulation Type Encapsulation PVC AAL5-SAAL PVC AAL5-SAAL PVC AAL5-ILMI PVC AAL5-ILMI MSVC-3 MSVC-3 AAL5-SNAP AAL5-SNAP MSVC AAL5-SNAP MSVC AAL5-SNAP MSVC AAL5-SNAP MSVC AAL5-SNAP MSVC AAL5-SNAP MSVC AAL5-SNAP SVC AAL5-SNAP SVC AAL5-SNAP SVC AAL5-SNAP SVC AAL5-SNAP MSVC-3 MSVC-3 AAL5-SNAP AAL5-SNAP MSVC-1 MSVC-1 AAL5-SNAP AAL5-SNAP SVC AAL5-SNAP SVC AAL5-SNAP MSVC-2 MSVC-2 AAL5-SNAP AAL5-SNAP MSVC-2 MSVC-2 AAL5-SNAP AAL5-SNAP Peak Peak Kbps Kbps 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 155000 Avg. Avg. Burst Burst Kbps Kbps Cells Cells Status Status 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT 155000 96 155000 96 ACT ACT

P2MP VC for which we are a Leaf


Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

93 93

Debugging P2MP SVCs


show atm vc Displays ATM VC status information for all open VCs. Notice the MSVC-n entries in the Type field. This indicates that the VC is a p2mp SVC and for which this router is the root. The number following the dash indicates the number of nodes on the p2mp SVC. Entries of MSVC without the dash indicate p2mp SVCs for which this router is a member (i.e. some other router is the root of the p2mp SVC.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

93

ATM P2MP VC per Group


Debugging P2MP VCs
show show ip ip mroute mroute 224.1.1.1 224.1.1.1 IP IP Multicast Multicast Routing Routing Table Table Flags: Flags: D D - Dense, Dense, S S - Sparse, Sparse, C C - Connected, Connected, L L- Local, Local, P P- Pruned Pruned R R - RP-bit RP-bit set, set, F F- Register Register flag, flag, T T- SPT-bit SPT-bit set, set, J J- Join Join SPT SPT Timers: Timers: Uptime/Expires Uptime/Expires Interface Interface state: state: Interface, Interface, Next-Hop Next-Hop or or VCD, VCD, State/Mode State/Mode (*, (*, 224.1.1.1), 224.1.1.1), 00:03:57/00:02:54, 00:03:57/00:02:54, RP RP 130.4.101.1, 130.4.101.1, flags: flags: SJ SJ Incoming Incoming interface: interface: Null, Null, RPF RPF nbr nbr 0.0.0.0 0.0.0.0 Outgoing Outgoing interface interface list: list: ATM0/0, ATM0/0, VCD VCD 3, 3, Forward/Sparse, Forward/Sparse, 00:03:57/00:02:53 00:03:57/00:02:53

ATM P2MP VC information for Group

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

94 94

Debugging P2MP SVCs


show ip mroute The information displayed in the outgoing interface list when p2mp SVCs are in use is modified to reflect not only the interface but the Virtual Circuit Descriptor (VCD) number of the p2mp SVC as well.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

94

ATM P2MP VC per Group


Debugging P2MP VCs
P2MP VC Opened by Group 224.1.1.1
rtr-a> rtr-a> show show atm atm vc vc 3 3 ATM0/0: ATM0/0: VCD: VCD: 3, 3, VPI: VPI: 0, 0, VCI: VCI: 124, 124, etype:0x0, etype:0x0, AAL5 AAL5 - LLC/SNAP, LLC/SNAP, Flags: Flags: 0x650 0x650 PeakRate: PeakRate: 155000, 155000, Average Average Rate: Rate: 155000, 155000, Burst Burst Cells: Cells: 96, 96, VCmode: VCmode: 0xE000 0xE000 OAM OAM DISABLED, DISABLED, InARP InARP DISABLED DISABLED InPkts: InPkts: 0, 0, OutPkts: OutPkts: 12, 12, InBytes: InBytes: 0, 0, OutBytes: OutBytes: 496 496 InPRoc: InPRoc: 0, 0, OutPRoc: OutPRoc: 0, 0, Broadcasts: Broadcasts: 12 12 InFast: InFast: 0, 0, OutFast: OutFast: 0, 0, InAS: InAS: 0, 0, OutAS: OutAS: 0 0 OAM OAM F5 F5 cells cells sent: sent: 0, 0, OAM OAM cells cells received: received: 0 0 Status: Status: ACTIVE, ACTIVE, TTL: TTL: 2, 2, VC VC owner: owner: IP IP Multicast Multicast (224.1.1.1) (224.1.1.1) interface interface = = ATM0/0, ATM0/0, call call locally locally initiated, initiated, call call reference reference = = 2 2 vcnum vcnum = = 11, 11, vpi vpi = = 0, 0, vci vci = = 132, 132, state state = = Active Active aal5snap aal5snap vc, vc, multipoint multipoint call call Retry Retry count: count: Current Current = = 0, 0, Max Max = = 10 10 timer timer currently currently inactive, inactive, timer timer value value = = 00:00:00 00:00:00 Leaf Leaf Atm Atm Nsap Nsap address: address: 47.0091810000000002BA08E101.444444444444.02 47.0091810000000002BA08E101.444444444444.02 Leaf Leaf Atm Atm Nsap Nsap address: address: 47.0091810000000002BA08E101.333333333333.02 47.0091810000000002BA08E101.333333333333.02 Leaf Leaf Atm Atm Nsap Nsap address: address: 47.0091810000000002BA08E101.222222222222.02 47.0091810000000002BA08E101.222222222222.02

NSAP Addresses of Leaf Nodes


Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

95 95

Debugging P2MP SVCs


show atm vc <vcd> The output of this command reflects the IP Multicast group responsible for the p2mp SVC being created. In addition, a list of all of the other nodes on the p2mp SVC (listed by NSAP address) is also displayed.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

95

Module Agenda

Bandwidth Control of Multicast Multicast Traffic Engineering Network Redundancy Multicast over NBMA Networks Reliable Multicast

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

96 96

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

96

Pragmatic General Multicast

IETF draft
draft-speakman-pgm-spec-??.txt

Routers assist the retransmit process


NAK suppression mechanism Retransmission constraint mechanism Maintain NAK/retransmission state only

Important point:
Routers dont do the retransmitting
Module7. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

97 97

Pragmatic General Multicast


PGM is a reliable multicast transport protocol for applications that require ordered duplicate-free, multicast data delivery from multiple sources to multiple receivers. PGM guarantees that a receiver in a multicast group either receives all data packets from transmissions and retransmissions, or can detect unrecoverable data packet loss. PGM is intended as a solution for multicast applications with basic reliability requirements. It is network layer-independent; the Cisco implementation of PGM Router Assist supports PGM over IP. PGM Router Assist feature allows Cisco routers to support optimal operation of Pragmatic General Multicast (PGM). The PGM Reliable Transport Protocol itself is implemented on the source and receiver hosts. PGM uses a NAK suppresion mechanism so that typically only one host sends a NAK for a particular lost packet. Routers acknowledge the NAK by multicasting a NAK-Conf (NCF) message and by instantiating retransmission state. This state is later used to forward the retransmitted packet to only those portions of the network where the packet was reported as lost. Important Point: Routers do NOT do the retransmitting. Retransmission is normally done by the source. The benefits of the PGM Router Assist are: It saves bandwidth: The PGM Router Assist feature saves bandwidth by substantially reducing the number of negative acknowledgments (NAKs) to the source and by constraining the retransmissions to only those receivers that experience data loss. It improves PGM Efficiency: The PGM Router Assist feature is not absolutely required for hosts that implement PGM, but PGM operates optimally in conjunction with routers that have this feature enabled.
Copyright ? ?1998-2001, Cisco Systems, Inc. Module7.ppt 97

Pragmatic General Multicast

Source multicasts packets (ODATA)


Identified by Transport Session Id (TSI) Sequenced by Sequence Number (SQ)

Receivers detect drops via TSI/SQ


Waits random delay before sending NAK NAKs are unicast to upstream PGM router

Routers send NAK Confirmations (NCF)


NCFs are multicast back to receivers Other receivers suppress NAKs upon hearing NCF

Module7. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

98 98

Pragmatic General Multicast


PGM supports any number of sources within a multicast group, each fully identified by a globally unique Transport Session Identifier (TSI). Sequence numbers (SQN's) identifies packets. Thus the combination of TSI and SQN uniquely identifies a packet that must be retransmitted. In the normal course of data transfer, a source multicasts sequenced data packets (ODATA), and receivers unicast selective negative acknowledgements (NAKs) for data packets detected to be missing from the expected sequence. Network elements (Cisco routers) forward NAKs PGM-hop-by-PGM-hop to the source, and confirm each hop by multicasting a NAK confirmation (NCF) in response on the interface on which the NAK was received. Retransmissions (RDATA) may be provided either by the source itself or by a Designated Local Repairer (DLR) in response to a NAK, or by another receiver in response to an NCF. NAKs provide the sole mechanism for reliability. NAKs are sent continuously until the receiver doesn't get NCF. When router receives NAK, NAK confirmation (NCF) is sent on an interface from which NAK was received. NCF is sent using multicast so that other receivers can see it and suppress their NAKs. NCFs are not propagated by PGM enabled routers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

98

Pragmatic General Multicast


S00

N20 N20

N21 N21

NCF (Multicast) NAK (Unicast)


N10 N10 N11 N11

R12 R11 TSI/SQN -1 OIF


Packet Lost Retransmission State

N00 N00

N01 N01

R00
Module7. ppt

R01

R02

R03

R04

R05
99 99

Suppresses NAK
1998 2001, Cisco Systems, Inc. All rights reserved.

PGM Example
When a retransmission is needed to make up for a lost packet, a sequence of events occurs. This sequence is collectively depicted by this slide and the following three slides. Upon detection of a missing data packet (error), a receiver repeatedly unicasts a NAK to the last-hop PGM router on the distribution tree from the source. A receiver repeats this NAK until it receives a NAK confirmation (NCF) multicast to the group from that PGM router. That router responds with an NCF to the first occurrence of the NAK and any further retransmissions of that same NAK from any receiver.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

99

Pragmatic General Multicast


S00 TSI/SQN -1 OIF R12
N10 N10 N11 N11

N20 N20

N21 N21

Retransmission State

NCF (Multicast) NAK (Unicast)

R11

N00 N00

N01 N01

R00
Module7. ppt

R01

R02

R03

R04

R05
100 100

1998 2001, Cisco Systems, Inc. All rights reserved.

PGM Example
In turn, the router repeatedly forwards the NAK to the upstream PGM router on the reverse of the distribution path from the source of the original data packet until it also receives an NCF from that router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

100

Pragmatic General Multicast


S00 TSI/SQN -1 OIF
Retransmission State

N20 N20

N21 N21

NCF (Multicast) NAK (Unicast)


N10 N10 N11 N11

R12 R11

N00 N00

N01 N01

R00
Module7. ppt

R01

R02

R03

R04

R05
101 101

1998 2001, Cisco Systems, Inc. All rights reserved.

PGM Example
In turn, the router repeatedly forwards the NAK to the upstream PGM router on the reverse of the distribution path from the source of the original data packet until it also receives an NCF from that router. This occurs repeatedly as needed.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

101

Pragmatic General Multicast


S00 TSI/SQN -1 OIF TSI/SQN -1 OIF R12
N10 N10 N11 N11

Retransmission State

N20 N20

N21 N21

Retransmission State

NCF (Multicast) NAK (Unicast) Retransmission

R11 TSI/SQN -1 OIF


Retransmission State

N00 N00

N01 N01

R00
Module7. ppt

R01

R02

R03

R04

R05
102 102

1998 2001, Cisco Systems, Inc. All rights reserved.

PGM Example
Finally, the source itself receives and confirms the NAK by multicasting a NCF to the group. The source then retransmits the missing data packet to the group address. PGM routers on the way forward the retransmitted packet according to the retransmission state in them - if the retransmission state exists the packet is forwarded to the interfaces via which NAKs were received.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

102

Module7.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

103

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

103

DVMRP
Module 8

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

Module Objectives

Explain the basic concepts of DVMRP Identify the various DVMRP Packet types Explain the detailed operation of DVMRP

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

2 2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

Module Agenda

Geekometer

DVMRP Overview DVMRP Packet Formats DVMRP Basic Concepts

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

3 3

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Overview
Distance vector-based
Similar to RIP Infinity = 32 hops Subnet masks in route advertisements

Routing information carried in IGMP packets


IP Protocol 2 (IGMP) IGMP type 0x13 (DVMRP)
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

4 4

DVMRP Overview
DVMRP is a Distance vector based protocol that is modeled after RIPv1 with the following fundamental differences: Infinity = 32 (hops) Subnet masks are sent in the route advertisements which make DVMRP a Classless protocol. DVMRP also makes special use of Poison-Reverse advertisements which is explained in the following slides. DVMRP routing information is carried inside of IGMP (IP protocol 2) packets. Therefore, if you are trying to capture a DVMRP conversation using equipment like a Network General Sniffer, you will need to capture IGMP packets and futher decode them. The IGMP type code for DVMRP is 0x13.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Overview
Similar to PIM DM
Broadcast and prune operation Uses DVMRP route table for RPF check

Virtual interfaces
Physical: Ethernet, FDDI, Token Ring, etc. Tunnels: IP-in-IP tunnels (IP Protocol 4)

Current version
mrouted 3.9 Available for most UNIX environments
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

5 5

DVMRP Overview (cont.)


In many ways, DVMRP operates like PIM Dense mode. The basic Broadcast and Prune model is used similar to Dense mode PIM with the exception that DVMRP builds source distribution trees using DVMRP routing information. Unlike PIM which uses the Unicast routing table to perform the RPF check, DVMRP uses its own DVMRP Multicast routing table which is built from periodic DVMRP route advertisements. DVMRP uses the concept of Virtual interfaces which may be any of the following: Physical: Ethernet, FDDI, Token Ring, etc. are all examples of Physical interfaces. Tunnels: DVMRP makes extensive use of IP-in-IP Tunnels to traverse Unicast-only clouds. (IP -in-IP Tunneling is assigned IP Protocol 4). Current version of DVMRP In most cases, DVMRP is implemented to run on Unix workstations as mrouted. At the time that this presentation was written, mrouted version 3.8 is the latest version and is available for most Unix environments. A beta version of 3.9 is also in the field and appears to be stable.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Basic Packets


DVMRP Probes
for DVMRP Neighbor Discovery

DVMRP Reports
for Multicast Route Exchange

DVMRP Prunes
for pruning multicast delivery trees

DVMRP Grafts
for grafting multicast delivery trees

DVMRP Graft Acks


for acknowledging graft msgs
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

6 6

DVMRP Basic Packets


Again, DVMRP rides inside of the IGMP protocol, (Type = 0x13). The following different DVMRP packets are differentiated via the Code field of the IGMP packet. DVMRP Probes Used to discover other DVMRP Neighbors on a Network DVMRP Reports Used to exchange DVMRP Source routing information. These packets are used to build the DVMRP Multicast Routing table which, in turn, is used to build Source Trees and also perform RPF checks on incoming multicast packets DVMRP Prunes These function similar to PIM Prunes and prune the multicast delivery tree(s). DVMRP Grafts These function similar to PIM Grafts and graft a branch back onto the multicast delivery tree. DVMRP Graft-Acks These function similar to PIM Grafts-Ack in that they are used to acknowledge DVMRP Graft messages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Probe Packet


7 Minor: Minor Code Version # Major: Major Code Version # Generation ID: Random non-decreasing # Neighbor IP Addresses: IP Address of other DVMRP Routers seen. Capabilities: Bit 0 : Leaf Bit 1 : Prune Bit 2 : GenID Bit 3 : Mtrace Bit 4 : SNMP
Module8. ppt

15 Code (0x1)

23 Checksum Minor Major

31

Type (0x13)

Reserved Capabilities

Generation ID Neighbor IP Address 1 Neighbor IP Address 2 ... Neighbor IP Address N

Router is a leaf node Router understands pruning Router sends Generation IDs Router handles Mtrace requests Router handles SNMP requests
8/14/2001 11:24 AM

1998 2001, Cisco Systems, Inc. All rights reserved.

7 7

DVMRP Probe Packet


Capabilities: Bit 0 : Leaf Bit 1 : Prune Bit 2 : GenID Bit 3 : Mtrace Bit 4 : SNMP Minor: Minor Code Version # Major: Major Code Version # Generation ID: Random non-decreasing number that permits other DVMRP neighbors to detect when detect when this router has rebooted. Neighbor IP Addresses: IP Address of other DVMRP Routers seen by this router. When a receiving DVMRP router sees its IP address in the list, it knows that a 2-way neighbor adjacency has been established. Router is a leaf node Router understands pruning Router sends Generation IDs Router handles Mtrace requests Router handles SNMP requests

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Route Report Packet


Route Advertisments
(Packed Format)

7 Type (0x13) Code (0x2)

15

23 Checksum Minor Major

31

Masks: Octets 2-4. Octet 1 = 0xFF (assumed) SrcNets :: Length in octets varies with length of mask. Metrics: (Hops) 1 - 31 = Valid Metric 32 = Infinity (unreachable) 33 - 63 = Poison Reverse

Reserved Mask1 SrcNet11 (cont.)... SrcNet12 (cont.)... Mask2 (cont.) SrcNet 21 (cont.) Mask3 (cont.)

SrcNet11 Metric 11 Metric 12 SrcNet12 Mask2

SrcNet 21 Metric 21 ... Mask3

Note: Poison Reverse is used to inform the Parent (uptree) DVMRP Router that we are a Child (down-tree) DVMRP Router and expect to receive traffic from these Source networks via this interface.
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

8 8

DVMRP Report Packet


Masks: Octets 2-4. Octet 1 = 0xFF (assumed) SrcNets: Length in octets varies with length of mask. Metrics: (Hops) 1 - 31 = Valid Metric 32 = Infinity (unreachable) = Poison Reverse 33 - 63

Poison Reverse metrics have a special function in DVMRP. When a DVMRP router receives a DVMRP Route Advertisement with a Poison Reverse from one of its DVMRP Neighbors, it indicates that the neighbor is a Child DVMRP Router (I.e. down-tree) which expect to receive multicast traffic from the advertised Source network(s) from this (parent) DVMRP router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Prune Packet


Source Address: Source IP Address S of the (S,G) to prune. Group Address: Multicast Group address G of the (S,G) to prune. Prune Lifetime:: Time (in seconds) this prune is active. 7 Type (0x13) 15 Code (0x7) 23 Checksum Minor Major 31

Reserved

Source Address Group Address Prune Lifetime

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

9 9

DVMRP Prune Packet


Source Address: Source IP Address S of the (S,G) to prune. Group Address: Multicast Group address G of the (S,G) to prune. Prune Lifetime:: Time (in seconds) this prune is active.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

DVMRP Graft & Graft Ack


7 Type (0x13) 15 Code (0x8) 23 Checksum Minor Major 31 7 Type (0x13) 15 Code (0x9) 23 Checksum Minor Major 31

Reserved

Reserved

Source Address Group Address

Source Address Group Address

Graft Packet

Graft Ack Packet


Source Address: Source IP Address S of the (S,G) to graft. Group Address: Multicast Group address G of the (S,G) to graft.

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

10 10

DVMRP Graft & Graft-Ack Packets


Source Address: Source IP Address S of the (S,G) to graft. Group Address: Multicast Group address G of the (S,G) to graft.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

10

DVMRP Debug Packets

DVMRP Ask-Neighbors2
request for list of DVMRP Neighbors
(used by mrinfo)

DVMRP Neighbors2
response to above
(used by mrinfo)

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

11 11

DVMRP Debug Packets


DVMRP defines some special Debug packets that are used by special Multicast troubleshooting applications such as mrinfo. These debug packets are as follows: DVMRP Ask-Neighbors Used to request a list of all known Multicast neighbor routers. (This form of the packet is obsolete and has been replaced by the Ask-Neighbors2 packet.) DVMRP Neighbors Used to respond to the above Ask-Neighbors request. (This form of the packet is obsolete and has been replaced by the Neighbors2 packet.) DVMRP Ask-Neighbors2 This is the newer format of the Ask-Neighbors packet. DVMRP Neighbors2 This is the newer format of the Neighbors packet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

11

DVMRP Ask-Neighbors2 Packet


7 Type (0x13) 15 Code (0x5) 23 Checksum Minor Major 31

Reserved

The Ask -Neighbors2 packet is a unicast request packet directed at a DVMRP router requesting the destination router to respond with a unicast Neighbors2 message back to the sender.

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

12 12

DVMRP Ask-Neighbors2 Packet


The Ask-Neighbors2 packet is a unicast request packet directed at a DVMRP router requesting the destination router to respond with a unicast Neighbors2 message back to the sender.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

12

DVMRP Neighbors2 Packet


7 Type (0x13) 15 Code (0x6) 23 Checksum Minor Major 31 Capabilties : 0 - Leaf 1 - Prune 2 - GenID 3 - Mtrace 4 - SNMP

Reserved Capabilites

Local Address1 Metric 1 Threshold1 Nbr1 ... NbrM Local AddressN Metric N ThresholdN Nbr1 ... NbrK
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Local AddressN : IP Address of a local interface. MetricN : Interface Metric ThresholdN : Metric-Threshold Flags N : 0 - Tunnel 1 - Src Route 2 - Reserved 3 - Down 6 - Leaf 4 - Disabled 5 - Reserved

Flags 1

Nbr Cnt1

Flags N

Nbr CntN

Nbr CntN : Number of Neighbors on this interface. NbrX - NbrY : List of Neighbor IP Addresses
8/14/2001 11:24 AM

13 13

DVMRP Neighbors2 Packet


Capabilties : 0 - Leaf 1 - Prune Local AddressN : IP Address of a local interface. MetricN : Interface Metric ThresholdN : Metric-Threshold FlagsN : 0 - Tunnel 1 - Src Route 2 - Reserved Nbr CntN : Number of Neighbors on this interface. NbrX - NbrY : List of Neighbor IP Addresses 3 - Down 4 - Disabled 5 - Reserved 6 - Leaf 2 - GenID 3 - Mtrace 4 - SNMP

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

13

DVMRPBasic Concepts

Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

14 14

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

14

DVMRPNeighbor Discovery
171.68.37.2 DVMRP Router 2 2 Receives Probe from DVMRP Router 1

mrouted

3 Sends Probe Neighbor List = 171.68.37.1

1 Sends Probe Neighbor List = NULL

mrouted
DVMRP Router 1 171.68.37.1

4 Sends Probe Neighbor List = 171.68.37.2

Probes Are Multicast to the All-DVMRP-Routers (224.0.0.4) Group Address

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

15 15

DVMRP Neighbor Discovery


1) DVMRP Router 1 multicasts a DVMRP Probe packet (to All-DVMRP Routers, 224.0.0.4) with an empty Neighbor List. 2) DVMRP Router 2 receives this packet and adds Router 1 to its internal list of DVMRP Neighbors. 3) DVMRP Router 2 now multicasts its own DVMRP Probe and includes Router 1 in the packets Neighbor List. 4) DVMRP Router 1 receives the packet and adds Router 2 to its internal list of DVMRP Neighbors and responds with a DVMRP Probe that has Router 2 in the packets Neighbor List.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

15

DVMRPBasic Concepts

Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

16 16

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

16

DVMRPRoute Exchange
Initial State DVMRP Route Table
Network Intf Metric 3 10 151.10.0.0/16 S0 204.1.16.0/24 S0 171.68.37.2 DVMRP Router 2

mrouted
E0

S0

Initial State
E0

DVMRP Route Table


Network 151.10.0.0/16 Intf Metric S0 6 3

mrouted
S0 DVMRP Router 1 171.68.37.1
Module8. ppt

198.14.32.0/24 S0

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

17 17

DVMRP Route Exchange


Initial State - Note the following: Both Router 1 and Router 2 have DVMRP Route Table entries for network 151.10.0.0/16 albeit with different metrics.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

17

DVMRPRoute Exchange
DVMRP Route Table
Network Intf Metric 3 10 151.10.0.0/16 S0 204.1.16.0/24 S0 171.68.37.2 DVMRP Router 2

mrouted
E0

S0 1 Sends Route Report 151.10.0.0/16, M=3 204.1.16.0/24, M=10

2 E0

Receives Route Report and updates entry

DVMRP Route Table


Network 151.10.0.0/16 204.1.16.0/24 Intf Metric S0 E0 E0 6 4 3 11 Updated Added
8/14/2001 11:24 AM

mrouted
S0 DVMRP Router 1 171.68.37.1
Module8. ppt

198.14.32.0/24 S0

1998 2001, Cisco Systems, Inc. All rights reserved.

18 18

DVMRP Route Exchange (cont.)


After the two routers are connect via the Ethernet as shown above, the following transactions take place: 1) Router 2 sends a DVMRP Route Report containing the following two routes from its DVMRP Route Table: 151.10.0.0/16, Metric = 3 204.1.16.0/24, Metric = 10 2)Router 1 receives the DVMRP Route Report increments the received metrics by 1 and performs the following: Updates its entry for 151.10.0.0/16 to point to Router 2 since the metric of 4 is better than its old metric. Adds and entry for 204.1.16.0/24 pointing to Router 2.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

18

DVMRPRoute Exchange
DVMRP Route Table
Network Intf Metric 3 10 151.10.0.0/16 S0 204.1.16.0/24 S0 171.68.37.2 DVMRP Router 2

mrouted
E0

S0

151.10.0.0/16, M=36 198.14.32.0/24, M=3 204.1.16.0/24, M=43

Poison Reverse

3 Sends Route Report E0

DVMRP Route Table


Network 151.10.0.0/16 204.1.16.0/24 Intf Metric S0 E0 E0 6 4 3 11

Poison Reverse indicates Router 1 is a Child (Down-Tree) of Router 2 for these Sources

mrouted
S0 DVMRP Router 1 171.68.37.1
Module8. ppt

198.14.32.0/24 S0

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

19 19

DVMRP Route Exchange (cont.)


3) Router 1 sends its own DVMRP Route Report containing its routes to Router 2. However, since Router 2 has a better metric for networks 151.10.0.0/16 and 204.1.16.0/24, it Poison Reverses these two routes by adding 32 to the metric.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

19

DVMRPRoute Exchange
4 Receives Route Report and adds entry 171.68.37.2 DVMRP Router 2

DVMRP Route Table


Network Intf Metric 3 10 4 151.10.0.0/16 S0 204.1.16.0/24 S0 Added 198.14.32.0/24 E0

mrouted
E0

S0

E0

DVMRP Route Table


Network 151.10.0.0/16 204.1.16.0/24 Intf Metric S0 E0 E0 6 4 3 11
8/14/2001 11:24 AM

mrouted
S0 DVMRP Router 1 171.68.37.1
Module8. ppt

198.14.32.0/24 S0

1998 2001, Cisco Systems, Inc. All rights reserved.

20 20

DVMRP Route Exchange (cont.)


4)Router 2 receives the DVMRP Route Report from Router 1 and performs the following steps: Adds network 198.14.32.0/24 to its DVMRP Route table (after incrementing the received metric by 1). Notes the fact that it received Poison Reverse advertisements for networks 151.10.0.0/16 and 204.1.16.0/24 from Router 1. This indicates that Router 1 expects to receive multicast traffic for sources in these networks from Router 2 (I.e Router 1 is a Child of Router 2 for these networks).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

20

DVMRPRoute Exchange
171.68.37.2 DVMRP Router 2

DVMRP Route Table


Network Intf Metric 3 10 4 151.10.0.0/16 S0 204.1.16.0/24 S0 198.14.32.0/24 E0

mrouted
E0

S0 5 Sends Route Report 151.10.0.0/16, M=3 204.1.16.0/24, M=10 198.14.32.0/24, M=36


Poison Reverse

E0

DVMRP Route Table


Network 151.10.0.0/16 204.1.16.0/24 Intf Metric S0 E0 E0 6 4 3 11

mrouted
S0 DVMRP Router 1 171.68.37.1
Module8. ppt

Poison Reverse indicates Router 2 is a Child (Down-Tree) of Router 1 for these Sources

198.14.32.0/24 S0

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

21 21

DVMRP Route Exchange (cont.)


5) Router 2 again sends a DVMRP Route Report containing its routes to Router 1. However, since Router 1 has a better metric for network 198.14.32.0/24, it Poison Reverses this route by adding 32 to the metric. This informs Router 1 that Router 2 expects to receive multicast traffic for sources in network 198.14.32 from Router 1 (I.e Router 2 is a Child of Router 1 for this network).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

21

DVMRPBasic Concepts

Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

22 22

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

22

DVMRP Source Trees


Source Network S1 Truncated Broadcast Trees Are Built using Best DVMRP Metrics Back to Source Network. Lowest IP Address Used in Case of a Tie. (Note: IP Address of D < C < B < A)

A
mrouted 1 33 1 mrouted 33

B
mrouted 1 2

X
mrouted 3 mrouted 34 35

mrouted 2 2

E
3

35

Y
mrouted

n m

Route for source network of metric n Poison reverse (metric + infinity) sent to upstream parent router. Router depends on parent to receive traffic for this source. Resulting Truncated Broadcast Tree for Source Network
8/14/2001 11:24 AM

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

23 23

DVMRP Source Trees


DVMRP builds its Source Trees utilising the concept of Truncated Broadcast Trees. The basic definition of a Truncated Broadcast Tree (TBT) is as follows: A Truncated Broadcast Tree (TBT) for source subnet S1, represent a shortest path spanning tree rooted at subnet S1 to all other routers in the network. In DVMRP, the abstract notion of the TBTs for all sub-networks are built by the exchange of periodic DVMRP routing updates between all DVMRP routers in the network. Just like its unicast cousin, RIPv2, DVMRP updates contain network prefixes/masks along with route metrics (in hop-counts) that describe the cost of reaching a particular subnets in the network. Unlike RIPv2, a downstream DVMRP router makes use of a special PoisonReverse advertisement to signal an upstream router that this link is on the TBT for source subnet S1. This Poison-Reverse (PR) is created by adding 32 to the advertised metric and sending back to the upstream router.

Example DVMRP TBT for network S1:


In the above example, DVMRP updates are being exchanged for source network S1. Routers A and B both advertise a metric of 1 (hop) to reach network S1 to routers C and D. In the case of router D, the advertisement from B is the best (only) route to source network S1 which causes router D to send back a PR advertisement (metric = 33) to B. This tells router B that router D is on the TBT for source network S1. In the case or router C, it received an advertisement form both A and B with the same metric. It breaks the tie using the lowest IP address and therefore sends a PR advertisement to router B. B now knows it has two branches of the TBT, one to router C and one to router D. These DVMRP updates flow throughout the entire network causing each router to send PR advertisements to its upstream DVMRP neighbor on the TBT for source network S1.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

23

DVMRP Source Trees


Source Network S1 Resulting Truncated Broadcast Tree for Source Network S1

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

S1 Truncated Broadcast Tree


Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

24 24

Example DVMRP TBT for network S1 (cont.)


Once the DVMRP network has converged and all PR advertisements have been sent up the TBT toward source network S1, the S1 TBT has been built. The drawing above shows the S1 TBT that resulted in the DVMRP route update exchanges from the previous page. Notice that this is a minimum spanning tree that is rooted at source network S1 and spans all routers in the network. If a multicast source were to now go active in network S1, the DVMRP routers in the network will initially flood this sources traffic down the S1 TBT.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

24

DVMRP Source Trees


A
mrouted

B
mrouted

Each Source Network has its Own Truncated Broadcast Tree

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Note: IP Address of D < C < B < A

S2 Truncated Broadcast Tree

Source Network S2

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

25 25

Every source network has its own TBT


In the drawing above, the TBT for network S2 is shown. This TBT would also be created by the exchange of DVMRP route updates and by PR advertisements sent by all routers in the network toward network S2. It is important to remember that these TBTs simply exist in the form of PR advertisements in the DVMRP routing tables of the routers in the network and as such, there is one TBT for every source network in the DVMRP net work.

Advantages of TBTs
The advantage of TBTs is that the initial flooding of multicast traffic throughout the DVMRP network is limited to flowing down the branches of the TBT. This insures that there are no duplicate packets sent as a result of parallel paths in the network.

Disadvantages of TBTs
The disadvantage of using TBTs is that it requires separate DVMRP routing information to be exchanged throughout the entire network. (Unlike other multicast protocols such as PIM that make use of the existing unicast routing table and do not have to exchange additional multicast routing data. Additionally, because DVMRP is based on a RIP model, it has all of the problems associated with a Distance-Vector protocol including, count-to-infinity, holddown, periodic updates. One has to ask oneself, Would I recommend someone build a unicast network based on RIP today? The answer is of course not, protocols like OSPF, IS-IS, and EIGRP have long since superseded RIP in robustness and scalability. The same is true of DVMRP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

25

DVMRPBasic Concepts
Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

26 26

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

26

DVMRPMulticast Forwarding

Incoming Packets Must Pass RPF Check


Multicast Packet from Source 151.10.3.21

X
mrouted
S1 S0

RPF Check Fails!


DVMRP Route Table Network IntfMetric Intf Metric 151.10.0.0/16 E1 4 198.14.32.0/24 S0 3 204.1.16.0/24 E0 11

mrouted
E0 E1

Packet Arrived on Wrong Interface!


mrouted
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

mrouted
8/14/2001 11:24 AM

27 27

DVMRP Multicast Forwarding


Just as in PIM, incoming packets must pass a Reverse Path Forwarding (RPF) check before they are accepted. However, DVMRP has its own DVMRP Route Table that is used for performing the RPF check. In the above example, a multicast packet from a source in the 151.10.0.0/16 network was received via interface S0. However, the DVMRP Route Table shows that the correct interface should be E1, not S0. Therefore, the RPF check for this packet fails and the packet is discarded.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

27

DVMRPMulticast Forwarding

Packets Forwarded Out All Non-Pruned, Downstream Interfaces


Multicast Packet from Source 198.14.32.10 RPF Check Succeeds!
mrouted
S1 S0 DVMRP Route Table Network IntfMetric Intf Metric 151.10.0.0/16 E1 4 198.14.32.0/24 S0 3 204.1.16.0/24 E0 11

mrouted
E0 E1

Packet Arrived on Correct Interface!


mrouted
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

mrouted
8/14/2001 11:24 AM

28 28

DVMRP Multicast Forwarding (cont.)


In the above example, a multicast packet from a source in the 198.14.32.0/24 network was received via interface S0. In this case the DVMRP Route Table shows that the correct interface is S0. Therefore, the RPF check for this packet succeeds and the packet is forwarded out all unpruned interfaces on the source tree.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

28

DVMRPMulticast Forwarding

Forwarding onto Multiaccess Networks


Network X A
mrouted 1 1

Both B & C have routes to network X. To avoid duplicates, only one router can be Designated Forwarder for network X. Router with best metric is elected as the Designated Forwarder. Lowest IP address used as tie-breaker. Router B wins in this example.

mrouted

mrouted 2 2

(Note: IP Address of B < C ) n


Module8. ppt

Route advertisement for network X of metric n


1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

29 29

DVMRP Multicast Forwarding (cont.)


When two or more routers share a common multi-access network, only one can be the Designated Router which is responsible for forwarding a source networks traffic onto the multi-access network; otherwise duplicate packets will be generated. The Designated Forwarder is selected based on the best route metric back to the source network (with the Lowest IP Address used as a tie-breaker). In the example above, both Router B and C share a common multi-access network and each have routes to network X. Since both have the same metric to network X, the lowest IP address is used to break the tie (in this case, Router B wins).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

29

DVMRPBasic Concepts

Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

30 30

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

30

DVMRP Pruning
Source S Initial Flooding of (S, G) Multicast Packets Down Truncated Broadcast Tree

A
mrouted

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

31 31

DVMRP Pruning
In this example we see source S has begun to transmit multicast traffic to group G. Initially, the traffic (shown by the solid arrows) is flooded to all routers in the network down the Truncated Broadcast Tree (indicated by the dashed arrows) and is reaching Receiver 1.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

31

DVMRP Pruning
Source S Routers C is a Leaf Node so it sends an (S, G) Prune Message Router B Prunes interface.

A
mrouted Prune

B
mrouted

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

32 32

DVMRP Pruning (cont.)


At this point, we see that router C is a leaf node on the TBT and has no need for the traffic. Therefore, it sends a DVMRP (S, G) Prune message up the TBT to router B to shutoff the unwanted flow of traffic. Router B receives this (S, G) Prune message and shuts off the flow of (S, G) traffic to router C.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

32

DVMRP Pruning
Source S Routers X, and Y are also Leaf Nodes so they send Prune (S, G) Messages Router E prunes interface. Prune

A
mrouted

B
mrouted

mrouted

mrouted

mrouted

mrouted

Prune

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

33 33

DVMRP Pruning (cont.)


Both routers X and Y are also leaf nodes that have no need for the (S, G) traffic (i.e. they have no directly connected receivers) and therefore send (S, G) Prunes up the TBT to router E. Once router E has received (S, G) Prunes messages from all DVMRP neighbours on the subnet it prunes the Ethernet interface connecting to router X and Y.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

33

DVMRP Pruning
Source S Router E is now a Leaf Node; it sends an (S, G) Prune message. Router D prunes interface.

A
mrouted

B
mrouted

X
mrouted Prune

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

34 34

DVMRP Pruning (cont.)


At this point, all of router Es downstream interfaces on the TB T have been pruned and it no longer has any need for the (S, G) traffic. As a result, it too sends an (S,G) Prune up the TBT to router D. When router D receives this (S, G) Prune, it prunes the interface and shuts off the flow of (S, G) traffic to router E.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

34

DVMRP Pruning
Source S

A
mrouted

B
mrouted

Final Pruned State

X
mrouted

mrouted

mrouted

mrouted

Y
mrouted

Receiver 1 (Group G)
Truncated Broadcast Tree based on DVMRP route metrics (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

35 35

DVMRP Pruning (cont.)


In the drawing above, we see the final pruned state of the TBT which leaves traffic flowing to the receiver. However, because DVMRP is a flood and prune protocol, these pruned branches of the TBT will time out (typically after 2 minutes) and (S, G) traffic will once again flood down all branches of the TBT. This will again trigger the sending (S, G) Prune messages up the TBT to prune of unwanted traffic.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

35

DVMRPBasic Concepts
Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

36 36

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

36

DVMRPGrafting
Source S 1 1 Receiver 2 joins Group G Router Y sends a Graft (S, G) Message 1 3 1 mrouted mrouted 1

mrouted

mrouted

X
mrouted

mrouted 1

Graft

E
1

Y
mrouted

Receiver 1 (Group G)
Source Tree S Based on DVMRP Route Tables (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver 2 (Group G)

8/14/2001 11:24 AM

37 37

DVMRP Grafting
Lets now assume that Receiver 2 now joins Group G by sending an IGMP Host Membership report for group G to Router Y. Router Y finds that it has state for source (S, G) and that it has previously pruned the source from the Source Tree (show with dashed green arrows). As a result, Router Y sends a Graft message to its upstream neighbor, Router E, for source S.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

37

DVMRPGrafting
Source S Router E Responds with a Graft-Ack 1 1

Sends its Own Graft (S, G) Message 1

mrouted 3

mrouted

X
mrouted

1 mrouted

Graft mrouted 1

mrouted 1

E
1 GraftGraft - Ack

Y
mrouted

Receiver 1 (Group G)
Source Tree S Based on DVMRP Route Tables (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver 2 (Group G)

8/14/2001 11:24 AM

38 38

DVMRP Grafting
Router E receives the Graft for (S, G) from Router Y and first responds by sending a Graft-Ack message to acknowledge the receipt of Router Ys Graft message. Router E now finds that it too has previously pruned (S, G) from the Source Tree and must therefore send an (S, G) Graft to its upstream neighbor Router D.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

38

DVMRPGrafting
Source S Router D Responds with a Graft-Ack 1 1 Begins Forwarding (S, G) Packets

mrouted 3

mrouted 1 1 mrouted GraftGraft - Ack mrouted 1

X
mrouted

mrouted 1

E
1

Y
mrouted

Receiver 1 (Group G)
Source Tree S Based on DVMRP Route Tables (S, G) Multicast Packet Flow
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Receiver 2 (Group G)

8/14/2001 11:24 AM

39 39

DVMRP Grafting
Router D receives the Graft for (S, G) from Router E and first responds by sending a Graft-Ack message to acknowledge the receipt of Router Es Graft message. Router D has not pruned (S, G) traffic from the Source Tree and therefore simply adds the interface towards Router E to its outgoing interface list. This restarts the flow of (S, G) traffic down to Receiver 2.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

39

DVMRPBasic Concepts
Neighbor discovery Route exchange Source trees Multicast forwarding Pruning Grafting Tunnels

Module8. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

40

DVMRPTunnels
mrouted mrouted

MBONE
mrouted

mrouted

Non-Multicast Network

Tunnel Direct Connection


mrouted

Local Network
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

41 41

DVMRP Tunnels
DVMRP Tunnels are primarily used to tunnel through non-Multicast capable networks. DVMRP Tunnels also have the added benefit of reducing the hop count across non-Multicast enabled networks to 1 hop. This is important when one considers that infinity is 32 hops in DVMRP. Since the Internet is considerably more than 32 hops in diameter, DVMRP Tunnels are a necessity in the Internet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

41

DVMRPTunnels

IP in IP Encapsulation
156.23.10.2
mrouted

Dst: 156.23.10.2 Dst: Src: Src : 192.1.1.1 Protocol: 4 Dst: 224.0.1.254 Src: 196.14.23.101 Protocol: xx

New IP Header Original IP Header

Multicast Data

Original Data

mrouted

192.1.1.1
Module8. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 11:24 AM

42 42

DVMRP Tunnels
DVMRP Tunnels are implemented using IP-in-IP Encapsulation. This encapsulation method uses an additional IP wrapper header with a protocol number of 4. The IP wrapper header uses the IP addresses of the end points of the tunnel as the Source and Destination addresses.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

42

Module8.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

43

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module8.ppt

43

Interconnecting PIM & DVMRP Multicast Networks


Module 9

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

Module Objectives Learn that Cisco routers dont run DVMRP! Explain Cisco routers PIM-DVMRP interoperability features & functions. Demonstrate an ability to configure PIM-DVMRP interoperability Demonstrate an ability to debug PIM-DVMRP interoperability
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

Module Agenda

Geekometer

PIM-DVMRP Interoperability Advanced PIM-DVMRP Features Debugging Tips

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

DVMRP Router Discovery


Automatic Detection of DVMRP Routers on PIM Enabled Interfaces
Router A ip multicasting interface E0 ip addr 10.1.1.1 255.255.255.0 ip pim dense dense-mode interface E1 ip addr 176.32.10.0 255.255.255.0 ip pim dense dense-mode

mrouted

DVMRP Probe

10.1.1.0/24

E0

Router A E1 176.32.10.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Whoa! Theres a DVMRP router on this PIM interface! Ill turn on DVMRP interoperation.

8/14/2001 2:11 PM

DVMRP Router Discovery


A Cisco router detects the existence of a DVMRP router on an interface whenever it receives a DVMRP Probe packet. PIM-DVMRP Interoperability is enabled automatically whenever a DVMRP router is detected on an interface that has PIM enabled.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

Basic PIM-DVMRP Interaction


PIM-DVMRP Interaction over p2p interfaces
DVMRP Probes DVMRP Reports
mrouted

DVMRP Tunnel w/Poison Reverse Tunnel0 DVMRP Route Table Unicast Route Table DVMRP Reports*

DVMRP Prunes and Grafts

DVMRP Prunes and Grafts

E0 PIM Network

* Unicast routes advertised depends on several factors


Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Interaction over P2P interfaces


Cisco routers receive the DVMRP Probes which notifies the Cisco router that a DVMRP neighbor is on the link. However, Cisco routers do not send DVMRP probes. Cisco routers receive DVMRP Route Reports over p2p links and stores them in a separate DVMRP routing table. The Cisco router sends back Poison-Reverse routes to the DVMRP neighbor so that the DVMRP neighbor will forward any traffic from sources in these networks across the link. (This effectively puts the Cisco router on the Truncated Broadcast Tree for sources in network that was Poison-Reversed.) Cisco routers redistribute certain Unicast routes to the DVMRP neighbor as DVMRP Route Reports. Cisco routers receive and process DVMRP Grafts and Prunes that arrive via p2p links. Cisco routers also send DVMRP Prunes and Grafts (as well as Graft-Acks) via p2p links.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

Basic PIM-DVMRP Interaction


Old PIM-DVMRP interaction over non-p2p interfaces (Ethernets, Etc.) prior to 11.2(13)
DVMRP Probes DVMRP Prunes DVMRP Reports
mrouted

DVMRP Grafts

DVMRP Reports* E1 Ignored Ignored E0 * Unicast routes advertised depends on several factors ** IGMP group joins sent for all known groups in PIM cloud
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Unicast Route Table

IGMP Group Joins**

DVMRP DVMRP Prunes Grafts

PIM Network
8/14/2001 2:11 PM 8

Interaction over non-P2P interfaces (prior to 11.2(13))


Cisco routers receive the DVMRP Probes which notifies the Cisco router that a DVMRP neighbor is on the link. However, Cisco routers do not send DVMRP probes. Cisco routers ignore DVMRP Route Reports received over p2p links prior to IOS version 11.2(13). Cisco routers redistribute certain Unicast routes to the DVMRP neighbor as DVMRP Route Reports. Cisco routers ignore any DVMRP Prunes received via a non-p2p link prior to IOS version 11.2(13). Prior to IOS version 11.2(13), Cisco routers would send IGMP Joins via non-p2p links for any active groups in the mroute table. Cisco routers ignore DVMRP Prunes received via non-p2p links. This is because Cisco routers do not keep track of how many DVMRP neighbors are on a multi-access link and which ones have sent prunes and which havent. (Note: DVMRP Prunes are supported on multi-access links in later releases of IOS.) Cisco routers will send DVMRP Prunes and Grafts (as well as Graft-Acks) via non-p2p links prior to IOS version 11.2(13). Cisco routers receive and process DVMRP Grafts that arrive via non-p2p links prior to IOS version 11.2(13).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

PIM-DVMRP Interaction
Problem with old Default PM-DVMRP interaction over non-p2p links (prior to 11.2(13))
mrouted

Source S, Group G

RP Border Router Router A Mcst Cache Empty Router B Mcst Cache (*,G) Router C Mcst Cache (*,G) Receiver, Group G PIM-Sparse Mode

Border router doesnt know about group G and doesnt send an IGMP group join for G Therefore, DVMRP router will not forward packets from source S
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

Problem with the IGMP Join interaction


Consider the example PIM -SM network in the drawing above. Here we have a receiver for group G which has resulted in the creation of a branch of the Shared Tree from the RP (Router B) down to the receiver. Router A is the border router connected to the DVMRP router. Unfortunately, Router A has no state for group G since it is not on the Shared Tree. As a result, Router A will not send IGMP Joins and source S traffic will not be forwarded across the boundary.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

Basic PIM-DVMRP Interaction


Using ip dvmrp unicast-routing to modify old default PIM-DVMRP interaction over non-p2p (prior to version 11.2(13))
DVMRP Probes DVMRP Prunes DVMRP Reports
mrouted

DVMRP Grafts

w/Poison Reverse E1 Ignored DVMRP Route Table *Unicast routes advertised depends on several factors. **IGMP Group Joins discontinued.
Module9.ppt

DVMRP Reports* DVMRP DVMRP Prunes Grafts Unicast Route Table IGMP Group Joins**

E0 PIM Network

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

10 10

Using ip dvmrp unicast-routing on non p2p links prior to 11.2(13)


Cisco routers receive the DVMRP Probes which notifies the Cisco router that a DVMRP neighbor is on the link. However, Cisco routers do not send DVMRP probes. Cisco routers receive DVMRP Route Reports over p2p links and stores them in a separate DVMRP routing table when ip dvmrp unicast-routing is configured on the link. The Cisco router will then also begin sending back Poison-Reverse routes to the DVMRP neighbor so that the DVMRP neighbor will forward any traffic from sources in these networks across the link. (This effectively puts the Cisco router on the Truncated Broadcast Tree for sources in network that was Poison-Reversed.) Cisco routers redistribute certain Unicast routes to the DVMRP neighbor as DVMRP Route Reports. Prior to IOS version 11.2(13), Cisco routers would send IGMP Joins via non-p2p links for any active groups in the mroute table. However, this behavior is suppressed when ip dvmrp unicast-routing is configured on the link. (The router instead sends PoisonReverse routes for source networks that it wishes to receive as mentioned in the previous paragraph.) Cisco routers ignore DVMRP Prunes received via non-p2p links. This is because Cisco routers do not keep track of how many DVMRP neighbors are on a multi-access link and which ones have sent prunes and which havent. (Note: DVMRP Prunes are supported on multi-access links in later releases of IOS.) Cisco routers will send DVMRP Prunes and Grafts (as well as Graft-Acks) via non-p2p links prior to IOS version 11.2(13). Cisco routers receive and process DVMRP Grafts that arrive via non-p2p links prior to IOS version 11.2(13).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

10

PIM-DVMRP Interaction
Using ip dvmrp unicast-routing over non-p2p links (prior to version 11.2(13))
mrouted

Source S, Group G

ip dvmrp unicast-routing Poison Reverse RP Receiver, Group G PIM-Sparse Mode

Border Router (pre-11.2(13))

Router A Mcst Table (S,G)

Router B Mcst Table (S,G)

Router C Mcst Table (S,G)

Border router now sends poison reverse routes so DVMRP router knows to forward packets from S
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

11 11

Workaround: Using ip dvmrp unicast-routing prior to 11.2(13)


Prior to IOS version 11.2(13), Cisco routers would not send Poison-Reverse routes over non-p2p links. Instead, it would send IGMP Joins if and only if the router had state for a group. The ip dvmrp unicast-routing command can be used to change this behavior and force the Cisco router to exchange DVMRP Route updates including Poison-Reverse updates. The Poison-Reverse updates inform the DVMRP router that the Cisco router should be placed on the Truncated Broadcast Tree for sources on the network specified in the Poison-Reversed route. This will cause the traffic from source S in this example, to be flooded across the boundary to Router A. When Router A receives this traffic from source S, it will then automatically proxyregister S to the RP (as if it were a directly connected source). In addition, Router A will join the Shared Tree for group G. (This insures that traffic for group G that is originated by a source in the PIM-SM cloud will be forwarded to the DVMRP router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

11

Basic PIM-DVMRP Interaction


New PIM-DVMRP interaction over non-p2p (Ethernet, FDDI, etc.) beginning with 11.2(13)*
DVMRP Probes DVMRP Prunes DVMRP Reports
mrouted

DVMRP Grafts

w/Poison Reverse E1 Ignored DVMRP Route * ip dvmrp unicast-routingTable no longer necessary. ** Unicast routes advertised depends on several factors.
Module9.ppt

DVMRP Reports** DVMRP DVMRP Prunes Grafts Unicast Route Table

E0 PIM Network

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

12 12

Interaction over non p2p links after IOS version 11.2(13)


Cisco routers receive the DVMRP Probes which notifies the Cisco router that a DVMRP neighbor is on the link. However, Cisco routers do not send DVMRP probes. Cisco routers automatically receive DVMRP Route Reports over p2p links and stores them in a separate DVMRP routing table. (It is no longer necessary to configure the ip dvmrp unicast-routing command on the link.) The Cisco router will also automatically send back Poison-Reverse routes to the DVMRP neighbor so that the DVMRP neighbor will forward any traffic from sources in these networks across the link. (This effectively puts the Cisco router on the Truncated Broadcast Tree for sources in network that was Poison-Reversed.) Cisco routers redistribute certain Unicast routes to the DVMRP neighbor as DVMRP Route Reports. Cisco routers will send DVMRP Prunes and Grafts (as well as Graft-Acks) via non-p2p links prior to IOS version 11.2(13). Cisco routers ignore DVMRP Prunes received via non-p2p links. This is because Cisco routers do not keep track of how many DVMRP neighbors are on a multi-access link and which ones have sent prunes and which havent. (Note: DVMRP Prunes are supported on multi-access links in later releases of IOS.) Cisco routers receive and process DVMRP Grafts that arrive via non-p2p links prior to IOS version 11.2(13).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

12

Basic PIM-DVMRP Interaction


Prune problem on non-p2p links prior to 12.1
mrouted
Router doesnt need (S, G) Traffic

mrouted

Prune (S, G)

(S, G) Traffic Im not keeping track of each DVMRP neighbor. So I dont know if they have ALL sent a Prune or not??!! Network S
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Ignored

8/14/2001 2:11 PM

13 13

Prune problem on non-p2p links prior to IOS version 12.1


The DVMRP pruning mechanisms on non-p2p links work differently than the pruning mechanisms of PIM. In PIM, the upstream router waits for 3-4 seconds before pruning a non-p2p link (i.e. a multi-access link) over which it has received a PIM Prune message. This provides other PIM routers on the link to override the Prune with a Join message. In DVMRP, there is no Prune override operation. Instead, the upstream DVMRP router is expected to keep track of all downstream DVMRP neighbors on the link and only prune the traffic when ALL DVMRP routers on the link had sent a Prune message. (In other words, the upstream router had to get a unanimous vote to Prune from all DVMRP routers on the link before it would prune.) Prior to release 12.1, Cisco routers did not keep track of the individual DVMRP routers on a multi-access (non-p2p) link. Therefore, it was unable to ascertain that ALL DVMRP routers on the link had sent a DVMRP Prune for the traffic. As a result, the Cisco router would ignore all DVMRP Prune messages and continue to send traffic under the assumption that there was still some other DVMRP router on the link that needed the traffic. Beginning with IOS version 12.1, Cisco routers now keep track of all DVMRP routers on the link and will prune the link when it receives a DVMRP prune from all DVMRP routers on the wire.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

13

Basic PIM-DVMRP Interaction


Poison-Reverse ignored prior to 11.2(13)
DVMRP Reports

mrouted

Unwanted Traffic!!!

Poison Reverse Routes Si , Sk

Normal Routes

Si , Sj , Sk Traffic

Ignored

DVMRP Route Table


Networks Si , Sj , Sk
8/14/2001 2:11 PM 14 14

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Poison-Reverse messages ignored prior to 11.2(13)


Prior to IOS version 11.2(13), Cisco routers ignored all Poison-Reverse routes from DVMRP routers and just assumed that the DVMRP routers wanted all traffic sourced from the PIM cloud. (This was equivalent to assuming every route advertised to a DVMRP neighbor resulted in a Poison-Reverse route being sent back from the DVMRP neighbor.) This behavior was sub-optimal as the DVMRP router would sometimes have a better route to a particular source network via some other DVMRP router in the DVMRP cloud. This would result in unwanted traffic to flow to the DVMRP router. This was fixed after release 11.2(13).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

14

Basic PIM-DVMRP Interaction


New PIM-DVMRP interaction over non-p2p (Ethernet, FDDI, etc.) beginning with 12.1*
DVMRP Probes DVMRP Prunes DVMRP Reports
mrouted

DVMRP Grafts

DVMRP Prunes

w/Poison Reverse E1

DVMRP Reports** DVMRP Grafts Unicast Route Table

DVMRP Route Table * DVMRP neighbors tracked. DVMRP Prunes supported. ** Unicast routes advertised depends on several factors.
Module9.ppt

E0 PIM Network

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

15 15

Interaction on non-p2p links after IOS version 12.1


Beginning with IOS version 12.1 and on, DVMRP prunes are no longer ignored on multiaccess (non-p2p) links. This leads to the following PIM-DVMRP interaction over all links. Cisco routers receive the DVMRP Probes which notifies the Cisco router that a DVMRP neighbor is on the link. However, Cisco routers do not send DVMRP probes. Cisco routers automatically receive DVMRP Route Reports and stores them in a separate DVMRP routing table. The Cisco router will also automatically send back Poison-Reverse routes to the DVMRP neighbor so that the DVMRP neighbor will forward any traffic from sources in these networks across the link. (This effectively puts the Cisco router on the Truncated Broadcast Tree for sources in network that was Poison-Reversed.) Cisco routers redistribute certain Unicast routes to the DVMRP neighbor as DVMRP Route Reports. Cisco routers will send DVMRP Prunes and Grafts (as well as Graft-Acks). Cisco routers receive and process DVMRP Prunes and Grafts.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

15

PIM-DVMRP Interaction

Recommendations
Use IOS release 12.1 or later. OR Use DVMRP tunnels or p2p interfaces whenever possible.

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

16 16

PIM-DVMRP Recommendation
Use IOS version 12.1 or later in order to more closely emulate the functions of a DVMRP router and to avoid several of the issues discussed on the previous pages. If this is not possible, the use of p2p links or DVMRP Tunnels is recommended in order to avoid issues such as the Pruning issue. Note: DVMRP Tunnels can be used across a multi-access link to avoid many of the problems outline previously. This requires a separate DVMRP Tunnel to each DVMRP router on the multi-access network. In addition, the PIM router and the DVMRP router SHOULD disable multicast on this multi-access link and only permit multicast traffic to flow via the DVMRP Tunnels.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

16

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

17 17

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

17

PIM-DVMRP Route Exchange


Received DVMRP Routes Are Installed in a Separate DVMRP Route Table
DVMRP Report 151.16.0.0/16, M=6 172.34.15.0/24, M=9 202.13.3.0/24, M=7

mrouted

Tunnel
DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0 Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

E0

E1

176.32.10.0/24 176.32.15.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

18 18

PIM-DVMRP Route Exchange


When a Cisco router receives a DVMRP Route update, the routes are stored in a separate DVMRP route table. The routes in the DVMRP route table have a default Admin. Distance of zero and are therefore preferred over routes in the Unicast Route table when performing the RPF calculation.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

18

PIM-DVMRP Route Exchange


Only Connected Unicast Routes Are Advertised by Default interface Tunnel 0
DVMRP Report 151.16.0.0/16, M=39 172.34.15.0/24, M=42 202.13.3.0/24, M=40 176.32.10.0/24, M=1 176.32.15.0/24, M=1

mrouted

Tunnel

ip unnumbered Ethernet 0 ... interface E0 ip addr 176.32.10.1 255.255.255.0 ip pim dense-mode interface E1 ip addr 176.32.15.1 255.255.255.0 ip pim dense-mode

DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0

E0

E1

Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

176.32.10.0/24 176.32.15.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

19 19

PIM-DVMRP Route Exchange


Cisco routers will advertise the routes in the DVMRP route table to DVMRP neighbors. The Cisco router will use the normal DVMRP Poison-Reverse mechanism to indicate those networks that it expects to receive traffic from the DVMRP router. In addition to the DVMRP routes mentioned above, the Cisco router will by default, redistribute (advertise) all connected routes to the DVMRP neighbor as DVMRP routes with a metric of 1. In the example above, the Cisco router has received three DVMRP routes from the DVMRP neighbor: 151.16.0.0/16 172.34.15.0/24 202.13.3.0/24 Metric=7 Metric=10 Metric=8

The Cisco router Poison-Reverses these routes by adding Infinity (32) to these metrics and sending them back to the DVMRP router. This informs the DVMRP router that the Cisco router should be included on the Truncated Broadcast Tree for these source networks. This will cause traffic from any source in these networks to be flooded across the interface. The Cisco router is automatically redistributing (advertising) all connected routes from its unicast route table with a DVMRP metric of one. This results in the following DVMRP route advertisements being sent to the DVMRP neighbor: 176.32.10.0/24 176.32.15.0/24 Metric=1 Metric=1

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

19

PIM-DVMRP Route Exchange


Classful Summarization of Unicast Routes
DVMRP Report 151.16.0.0/16, M=39 172.34.15.0/24, M=42 202.13.3.0/24, M=40 176.32.0.0/16, M=1 interface Tunnel 0 ip addr 204.10.10.1 255.255.255.0 ... interface E0 ip addr 176.32.10.1 255.255.255.0 ip pim dense-mode interface E1 ip addr 176.32.15.1 255.255.255.0 ip pim dense-mode

mrouted

Tunnel 204.10.10.0/24

DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0

E0

E1

Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

176.32.10.0/24 176.32.15.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

20 20

Classful Summarization of redistributed Unicast Routes


Cisco routers automatically perform classful summarization of routes that are advertised to a DVMRP neighbor. In the example above, the interface between the Cisco router and the DVMRP router is not within the same classful subnet as the connected routes. This causes the connected networks to be summarized into a Class B network advertisement resulting in the following advertised route: 176.32.0.0/16 Metric=1

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

20

PIM-DVMRP Route Exchange


Forcing DVMRP Route Summarization ( 11.2(5) )
DVMRP Report 151.16.0.0/16, M=39 172.34.15.0/24, M=42 202.13.3.0/24, M=40 176.32.0.0/16, M=1 interface Tunnel 0 ip unnumbered Ethernet 0 ip dvmrp summary-addr 176.32.0.0 255.255.0.0 ...

mrouted
Tunnel

DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0

E0

E1

Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

176.32.10.0/24 176.32.15.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

21 21

Forcing Route Summarization


Beginning with IOS version 11.2(5), classful summarization can be forced even though the Tunnel interface falls within the same classful network as the connected routes. In the above example, the ip dvmrp summary-address command is being used to summarize the two connected networks into the following DVMRP route advertisement: 176.32.0.0/16 Metric=1

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

21

PIM-DVMRP Route Exchange


The Deadly ip ip dvmrp metric n n Command
DVMRP Report 151.16.0.0/16, M=39 172.34.15.0/24, M=42 202.13.3.0/24, M=40 176.32.10.0/24, M=1 176.32.15.0/24, M=1 176.32.20.0/24, M=1 . . . (10,000 Routes!) DVMRP Route Table Src Network Intf Metric Dist 151.16.0.0/16 E0 7 0 172.34.15.0/24 E0 10 0 202.13.3.0/24 E0 8 0 interface Tunnel 0 ip unnumbered Ethernet 0 ip dvmrp metric 1 ... interface E0 ip addr 176.32.10.1 255.255.255.0 ip pim dense-mode interface E1 ip addr 176.32.15.1 255.255.255.0 ip pim dense-mode

10,000 routes!!! Choke, gasp, wheeze!

mrouted
Tunnel

E0

E1

Unicast Route Table (10,000 Routes) Network Intf Metric Dist 176.32.10.0/24 E0 10514432 90 176.32.15.0/24 E1 10512012 90 176.32.20.0/24 E1 45106272 90 (Includes 200-176.32 Routes)

176.32.10.0/24 176.32.15.0/24
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

22 22

Redistributing other Unicast Routes


If it is desired to redistribute more routes from the Unicast Routing table than the default connected routes, the interface may be configured with the ip dvmrp metric command. In the example above, the Tunnel interface is configured with ip dvmrp metric 1 command. Because no option ACL was specified in this command, a permit any is assumed which results in all 10,000 routes from the Unicast Routing table being redistributed to the DVMRP neighbor with a DVMRP metric of 1. Note: The ip dvmrp metric command should normally be used with an optional ACL in order to limit the routes being redistributed to the DVMRP neighbor so that route loops are not formed. The use of an ACL is especially important if the size of the Unicast Routing table is quite large as it may be possible to overwhelm the DVMRP router with too many routes.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

22

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

23 23

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

23

PIM-DVMRP RPF Checking Sources of RPF Information


Static Mroutes MBGP Multicast NLRI (M -RIB) DVMRP Routes Unicast Routes

RPF Information source selection


Based on Admin. Distance
If equal, preferred in order listed above
i.e. Static Mroute is most preferred source Default DVMRP Admin. Distance = 0
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

24 24

Sources of RPF Information


Cisco routers calculate RPF Information (Incoming Interface and RPF Neighbor) using several sources. These sources are: Static Mroutes MBGP Multicast NLRI (M-RIB) DVMRP Routes Unicast Routes The selection as to which source to use is based on Administrative Distance. If the Admin. Distances are equal, the tie is broken using the order listed above. That is to say, Static Mroutes are preferred over MBGP Multicast NLRI which is preferred over DVMRP routes which are preferred over Unicast routes. Note: The default Admin. Distance of routes in the DVMRP route table is zero.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

24

PIM RPF Calculation Details


Decreasing Preference Static Mroute Table
(First Match)

Route/Mask, Dist.
(Default Dist. = 0)

(Best Path)

RPF Calculation (Use best Distance unless Longest Match1 is enabled. If enabled, use longest Mask.)

BGP MRIB

Route/Mask, Dist.
(eBGP Def. Dist.=20) (iBGP Def. Dist.=200) (Longest Match)

IIF, RPF Neighbor

DVMRP Route Table

Route/Mask, Dist.
(Default Dist. = 0)

Unicast Routing Table


Module9.ppt

(Longest Match)

Route/Mask, Dist.
1 Global Command:

ip multicast longest-match
8/14/2001 2:11 PM 25 25

1998 2001, Cisco Systems, Inc. All rights reserved.

PIM RPF Calculations


Cisco IOS permits other sources of information to be used in the RPF calculation other than the unicast routing table. In general, these other sources are preferred based on their Admin. Distance. If Admin Distance values are equal, the sources are preferred in the order listed below: Static Mroute Table Static Mroutes may be defined that are local to the router on which they are defined. If a matching Static Mroute is defined, its default Admin. Distance is zero and is therefore preferred over other sources. (If another source also has a distance of zero, the Static Mroute takes precedence.) BGP Multicast RIB (M-RIB) If MBGP is in use and a matching prefix exists in the MBGP M-RIB, it will be used as long as its Admin. Distance is the lowest of the other sources. (MBGP M-RIB prefixes are preferred over DVMRP or Unicast routes if the Admin Distances are the same.) DVMRP Route Table If DVMRP routes are being exchanged and there exists a matching route in the DVMRP route table, the default Admin. Distance of this route is zero. DVMRP routes are preferred over Unicast routes if their Admin. Distances are equal. Unicast Route Table This is least preferred source of information. If no other source has a matching route with a lower Admin. Distance, then this information is used.
Note: The above behavior can be modified so that the longest match route is used from the available sources. This is configured with the ip multicast longest-match hidden command.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

25

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

26 26

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

26

PIM-DVMRP Congruency

RPF Information Source


Should be consistent in all routers
Otherwise, RPF checks may fail at some routers This results in congruency problems

DVMRP often causes inconsistencies


Some routers use DVMRP routes for RPF check Others use Unicast routes for RPF check

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

27 27

PIM-DVMRP Congruency
The source of RPF information normally should be consistent in all routers in the network or the possibility of congruency problems can occur that result in multicast traffic failing the RPF check in some routers. The use of DVMRP routes can often cause these inconsistencies to occur if all routers in the network are not maintaining DVMRP route tables. This can result in some routers using the DVMRP route table for RPF calculations while the other routers are still using the Unicast route table.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

27

PIM/DVMRP Congruency
DVMRP
Physical Path Tunnel Path Data Flow RPF Path

PIM-DM

MR1 Packets not forwarded !!! Receiver Receiver RPF Failure !!!
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

MR2

8/14/2001 2:11 PM

28 28

PIM-DVMRP Congruency
The slide above shows an example of how the use of DVMRP routes can cause RPF failures as a result of Unicast/DVMRP incongruency. Router MR2 is connected to a router in the DVMRP cloud via a DVMRP tunnel over which it is receiving DVMRP routes for sources in the DVMRP network. Since the default distance of DVMRP routes is zero, the DVMRP routes will be used by MR2 to perform the RPF calculation for sources in the DVMRP cloud. Multicast traffic from sources in the DVMRP cloud arrive via the DVMRP Tunnel and pass the RPF check. (Based on the DVMRP Routing table in MR2 which RPFs up the Tunnel.) When the multicast traffic flow reaches router MR1, it does not have any DVMRP routes and therefore uses the Unicast Route table to calculate the RPF interface which is not the interface where the multicast traffic is arriving. As a result, the multicast traffic fails the RPF check and is discarded.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

28

PIM/DVMRP Congruency

Dont run any DVMRP!


Sometimes necessary to transition to PIM

Make the topologies congruent


Not always an option (although the best)

Propagate DVMRP routes to other routers as necessary


Transition method

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

29 29

PIM/DVMRP Congruency
Avoid the use of DVMRP in your network if at all possible. However, this is sometimes unavoidable when transitioning a legacy DVMRP network over to PIM. The best solution is to maintain congruency between PIM routing and DVMRP routing. Since PIM normally uses the unicast routing table for the RPF calculation, this means that the DVMRP routing information would have to match the Unicast routing information in order to make the topologies congruent. This is not always an option. Another transition method is to propagate DVMRP routes to other routers in the PIM domain to insure that the topologies remain congruent. This should only be considered as a transition method since it may be necessary to propagate DVMRP routes to EVERY router in the network in order to accomplish this.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

29

PIM/DVMRP Congruency
DVMRP

Congruent Topologies
Physical Path Tunnel Path Data Flow RPF Path

MR3

PIM-DM

MR1

MR2

Receiver

Receiver

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

30 30

PIM/DVMRP Congruency
In the example above, the DVMRP and PIM (Unicast) topologies are physically congruent. This is because the DVMRP tunnel is terminated on router MR3 which happens to also be the path out of the PIM domain. Hence, the RPF calculations done on MR1 and MR2 (based on the unicast route table) will RPF correctly.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

30

PIM/DVMRP Congruency
DVMRP

Propagating DVMRP routes


Physical Path Tunnel Path Data Flow RPF Path

MR3

PIM-DM

MR2: interface tunnel0 ip unnumbered ethernet0 ip pim dense-mode tunnel mode dvmrp tunnel source ethernet0 tunnel destination <mrouted> interface Serial0 ip pim dense-mode ip dvmrp unicast-routing

MR1
S0 S0

MR2

MR1: interface Serial0 ip pim dense-mode ip dvmrp unicast-routing

Receiver

Receiver ip dvmrp unicast-routing

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

31 31

PIM/DVMRP Congruency
In the example above, the DVMRP and PIM (unicast) topologies are not physically congruent on routers MR1 and MR3. This is because the DVMRP tunnel is terminated further down inside of the PIM domain. (In this case, it is terminated on MR2). As a result, router MR1 would normally have an RPF path (for sources outside the PIM domain) via router MR3. This would cause RPF failures at router MR1 for arriving traffic received via router MR2 and the DVMRP tunnel. A transition strategy would be to have MR2 and MR1 exchange DVMRP routes using the ip dvmrp unicast-routing command. This will permit MR2 to send DVMRP routes to MR1 which will in turn, result in MR1 using these routes to RPF for sources in the DVMRP network. Note: It would also be necessary to exchange DVMRP routes from MR1 to MR3 if there are any other interfaces (not shown) on MR3 where there are hosts or PIM neighbor routers. Otherwise, MR3 would also RPF fail any traffic arriving from MR1.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

31

Propagating DVMRP Routes


Using ip dvmrp unicast-routing between two PIM neighbors causes DVMRP routes to be exchanged.
PIM Router DVMRP Route Table Unicast Route Table

ip dvmrp unicast-routing

DVMRP Reports*

PIM Router *Split-Horizon is used between two PIM neighbors instead of Poison Reverse.
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

DVMRP Route Table

Unicast Route Table

8/14/2001 2:11 PM

32 32

Propagating DVMRP Routes between Cisco routers


By configuring the ip dvmrp unicast-routing interface command on the interfaces connecting two Cisco routers, DVMRP route updates will be exchanged in the same fashion as if there were a DVMRP router on the interface. Exception: Poison-Reverse is not used between Cisco routers. Split-Horizon is used instead.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

32

DVMRP Tunneling Problem


DVMRP
mrouted

RPF Direction

X X
RPF Failures!!!

Mcst Traffic

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

33 33

DVMRP Tunneling Problem


Care must be used when terminating a DVMRP tunnel into a PIM domain as the routers between the tunnel termination router and the normal network exit, will result in RPF failures. An example of this is shown above. Notice that the routers at and above the tunnel termination point are using the unicast routing table to RPF to sources in the DVMRP network. This results in RPF failures as shown.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

33

DVMRP Tunneling Problem


DVMRP
mrouted

RPF Direction Mcst Traffic

Solution 1: Terminate tunnel at top of hierarchy.

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

34 34

DVMRP Tunneling Problem


The best way to terminate a DVMRP tunnel in a PIM domain is to terminate the tunnel on the border router where the PIM domain and the DVMRP domain connect. This results in a congruent topology and all routers down the hierarchy will RPF correctly. (Using the normal Unicast routing table information.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

34

DVMRP Tunneling Problem


DVMRP
mrouted

RPF Direction Mcst Traffic ip dvmrp unicast-routing

Solution 2: Use ip dvmrp unicast-routing

DVMRP route exchange must be enabled from here up in the hierarchy!! hierarchy!!
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

35 35

DVMRP Tunneling Problem


While it is possible to use ip dvmrp unicast-routing to get the Cisco routers to exchange DVMRP routes (and hence RPF correctly for sources in the DVMRP network), the number of routers that must exchange DVMRP routes can quickly get out of hand. The basic rule of thumb is that ip dvmrp unicast-routing must be run on all routers from the tunnel termination point up the hierarchy as shown in the drawing above. Therefore, one should terminate the DVMRP tunnel as close to the PIM-DVMRP network border as possible to minimize the number of routers that must exchange DVMRP routes.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

35

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

36 36

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

36

PIM-DM/DVMRP Boundaries DVMRP Network uses Push model


Traffic flooded everywhere

PIM-DM Network uses same model


Traffic flooded everywhere

Common model makes interface easy.


Use PIM-DM flood and prune mechanism. Must also observe key DVMRP signals
Poison Reverse DVMRP Prunes
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

37 37

PIM-DM/DVMRP Boundaries
DVMRP is a dense mode protocol which uses the Push model to flood traffic to all points in the network. (Routers that dont have receivers for the traffic will then Prune the traffic flow.) PIM-DM uses the same Push model to Flood-and-Prune traffic to all parts of the network. Because the two protocols use the same basic Push model, traffic can be flooded across the PIM-DM/DVMRP boundary without using any complex mechanisms. The PIM -DM boundary router will use the DVMRP Poison-Reverse mechanism as well as normal DVMRP Prunes and Grafts to maintain traffic flow between the two networks.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

37

PIM-DM/DVMRP Boundaries

Traffic sourced from DVMRP cloud


When packet arrives, create (S,G) state
(If it didnt already exist.)

Use normal PIM-DM state rules

Perform normal RPF check If passed, flood out all (S,G) OIL entries
Observing any Pruned (S,G) OIL state

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

38 38

PIM-DM/DVMRP Boundaries
Traffic sourced from sources in the DVMRP cloud: When the first packet arrives from source S in the DVMRP cloud at the PIM-DM router, the router will create (S,G) state using the normal PIM -DM state creation rules. The incoming interface for the (S,G) entry will be computed and assuming that a route exists for this source in the DVMRP routing table (it should since the PIM router is receiving DVMRP routes), the Incoming Interface will point to the DVMRP neighbor. Packets arriving from the DVMRP neighbor will now RPF correctly and be flooded out all other interfaces on the PIM-DM router where there are other PIM -DM neighbors. This permits the push model to work across the boundary from the DVMRP network towards the PIM network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

38

PIM-DM/DVMRP Boundaries Traffic sourced from PIM-DM cloud


When packet arrives, create (S,G) state
(if it doesnt already exist)

Use normal PIM-DM state rules plus


If DVMRP neighbor has sent Poison-Reverse for route
Add DVMRP neighbors interface to (*,G) OIL. Results in interface being added to (S,G) OIL.

Prune interface if all DVMRP neighbors send prune.

Perform normal RPF check. If passed then Flood out all (S,G) OIL entries.
Observing any Pruned (S,G) OIL state
(Above behavior assumes latest IOS code.)
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

39 39

PIM-DM/DVMRP Boundaries
Traffic sourced from sources in the PIM-DM cloud: When the first packet arrives from source S in the PIM-DM cloud at the PIM/DVMRP border router, the router will create (S,G) state using the normal PIM DM state creation rules. In addition, if the border router has received a DVMRP Poison-Reverse route for this particular source network, the border router will add the DVMRP neighbors interface to the outgoing interface list of the (*, G) and (S,G) forwarding entry. If a Prune message is received from all DVMRP neighbors on this interface, the PIM router will Prune the interface in the (S, G) outgoing interface list. The incoming interface for the (S,G) entry will be computed based on the unicast routing table resulting in the Incoming Interface pointing to the PIM -DM neighbor in the direction of the source in the PIM cloud. Packets arriving from this PIM-DM neighbor will now RPF correctly and be flooded out all interfaces in the (S, G) outgoing interface list including the interface on which the DVMRP neighbor resides. This permits the push model to work across the boundary from the PIM network towards the DVMRP network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

39

PIM-DVMRP Interoperability DVMRP router discovery Basic PIM-DVMRP interaction PIM-DVMRP route exchange PIM-DVMRP RPF checking PIM-DVMRP congruency PIM-DM/DVMRP Boundaries PIM-SM/DVMRP Boundaries
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

40

PIM-SM/DVMRP Boundaries DVMRP Network uses Push model


Traffic flooded everywhere

PIM-SM Network uses Pull model


Traffic only sent when requested

Differences in models can result in problems at the PIM-SM/DVMRP boundary.


Requires DVMRP Rcvr-is-Sender hack

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

41 41

PIM-SM/DVMRP Boundaries
DVMRP is a dense mode protocol which uses the Push model to flood traffic to all points in the network. (Routers that dont have receivers for the traffic will then Prune the traffic flow.) PIM-SM uses the Pull model to forward traffic to only those parts of the network where it has been explicitly requested. Because DVMRP is using the Push model, while the PIM-SM network is using the Push model, special considerations must be used to get traffic to flow across the PIMDM/DVMRP boundary. The PIM -SM/DVMRP solution requires the use of the Receiver-is-a-Sender hack in order to get traffic to flow across the PIM-SM/DVMRP boundary.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

41

PIM-SM/DVMRP Boundaries

Traffic sourced from DVMRP cloud


Treat as directly connected source/receiver.
Create (S,G) state (if it doesnt exist)
Using normal PIM-SM state rules

Proxy-Register initial (S,G) packets with RP. Join Shared-Tree for group G. Operate using normal PIM-SM forwarding rules.

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

42 42

PIM-SM/DVMRP Boundaries
Traffic sourced by sources in the DVMRP cloud. Since the DVMRP cloud uses the Push model, multicast traffic will be flooded to all points in the network including the PIM-SM/DVMRP border. When the first packet arrives from source S in the DVMRP cloud at the PIM-SM router, the router will create (S,G) state using the normal PIM -SM state creation rules. The incoming interface for the (S,G) entry will be computed and assuming that a route exists for this source in the DVMRP routing table (it should since the PIM router is receiving DVMRP routes), the Incoming Interface will point to the DVMRP neighbor. The PIM -SM border router will then Register this traffic to the RP as if it was coming from a directly connected source. This will ultimately result in the traffic flowing to the RP and on down the Shared Tree to any interested receivers for group G traffic in the PIM -SM network. The PIM -SM border router will also include the DVMRP neighbors interface in the (*,G) outgoing interface list and trigger a (*,G) Join toward the RP to build a branch of the Shared Tree. This causes any traffic being sent by sources in the PIM-SM network to flow down the Shared Tree to the PIM border router and be forwarded to the DVMRP router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

42

PIM-SM/DVMRP Boundaries

Traffic sourced from PIM-SM cloud.


Depends on DVMRP Rcvr-is-Sender hack. A host in the DVMRP cloud must send first.
Causes PIM-SM border router to join Shared Tree. Traffic arriving via Shared Tree is flooded to DVMRP neighbor.

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

43 43

PIM-SM/DVMRP Boundaries
Traffic sourced by sources in the PIM-SM cloud. Since the DVMRP cloud uses the Push model instead of the Pull model, the DVMRP border router has no way of knowing if there are any receivers in the DVMRP network and therefore cannot inform the PIM-SM router of this fact. As a result of the differences in these two models, the only way to insure that traffic flows from PIM-SM network to the DVMRP network is by using the Receiver-is-aSender hack. This hack relies on the receiver to first send multicast traffic to the group. This traffic will be flooded to the PIM-SM/DVMRP border and will cause the border router to join the Shared Tree for the group. (See explanation on the previous page.) Once a branch of the Shared Tree is setup, any traffic being sent by sources in the PIM-SM network can then flow down the Shared Tree to the PIM border router and be forwarded to the DVMRP router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

43

Sparse Mode Boundary Issue


Problem: Receive-Only Hosts in DVMRP Cloud.
mrouted

Recv-Only Host, Group G

Border Router Tunnel0 RP Source S1, Group G

Router A Mcst Table NULL

Router B Mcst Table (*, G) (S1 , G)

Router C Mcst Table (*, G) (S1 , G)

PIM-Sparse Mode

Border router is unaware of Recv-Only Host in DVMRP cloud and therefore has no state for Group G. (S1 , G) Traffic doesnt make it to Border router nor DVMRP Cloud.
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

44 44

Sparse Mode Boundary Issue


The problem with Receive-Only hosts is shown in the drawing above. Although the traffic from source S is flowing to the RP in the PIM -SM network, it is not reaching the border router, Router A. This is because Router A has no way of knowing about the receive-only host in the DVMRP network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

44

Sparse Mode Boundary Issue


Solution 1: Terminate Tunnel on RP.
mrouted

Recv-Only Host, Group G Border Router

Tunnel0 Router A Mcst Table NULL

RP Source S1, Group G

Router B Mcst Table (*, G) (S1 , G)

Router C Mcst Table (*, G) (S1 , G)

PIM-Sparse Mode

Not always possible. (e.g. Multiple RPs, Multi-homing)

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

45 45

Solution 1:
One of the easiest solutions to this problem is to always terminate the DVMRP tunnel (or border) on the RP. This will cause any traffic sent by sources in the PIM-SM cloud to flow to the RP (using normal PIM-SM forwarding rules) and then be flooded across the interface to the DVMRP neighbor. Unfortunately, this solution is not always possible as there may be multiple candidate RPs in the network or other network topology issues may preclude this approach.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

45

Sparse Mode Boundary Issue


Solution 2: Use Dense mode between Border & RP
mrouted

Recv-Only Host, Group G ip pim dense-mode RP Source S1, Group G

Border Router

Router A Mcst Table (*, G) (S1 , G) (S2 , G)

Router B Mcst Table (*, G) (S1 , G) (S2 , G)

Router C Mcst Table (*, G) (S1 , G) (S2 , G)

PIM-Sparse Mode

Need to minimize distance between Border & RP to keep Dense-mode cloud small. Requires carefully planning of topology.
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

46 46

Solution 2:
Another possible (but rather ugly) solution is to use PIM-DM to extend the dense mode flooding from the RP to the DVMRP neighbor. This has some very obvious drawbacks including dense-mode flooding across potentially large portions of the network if the distance from the RP to the border are great.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

46

Sparse Mode Boundary Issue


Solution 3: Receiver is Sender Hack.
mrouted

Host, Group G (Send & Rcvr)

Border Router Tunnel0 RP Source S1, Group G

Router A Mcst Table (*, G) (S1 , G) (S2 , G)

Router B Mcst Table (*, G) (S1 , G) (S2 , G)

Router C Mcst Table (*, G) (S1 , G) (S2 , G)

PIM-Sparse Mode

Works fine for most Mbone applications since they both send and receive. Not applicable for applications that are Rcv-Only.
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

47 47

Solution 3:
The most often used solution is to employ the Receiver-is-a-Sender hack. This solution normally works well for legacy multicast applications such as the Mbone multimedia conferencing applications (vic, vat, rat, wb, etc.) since these applications always send back-channel RTCP traffic to the multicast group. Unfortunately, this solution can not be used reliably for applications that are truly Receive-only.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

47

Module Agenda

PIM-DVMRP Interoperability Advanced PIM-DVMRP Features Debugging Tips

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

48 48

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

48

Route Redistribution

Command for route injection:


ip dvmrp metric <metric> [list <acl>] [<protocol> | dvmrp] [route-map <map>]

Metric 0 means dont inject Can select routes based on:


Routing protocol route-map specification Enumeration using access-lists
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

49 49

Route Redistribution
By default, only Connected routes are advertised by a router in DVMRP route updates. This behavior can be overridden with the following interface command: ip dvmrp metric <metric> [list <acl>] [<protocol> | dvmrp] [route-map <map>] This command can be used multiple times on an interface. The <metric> field is in hops and is normally set to a value of 1. A metric of 0 has a special meaning which means any matching routers are not to be advertised. Either ACLs or Route-Maps may be used to match the desired routes in the Unicast routing table that are to be advertised. If the <protocol><process-id> is configured, only routes learned by the specified routing protocol will be advertised in DVMRP Report messages. This parameter can be used in conjunction with <access-list> so a selective list of destinations learned from a given routing protocol may be reported. If this command is not used, only directly connected networks are advertised when DVMRP neighbors are discovered. If the "dvmrp" keyword is configured, only routes from the DVMRP routing table will be selected to be advertised with <metric>. Warning: Care should be used when configuring this command as it is easy to configure route loops whenever route redistribution is used.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

49

Route Redistribution
Preventing redistribution of DVMRP routes back into DVMRP cloud to avoid being transit network.

DVMRP Cloud
mrouted mrouted

ip dvmrp metric 0 dvmrp

ip dvmrp metric 0 dvmrp

PIM Cloud ip dvmrp unicast-routing

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

50 50

Avoid being a Tranist DVMRP Network


It is possible to avoid becoming a DVMRP transit network by using the metric 0 form of the ip dvmrp metric interface command. In the example above, the two Cisco routers in the bottom network are exchanging DVMRP routes by the use of the ip dvmrp unicast-routing command. This would normally result in DVMRP routes learn via one router being advertised back into the DVMRP cloud which could possibly turn the bottom network into a transit DVMRP network. By configuring the ip dvmrp metric 0 dvmrp command on the border interfaces as shown above, the Cisco routers will not advertise routes from their DVMRP routing table to the DVMRP cloud. This prevents the network from becoming a possible transit network for the DVMRP network.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

50

Route Redistribution
You can specify what you want to receive:
ip dvmrp accept-filter <acl> [<distance>]

And from whom:


ip dvmrp accept-filter <acl> [neighbor-list <acl>] [<distance>]

Note: DVMRP Probes are ignored from DVMRP neighbors that are denied in the neighbor-list <acl>.
This can be used to disable automatic PIM-DVMRP interoperability on an interface.

And what metric to add in:


ip dvmrp metric-offset [in | out] <increment>

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

51 51

Route Redistribution Tuning


ip dvmrp accept-filter <acl> [<distance>] This interface command controls which DVMRP routes will be accepted based on the specified ACL. An optional <distance> value may be used to set the Admin. Distance of the matching DVMRP routes to some other value than the default of zero. ip dvmrp accept-filter <acl> [neighbor-list <acl>] [<distance>] This form of the accept-filter interface command works identical to the previous version with the addition that it can be used to control which DVMRP routes will be accepted from which DVMRP neighbor. The neighbor-list ACL is used to accept or ignore updates from certain DVMRP neighbors. Note: DVMRP Probes are also ignored from any DVMRP neighbors that are denied (either explicitly or implicitly) by The neighbor-list ACL. This feature can be used to disable the automatic PIM-DVMRP interoperability on an interface by configuring a deny all ACL for the neighbor-list. ip dvmrp metric-offset [in | out] <increment> This interface command can be used to offset the metrics of the incoming or outgoing DVMRP route updates by the amount specified by <increment>.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

51

Route Redistribution

Control rate at which reports are sent.


ip dvmrp output-report-delay <delay> [< burst >]

Defaults: delay = 100ms; burst = 2 Added IOS 11.2(9) Prior releases: delay = 0; burst = infinity

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

52 52

Route Redistribution Tuning


ip dvmrp output-report-delay <delay> [<burst>] This interface command was first introduced in IOS 11.2(9) and can be used to control the rate at which DVMRP route updates are transmitted. Prior to this release, Cisco routers would simply transmit all DVMRP routes in one continuous burst every DVMRP update interval (60 seconds). This often resulted in the overrun of the DVMRP neighbors input buffers which would result in lost update messages. This in turn, would result in instabilities in the DVMRP network as routes would timeout and go into holddown. In order to solve this problem, the ip dvmrp output-report-delay command was introduced to provide pacing of the DVMRP route updates. If this command is not configured, the default values for the delay and burst size is 100ms and 2 packets.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

52

Default Route Origination

You can originate the DVMRP default route with other routes
ip dvmrp default-information originate

You can originate the DVMRP default route only


ip dvmrp default-information only Good for stub networks and low-speed linkssaves bandwidth and state

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

53 53

Default Route Origination


ip dvmrp default-information originate This interface command controls the origination of a default DVMRP route. If this command is configured, the router will originate a DVMRP route for 0.0.0.0 out this interface in addition to the other DVMRP routes that would normally be sent. ip dvmrp default-information only This interface command will cause the router to only originate the DVMRP default route out this interface. No other DVMRP routes will be sent. Note: This is very useful when connecting to DVMRP stub networks as it saves on the amount of bandwidth and routing state consumed in the stub network router(s).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

53

Summary Origination
Classful summarization is on by default
As long as the subnets are from a different network number than the network number of the interface the advertisement is being sent out on

You can turn auto summarization off


no ip dvmrp auto-summary

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

54 54

Summary Origination
Classful summarization is on by default for DVMRP. This means that subnets from a network that are different than the network number of the interface will be automatically summarized into their classful summarization. This behavior may be disabled using the following command: no ip dvmrp auto-summary This interface command turns off the classful summarization of routes across an interface.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

54

Summary Origination

Custom classless summarization


ip dvmrp summary-address <address> <mask> metric <metric>

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

55 55

Summary Origination
ip dvmrp summary-address <address> <mask> metric <metric> This interface command configures a summary address to be advertised out the interface. If there is at least one more specific route in the unicast routing table that matches the <address>/<mask>, the summary will be advertised. (Note: Routes in the DVMRP routing table are not candidates for summarization.) When the metric keyword is supplied, the summary will be advertised with the metric specified by <value>. The default metric <value> is 1. Multiple summary addresses can be configured on an interface. When multiple overlapping summary addresses are configured on an interface, the one with the longest mask takes preference. (This command was first introduced in IOS version 11.2.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

55

Legacy MBONE Features

Limiting number of routes advertised


ip dvmrp route-limit <route-count>

Generating route-hog Warnings


ip dvmrp routehog-notification <route-count>

Ignore DVMRP neighbors that dont support Pruning.


ip dvmrp reject-non-pruners

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

56 56

Legacy MBONE Features


The following commands are considered legacy features that were primarily used when the DVMRP-based MBONE was used as the primary transit for multicast traffic across the Internet. ip dvmrp route-limit <route-count> This global command limits the number of DVMRP routes advertised over an interface enabled to run DVMRP. That is a DVMRP tunnel, an interface where a DVMRP neighbor has been discovered, or an interface configured to run "ip dvmrp unicastrouting". The default value is 7000. This command will be automatically generated to the configuration file when at least one interface is enabled for multicast routing. This command is necessary so misconfigured "ip dvmrp metric" commands don't cause massive route injection into the MBONE. The "no" version of the command configures no limit. [11.0] ip dvmrp routehog-notification <route-count> This global command configures the number of routes allowed within an approximate one minute interval before a syslog message is isssued warning that there maybe a route surge going on in the MBONE. This is typically used to detect quickly when someone has misconfigured their routers to inject a large number of routes into the MBONE. The default value is 10,000. You can find a running count in the "show ip igmp interface" display. When the count is exceeded, you'll see an "*** ALERT ***" string appended to the line. [10.2] ip dvmrp reject-non-pruners This interface command will cause the router not to peer with a DVMRP neighbor if the neighbor doesn't support DVMRP Pruning/Grafting. This command was added so that the policy of no-support for non-Pruning DVMRP versions could be enforced in the Internet.
Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

56

Module Agenda

PIM-DVMRP Interoperability Advanced PIM-DVMRP Features Debugging Tips

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

57 57

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

57

Debugging Tips

Example Network
DVMRP network
pim-dvmrp-gw:
mrouted

Tunnel0 pim-dvmrp-gw Ethernet0 Ethernet1

interface tunnel0 ip unnumbered ethernet0 ip pim dense-mode tunnel mode dvmrp tunnel source ethernet0 tunnel destination 135.1.22.98 interface ethernet0 ip addr 135.1.3.102 255.255.255.0 ip pim dense-mode interface ethernet1 ip addr 135.1.2.102 255.255.255.0 ip pim dense-mode

135.1.2.100

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

58 58

Example Network
The network shown in the above drawing will be used through-out this section on debugging tips. Take particular note of Tunnel0 source and destination addresses as well as the address of the host workstation. These addresses will be important in the upcoming pages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

58

Debugging Tips

Verifying the DVMRP tunnel Verifying DVMRP route exchange Verifying Multicast reception Verifying Multicast transmission

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

59 59

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

59

Verifying the DVMRP Tunnel

Using the show interface Command


pim -dvmrp pimdvmrp-gw gw> > show int tunnel 0 Tunnel0 is up, line protocol is up Hardware is Tunnel Interface is unnumbered. Using address of Ethernet0 (135.1.3.102) (135.1.3.102) MTU 1500 bytes, BW 9 Kbit Kbit, , DLY 500000 usec usec, , rely 255/255, load 1/255 Encapsulation TUNNEL, loopback not set, keepalive set (10 sec) Tunnel source 135.1.3.102 (Ethernet0), destination 135.1.22.98 Tunnel protocol/transport IP/IP (DVMRP), key disabled, sequencing sequenc ing disabled Checksumming of packets disabled, fast tunneling enabled Last input 00:00:05, output 00:00:08, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0 (size/max/drops); Total output drops: 0 . . .
8/14/2001 2:11 PM 60 60

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

Verifying the DVMRP Tunnel


Use the show interface Tunnel0 command to verify that Tunnel0 is up and the line protocol is also up. (This is normally the case as soon as the tunnel has been configured. However, it does not necessarily mean that the tunnel is operational.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

60

Verifying the DVMRP Tunnel

Using the mrinfo Command


pim -dvmrp pimdvmrp-gw gw> >mrinfo 135.1.3.102 [version cisco 11.2] [flags: PMA]: 135.1.3.102 -> 0.0.0.0 [1/0/pim [1/0/pim/ /querier querier/leaf] /leaf] 135.1.2.102 -> 135.1.2.2 [1/0/pim [1/0/pim/ /querier querier] ] 135.1.2.102 -> 135.1.2.3 [1/0/pim [1/0/pim/ /querier querier] ] 135.1.3.102 -> 135.1.22.98 [1/0/tunnel/querier] pim -dvmrp pimdvmrp-gw gw> >mrinfo 135.1.22.98 135.1.22.98 [version mrouted 3.8] [flags: GPM]: 172.21.32.98 -> 172.21.32.191 [1/1] 172.21.32.98 -> 172.21.32.1 [1/1] 135.1.22.98 -> 135.1.22.102 [1/1/querier [1/1/querier] ] 135.1.22.98 -> 135.1.3.102 [1/1/tunnel]

Both Ends See Each Other

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

61 61

Verifying the DVMRP Tunnel


The best way to verify that the tunnel is up in both directions is to use the mrinfo command on the Cisco PIM-DVMRP gateway router. In the first example above, the mrinfo command is entered on the Cisco gateway router without any parameters. This causes the Cisco gateway router to display its own multicast interface status information. We are looking for the tunnel interface status from the Cisco gateway router to the DVMRP router at the other end of the tunnel. This is shown on the line that reads: 135.1.3.120 -> 135.1.22.98 [1/0/tunnel/querier] The key item to look for is the absence of the word down in the status. This indicates that the tunnel is up in the direction from the Cisco to the DVMRP router. In the second example above, the mrinfo command is entered on the PIM-DVMRP gateway router with the IP address of the DVMRP router. (I.e the tunnel destination address, 135.1.22.98.) This causes the DVMRP router at the other end of the tunnel to report back its multicast interface status information. We are looking for the tunnel interface status from the DVMRP router at the other end of the tunnel back toward the Cisco gateway router. This is shown on the line that reads: 135.1.22.98 -> 135.1.3.102 [1/1/tunnel] Again, the key item to look for is the absence of the word down in the status. This indicates that the tunnel is up in the direction from the DVMRP router to the Cisco gateway router. Since both ends of the tunnel are up (i.e. neither is displaying down), the tunnel is up in both directions.
Copyright ? ?1998-2001, Cisco Systems, Inc.
Module9.ppt

61

Debugging Tips

Verifying the DVMRP tunnel Verifying DVMRP route exchange Verifying Multicast reception Verifying Multicast transmission

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

62 62

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

62

Verifying DVMRP Route Exchange Using the show ip dvmrp route Command
pim -dvmrp pimdvmrp-gw gw# # show ip dvmrp route DVMRP Routing Table - 8 entries 130.1.0.0/16 [0/3] uptime 00:19:03, expires 00:02:13 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 135.1.0.0/16 [0/3] uptime 00:19:03, expires 00:02:13 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 135.1.22.0/24 [0/2] uptime 00:19:03, expires 00:02:13 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 171.69.0.0/16 [0/3] uptime 00:19:03, expires 00:02:13 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 172.21.27.0/24 [0/3] uptime 00:19:04, expires 00:02:12 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 172.21.32.0/24 [0/2] uptime 00:19:04, expires 00:02:12 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 172.21.33.0/24 [0/3] uptime 00:19:04, expires 00:02:12 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM] 172.21.120.0/24 [0/3] uptime 00:19:04, expires 00:02:12 via 135.1.22.98, Tunnel0, [version mrouted 3.8] [flags: GPM]
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

63 63

Verifying DVMRP route exchange


The easiest way to verify that DVMRP routes are being exchanged is to use the show ip dvmrp route command on the Cisco PIM -DVMRP gateway router. In the example above, the show ip dvmrp route command is entered on the Cisco gateway router. This causes the Cisco gateway router to display the contents of its DVMRP route table. In this case, we see that the Cisco gateway router has learned 8 DVMRP routes from the DVMRP neighbor at the other end of the tunnel. This is a pretty good indication that DVMRP routes are being exchanged via the tunnel.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

63

Verifying DVMRP Route Exchange

Using the debug ip dvmrp Command


pim -dvmrp pimdvmrp-gw gw# # debug ip dvmrp DVMRP debugging is on pimpim -dvmrp dvmrp-gw gw# # Mar 20 11:39:36.335: DVMRP: Aging routes, 0 entries expired Mar 20 11:39:41.271: DVMRP: Received Probe on Tunnel0 from 135.1.22.98 135.1 .22.98 Mar 20 11:39:45.335: DVMRP: Building Report for Tunnel0 224.0.0.4 Mar 20 11:39:45.335: DVMRP: Send Report on Tunnel0 to 135.1.22.98 Mar 20 11:39:45.335: DVMRP: 2 unicast, 8 DVMRP routes advertised Mar 20 11:39:47.335: DVMRP: Aging routes, 0 entries expired Mar 20 11:39:51.371: DVMRP: Received Probe on Tunnel0 from 135.1.22.98 135.1 .22.98 Mar 20 11:39:52.379: DVMRP: Received Report on Tunnel0 from 135.1.22.98

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

64 64

Verifying DVMRP route exchange


Another way to verify that DVMRP routes are being exchanged is to use the debug ip dvmrp command on the Cisco PIM-DVMRP gateway router. In the example above, the debug ip dvmrp command is entered on the Cisco gateway router. This causes the Cisco gateway router to display the DVMRP routing protocol events. In the first highlighted section, we see that the Cisco gateway router build and send a DVMRP Report message that contains 8 DVMRP routes and 2 unicast routes to the DVMRP router at the other end of the tunnel. Note: The 8 DVMRP routes being sent are actually Poison Reverse routes for the 8 DVMRP routes received from the DVMRP neighbor. The 2 unicast routes are local unicast routes (in this case the directly connected networks on the Cisco gateway router) that are being advertised to the DVMRP neighbor in the DVMRP Report. However, this information is not obvious unless we turn on more debugging detail. In the second highlighted section, we see that the Cisco gateway router has received a DVMRP Report via Tunnel0 from the DVMRP neighbor, 135.1.22.98. Unfortunately, with this level of debug we do not know how many routes were contained in the Report.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

64

Verifying DVMRP Route Exchange

Checking DVMRP Routes being Advertised


pim -dvmrp pimdvmrp-gw gw# # debug ip dvmrp detail DVMRP debugging is on Mar 20 11:42:45.337: DVMRP: Building Report for Tunnel0 224.0.0.4 224.0.0.4 Mar 20 11:42:45.337: DVMRP: Report 130.1.0.0/16, metric 35, from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 135.1.0.0/16, metric 35, from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 135.1.22.0/24, metric 34, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 171.69.0.0/16, metric 35, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 172.21.27.0/24, metric 35, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 172.21.32.0/24, metric 34, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 172.21.33.0/24, metric 35, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 172.21.120.0/24, metric 35, from from DVMRP table Mar 20 11:42:45.337: DVMRP: Report 135.1.2.0/24, metric 1 Mar 20 11:42:45.337: DVMRP: Report 135.1.3.0/24, metric 1 Mar 20 11:42:45.337: DVMRP: Send Report on Tunnel0 to 135.1.22.98 135.1.22.9 8 Mar 20 11:42:45.337: DVMRP: 2 unicast, 8 DVMRP routes advertised
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

65 65

Verifying DVMRP route exchange


In order to get more detailed information, it is necessary to enter the debug ip dvmrp detail command on the Cisco PIM-DVMRP gateway router. This will result in the Cisco router reporting detailed contents of each sent and received DVMRP route report. (Warning: Care should be taken when using this debug command as the router may be overloaded if the number of DVMRP routes in these reports is large.) In the example above, the debug ip dvmrp detail command is entered on the Cisco gateway router. This causes the Cisco gateway router to display the DVMRP routing protocol events as well as the contents of the DVMRP Report messages. In the example above, we see the Cisco gateway router build and send a DVMRP Report message that contains 8 DVMRP routes and 2 unicast routes to the DVMRP router at the other end of the tunnel. Notice that the 8 DVMRP routes have a metric greater than 32 (infinity) which indicates that these routes are Poison-Reverse routes. The two unicast routes being advertised in this DVMRP report are highlighted in the above example. Notice that these are the two connected networks on the Cisco gateway router. (Refer to the Example network drawing in a previous slide.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

65

Verifying DVMRP Route Exchange

Checking DVMRP Routes being Received


pim -dvmrp pimdvmrp-gw gw# # debug ip dvmrp detail DVMRP debugging is on ... : DVMRP: Received Report on Tunnel0 from 135.1.22.98 ... : DVMRP: Origin 130.1.0.0/16, metric 2, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 135.1.0.0/16, metric 2, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 171.69.0.0/16, metric 2, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 135.1.2.0/24, metric 34, metricmetric -offset 1, infinity ... : DVMRP: Origin 135.1.3.0/24, metric 34, metricmetric -offset 1, infinity ... : DVMRP: Origin 135.1.22.0/24, metric 1, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 172.21.27.0/24, metric 2, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 172.21.32.0/24, metric 1, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 172.21.33.0/24, metric 2, metricmetric -offset 1, distance 0 ... : DVMRP: Origin 172.21.120.0/24, metric 2, metric -offset 1, distance 0

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

66 66

Verifying DVMRP route exchange


In the example above, we again see an example of some debug ip dvmrp detail output from the Cisco PIM -DVMRP gateway router. In this example, we see the Cisco gateway router has received a DVMRP Report message that contains a total of 10 routes. Notice that the 8 of the routes have a metric less than 32 (infinity) which indicates that these routes are normal DVMRP routes that are being advertised by the DVMRP neighbor. Notice also the two routes with a metric of 34. These are Poison-Reverse routes being sent from the DVMRP neighbor back toward the Cisco router. These are the two unicast routes that were originally advertised to the DVMRP neighbor by the Cisco gateway router. (Refer to the Example network drawing in a previous slide.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

66

Debugging Tips

Verifying the DVMRP tunnel Verifying DVMRP route exchange Verifying Multicast reception Verifying Multicast transmission

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

67 67

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

67

Verifying Multicast Reception


Use SDR multicasts (224.2.127.254) as a test signal
Enable ip sdr listen on an interface
Use ip sap listen on newer versions of IOS

Should begin seeing entries in mroute table


Use show ip mroute summary to list

Should begin seeing SDR cache entries


Use show ip sdr to list Use debug ip sd to observe SDR packets
Note: Assumes SDR is being used in the old DVMRP cloud to announce sessions.
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

68 68

Verifying Multicast Reception


The key to verifying multicast reception is to first know of an active source somewhere in the network that can be used as a multicast beacon. As far as applications go, SDR is almost always active somewhere in the Internet and often is active in other multicast networks as well. Therefore, if there are no other known active multicast applications, the SDR application is often a good choice. In order to use the SDR application as a multicast test beacon, the ip sdr listen interface command can be configured on one of the router interfaces. (Note: On newer versions of IOS, the command has been renamed to ip sap listen.) This will cause the router to join the SDR group (224.2.127.254) and begin receiving any possible SDR sources in the network. Once the router has joined the SDR multicast group, you should begin see (S,G) entries in the mroute table for sources sending to this group. This will confirm that you are receiving multicast traffic. In addition to the mroute entries, the router will cache SDR Session Announcements for advertised multimedia sessions. The contents of this cache can be displayed by using the show ip sdr command. (Alternatively, you can enable debug ip sd and see these Sessions Announcements being processed by the router.) This will also confirm that you are receiving multicast traffic.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

68

Verifying Multicast Reception


Should begin seeing entries in mroute table
pim -dvmrp pimdvmrp-gw gw# # show ip mroute summary 224.2.127.254 IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P Pruned R - RP RP-bit set, F - Register flag, T - SPT SPT-bit set, J - Join SPT Timers: Uptime/Expires Interface state: Interface, NextNext-Hop, State/Mode (*, 224.2.127.254), 00:08:07/00:02:56, RP 0.0.0.0, flags: SJC (128.32.131.87/32, 224.2.127.254), 00:02:30/00:00:26, flags: CT CT (129.89.142.50/32, 224.2.127.254), 00:02:35/00:00:19, flags: CT CT (171.69.58.109/32, 224.2.127.254), 00:02:40/00:00:21, flags: CT CT . . .
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

69 69

Verifying Multicast Reception


The slide above is an example of the show ip mroute summary command being used to display a list of (S,G) entries for the SDR multicast group (224.2.127.254). This tells us that we have successfully received multicast traffic from these sources which in turn verifies that we have multicast reception.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

69

Verifying Multicast Reception


Should begin seeing SDR cache entries
pim -dvmrp pimdvmrp-gw gw# # show ip sdr SDR Cache - 249 entries Michigan State University Instructional Television !CannesCast '97 - 50th anniversary Aberdeen University, Scotland ACM 97 Alan Kay: Georgia Tech Distinguished Lecture AmiNet Argonne Petroleum Seminar Series as/cd as/ cd discussions ATM setup MannheimMannheim -Bonn Bonn- Berkeley Audio for Sunday Basler Fasnacht 1997 ! . . .
Module9.ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

70 70

Verifying Multicast Reception


The slide above is an example of the show ip sdr command being used to display a list of the names of the multi-media sessions learned via the SDR (multicast group 224.2.127.254) entries for the SDR multicast group (224.2.127.254). This tells us that we have successfully received Session Announcement multicast traffic from the sources advertising the sessions. This in turn, verifies that we have multicast reception.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

70

Debugging Tips

Verifying the DVMRP tunnel Verifying DVMRP route exchange Verifying Multicast reception Verifying Multicast transmission

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

71 71

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

71

Verifying Multicast Transmission

Use sdr & rat MBONE applications


Run sdr on WS or PC Join a Well-Known rat session

Check mroute table at pim-dvmrp-gw


Should have (S, G) entry for WS or PC

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

72 72

Verifying Multicast Transmission


Verifying multicast transmission via the DVMRP tunnel is a little more complex. First it is necessary to activate a source somewhere inside your network for a multicast group that you know has active receivers in the DVMRP network. Again, SDR is almost always active somewhere in the Internet and often is active in other multicast networks as well. Therefore, if there are no other known active multicast applications, the SDR application is often a good choice. Another choice is to use the SDR application to identify an active multimedia session that may have active receivers. The next step is to activate a workstation or PC and run the SDR application or the RAT (Robust Audio Tool) to source multicast traffic from your network. In order to use the SDR application as a multicast test source you need to activate the SDR application on a PC or workstation and create a new multimedia session. This will cause the PC/WS to begin sourcing SDR Session Announcements to the SDR multicast group (224.2.127.254). Assuming that there are active SDR receivers in the DVMRP network (or beyond), this SDR multicast traffic will begin flowing across the tunnel. This should by making sure that there is an (S,G) mroute entry for the PC/WS on the PIM-DVMRP gateway router. Furthermore, the Tunnel interface should appear in the outgoing interface list in an unpruned (Forwarding) state. An alternative to using SDR is to launch the RAT (Robust Audio Tool) audio conferencing tool for an active multi-media audio session that is being advertised by SDR. Assuming that there are active receivers in the DVMRP network (or beyond) for this session this RAT multicast traffic will begin flowing across the tunnel. This should by making sure that there is an (S,G) mroute entry for the PC/WS on the PIM-DVMRP gateway router. Furthermore, the Tunnel interface should appear in the outgoing interface list in an unpruned (Forwarding) state.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

72

Verifying Multicast Transmission


Should Have (S, G) Entry for WS or PC
pim -dvmrp pimdvmrp-gw gw> > show ip mroute summary IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP RP-bit set, F - Register flag, T - SPT SPT-bit set, J - Join SPT Timers: Uptime/Expires Interface state: Interface, NextNext-Hop, State/Mode (*, 224.2.0.1), 00:08:07/00:02:56, RP 0.0.0.0, flags: DJC (13.2.116.11/32, 224.2.0.1), 08:11:47/00:02:55, flags: PCT (128.16.64.19/32, 224.2.0.1), 16:05:41/00:02:52, flags: PCT (129.99.50.40/32, 224.2.0.1), 1d08h/00:02:57, flags: PCT (134.164.1.2/32, 224.2.0.1), 01:42:58/00:02:57, flags: PCT (135.1.2.100/32, 224.2.0.1), 00:05:40/00:00:43, flags: CLT (138.25.8.74/32, 224.2.0.1), 13:16:05/00:02:56, flags: PCT (171.69.58.109/32, 224.2.127.254), 00:02:40/00:00:21, flags: PCT PCT

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 2:11 PM

73 73

Verifying Multicast Transmission


The slide above is an example of the show ip mroute summary command being used to verify that an (S,G) entry for the sending PC/WS for the RAT multi-media session exists in the PIM-DVMRP gateway. Notice the highlighted entry which corresponds to the IP address of the WS in the initial example network drawing. This tells us that we are successfully sending multicast traffic from this WS to the PIM-DVMRP gateway. In order to verify that the WS traffic is flowing across the Tunnel, the outgoing interface list of the highlighted entry would be checked. If the interface Tunnel0 is in the outgoing interface list in Forwarding state, it verifies that we are sending multicast traffic across the tunnel.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

73

Module9.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

74 74

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module9.ppt

74

Multiprotocol BGP (MBGP)


Module 10

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

Module Objectives

Understand the basic concepts of BGP Explain the MBGP extensions to BGP Identify steps associated with configuring and debugging MBGP

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

2 2

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

Agenda
BGP Review MBGP Overview MBGP Update Messages MBGP Capability Negotiation MBGP NLRI Information Advanced MBGP Features New 12.1 MBGP Syntax Debugging MBGP
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

3 3

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

BGP Border Gateway Protocol

Routing Protocol used between ASs Currently Version 4 Runs over TCP Path Vector Protocol Incremental Updates

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

4 4

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

BGP Peers
A C

AS 100
220.220.8.0/24

AS 101
220.220.16.0/24

BGP speakers are called peers Peers in different ASs are called External Peers
eBGP TCP/IP Peer Connection

AS 102
220.220.32.0/24

Note: eBGP Peers normally should be directly connected.


Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

5 5

External BGP Peers


An external BGP peering is established when two BGP speakers in different Autonomous Systems (AS) are connected via a TCP/IP connection. External BGP peers are normally directly connected to each other. This simplifies the establishment of the TCP/IP session since the two BGP peers do not have to rely on an IGP or static routing to establish the TCP/IP path.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

BGP Peers
A C

AS 100
220.220.8.0/24

AS 101
220.220.16.0/24

BGP speakers are called peers Peers in the same AS are called Internal Peers
iBGP TCP/IP Peer Connection
Module10. ppt

AS 102
220.220.32.0/24

Note: iBGP Peers dont have to be directly connected.


8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

6 6

Internal BGP Peers


An internal BGP peering is established when two BGP speakers in the same Autonomous Systems (AS) are connected via a TCP/IP connection. Internal BGP peers are normally not directly connected to each other. For example, it is assumed that routers A & B in the above drawing are some distance apart and are not directly connected. This means that IGP routing information must be available in order to establish the TCP/IP session between the two iBGP peers.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

BGP Peers
A C

AS 100
220.220.8.0/24

AS 101
220.220.16.0/24

BGP Peers exchange Update messages containing Network Layer Reachability Information (NLRI)
BGP Update Messages
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

AS 102
220.220.32.0/24

8/14/2001 3:35 PM

7 7

BGP Update Messages


Routing information is exchanged between BGP peers in the form of BGP Update messages which contain Network Layer Reachability Information (NLRI).

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

Configuring BGP Peers


AS 100
A
.2 eBGP TCP Connection

AS 101
C
.2

222.222.10.0/30

220.220.8.0/24

.1

.2

.1

220.220.16.0/24

.1

interface Serial 0 ip address 222.222.10.2 255.255.255.252 router bgp 100 network 220.220.8.0 mask 255.255.255.0 neighbor 222.222.10.1 remoteremote-as 101

interface Serial 0 ip address 222.222.10.1 255.255.255.252 router bgp 101 network 220.220.16.0 mask 255.255.255.0 neighbor 222.222.10.2 remoteremote-as 100

BGP Peering sessions are established using the BGP neighbor configuration command
External (eBGP) is configured when AS numbers are different
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

8 8

Configuring BGP Peers


The BGP neighbor configuration command is used to establish the TCP/IP connection between two BGP peers. For example, the following command instructs Router B to open a TCP/IP session to 222.222.10.1 (Router C): neighbor 222.222.10.1 remote-as 101 Likewise, the following command instructs Router C to open its end of the TCP/IP session to Router B via the 222.222.10.2 IP address neighbor 222.222.10.2 remote-as 100 The BGP peering established by these two commands will be an External BGP session because the AS numbers are different. (i.e. Routers B & C are in different AS.)

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

Configuring BGP Peers


AS 100
222.222.10.0/30

AS 101
iBGP TCP Connection .1

.2

220.220.8.0/24

.2

.1

.2

220.220.16.0/24

.1

interface Serial 1 ip address 220.220.16.2 255.255.255.252 router bgp 101 network 220.220.16.0 mask 255.255.255.0 neighbor 220.220.16.1 remoteremote-as 101

interface Serial 1 ip address 222.220.16.1 255.255.255.252 router bgp 101 network 220.220.16.0 mask 255.255.255.0 neighbor 220.220.16.2 remoteremote-as 101

BGP Peering sessions are established using the BGP neighbor configuration command
External (eBGP) is configured when AS numbers are different Internal (iBGP) is configured when AS numbers are same
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

9 9

Configuring BGP Peers


In this example, the following command instructs Router C to open a TCP/IP session to 222.222.16.1 (Router D): neighbor 222.222.16.1 remote-as 101 Likewise, the following command instructs Router D to open its end of the TCP/IP session to Router C via the 222.222.16.2 IP address neighbor 222.222.16.2 remote-as 101 The BGP peering established by these two commands will be an Internal BGP session because the AS numbers are different. (i.e. Routers B & C are in the same AS.)

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

Configuring BGP Peers


AS 100
B

iBGP TCP/IP Peer Connection

Each iBGP speaker must peer with every other iBGP speaker in the AS
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

10 10

Configuring BGP Peers


Normally, all Internal BGP (iBGP) speakers in an AS must peer with every other iBGP speaker in the AS. In other words, all iBGP speakers in the AS must be fully meshed. Note: Special BGP techniques such as Route Reflectors and BGP Confederations may be employed to avoid having to fully mesh all iBGP speakers in an AS. However, the details of these techniques are beyond the scope of this tutorial. Consult other documentation on Deploying BGP.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

10

Configuring BGP Peers


AS 100
215.10.7.2

215.10.7.1

215.10.7.3

iBGP TCP/IP Peer Connection

Loopback interface are normally used as peer connection end-points


Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

11 11

Configuring BGP Peers


Loopback interfaces are often used as the peering address in order to avoid BGP peering sessions from being lost when a single physical interface goes down.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

11

Configuring BGP Peers


AS 100
215.10.7.2

215.10.7.1

215.10.7.3

iBGP TCP/IP interface Peer Connection loopback 0


ip address 215.10.7.1 255.255.255.255 router bgp 100 network 220.220.1.0 neighbor 215.10.7.2 neighbor 215.10.7.2 neighbor 215.10.7.3 neighbor 215.10.7.3

C
remote-as 100 update-source loopback0 updateremote-as 100 update-source loopback0 update-

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

12 12

Configuring BGP Peers


In this example, Router A has been configured to establish iBGP sessions to routers B and C using the address of their Loopback interfaces 215.10.7.2 and 215.10.7.3, respectively. This is accomplished by the following commands: neighbor 215.10.7.2 remote-as 100 neighbor 215.10.7.2 update-source loopback0 neighbor 215.10.7.3 remote-as 100 neighbor 215.10.7.3 update-source loopback0 Notice that the update-source commands are used in the above configuration so that router A will make the TCP/IP connections using the address of its Loopback interface, 215.10.7.1. If this is not done, the TCP/IP connection will not come up as there will be a mismatch in source/destination addresses.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

12

Configuring BGP Peers


AS 100
215.10.7.2

215.10.7.1

215.10.7.3

iBGP TCP/IP Peer Connection

interface loopback 0 ip address 215.10.7.2 255.255.255.255 router bgp 100 network 220.220.5.0 neighbor 215.10.7.1 neighbor 215.10.7.1 neighbor 215.10.7.3 neighbor 215.10.7.3

remote-as 100 update-source loopback0 updateremote-as 100 update-source loopback0 update-

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

13 13

Configuring BGP Peers


In this example, Router B has been configured to establish iBGP sessions to routers A and C using the address of their Loopback interfaces 215.10.7.1 and 215.10.7.3, respectively. This is accomplished by the following commands: neighbor 215.10.7.1 remote-as 100 neighbor 215.10.7.1 update-source loopback0 neighbor 215.10.7.3 remote-as 100 neighbor 215.10.7.3 update-source loopback0 Notice that the update-source commands are used in the above configuration so that router B will make the TCP/IP connections using the address of its Loopback interface, 215.10.7.2. If this is not done, the TCP/IP connection will not come up as there will be a mismatch in source/destination addresses.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

13

Configuring BGP Peers


AS 100
215.10.7.2

215.10.7.1

215.10.7.3

iBGP TCP/IP Peer Connection


interface loopback 0 ip address 215.10.7.3 255.255.255.255 router bgp 100 network 220.220.1.0 neighbor 215.10.7.1 neighbor 215.10.7.1 neighbor 215.10.7.2 neighbor 215.10.7.2
Module10. ppt

remote-as 100 update-source loopback0 updateremote-as 100 update-source loopback0 update8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

14 14

Configuring BGP Peers


In this final example, Router C has been configured to establish iBGP sessions to routers A and B using the address of their Loopback interfaces 215.10.7.1 and 215.10.7.2, respectively. This is accomplished by the following commands: neighbor 215.10.7.1 remote-as 100 neighbor 215.10.7.1 update-source loopback0 neighbor 215.10.7.2 remote-as 100 neighbor 215.10.7.2 update-source loopback0 Notice that the update-source commands are used in the above configuration so that router C will make the TCP/IP connections using the address of its Loopback interface, 215.10.7.3. If this is not done, the TCP/IP connection will not come up as there will be a mismatch in source/destination addresses.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

14

BGP Update Messages


The BGP UPDATE Message
Unfeasible Routes Length (2 Octets) Withdrawn Routes (Variable) Attribute Type Total path Attribute Length (2 Octets) Path Attributes (Variable) Network Layer Reachability Information (Variable) Length (I Octet) Prefix (Variable) Attribute Length Attribute Value Length (I Octet) Prefix (Variable)

A BGP update is used to advertise a single feasible route to a peer, or to withdraw multiple unfeasible routes Each update message contains attributes, like origin, ASPath, Next-Hop, .
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

15 15

BGP Update Messages


The drawing above shows the basic format of the BGP Update Message. These messages are used to advertise a single feasible route to a peer as well as to optionally withdraw one or more unfeasible routes. The first section of the Update Message is the Withdrawn Routes. This is a variable length field whose length is specified in the first two bytes. Each withdrawn route consists of a length and the prefix of the route being withdrawn. The second section of the message is the Path Attributes section. This is also a variable length section whose length is specified in the first two bytes of the section. Each attribute is encoded in a TLV (Type, Length, Value) format that specifies certain attributes that are attributed to the Network Layer Reachability Information (NLRI) contained in the next section. The last section is the Network Layer Reachability Information (NLRI) field which consists of a length and prefix of the advertised route.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

15

BGP Updates NLRI

Network Layer Reachability Information Used to advertise feasible routes Composed of:
Network Prefix Mask Length

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

16 16

Network Layer Reachability Information (NLRI)


This information is the heart and soul of the BGP routing protocol. NLRI are feasible routes that are exchanged between BGP speakers. NLRI is composed of: Mask Length (or the length of the Prefix) Network Prefix

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

16

BGP Updates Attributes


Used to convey information associated with NLRI
AS path Next hop Local preference Multi-Exit Discriminator (MED) Community Origin Aggregator
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

17 17

BGP Attributes
Attributes are associated with the NLRI (i.e. the route) being advertised. These attributes include (but are not limited to) the following: AS path Next-Hop Local Preference Multi-Exit Discriminator (MED) Community Origin Aggregator The discussion of all of these attributes is beyond the scope of this tutorial. However, the AS-Path and Next-Hop attributes are discussed in the following sections since they are fundamental to the basic operation of BGP.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

17

AS-Path Attribute
Sequence of ASes a route has traversed Loop detection Apply policy AS 300
AS 200
170.10.0.0/16

AS 100
180.10.0.0/16

Network Path 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200

AS 400
150.10.0.0/16 Network 180.10.0.0/16 170.10.0.0/16 150.10.0.0/16 Path 300 200 100 300 200 300 400

AS 500

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

18 18

AS-Path Attribute
This attribute describes the sequence of AS numbers that must be traversed to reach the network whose prefix is advertised in the NLRI field of the Update message. As each eBGP speaker in the network forwards this Update message on to its eBGP neighbors, the local AS number is prepended to this AS-Path list. In the above example, network 180.10.0.0/16 resides inside of AS100. Notice that after this network prefix is reaches AS 500, the AS-Path for network 180.10.0.0/16 is 300 200 100. This means that traffic destined for this network must travel to AS 300, then on to AS 200 and finally AS 100 where network 180.10.0.0 resides. The same thing occurs for networks 170.10.0.0/16 and 150.10.0.0/16. An Update messages are originated by AS 200 and AS 400, respectively. When these Update messages reach AS 500, a separate entry is maintained for each network along with its unique AS-Path.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

18

Next Hop Attribute


AS 300 AS 200
150.10.0.0/16 B
192 .20 .2.0 /30

192.10.1.0/30

140.10.0.0/16
.2

.1

D E

BGP Update Messages

.2

Network Next-Hop 160.10.0.0/16 192.20.2.1

Path 100

.1

Next hop to reach a network Usually a local network is the next hop in eBGP session

AS 100
160.10.0.0/16

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

19 19

Next-Hop Attribute
The Next-Hop attribute contains the IP address of the next-hop router to which traffic destined for the network specified in the NLRI is to be sent. This is normally a directly connected network in the case of eBGP sessions. In the above example, network 160.10.0.0/16 resides in AS 100. Router A originates an Update message containing this network in the NLRI and sends this to Router B as shown. The Next-Hop attribute in the Update message contains the the IP address of Router As serial port, namely 192.20.2.1. This information instructs Router B that traffic for network 160.10.0.0/16 should be sent to 192.20.2.1 (Router A) for forwarding on to the destination.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

19

Next Hop Attribute


AS 300 AS 200
150.10.0.0/16 B
192 .20 .2.0 /30

192.10.1.0/30

140.10.0.0/16
.2

.1

D E

BGP Update Messages

.2

Network Next-Hop 150.10.0.0/16 192.10.1.1 160.10.0.0/16 192.10.1.1

Path 200 200 100

.1

Next hop to reach a network Usually a local network is the next hop in eBGP session Next Hop updated between eBGP Peers
8/14/2001 3:35 PM

AS 100
160.10.0.0/16

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

20 20

Next-Hop Attribute
The Next-Hop attribute is normally updated with the local IP address of the eBGP router when an Update message is forwarded to another eBGP peer. In the above example, the Update for network 160.10.0.0/16 is being forwarded by Router C to its eBGP peer Router D. Notice that the Next-Hop attribute in the Update message has been updated to contain the the IP address of Router Cs serial port, namely 192.10.1.1. This information instructs Router D that traffic for network 160.10.0.0/16 should be sent to 192.10.1.1 (Router C) for forwarding on to the destination.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

20

Next Hop Attribute


AS 300 AS 200
150.10.0.0/16 B
192 .20 .2.0 /30

192.10.1.0/30

140.10.0.0/16
.2

.1

D E

BGP Update Messages

.2

Network Next-Hop 150.10.0.0/16 192.10.1.1 160.10.0.0/16 192.10.1.1

Path 200 200 100

.1

Next hop not changed between iBGP peers

AS 100
160.10.0.0/16

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

21 21

Next-Hop Attribute
The Next-Hop attribute is not updated when the Update message is being sent to an iBGP peer. In the above example, the Update for network 160.10.0.0/16 is being forwarded by Router D to its iBGP peer Router E. Notice that the Next-Hop attribute for network 160.10.0.0/16 has not been updated and still contains the the IP address of Router Cs serial port, namely 192.10.1.1. This means that the IGP running in AS 300 must contain routing information for 192.10.1.1 so that Router E can resolve how to forward the traffic for network 160.10.0.0/16 across AS 300 to Router D. Note: The requirement of carrying this Next-Hop information through the IGP (in this case a route to 192.10.1.1) can be eliminated by the use of the next-hop-self command at Router D. This forces Router D to update the Next-Hop attribute with its own IP address when sending the Update to its iBGP neighbor, Router E.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

21

Next Hop Attribute (more)

IGP should carry route to next hops Recursive route look-up Unlinks BGP from actual physical topology Allows IGP to make intelligent forwarding decision

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

22 22

Next-Hop Attribute
In general, the IGP should carry a route to the Next-Hop address as these addresses are often outside the address space in the IGP. iBGP speakers must perform a recursive route lookup to resolve the BGP Next-Hop information to a local IGP next-hop. (In other words, the iBGP router must determine the internal network next hop in the direction of iBGP speaker on the other side of the AS that advertised the network. This uncouples BGP from the actual physical topology of the network inside of the AS. As long as the IGP can find a path through the network to reach the Next-Hop address, then transient traffic can be routed across the AS to the exit-point iBGP router. This also permits the IGP to make intelligent forwarding decisions based on the internal metrics set in the local network.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

22

BGP Updates Withdrawn Routes

Used to withdraw network reachability Each Withdrawn Route is composed of:


Network Prefix Mask Length

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

23 23

Withdrawn Routes
This section of the Update message contains zero or more routes (prefix) that are to be withdrawn. The message is used to inform a BGP neighbor that the specified routes are no longer reachable.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

23

BGP Updates Withdrawn Routes


AS 123
.1 192.168.10.0/24 .2

AS 321
BGP Update Message

Withdraw Routes 192.192.25.0/24

Connectivity lost

192.192.25.0/24

Network Next-Hop Path 150.10.0.0/16 192.168.10.2 321 200 192.192.25.0/24 192.168.10.2 321

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

24 24

Withdrawn Routes
In this example, network 192.192.25.0/24 was previously advertised to AS 123. However, the only interface to this network has failed. As a result, an Update message is sent to AS 123 with the prefix of network 192.192.25.0/24 in the Withdrawn Routes section of the message. The eBGP peer in AS 123 will update the information in its BGP Routing Information Base (RIB) to mark this route as withdrawn.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

24

BGP Routing Information Base


BGP RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

router bgp 100 network 160.10.1.0 255.255.255.0 network 160.10.3.0 255.255.255.0 no auto-summary
D D D R S 10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

Route Table

BGP network commands are normally used to populate the BGP RIB with routes from the Route Table

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

25 25

BGP Routing Information Base


The BGP Routing Information Base contains all of the prefixes along with their associated attributes that have been received from other BGP peers. In addition, the router can be configured to inject information from its local Route Table into its local BGP RIB. This information is, in turn, advertised to the routers BGP peers. The Cisco IOS network command is the most common method of injecting local Route Table information into the BGP RIB. In the example shown above, prefixes 160.10.1.0/24 and 160.10.3.0/24 are being injected via the use of the network command.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

25

BGP Routing Information Base


BGP RIB
Network *> 160.10.0.0/16 * i s> 160.10.1.0/24 s> 160.10.3.0/24 Next-Hop 0.0.0.0 192.20.2.2 192.20.2.2 192.20.2.2 Path i i i i

router bgp 100 network 160.10.1.0 255.255.255.0 network 160.10.3.0 255.255.255.0 aggregate-address 160.10.0.0 255.255.0.0 summary-only no auto-summary
D D D R S 10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

Route Table

BGP aggregate -address commands may be used to install summary routes in the BGP RIB

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

26 26

BGP Routing Information Base


Prefix aggregation may be accomplished by the use of the Cisco IOS aggregate-address command as shown in the example above. In this example, the use of the summary-only clause results in only the aggregate address being advertised to the routers BGP peers. The small s to the left of the other 160.10.x.x entries in the BGP RIB indicate that these prefixes are suppressed and are not being advertised to the routers BGP peers.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

26

BGP Routing Information Base


BGP RIB
Network *> 160.10.0.0/16 * i s> 160.10.1.0/24 s> 160.10.3.0/24 *> 192.1.1.0/24 Next-Hop 0.0.0.0 192.20.2.2 192.20.2.2 192.20.2.2 192.20.2.2 Path i i i i ?

router bgp 100 network 160.10.0.0 255.255.0.0 redistribute static route-map foo no auto-summary
D D D R S 10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

access-list 1 permit 192.1.0.0 0.0.255.255 route-map foo permit 10 match ip address 1

Route Table

BGP redistribute commands can also be used to populate the BGP RIB with routes from the Route Table

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

27 27

BGP Routing Information Base


The redistribute command may also be used to inject specific routes from the Route Table into the BGP RIB as shown in the example above. In this example, all static routes specified by the route-map foo are being injected into the BGP RIB. In this case, the static route for network 192.1.1.0/24is a match and is being injected into the BGP RIB

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

27

BGP Routing Information Base


IN Process BGP RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 * > 173.21.0.0/16 Next-Hop 192.20.2.2 192.20.2.2 192.20.2.1 Path i i 100

OUT Process

Update

Update

Network Next-Hop 173.21.0.0/16 192.20.2.1

Path 100

BGP in process
receives path information from peers results of BGP path selection placed in the BGP table best path flagged (denoted by >)
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

28 28

BGP Input Process


The BGP Input Process receives all incoming BGP Update messages from the other BGP peers. Although not shown in the above example, multiple paths to a particular network (prefix) may exist. The Input Process applies the various selection criteria to the incoming information and selects the best path for a given prefix. The selected best path is indicated by the > character as shown in the above example.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

28

BGP Routing Information Base


IN Process BGP RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *> 173.21.0.0/16 Next-Hop 192.20.2.2 192.20.2.2 192.20.2.1 Path i i 100

OUT Process

Update

Update

Network Network 160.10.1.0/24 160.10.1.0/24 160.10.3.0/24 160.10.3.0/24 173.21.0.0/16 173.21.0.0/16

Next-Hop Next-Hop 192.20.2.2 192.20.2.2 192.20.2.2 192.20.2.2 192.20.2.2 192.20.2.1 192.20.2.2

Path Path 200 200 200 200 200 200 100 100

BGP out process


builds update using info from RIB may modify update based on config Sends update to peers
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Next-Hop changed

8/14/2001 3:35 PM

29 29

BGP Output Process


The BGP Output Process is responsible for constructing and sending BGP Update messages to the routers other BGP peers. The contents of these Update messages may be modified using certain IOS configuration commands so that the desired routing policy is established. In the example above, the next hop address being used in the Update message has been modified (possibly using the next-hop-self configuration command.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

29

BGP Routing Information Base


BGP RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *> 173.21.0.0/16 Next-Hop 192.20.2.2 192.20.2.2 192.20.2.1 Path i i 100

D D D R S B

10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24 173.21.0.0/16

Best paths installed in routing table if:


prefix and prefix length are unique lowest protocol distance

Route Table
Module10. ppt 8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

30 30

Injection of BGP prefixes into the local Route Table


The best-path BGP prefixes are installed in the local routing table if: The prefix/length are unique AND the BGP prefix is the lowest Administrative Distance.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

30

MBGPMultiprotocol BGP

MBGP Overview MBGP Update Messages MBGP Capability Negotiation MBGP NLRI exchange MBGP-DVMRP redistribution BGP-to-MBGP redistribution

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

31 31

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

31

MBGP Overview
MBGP: Multiprotocol BGP Defined in RFC 2283 (extensions to BGP) Can carry different types of routes
IPv4 Unicast IPv4 Multicast IPv6 Unicast

May be carried in same BGP session Does not propagate multicast state info
Still need PIM to build Distribution Trees

Same path selection and validation rules


AS-Path, LocalPref, MED,
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

32 32

MBGP Overview
Multiprotocol BGP (MBGP) is defined in RFC 2283. This RFC defines extensions to the existing BGP protocol to allow it to carry more than just IPv4 route prefixes. Examples of some of the new types of routing information include (but are not limited to): IPv4 prefixes for Unicast routing IPv4 prefixes for Multicast RPF checking IPv6 prefixes for Unicast routing A common misconception is that MBGP is a replacement for PIM. This is incorrect. MBGP does not propagate any multicast state information nor does it build any sort of multicast distribution trees. MBGP can distribute unicast prefixes that can be used for the multicast RPF check. Because MBGP is an extension to the existing BGP protocol, the same basic rules apply to path selection, path validation, etc.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

32

MBGP Overview
Separate BGP tables maintained
Unicast Routing Information Base (U -RIB) Multicast Routing Information Base (M-RIB) New BGP nlri keyword specifies which RIB Allows different unicast/multicast topologies or policies

Unicast RIB (U-RIB)


Contains unicast prefixes for unicast forwarding Populated with BGP unicast NLRI

Multicast RIB (M-RIB)


Contains unicast prefixes for RPF checking Populated with BGP multicast NLRI
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

33 33

Routing Information Bases


Previously, BGP only maintained a single Routing Information Base (RIB) for IPv4 unicast prefixes. In the case of MBGP, separate RIBs must be maintained for each type of routing information being exchanged. This implies that a separate Unicast RIB (U-RIB) and a separate Multicast RIB (M-RIB) can be maintained by MBGP. A new nlri keyword was added to the Cisco IOS command structure to differentiate between the U-RIB and the M-RIB. (Note: This keyword will soon be depreciated in order to generalize MBGP for other protocols such as IPv6. Consult you latest IOS Documentation for the correct syntax.) Unicast RIB (U-RIB) This RIB contains the unicast prefixes that was previously used by BGP for IPv4 unicast traffic forwarding. Multicast RIB (M-RIB) This new RIB contains the same type of unicast prefixes contained in the URIB except that the prefixes stored in the M-RIB are used to RPF check arriving multicast traffic.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

33

MBGP Overview

Cisco IOS MBGP implementation


All the familiar BGP configuration knobs Carries multicast routes in MP_REACH_NLRI
Still carries unicast routes in old style NLRI

NLRI capability negotiation Redistribution between MBGP and DVMRP Redistribution of BGP stubs into MBGP

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

34 34

Cisco IOS Implementation of MBGP


All of the familiar BGP configuration capabilities are maintained and may be applied separately to the U-RIB and/or the M-RIB. The initial implementation of MBGP carries the multicast routes in the new MBPG attribute MP_REACH_NLRI while the unicast routes are still carried in the old style NLRI format. Capability negotiation has been added to permit MBGP peers to negotiate which route information will be exchanged over a session. The ability to redistribute MBGP multicast NLRI into DVMRP and v ice versa has been added to permit a transition from legacy DVMRP based networks to native multicast networks. Translation of BGP unicast NLRI into MBGP multicast NLRI is supported to permit easy transition of non MBGP capable networks.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

34

MBGP Update Message


The MBGP UPDATE Message
Unfeasible Routes Length (2 Octets) Withdrawn Routes (Variable) Attribute Type Total path Attribute Length (2 Octets) Path Attributes (Variable) Network Layer Reachability Information (Variable) Length (I Octet) Prefix (Variable) Attribute Length Attribute Value Length (I Octet) Prefix (Variable)

New Multiprotocol Attributes added to Path Attributes:


MP_REACH_NLRI MP_UNREACH_NLRI
Module10. ppt 8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

35 35

MBGP Update Message


The MBGP Update message is identical to the old BGP Update message with the exception that two new attribute types have been added. These two attributes are: MP_REACH_NLRI MP_UNREACH_NLRI

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

35

MBGP Update Message


MP_REACH_NLRI Attribute
Address Family Identifier (2 Octets) Subsequent Address Family Identifier (1 Octet) Length of the Next-Hop Address (1 Octet) Network Address of Next-Hop (Variable) Number of SNPAs (1 Octet) Length of first SNPA (1 Octet) First SNPA (Variable) Length of second SNPA (1 Octet) Second SNPA (Variable) Length (I Octet) ...... Prefix (Variable) Length of last SNPA (1 Octet) Last SNPA (Variable) Network Layer Reachability Information (Variable)
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

RFC 1700

May be Zero

8/14/2001 3:35 PM

36 36

MP_REACH_NLRI Attribute
The key characteristics of this new attribute is the Address Family Identifier and Sub-Address Family Identifier fields. These two fields define the type of routing information that is carried in the NLRI field of this attribute. The Next-Hop Address information is contained in the field following the AFI and Sub-AFI fields. Following the Next-Hop Address fields are zero or more SNPA fields. These field contain the attributes associated with the NLRI field. (For IPv4 AFIs, these attributes are the same as the old style BGP attributes.) Finally, the NLRI field contains the Length and Prefix information of the route that is being advertised as reachable.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

36

MBGP Update Message


Address Family Information (AFI)
Identifies Address Type (see RFC1700)
AFI = 1 (IPv4) AFI = 2 (IPv6)

Sub-Address Family Information (Sub-AFI)


Sub category for AFI Field Address Family Information (AFI) = 1 (IPv4)
Sub-AFI = 1 (NLRI is used for unicast) Sub-AFI = 2 (NLRI is used for multicast RPF check) Sub-AFI = 3 (NLRI is used for both unicast and multicast RPF check)

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

37 37

Address Family Information (AFI)


This field is based on the address families defined in RFC1700. AFI = 1 (IPv4) AFI = 2 (IPv6)

Sub-Address Family Information (Sub-AFI)


This field contains further information regarding the type of routing information being exchanged in the NLRI field. The following are the current definitions for Sub-AFI codes associated with the IPv4 Address Family: Sub-AFI = 1 (NLRI is used for unicast routing) Sub-AFI = 2 (NLRI is used for multicast RPF check) Sub-AFI = 3 (NLRI is used for both unicast routing and multicast RPF checking

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

37

MBGP Update Message

MP_UNREACH_NLRI Attribute
Address Family Identifier (2 Octets) Subsequent Address Family Identifier (1 Octet) Withdrawn Routes (Variable) Length (I Octet) Prefix (Variable)

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

38 38

MP_UNREACH_NLRI Attribute
This new attribute permits unfeasible routes of the new protocol types to be withdrawn in the same fashion as the Withdrawn Routes field is used in BGP. Notice that this attribute also carries the AFI and Sub-AFI fields along the associated Length and Prefix of the withdrawn route.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

38

MBGPCapability Negotiation
BGP routers establish BGP sessions through the OPEN message OPEN message contains optional parameters BGP session is terminated if OPEN parameters are not recognised New parameter: CAPABILITIES
Multiprotocol extension Multiple routes for same destination
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

39 39

MBGP Capability Negotiation


MBGP has extended the Open Message format to include a new optional Capability negotiation parameters. MBGP routes negotiation the lowest common set of capabilities using these Capability option fields. If two MBGP peers are unable to agree on the Capabilities supported, the MBGP session is terminated and an error message written to the console.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

39

MBGPCapability Negotiation

New nlri keyword on neighbor command


neighbor <foo> remote-as <asn> nlri multicast unicast

Configures router to negotiate either or both types of NLRI If neighbor configures both or subset, common NRLI is used in both directions If there is no match, notification is sent and peering doesnt come up

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

40 40

MBGP Capability Negotiation


A new nlri keyword is used on the neighbor command to control which capabilities are to be enabled with which peer. For example the command: neighbor foo remote-as 100 nlri multicast unicast results in both multicast and unicast capabilities being enabled to neighbor foo.

If foo has configured the same set of abilities, then both unicast and multicast NLRI can be exchanged via the session. If the two peers do not match, the lowest common subset is used. If there is no match between the capabilities, the peering will not come up.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

40

MBGP Capability Negotiation


AS 123
.1 192.168.100.0/24 .2

AS 321

router bgp 123 neighbor 192.168.100.2 remote-as 321 nlri unicast multicast . . .

Receiver

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

41 41

MBGP Capability Negotiation


In this example, the router on the left is configured to peer with the router on the right. The command: neighbor 192.168.100.2 remote-as 321 nlri unicast multicast instructs the router on the left to attempt to negotiate both unicast and multicast NLRI exchange.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

41

MBGP Capability Negotiation


AS 123
.1 192.168.100.0/24 .2

AS 321

router bgp 321 neighbor 192.168.100.1 remote-as 123 nlri unicast multicast . . .

Receiver

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

42 42

MBGP Capability Negotiation


In this example, the router on the right is configured to peer with the router on the left. The command:
neighbor 192.168.100.1 remote-as 123 nlri unicast multicast

instructs the router on the right to attempt to negotiate both unicast and multicast NLRI exchange.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

42

MBGP Capability Negotiation


AS 123
.1

MBGP Session for Unicast and Multicast NLRI


192.168.100.0/24 .2

AS 321

Receiver

BGP: BGP: BGP: BGP: BGP: BGP: BGP: BGP: BGP: BGP: BGP: BGP:

192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2

open active, local address 192.168.100.1 went from Active to OpenSent sending OPEN, version 4 192.192.25.0/24 OPEN rcvd, version 4 rcv OPEN w/option parameter type: 2, len: 6 OPEN has CAPABILITY code: 1, length 4 Sender OPEN has MP_EXT CAP for afi/safi: 1/1 rcv OPEN w/option parameter type: 2, len: 6 OPEN has CAPABILITY code: 1, length 4 OPEN has MP_EXT CAP for afi/safi: 1/2 went from OpenSent to OpenConfirm went from OpenConfirm to Established

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

43 43

MBGP Capability Negotiation


In this example, the two routers can be seen exchanging Capabilities in the router debug message output. In this case, both unicast and multicast NLRI has been successfully negotiated. Therefore, both unicast and multicast NLRI will be exchanged in this single MBGP peer session

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

43

MBGP Capability Negotiation

If neighbor doesnt include the CAPABILITY parameters in open, Cisco backs off and reopens with no capability parameters Peering comes up in unicast-only mode Hidden command
neighbor <foo> dont-capability-negotiate

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

44 44

MBGP Capability Negotiation


In cases where the far end of the attempted peering is not running MBGP, the router at that end will not send an OPEN message with the Capability option. Therefore, the Cisco router will retry the OPEN handshake without any Capability options. This means the peering will come up and only IPv4 unicast NLRI will be exchanged. In some cases, non MBGP routers have been known to crash or get stuck in a loop during the OPEN handshake. If this occurs, the Cisco MBGP router can be configured with the following hidden command to not attempt the Capability negotiation: neighbor <foo> dont-capability-negotiate

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

44

MBGP NLRI Information RIBs may be populated by:


Network commands
network <foo> <foo-mask> [nlri multicast unicast]

New nlri keyword controls in which RIB the matching route(s) is(are) stored M-RIB if multicast keyword specified U-RIB if unicast keyword specified (or if nlri clause omitted) Both RIBs if both keywords specified

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

45 45

Controlling RIB population


Like the neighbor command, the network command also accepts the new nlri keyword. This provides control as to which RIB (U-RIB, M-RIB, or both) that the network information is injected. If the nlri clause is omitted, the Unicast RIB (U-RIB) is assumed.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

45

MBGP NLRI Information


Unicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

Multicast RIB
Network Next-Hop Path

D D D R S

10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

router bgp 100 network 160.10.1.0 255.255.255.0 nlri unicast network 160.10.3.0 255.255.255.0 nlri unicast no auto-summary

New nlri keyword used to control RIB population. (e.g. network command)
Unicast RIB only

Route Table

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

46 46

RIB Population Example


In this example, only the unicast keyword is used in the nlri clause. This causes the matching networks in the local Route Table to be injected into only the MBGP Unicast RIB.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

46

MBGP NLRI Information


Unicast RIB
Network Next-Hop Path

Multicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

D D D R S

10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

router bgp 100 network 160.10.1.0 255.255.255.0 nlri multicast network 160.10.3.0 255.255.255.0 nlri multicast no auto-summary

New nlri keyword used to control RIB population. (e.g. network command)
Unicast RIB only Multicast RIB only

Route Table

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

47 47

RIB Population Example


In this example, only the multicast keyword is used in the nlri clause. This causes the matching networks in the local Route Table to be injected into only the MBGP Multicast RIB.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

47

MBGP NLRI Information


Unicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

Multicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

D D D R S

10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

router bgp 100 network 160.10.1.0 255.255.255.0 nlri unicast multicast network 160.10.3.0 255.255.255.0 nlri unicast multicast no auto-summary

New nlri keyword used to control RIB population. (e.g. network command)
Unicast RIB only Multicast RIB only Both RIBs
8/14/2001 3:35 PM

Route Table

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

48 48

RIB Population Example


In this example, both the unicast and multicast keywords are used in the nlri clause. This causes the matching networks in the local Route Table to be injected into both the MBGP Unicast and Multicast RIBs.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

48

MBGP NLRI Information RIBs may be populated by:


Redistribution commands
router bgp 100 redistribute <unicast> [<process>] route-map foo access-list 1 permit 192.10.0.0 0.0.255.255 route-map foo permit 10 match ip address 1 set nlri multicast

Route map set nlri clause controls which RIB the matching route(s) is(are) stored

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

49 49

RIB Population Example


This example demonstrates that the nlri clause may be used in a route-map that is in turn, used in a redistribute command. This may be useful in some scenarios where tighter control is desired as to what is injected into the desired RIB. In this example, any network in the 192.10.0.0/16 range will be injected into only the MBGP Multicast RIB.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

49

MBGP NLRI Information


Unicast RIB
Network Next-Hop Path

Multicast RIB
Network *>i192.1.1.0/24 Next-Hop 192.20.2.2 Path ?

router bgp 100 redistribute static route-map foo access-list 1 permit 192.1.1.0 0.0.0.255
D D D R S 10.1.2.0/24 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24

route-map foo permit 10 match ip address 1 set nlri multicast

Route Table

New nlri keyword used to control RIB population in a route -map


(Multicast RIB only example)
50 50

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

RIB Population Example


In this example, network 192.1.1.0/24 is being injected into only the MBGP Multicast RIB as a result of route-map foo.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

50

MBGP NLRI Information


Other nlri keyword commands
Aggregation
aggregate-address <foo> <foo-mask> [nlri multicast unicast]

Generates an aggregate route for network <foo>

Default Origination
neighbor <foo> default-originate [nlri multicast unicast]

Originates a default route to the neighbor

In Route Maps
match nlri multicast unicast

Matches on the NLRI type


set nlri multicast unicast

Injects the matched route into the specified unicast or multicast RIB
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

51 51

Controlling RIB population


The new nlri keyword can also be applied in other MBGP commands. This includes: aggregate-address default-originate route-maps

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

51

MBGP NLRI Information RIBs are also populated by:


Receiving MP_REACH_NLRI from Peers
Storage controlled by AFI/SAFI value: AFI/SAFI = 1/1 (IPv4 / Unicast) : U-RIB only AFI/SAFI = 1/2 (IPv4 / Multicast) : M-RIB only AFI/SAFI = 1/3 (IPv4 / Unicast-Multicast) : Both RIBs

Receiving NLRI (old style) from Peers


Stored in RIB only

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

52 52

RIB Population from Update Messages


Information received in the form of Update messages are written into either the Unicast RIB, the Multicast RIB or both RIBs depending on the value of the AFI/Sub-AFI value. AFI/Sub-AFI = 1/1 (IPv4 / Unicast): U-RIB only AFI/Sub-AFI = 1/2 (IPv4 / Multicast): M-RIB only AFI/Sub-AFI = 1/3 (IPv4 / Unicast-Multicast): Both RIBs Old style NLRI (i.e. not in a MP_REACH_NLRI Attribute): U-RIB only

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

52

MBGP NLRI Information


Unicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *>i192.192.2.0/24 BGP Update from Peer MP_REACH_NLRI: MP_REACH_NLRI: 192.192.2/24 192.192.2/24 AFI: AFI: 1, 1, Sub-AFI: Sub-AFI: 1 1 (unicast) (unicast) AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 Next-Hop 192.20.2.2 192.20.2.2 192.168.200.2 Path i i 300 200 i

Multicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path i i

Storage of arriving NLRI information depends on AFI/SAFI fields in the Update message Unicast RIB only (AFI=1/SAFI=1 or old style NLRI)

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

53 53

Receiving Update Messages


Storage of arriving NLRI information depends on the AFI/Sub-AFI fields in the MP_REACH_NLRI attribute. In the above example, an Update message is received that contains an MP_REACH_NLRI with an AFI/Sub-AFI = 1/1 which indicates that the information is for the Unicast RIB. As a result, the information is processed by the MBGP input process and written into the U-RIB as shown.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

53

MBGP NLRI Information


Unicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 BGP Update from Peer MP_REACH_NLRI: MP_REACH_NLRI: 192.192.2/24 192.192.2/24 AFI: AFI: 1, 1, Sub-AFI: Sub-AFI: 2 2 (multicast) (multicast) AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 Next-Hop 192.20.2.2 192.20.2.2 Path i i

Multicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *>i192.192.2.0/24 Next-Hop 192.20.2.2 192.20.2.2 192.168.200.2 Path i i 300 200 i

Storage of arriving NLRI information depends on AFI/SAFI fields in the Update message Unicast RIB only (AFI=1/SAFI=1 or old style NLRI) Multicast RIB only (AFI=1/SAFI=2)
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

54 54

Receiving Update Messages


In the above example, an Update message is received that contains an MP_REACH_NLRI with an AFI/Sub-AFI = 1/2which indicates that the information is for the Multicast RIB. As a result, the information is processed by the MBGP input process and written into the M-RIB as shown.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

54

MBGP NLRI Information


Unicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *>i192.192.2.0/24 BGP Update from Peer MP_REACH_NLRI: MP_REACH_NLRI: 192.192.2/24 192.192.2/24 AFI: AFI: 1, 1, Sub-AFI: Sub-AFI: 3 3 (both) (both) AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 Next-Hop 192.20.2.2 192.20.2.2 192.168.200.2 Path i i 300 200 i

Multicast RIB
Network *>i160.10.1.0/24 *>i160.10.3.0/24 *>i192.192.2.0/24 Next-Hop 192.20.2.2 192.20.2.2 192.168.200.2 Path i i 300 200 i

Storage of arriving NLRI information depends on AFI/SAFI fields in the Update message Unicast RIB only (AFI=1/SAFI=1 or old style NLRI) Multicast RIB only (AFI=1/SAFI=2) Both RIBs (AFI=1/SAFI=3)
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

55 55

Receiving Update Messages


In the above example, an Update message is received that contains an MP_REACH_NLRI with an AFI/Sub-AFI = 1/3which indicates that the information is for both the Unicast and Multicast RIBs. As a result, the information is processed by the MBGP input process and written into both the URIB and the M-RIB as shown.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

55

MBGP NLRI Information

Congruent Topologies
AS 123
.1

BGP Session for Unicast and Multicast NLRI


192.168.100.0/24 .2

AS 321

192.168.10.0/24

router bgp 123 neighbor 192.168.100.2 remote-as 321 nlri unicast multicast network 192.168.10.0 255.255.255.0 nlri unicast multicast no auto-summary

192.192.25.0/24

Receiver

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

56 56

Congruent Unicast-Multicast Topologies


When the flows of both unicast and multicast traffic follow the same path, it is said that the two topologies are congruent. In the example above, both the neighbor and the network statements in the router on the left, contain the nlri unicast multicast clause. This causes this router to negotiate a single BGP session over which both unicast and multicast NLRI are exchanged.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

56

MBGP NLRI Information

Congruent Topologies
AS 123
.1

BGP Session for Unicast and Multicast NLRI


192.168.100.0/24 .2

AS 321

192.168.10.0/24

router bgp 321 neighbor 192.168.100.1 remote-as 123 nlri unicast multicast network 192.192.25.0 255.255.255.0 nlri unicast multicast no auto-summary

192.192.25.0/24

Receiver

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

57 57

Congruent Unicast-Multicast Topologies


Continuing with the example from the previous page, both the neighbor and the network statements in the router on the right also contain the nlri unicast multicast clause. This causes this router to also negotiate a single BGP session over which both unicast and multicast NLRI are exchanged. In this simple case, the Unicast and Multicast topologies are congruent. Once the two routers have opened the session, the result is that the Unicast and Multicast topologies will be congruent.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

57

MBGP NLRI Information

Congruent Topologies
AS 123
.1

BGP Session for Unicast and Multicast NLRI


192.168.100.0/24 .2

AS 321

Unicast Information NLRI: 192.192.25/24 AS_PATH: 321 MED: Next-Hop: 192.168.100.2 ... 192.192.25.0/24 Multicast Information Receiver MP_REACH_NLRI: 192.192.25/24 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 321 MED: Next-Hop: 192.168.100.2 ... Routing Update
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

192.168.10.0/24

Sender

8/14/2001 3:35 PM

58 58

Congruent Unicast-Multicast Topologies


Continuing with the example from the previous pages we see the router on the right sending an Update message containing both an old style NLRI for the unicast traffic and an MP_REACH_NLRI attribute containing the multicast NLRI information. Notice that the Next-Hop address is identical for both NLRI. This results in the Unicast and Multicast topologies being congruent.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

58

MBGP NLRI Information

Congruent Topologies
AS 123
.1

BGP Session for Unicast and Multicast NLRI


192.168.100.0/24 .2

AS 321

Unicast Information NLRI: 192.168.10/24 AS_PATH: 123 MED: Next-Hop: 192.168.100.1 ... Multicast Information Receiver MP_REACH_NLRI: 192.168.10/24 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 123 MED: Next-Hop: 192.168.100.1 ... Routing Update
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

192.168.10.0/24

192.192.25.0/24

Sender

8/14/2001 3:35 PM

59 59

Congruent Unicast-Multicast Topologies


Continuing with the example from the previous pages we see the router on the left sending an Update message. This message also containing both an old style NLRI for the unicast traffic and an MP_REACH_NLRI attribute containing the multicast NLRI information. Notice that the Next-Hop address is identical for both NLRI. This results in the Unicast and Multicast topologies being congruent.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

59

MBGP NLRI Information

Incongruent Topologies
AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 .2

Multicast Traffic .2 192.168.200.0/24 192.192.25.0/24

192.168.10.0/24 router bgp 321 . . . network 192.192.25.0 255.255.255.0 nlri unicast multicast neighbor 192.168.100.1 remote-as 123 nlri unicast neighbor 192.168.200.1 remote-as 123 nlri multicast

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

60 60

Incongruent Unicast-Multicast Topologies


In some cases it is desirable to have the multicast traffic follow one path and the unicast another. When this occurs, the topologies are said to be incongruent. In the example above, two separate neighbor statements are used so that the router on the right will negotiate and establish two separate BGP sessions with its neighbor router on the left. This is accomplished with the following two commands
neighbor 192.168.100.1 remote-as 123 nlri unicast neighbor 192.168.200.1 remote-as 123 nlri multicast

The first command instructs the router on the right to negotiate a unicastonly BGP session over the top serial link while the second command instructs it to negotiate a multicast-only session over the bottom serial link. The router on the right is also instructed to inject network 192.192.25.0/24 into both its local U-RIB and M-RIB using the following command:
network 192.192.25.0 255.255.255.0 nlri unicast multicast

Once this network is injected into both RIBs, its network prefix will be advertised to the router on the right. However, the unicast nlri for this network will be advertised over the top BGP session and the multicast nlri over the bottom BGP session.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

60

MBGP NLRI Information

Incongruent Topologies
AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 .2

Multicast Traffic .2 192.168.200.0/24 192.192.25.0/24

192.168.10.0/24 router bgp 123 . . . network 192.168.10.0 255.255.255.0 nlri unicast multicast neighbor 192.168.100.2 remote-as 321 nlri unicast neighbor 192.168.200.2 remote-as 321 nlri multicast

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

61 61

Incongruent Unicast-Multicast Topologies


Continuing with the example, two separate neighbor statements are also used on the router on the left so it will negotiate and establish two separate BGP sessions with its neighbor router on the right. This is accomplished with the following two commands
neighbor 192.168.100.2 remote-as 321 nlri unicast neighbor 192.168.200.2 remote-as 321 nlri multicast

The first command instructs the router on the right to negotiate a unicastonly BGP session over the top serial link while the second command instructs it to negotiate a multicast-only session over the bottom serial link. The router on the right is also instructed to inject network 192.168.10.0/24 into both its local U-RIB and M-RIB using the following command:
network 192.168.10.0 255.255.255.0 nlri unicast multicast

Once this network is injected into both RIBs, its network prefix will be advertised to the router on the right. However, the unicast nlri for this network will be advertised over the top BGP session and the multicast nlri over the bottom BGP session.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

61

MBGP NLRI Information

Incongruent Topologies
AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 .2

Multicast Traffic .2 192.168.200.0/24 192.192.25.0/24

192.168.10.0/24

Sender Unicast Information NLRI: NLRI: 192.192.25/24 192.192.25/24 AS_PATH: AS_PATH: 321 321 MED: MED: Next-Hop: 192.168.100.2 Next-Hop: 192.168.100.2 Routing Update

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

62 62

Incongruent Unicast-Multicast Topologies


Once the two separate BGP sessions have been established, unicast and multicast NLRI will be exchanged over their respective sessions as shown in the example above. In the drawing above, the router on the right is sending a BGP Update message over the top serial link (the unicast session) containing only a unicast NLRI for network 192.192.25.0/24. Notice that the Next-Hop information indicates that unicast traffic for this network should be sent to 192.168.100.2, which corresponds to the top serial link. This will result in unicast traffic destined for the 192.168.100.2 network flowing over the top serial link.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

62

MBGP NLRI Information

Incongruent Topologies
AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 .2

Multicast Traffic .2 192.168.200.0/24 192.192.25.0/24

192.168.10.0/24

Multicast Information MP_REACH_NLRI: MP_REACH_NLRI: 192.192.25/24 192.192.25/24 AFI: AFI: 1, 1, Sub-AFI: Sub-AFI: 2 2 AS_PATH: AS_PATH: 321 321 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 Routing Update

Sender

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

63 63

Incongruent Unicast-Multicast Topologies


Continuing with NLRI exchange example shown In the drawing above, the router on the right is now sending a BGP Update message over the bottom serial link (the multicast session) containing only a multicast NLRI for network 192.192.25.0/24. Notice that the Next-Hop information indicates that multicast traffic from any source in the network 192.192.25.0/24 should RPF to the 192.168.200.2 address, which corresponds to the bottom serial link. This will result in multicast traffic from any source in network 192.168.100.2 flowing over the bottom serial link.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

63

MBGP NLRI Information

Incongruent Topologies
AS 123 AS 321 .1 .1 Unicast Traffic 192.168.100.0/24 .2

Multicast Traffic .2 192.168.200.0/24 192.192.25.0/24

192.168.10.0/24

Sender

Unicast RIB Network Next-Hop Path 192.192.25.0/24 192.168.100.2 321 Multicast RIB Network Next-Hop Path 192.192.25.0/24 192.168.200.2 321

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

64 64

Incongruent Unicast-Multicast Topologies


In the final drawing of our example we see that the router on the left has processed the two Update messages and inserted an entry for network 192.192.25/24 in both the U-RIB and the M-RIB. Notice that the Next-Hop information in the entry in the M-RIB indicates that the Next-Hop is 192.168.200.2 while the Next-Hop information in the entry in the U-RIB indicates that the Next-Hop is 192.168.100.2. Because each RIB entry for this network has a different Next-Hop address, the unicast and multicast traffic will flow over different paths.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

64

Unicast-Multicast NLRI Translation

BGP stubs that dont have MBGP support need to get their prefixes into the Multicast backbone They get external routes via MBGP default or static default

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

65 65

Unicast to Multicast NLRI Translation


In some transition cases, it may be desirable to convert unicast NLRI into Multicast NLRI on behalf of an old style BGP stub AS that would like to receive multicast traffic but is unable to upgrade to MBGP. Because this is a BGP stub AS, it is possible for this AS to simple RPF using either a static route or the unicast NLRI prefixes received from the upstream AS.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

65

Unicast-Multicast NLRI Translation

Use command
neighbor <foo> translate-update [nlri multicast]

Arriving BGP Updates are translated into an MP_REACH_NLRI attribute


As if the neighbor sent AFI 1/SAFI 2 routes Results written into the Multicast RIB

Original BGP Update processed as normal


Results written into the Unicast RIB

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

66 66

Unicast to Multicast NLRI Translation


Unicast-Multicast NLRI Translation may be enabled by the following command: neighbor <foo> translate-update nlri multicast This instructs the router to clone another copy of the received BGP Update message and to convert the unicast NLRI into a multicast NLRI using the MBGP MP_REACH_NLRI attribute. This update is processed as if it were received by the BGP Stub AS and the results are written into the M-RIB. The original BGP Update containing the unicast NLRI is processed as normal and the results written into the U-RIB

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

66

Unicast-Multicast NLRI Translation


translate-update Front-end
NLRI: NLRI: 192.192.2/24 192.192.2/24 AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 BGP Update from Peer MP_REACH_NLRI: MP_REACH_NLRI: 192.192.2/24 192.192.2/24 AFI: AFI: 1, 1, Sub-AFI: Sub-AFI: 2 2 (multicast) (multicast) AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2 NLRI: NLRI: 192.192.2/24 192.192.2/24 AS_PATH: AS_PATH: 300 300 200 200 MED: MED: Next-Hop: Next-Hop: 192.168.200.2 192.168.200.2

BGP IN Process

Arriving Unicast update intercepted by translate -update Front-end A translated Multicast update is created & passed to the IN Process Original Unicast update is passed on to the IN Process Both updates processed normally by the IN Process
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

67 67

Unicast to Multicast NLRI Translation


The above drawing graphically depicts the sequence of events when translateupdate has been for a BGP neighbor. At the left of the drawing a old style BGP Update arrives containing unicast NLRI. The Update message is intercepted by the Translate-Update front -end process and a Multicast Update message is created and passed on to the MBGP Input process. The original Unicast Update message is also passed on unchanged to the MBGP Input process. Both updates are processed normally and the results written into the appropriate RIB.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

67

Unicast-Multicast NLRI Translation


Unicast NLRI Only Stub AS
.1 192.168.1.0/24 .2

Unicast / Multicast NLRI AS

AS 2
LO0 192.170.1.1

Unicast / Multicast NLRI AS

AS 4
192.192.25.0/24

AS 1

router bgp 1 . . . neighbor 192.168.1.1 neighbor 192.168.1.1 neighbor 192.170.1.1 neighbor 192.180.1.1 . . .

remote-as 4 translatetranslate -update nlri multicast remote-as 2 nlri unicast multicast remote-as 3 nlri unicast multicast

AS 3
LO0 192.180.1.1

Unicast / Multicast NLRI AS

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

68 68

Unicast to Multicast NLRI Translation


In this drawing we see a BGP-Only stub AS (AS 4) using AS 1 as their transit for both unicast and multicast traffic. However, in order for other ASs to be able to correctly RPF towards AS 4 for network 192.192.25.0/24, the router in AS 1 has been configured to do unicast-multicast NLRI translation as shown in the configuration above.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

68

Unicast-Multicast NLRI Translation


Unicast NLRI Only Stub AS
.1 192.168.1.0/24 Unicast Update NLRI: 192.192.25/24 AS_PATH: 4 MED: Next-Hop: 192.168.1.1 ... .2

Unicast / Multicast NLRI AS

AS 2
LO0 192.170.1.1

Unicast / Multicast NLRI AS

AS 4
192.192.25.0/24

AS 1

AS 3
LO0 192.180.1.1

Unicast / Multicast NLRI AS

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

69 69

Unicast to Multicast NLRI Translation


Continuing with the example, we see prefix 192.192.25.0/24 being sent in an old-style BGP Update message to the router in AS 1.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

69

Unicast-Multicast NLRI Translation


Unicast NLRI Only Stub AS
.1 192.168.1.0/24 .2

Unicast / Multicast NLRI AS

AS 2
LO0 192.170.1.1

Unicast / Multicast NLRI AS

AS 4
192.192.25.0/24

AS 1

AS 3
Multicast Updates MP_REACH_NLRI: 192.192.25/24 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 1, 4 MED: Next-Hop: ... LO0 192.180.1.1

Unicast / Multicast NLRI AS

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

70 70

Unicast to Multicast NLRI Translation


The router in AS 1 has been configured to perform unicast-multicast NLRI translation and therefore converts the incoming unicast (old-style) NLRI into a Multicast NLRI in an MBGP Update message using the MP_REACH_NLRI attribute. This translated NLRI is processed normally by the router in AS1 which results in it forwarding Multicast NLRI updates to its neighbors for network 192.192.25.0/24.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

70

Unicast-Multicast NLRI Translation


Unicast NLRI Only Stub AS
.1 192.168.1.0/24 .2

Unicast / Multicast NLRI AS

AS 2
LO0 192.170.1.1

Unicast / Multicast NLRI AS

AS 4
192.192.25.0/24

AS 1

AS 3
Unicast Updates NLRI: 192.192.25/24 AS_PATH: 1, 4 MED: Next-Hop: ... LO0 192.180.1.1

Unicast / Multicast NLRI AS

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

71 71

Unicast to Multicast NLRI Translation


Finally, the original unicast NLRI update is processed normally by the router in AS1 which results in it forwarding unicast NLRI updates to its neighbors for network 192.192.25.0/24.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

71

Unicast-Multicast NLRI Translation


Route Maps can be used for finer control over which prefixes are translated. Omit nlri clause in translate-update cmd
neighbor <foo> translate-update

Use set nlri in Route map


neighbor <foo> remote-as 4 bgp-to-mbgp in access-list 1 permit 192.192.25.0 0.0.0.255 route-map bgp-to-mbgp permit 10 match ip address 1 set nlri multicast

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

72 72

Unicast to Multicast NLRI Translation


Route Maps may be used to provide finer control over which prefixes are translated into Multicast NLRI. When this is desired, the nlri clause is omitted from the translate-update command. (The translate-update command is still necessary for the neighbor.) The example above shows a configuration that will result in only prefixes in the 192.192.25.0/24 range being translated into Multicast NLRI. Because the translate-update function has been enabled for this neighbor, all arriving unicast NLRI are translated (by the translate-update front-end) into Multicast NLRI. The Route map in the above example will only match on prefixes in the 192.192.25.0/24 range that are received from neighbor foo. The set nlri multicast clause in the body of the Route Map causes only these matching prefixes to be injected into the local M-RIB. As a result, only prefixes in the above range will be translated.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

72

DVMRP to MBGP Redistribution


You can also put routes in the MRIB that are currently in the DVMRP routing table
router bgp <asn> redistribute dvmrp route-map <map>

You can do your typical set operations Used when connecting DVMRP access points into the MBGP backbone Used at strategic interconnect points with the old DVMRP MBONE
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

73 73

DVMRP to MBGP Redistribution


Routes in the routers local DVMRP routing table may be redistributed into the MBGP Multicast RIB using standard Route Maps. This is typically done where legacy DVMRP networks still exist and the received DVMRP routes need to be injected into the M-RIB. Careful redistributing of these DVMRP routes into the M-RIB will allow other MBGP routers to RPF correctly for sources in the DVMRP network. This is considered a transitional function that permits DVMRP networks to be phased out gracefully and should be used only a strategic interconnect points with old DVMRP networks. (Otherwise, multicast route loops and multicast instability can occur.)

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

73

DVMRP to MBGP Redistribution


Unicast RIB
Network Next-Hop Path

Multicast RIB
Network *>i192.1.1.0/24 *>i192.1.5.0/24 Next-Hop 192.20.2.2 192.20.2.2 Path ? ?

router bgp 100 redistribute dvmrp route -map dvmrp-to-bgp access-list 1 permit 192.1.0.0 0.0.255.255
Route 160.10.1.0/24 160.10.3.0/24 153.22.0.0/16 192.1.1.0/24 192.1.5.0/24 Hops 4 3 7 5 6

route-map dvmrp-to-bgp permit 10 match ip address 1 set set nlri multicast

DVMRP Route Table

Route map used to control DVMRP route redistribution into MBGP

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

74 74

DVMRP to MBGP Redistribution


In the above configuration example, a Route Map is being used to redistribute DVMRP Routes from the routers local DVMRP routing table into the MBGP Multicast RIB. Notice that the set nlri multicast clause in the body of the Route Map is what causes this to occur.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

74

DVMRP to MBGP Redistribution


AS 2
LO0 192.170.1.1

DVMRP Tunnel

AS 1 DVMRP Stub Network


router bgp 1 neighbor 192.168.1.1 remote-as 4 neighbor 192.170.1.1 remote-as 2 nlri unicast multicast neighbor 192.180.1.1 remote-as 3 nlri unicast multicast redistribute dvmrp routeroute -map dvmrpdvmrp-to to-mbgp access-list 1 permit 192.168.0.0 0.0.255.255 accessrouteroute -map dvmrpdvmrp-to to-mbgp permit 10 match ip address 1 set nlri multicast

AS 3
LO0 192.180.1.1

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

75 75

DVMRP to MBGP Redistribution


The drawing above is another example of DVMRP to MBGP redistribution. Notice that the configuration in the router in AS1 uses a Route Map that matches on any DVMRP prefixes in the 192.168.0.0/16 range via accesslist 1. The set nlri multicast clause in the body of the Route Map causes the DVMRP matching routes to be injected into the MBGP M-RIB.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

75

DVMRP to MBGP Redistribution


AS 2
LO0 192.170.1.1

DVMRP Tunnel

AS 1 DVMRP Stub Network


DVMRP Report Route Metric 192.168.0.0/16 14

AS 3
LO0 192.180.1.1

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

76 76

DVMRP to MBGP Redistribution


When the router in AS1 receives a DVMRP Route Report for network 192.168.0.0/16, it is installed in the DVMRP Route table.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

76

DVMRP to MBGP Redistribution


AS 2
LO0 192.170.1.1

DVMRP Tunnel

AS 1 DVMRP Stub Network

AS 3
Multicast Updates MP_REACH_NLRI: 192.168/16 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 1, 4 MED: Next-Hop: ... LO0 192.180.1.1

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

77 77

DVMRP to MBGP Redistribution


As a result of the Route Map associated with the redistribute dvmrp command that was configured on the router in AS1, the DVMRP route, 192.168.0.0/16 is a match and is injected into the MBGP M-RIB. Once this has been done, the 192.168.0.0/16 prefix is advertised to MBGP peer routers in AS2 and AS3 as a Multicast NLRI.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

77

MBGP to DVMRP Redistribution

MBGP routes can be sent into DVMRP


interface tunnel0 ip dvmrp metric 1 route-map <map> mbgp

Can use typical match operations However, we recommend tail sites using DVMRP access to accept DVMRP default route

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

78 78

MBGP to DVMRP Redistribution


Prefixes in the MBGP M-RIB may also be advertised as DVMRP routes to the DVMRP neighbor at the other end of Tunnel 0 by the use of the following interface commands: interface tunnel 0 ip dvmrp metric 1 route-map mrib-to-dvmrp mbgp The above command causes any prefixes that are matched by the mrib-todvmrp Route Map to be sent in DVMRP Route Reports out Tunnel 0. Note: The mbgp keyword on the above command is a little confusing as it really implies the M-RIB of the local MBGP process. This syntax may change in the near future. Typically, ISPs reconfigure their connectivity to legacy DVMRP networks so that they are stub networks as they phase out support for DVMRP over time. This is done so that the ISP may simply their support for any remaining DVMRP networks by only injecting the default route into the DVMRP network.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

78

12.1 MBGP Syntax Changes


MBGP History
Introduced in IOS 12.0S nlri clause added to BGP syntax
Covered IPv4 Unicast and Multicast NLRI only Didnt cover other protocols or address-families

Extending nlri syntax considered


Rejected as being too inflexible

New syntax as of 12.1/12.0(7)T


Exception: 12.0S train retains old syntax
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

79 79

MBGP History
Support for Multiprotocol BGP was first introduced in IOS release 12.0S. In order to support different types of NLRI exchange, the nlir clause was added to many of the existing BGP configuration commands. However, this initial release only supported ipv4 unicast and ipv4 multicast NLRI. In order to support other types of NLRI, (such as IPv6) it was decided that the nlri syntax was not suitable and a new change in the syntax was introduced beginning in IOS releases 12.1 and 12.0(7)T. The 12.0S train still retains the old syntax.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

79

12.1 MBGP Syntax Changes


New address-family structure
router bgp <asn> address-family <afi> [<sub-afi>] . . exit-address-family

Replaces most nlri clauses Separates configurations by address family


Note: ipv4-unicast is default address-family
Implied address-family ipv4 unicast block To override this default behavior, use: no bgp default ipv4-unicast
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

80 80

Address-family structure
In order to support different types of address-family and sub-address family NLRI, the address-family block was added to the BGP configuration command syntax. The address-family block replaces many of the old nlri clauses. However, there are still some commands that retain the use of the nlri keyword. The address-family block is used to separate groups of configuration commands by address-family/sub-address family. In order to remain as backwards compatible as possible, the default address family is ipv4 unicast. This results in an implied address-family ipv4 unicast block. This default behavior is sometimes confusing because it merges common bgp configuration commands with ipv4 unicast specific commands. In addition, neighbor definition implies ipv4 unicast capability negotiation by default. This makes specifying ipv4 multicast only neighbors a bit confusing. The no bgp default ipv4-unicast command may be used to override this rather confusing behavior. When this command has been configured, a separate address-family ipv4 unicast block can be configured. This allows the configuration to clearly separate ipv4 unicast and multicast into separate address-family blocks.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

80

12.1 MBGP Syntax Changes


New 2-step neighbor configuration
Neighbors are first identified
neighbor <foo> remote-as <asn>

Then they are activated by address family


neighbor <foo> activate

Controls MBGP capability negotiation

Exception:
Unicast neighbors are automatically activated in the implied address-family ipv4 unicast block This default behavior can be overridden with:
no bgp default ipv4-unicast
Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

81 81

2-Step Neighbor Configuration


In the new bgp syntax, BGP neighbors (peers) are first identified and then activated by address family. The well known neighbor <foo> remote-as <asn> command simply identifies the neighbor with which a peering connection is to be established. The neighbor <foo> activate command is used inside of an address family block to specify that this type of NLRI is to be exchanged with neighbor <foo>. For example, the neighbor <foo> activate command in the address-family ipv4 multicast block, activates the exchange of ipv4 multicast NLRI exchange. The no neighbor <foo> activate form will explicitly disable the exchange of this type of NLRI with neighbor <foo>. EXCEPTION: In order to provide backward compatibility, ipv4 unicast NLRI exchange is automatically activated in the implied ipv4 unicast address family block. No explicit activate command is necessary to activate ipv4 unicast NLRI exchange. This default behavior is often confusing to someone trying to configure multiple types of NLRI exchange since the activate command must be configured for all other NLRI types EXCEPT ipv4 unicast. To disable this behavior and permit explicit ipv4 unicast configuration using an ipv4 unicast address family block, the no bgp default ipv4-unicast command may be configured.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

81

12.1 MBGP Syntax Changes


Address-Family Example:
router bgp 101 no synchronization bgp log-neighbor-changes network 172.16.21.0 mask 255.255.255.0 network 192.168.1.0 neighbor 172.16.1.2 remote-as 301 neighbor 172.16.11.2 remote-as 201 no neighbor 172.16.11.2 activate no auto-summary ! address-family ipv4 multicast neighbor 172.16.11.2 activate network 172.16.21.0 mask 255.255.255.0 network 192.168.1.0 exit-address-family

Implied ipv4 unicast address family block with implied neighbor activate commands

Explicit ipv4 multicast address family block

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

82 82

Address Family Example


In the above example, the implied ipv4 unicast address family block consists of the following lines: network 172.16.21.0 mask 255.255.255.0
network 192.168.1.0 neighbor 172.16.1.2 remote-as 301 neighbor 172.16.11.2 remote-as 201 no neighbor 172.16.11.2 activate

Notice that the neighbor 172.16.1.2 remote-as 301 command also implies a neighbor 172.16.1.2 activate for ipv4 unicast NLRI exchange to neighbor 172.16.1.2. In addition, the no neighbor 172.16.11.2 activate overrides the implied neighbor 172.16.11.2 activate for ipv4 unicast NLRI exchange to neighbor 172.16.11.2. The bottom section of the example is the ipv4 multicast address family block. Notice that only neighbor 172.16.11.2 is activated for ipv4 multicast NLRI exchange. In this case, the no neighbor 172.16.1.2 activate is implied.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

82

12.1 MBGP Syntax Conversion

12.0S 12.1 Syntax Conversion


Occurs when upgrading to 12.1 or later Can result in some confusing changes

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

83 83

12.1 Syntax Conversion


Conversion of the old 12.0S syntax to the new 12.1 syntax occurs automatically when the IOS version running in the router is upgraded from 12.0S to 12.1. Unfortunately, the conversion can result in some configurations that are not immediately obvious as to what they do.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

83

12.0S 12.1 Syntax Conversion


Before Conversion
router bgp 5 network 171.69.214.0 mask 255.255.255.0 nlri unicast multicast neighbor 171.69.214.38 remote-as 2 nlri unicast neighbor 171.69.214.50 remote-as 2 nlri multicast

After Conversion
router bgp 5 network 171.69.214.0 mask 255.255.255.0 neighbor 171.69.214.38 remote-as 2 neighbor 171.69.214.50 remote-as 2 no neighbor 171.69.214.50 activate ! address-family ipv4 multicast neighbor 171.69.214.50 activate network 171.69.214.0 mask 255.255.255.0 exit-address-family

Overrides implied neighbor activate for the ipv4 unicast address family

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

84 84

Syntax Conversion Example


Prior to conversion, the 12.0S configuration was:
router bgp 5 network 171.69.214.0 mask 255.255.255.0 nlri unicast multicast neighbor 171.69.214.38 remote-as 2 nlri unicast neighbor 171.69.214.50 remote-as 2 nlri multicast

After the conversion, the configuration has the implied ipv4 address block in the top lines as follows:
router bgp 5 network 171.69.214.0 mask 255.255.255.0 neighbor 171.69.214.38 remote-as 2 neighbor 171.69.214.50 remote-as 2 no neighbor 171.69.214.50 activate

Again, the neighbor 172.69.224.38 remote-as 2 command in this implied ipv4 unicast address family block has an implied neighbor 172.69.224.38 activate . This automatically activates ipv4 unicast NLRI exchange as was taking place in the original 12.0S configuration. In addition, the no neighbor 172.69.214.50 activate overrides the implied neighbor 172.69.214.50 activate that would normally occur in the implied ipv4 unicast block. This prevents ipv4 unicast NLRI exchange with this neighbor. The ipv4 multicast address family block also contains the neighbor 172.69.214.50 activate command which explicitly activates ipv4 multicast NLRI exchange with this neighbor Finally, the network 171.69.224.0 mask 255.255.255.0 command appears in both the implied ipv4 unicast address family block AND the explicit ipv4 multicast address family block. This causes this network to be injected into both the ipv4 Unicast and Multicast RIBs.
Copyright ? ?1998-2000, Cisco Systems, Inc.
Module10.ppt

84

12.1 MBGP Syntax Tricks

Disabling default ipv4 unicast family


no bgp default ipv4-unicast

Allows a clear separation of unicast and multicast configurations All ipv4 unicast commands are placed in a separate (explicit) address-family block

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

85 85

Syntax Tricks
By using the no bgp default ipv4-unicast command, we can disable the default, implied ipv4 unicast address family block and its implied neighbor activation commands. (Which can be quite confusing in a multiprotocol envirionment.) When this command is configured, it allows a clear separation of ipv4 unicast and multicast configurations commands. In addition, activate commands must be explicitly configured in each section.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

85

12.1 MBGP Syntax Changes


Using no bgp default ipv4-unicast
Common neighbor definition block router bgp 101 no synchronization bgp log-neighbor-changes no bgp default ipv4 - unicast neighbor 172.16.1.2 remote-as 301 neighbor 172.16.11.2 remote-as 201 no auto-summary ! address-family ipv4 unicast network 172.16.21.0 mask 255.255.255.0 network 192.168.1.0 neighbor 172.16.1.2 activate exit-address-family ! address-family ipv4 multicast neighbor 172.16.11.2 activate network 172.16.21.0 mask 255.255.255.0 network 192.168.1.0 exit-address-family
8/14/2001 3:35 PM

Explicit ipv4 unicast address family block

Explicit ipv4 multicast address family block

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

86 86

Example Configuration with no bgp default ipv4-unicast


In this example, the no bgp default ipv4-unicast command has been added to the common bgp configuration block at the top of the bgp configuration. In this section, we also identify the bgp neighbors using the normal neighbor command. However, no activate commands are implied because the no bgp default ipv4-unicast command has been configured. The next section is the explict ipv4 unicast address family block. In this section we see that two network prefix are being injected into the ipv4 Unicast RIB and that neighbor 172.16.1.2 has been activated for ipv4 unicast NLRI exchange. (No others are implied.) The last section is the ipv4 multicast address family block. Here too the same network prefix are being injected into the ipv4 Multicast RIB and that neighbor 172.16.11.2 has been activated for ipv4 multicast NLRI exchange.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

86

Debugging MBGP
show ip bgp neighbor
asimov# asimov# show show ip ip bgp bgp neighbor neighbor BGP BGP neighbor neighbor is is 10.0.10.3, 10.0.10.3, remote remote AS AS 1, 1, internal internal link link Index 2, Offset 0, Mask Index 2, Offset 0, Mask 0x4 0x4 BGP BGP version version 4, 4, remote remote router router ID ID 193.78.81.4 193.78.81.4 BGP BGP state state == Established, Established, table table version version == 4, 4, up up for for 22:32:50 22:32:50 Last Last read read 00:00:49, 00:00:49, hold hold time time is is 180, 180, keepalive keepalive interval interval is is 60 60 seconds seconds Neighbor NLRI negotiation: Neighbor NLRI negotiation: Configured Configured for for unicast unicast and and multicast multicast routes routes Peer Peer negotiated negotiated unicast unicast and and multicast multicast routes routes Exchanging Exchanging unicast unicast and and multicast multicast routes routes Minimum Minimum time time between between advertisement advertisement runs runs is is 5 5 seconds seconds Received Received 8916 8916 messages, messages, 00 notifications, notifications, 00 in in queue queue Sent Sent 8923 8923 messages, messages, 00 notifications, notifications, 00 in in queue queue Connections Connections established established 4; 4; dropped dropped 33 Last Last reset reset 22:32:59, 22:32:59, due due to to User User reset reset 00 accepted accepted unicast unicast prefix prefix consume consume 00 bytes bytes of of memory memory 00 history history unicast unicast paths paths consume consume 0 0 bytes bytes of of memory memory Connection Connection state state is is ESTAB, ESTAB, I/O I/O status: status: 1, 1, unread unread input input bytes: bytes: 0 0 Local Local host: host: 10.0.10.1, 10.0.10.1, Local Local port: port: 11004 11004 Foreign host: 10.0.10.3, Foreign port: 179 Foreign host: 10.0.10.3, Foreign port: 179

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

87 87

Debugging MBGP
The above command may be used to debug the status of a (M)BGP peer connection with a neighbor. Notice that the highlighted text indicates exactly what capabilities and NLRI are being exchanged between the router and this peer. Note: If the Neighbor NLRI negotiation field is missing, only unicast NLRI information is being exchanged.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

87

Debugging MBGP
Old 12.0S Syntax
show ip bgp
asimov# asimov# show show ip ip bgp bgp BGP BGP table table version version is is 4, 4, local local router router ID ID is is 10.0.100.1 10.0.100.1 Status Status codes: codes: ss suppressed, suppressed, dd damped, damped, hh history, history, ** valid, valid, >> best, best, ii -- internal internal Origin Origin codes: codes: ii -- IGP, IGP, ee -- EGP, EGP, ?? -- incomplete incomplete Network Network *>10.0.100.0/24 *>10.0.100.0/24 Next Next Hop Hop 0.0.0.0 0.0.0.0 Metric Metric LocPrf LocPrf Weight Weight Path Path 00 32768 32768 ii

show ip mbgp
asimov# asimov# show show ip ip mbgp mbgp MBGP MBGP table table version version is is 6, 6, local local router router ID ID is is 10.0.100.1 10.0.100.1 Status Status codes: codes: ss suppressed, suppressed, dd damped, damped, hh history, history, ** valid, valid, >> best, i internal best, i - internal Origin Origin codes: codes: ii -- IGP, IGP, ee -- EGP, EGP, ?? -- incomplete incomplete Network Network *>10.0.70.0/24 *>10.0.70.0/24 *>10.0.80.0/24 *>10.0.80.0/24
Module10. ppt

Next Next Hop Hop 10.0.20.2 10.0.20.2 10.0.20.2 10.0.20.2

Metric Metric LocPrf LocPrf Weight Weight Path Path 307200 32768 307200 32768 ii 10000 32768 10000 32768 ??
8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

88 88

Debugging MBGP
Two different commands are currently used to display the contents of the Unicast RIB and the Multicast RIB. These are: show ip bgp show ip mbgp Shows the contents of the U-RIB Shows the contents of the M-RIB

The information displayed by the above commands is fundamentally the same. The only difference is on is the contents of the Unicast RIB and the other is the contents of the Multicast RIB. Note:The syntax of the above commands will change in the near future to avoid the confusing practice of referring to mbgp as meaning Multicast NLRI or Multicast RIB instead of Multiprotocol BGP.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

88

Debugging MBGP
New 12.1 Syntax
show ip bgp ipv4 unicast
asimov# asimov# show show ip ip bgp bgp ipv4 ipv4 unicast unicast BGP BGP table table version version is is 4, 4, local local router router ID ID is is 10.0.100.1 10.0.100.1 Status Status codes: codes: ss suppressed, suppressed, dd damped, damped, hh history, history, ** valid, valid, >> best, best, ii -- internal internal Origin Origin codes: codes: ii -- IGP, IGP, ee -- EGP, EGP, ?? -- incomplete incomplete Network Network *>10.0.100.0/24 *>10.0.100.0/24 Next Next Hop Hop 0.0.0.0 0.0.0.0 Metric Metric LocPrf LocPrf Weight Weight Path Path 00 32768 32768 ii

show ip bgp ipv4 multicast


asimov# asimov# show show ip ip bgp bgp ipv4 ipv4 multicast multicast BGP BGP table table version version is is 6, 6, local local router router ID ID is is 10.0.100.1 10.0.100.1 Status Status codes: codes: ss suppressed, suppressed, dd damped, damped, hh history, history, ** valid, valid, >> best, best, ii -- internal internal Origin Origin codes: codes: ii -- IGP, IGP, ee -- EGP, EGP, ?? -- incomplete incomplete Network Network *>10.0.70.0/24 *>10.0.70.0/24 *>10.0.80.0/24 *>10.0.80.0/24
Module10. ppt

Next Next Hop Hop 10.0.20.2 10.0.20.2 10.0.20.2 10.0.20.2

Metric Metric LocPrf LocPrf Weight Weight Path Path 307200 32768 307200 32768 ii 10000 32768 ? 10000 32768 ?
8/14/2001 3:35 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

89 89

Debugging MBGP
Two different commands are currently used to display the contents of the Unicast RIB and the Multicast RIB. These are: show ip bgp show ip mbgp Shows the contents of the U-RIB Shows the contents of the M-RIB

The information displayed by the above commands is fundamentally the same. The only difference is on is the contents of the Unicast RIB and the other is the contents of the Multicast RIB. Note:The syntax of the above commands will change in the near future to avoid the confusing practice of referring to mbgp as meaning Multicast NLRI or Multicast RIB instead of Multiprotocol BGP.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

89

Debugging MBGP
Same for both Old and New Syntax
MBGP debug commands
debug ip mbgp updates

Logs Multicast related information passed in MBGP update messages.


debug ip bgp updates

Logs Unicast related information passed in MBGP update messages.


debug ip mbgp dampening [<acl>]

Logs multicast route flap dampening activity.


debug ip bgp dampening [<acl>]

Logs multicast route flap dampening activity.


Module10. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

90 90

Debugging MBGP
Use the following command to display Multicast NLRI passed in MBGP update messages. debug ip mbgp updates Use the following command to display Unicast NLRI passed in MBGP update messages. debug ip mbgp updates Use the following command to display Multicast route flap dampening activity. debug ip mbgp dampening [<acl>] Use the following command to display Unicast route flap dampening activity. debug ip bgp dampening [<acl>] Note: The syntax of the above commands will change in the near future.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

90

MBGPSummary

Solves part of inter-domain problem


Can exchange multicast RPF information Uses standard BGP configuration knobs Permits separate unicast and multicast topologies if desired

Still must use PIM to:


Build multicast distribution trees Actually forward multicast traffic PIM-SM recommended

Module10. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/14/2001 3:35 PM

91 91

MBGP Summary
MBGP solves part of the inter-domain multicast problem by allowing ASs to exchange Multicast RPF information in the for of MBGP Multicast NLRI. Because this is accomplished using an extension to the BGP protocol to make it support multiple protocols (i.e.Multiprotocol BGP), the same BGP configuration knobs are available for both Unicast and Multicast information. The separation of unicast and multicast prefixes into separate Unicast and Multicast RIBs permits unicast and multicast traffic to follow different paths if desired. MBGP is only one piece of the overall Inter-domain Multicast solution and PIM must still be used to: Build the multicast distribution trees. (Typically via PIM-SM.) Actually RPF check and forward multicast traffic. PIM-SM is recommended as it permits the use of MSDP which solves most of the remaining issues and is covered in another section.

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

91

Module10.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

92

Copyright ? ?1998-2000, Cisco Systems, Inc.

Module10.ppt

92

Multicast Source Discovery Protocol (MSDP)


Module 11

ModuleN.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

Module Objectives

Understand the issues relating to Interdomain IP Multicast Explain fundamental concepts of MSDP Identify steps associated with configuring and debugging MSDP

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

2 2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

3 3

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

Past History
DVMRP MBONE
Virtual network overlaid (tunneled) on the unicast Internet infrastructure DVMRP MBONE uses RIP-like routing Flood and Prune technology Initially instantiated by MROUTED, and later implemented by various router vendors Very successful in academic circles
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

4 4

DVMRP MBone
Historically, the very limited amount of multicast traffic that flowed across the Internet used DVMRP Tunnels to interconnect multicast enabled portions of the Internet together. Unfortunately, DVMRP basically is an extension to the RIP unicast routing protocol and has all of the problems associated with RIP as a routing protocol. DVMRP uses a Flood and Prune methodology where traffic is periodically flooded to every part of the network and pruned back where it is unwanted. The first versions of DVMRP was the mrouted program that runs on Unix platforms. Later implementations of DVMRP were developed by commercial router vendors. Initially, the DVMRP MBone was limited to academic sites and was managed by a handful of dedicated academic types that kept it running smoothly. The occasional outages were not considered to be a problem since the MBone was largely seen as an academic experiment.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

Past History

Problem
DVMRP cant scale to Internet sizes
Distance vector-based routing protocol Periodic updates
Full table refresh every 60 seconds

Table sizes
Internet > 40,000 prefixes

Stability
Hold-down, count-to-infinity, etc.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

5 5

DVMRP Problems
DVMRP has problems scaling to any significant size, particularly to the size of the Internet. These problems include: DVMRP is based on a Distance Vector routing protocol (RIP). Periodic updates of the entire routing table are sent every 60 seconds. This is fine for networks where the routing table is relatively small but is unthinkable for really large networks such as the Internet were the number of prefixes (routes) exceed 40,000. Distance Vector based protocols suffer from some well know stability issues including route Holddown, Count-to-Infinity and other problems.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

In the Future
BGMP (Border Gateway Multicast Protocol) Shared tree of domains
Bidirectional trees Explict join-model Joins sent toward root domain

Single root domain per group


Multicast group prefixes assigned by domain MASC proposed as assignment method

Requires BGP4+ (aka MBGP)


Must carry group prefixes in NLRI field Needed to build bidirectional trees
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

6 6

Future Border Gateway Multicast Protocol (BGMP)


Work is underway in the IETF to define a new protocol (other than PIM) that will provide for scalable Inter-domain IP Multicast. This protocol is the Border Gateway Multicast Protocol (BGMP) which has the following characteristics: Shared Tree of Domains BGMP uses Bidirectional Shared Trees to interconnect multicast domains. These trees are built using an Explicit join-model where the Join messages are sent toward the root domain. Single Root Domain per Group Each multicast domain serves as the root domain for a contiguous range of multicast addresses. This has the potential for Group state aggregation. The address allocation method proposed for dynamically allocation these contiguous ranges is called Multicast Address Set-Claim (MASC). BGMP requires BGP4+ (MBGP) In order to build the Bidirectional Shared Trees towards a root domain, Multicast Group addresses must be distributed using MBGP in special Multicast Group NLRI. (This is not to be confused with unicast prefixes distributed in Multicast NLRI for the purpose of RPF calculation.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

In the Future
Domain A

BGMP Example
BR BR BR BR

Domain C

Domain B BR

Root 224.2.2.2

BR

Join Domain D
BR BR

Join
BR

BR

Join
BR

Join
BR

r
Domain G

r
Domain E
224.2.2.2
Module11. ppt

BR

BR

r
Domain F
8/21/2001 2:33 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

7 7

BGMP Example
In the above example, multicast group 224.2.2.2 has been assigned to Domain B. Therefore, Domain B is the root domain for this multicast group and this fact is communicated to all other domains. Lets now assume that are receiver in Domain E joins multicast group 224.2.2.2. The last -hop router R directly connected to the receiver would communicate this to the BGMP Border Routers (BR) by whatever method is appropriate for the multicast routing protocol (PIM, MOSPF, DVMRP) running in Domain E. When the BGMP BR learns that there is a receiver in its domain for group 224.2.2.2, it sends a BGMP Join for group 224.2.2.2 toward the root domain, Domain B. This Join travels domain by domain building a branch of the Bidirectional BGMP Shared Tree from the root domain to Domain E. If receivers in Domains F and G also join the multicast group, their lasthop routers also trigger their BGMP BR to join the Bidirectional Shared Tree by sending BGMP Joins toward the root domain. The end result is a Bidirectional Shared Tree that connects all domains that have active receivers for group 224.2.2.2.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

In the Future
Domain A

BGMP Example
BR BR BR

Domain C

Domain B BR

BGMP Routing M-IGP Routing Domain D


BR

Root 224.2.2.2

BR

s
BR

BR BR BR

r
BR BR BR BR

Domain G

r
Domain E
Module11. ppt

r
Domain F
8/21/2001 2:33 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

8 8

BGMP Example
Lets now assume that a source in Domain D connected to first -hop router S goes active. This source traffic is routed by the Multicast IGP (PIM, MOSPF, DVMRP) to the BGMP BRs that are on the Bidirectional Sha red Tree for multicast group 224.2.2.2. When the BGMP BRs receive this traffic, they forward it up/down the Bidirectional Shared Tree. The multicast traffic flows up and down the Bidirectional Shared Tree to the BGMP BRs on all domains that are part of the Shared Tree. The BGMP BRs that receive the multicast traffic, forward it via the Multicast-IGP to the last-hop routers that have directly connected receivers. Notice that in some cases, a domain is acting as a transient domain. This is the case for the root domain, Domain B. In this case, the multicast traffic must be forwarded by the Multicast IGP in Domain B from one BGMP BR to the other so that traffic will continue to flow down the Bidirectional Shared Tree to Domain G.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

In the Future
MASC (Multicast Address Set-Claim)
Multicast address space is hierarchical
Top of hierarchy is at an Internet exchange Children get address space from parent Results in aggregateable multicast address space

Allocation has a lifetime


Children must renew address allocation May not receive same space at renewal time Parent may reclaim space at renewal time Permits reallocation of space Complex garbage collection problem
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

9 9

Multicast Address Set-Claim (MASC)


The IETF is also working on a new protocol to do dynamic allocation of blocks of multicast address space. This protocol is called the Multicast Address Set-Claim (MASC) protocol. This protocol has the following characteristics: Multicast Address space is hierarchical At the top of the hierarchy are one or more root MASC nodes that are responsible for the dynamic allocation of multicast group ranges. (The exact range is yet to be determined. It is very likely that MASC will, at least initially, only control a subset of the global IP multicast address space.) Children request (set-claim) multicast group ranges from their parent MASC nodes. Parents allocate sub-ranges of the multicast group range(s) that they have allocated from their parent MASC nodes. This allocation scheme results in aggregateable group ranges. Allocation has a lifetime. Children must renew their allocation with their parent. However, there is no guarantee that they will get the same range at renewal time. It is possible that they may get a reduced range or a completely different range of addresses. This permits parents to periodically reclaim address space from their children for the purpose of address reallocation. This presents MASC with a complex garbage collection problem which is made even more complex by the fact that it must be performed in a distributed fashion across all MASC nodes in the Internet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

In the Future

BGMP and MASC are a long ways off


Both are quite complex to implement Still only in draft proposal stages

ISPs want to deploy multicast now


What are their minimum requirements?

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

10 10

In the Future A Summary


The complex nature of BGMP and MASC make them non-trivial protocols to implement. At the present time, both BGMP and MASC are only in the draft stages (and have been for quite some time). Unfortunately, ISPs want to deploy Inter-Domain multicast now.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

10

ISP Requirements to Deploy Now


Want an explicit join protocol for efficiency
PIM-SM

Use existing (unicast) operation model


MBGP

Will not share RP with competitors


Results in third-party resource dependency Hmmm

Want flexibility regarding RP placement


Hmmm
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

11 11

ISP Requirements to deploy IP Multicast now


Explicit Join Model Protocol The Flood and Prune behavior of Dense mode protocols preclude them from being used by most ISPs. However, this requirement can be met using PIM-SM today. Existing unicast-like operations model. ISPs want to minimize the impact of multicast on their operations model. This can be achieved by using MBGP to carry both unicast and multicast routing information. Because MBGP uses the same configuration, maintenance and policy controls as BGP, virtually no new training is required of operations staff. Will not share an RP with their competitors. This can result in a third-party dependency which is unacceptable to all ISPs. Something new is require here. Want flexibility regarding the placement of their RPs. Interfacing multiple SM domains together has always been a problem. Historically, the workaround to this was to place all SM domain RPs at a common location. This is clearly unacceptable to all ISPs and requires something new to meet this requirement.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

11

ISP Requirements to Deploy Now

Interim solution: MBGP + PIM-SM


Environment
ISPs run MBGP and PIM-SM (internally) ISPs multicast peer at a public interconnect

Deployment
Each ISP puts their own administered RP attached to the interconnect That RP as well as all border routers run MBGP The interconnect runs dense-mode PIM

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

12 12

Interim Solution: MBGP + PIM-SM


In order to move forward with Inter-domain multicast across the Internet, several ISPs agreed to an interim solution using only PIM-SM and MBGP. Environment: Each ISP ran MBGP and PIM-SM internally in their network. Each ISP MBGP peered with the other multicast ISPs at a public multicast interconnect point. Deployment: Each ISP put their single RP at the interconnect point. This router was MBGP with the other multicast routers at the interconnection point in order to exchange multicast NLRI for the purpose of RPF calculation. To deal with the problems of connecting multiple PIM-SM domains together, the interconnect network was run in Dense mode. As a result, any multicast traffic sent by a source in one ISP domain would be flooded across the interconnect to all other ISP domains.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

12

ISP Requirements to Deploy Now

Interim Solution: MBGP + PIM-SM


ISP A PIM-SM
RP

Public Interconnect
RP

PIM-SM iMBGP

ISP B

iMBGP AS 10888 eMBGP iMBGP


RP RP

iMBGP PIM-SM ISP D

ISP C

PIM-SM

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

13 13

Interim Solution: MBGP + PIM-SM


The drawing above illustrates the interim solution.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

13

ISP Requirements to Deploy Now

Interim solution: MBGP + PIM-SM


Too restrictive regarding RP placement
Need multiple interconnect points between ISPs

Using multiple interconnect points


Fine if all ISP RPs at same interconnect Can degenerate into large PIM-DM cloud

Back to the requirements list

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

14 14

Interim Solution: MBGP + PIM-SM


While the interim solution allow the ISPs to move forward and do some initial testing of native multicast across portions of the Internet, it was clearly not a solution that would scale. Too restrictive regarding RP placement. For redundancy purposes, multiple RPs per ISP was needed. Multiple Interconnection Points Not all ISPs share the same interconnection point. Therefore, in order to make the model work, it was necessary to link (generally via tunnels) multiple interconnection points into a single interconnection AS. As the size of this interconnection AS grew, native multicast across the Internet would degenerate into a large PIM -DM cloud.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

14

ISP Requirements to Deploy Now

Must interconnect PIM-SM domains


Interconnect using shared trees
Thats BGMP! Cant wait

Interconnect using source trees


Need a way to discover all multicast sources
Hmmm. Interesting idea!

Solution: MSDP
Multicast Source Discovery Protocol

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

15 15

Must somehow interconnect multiple PIM-SM domains


Solution 1: Interconnect using Shared Trees When all is said and done, this solution results in something as complex as BGMP. In either case, ISPs didnt want to wait for something that complex to be developed. Solution 2: Interconnect using Source Trees PIM-SM has the ability to send explicit (S,G) Joins toward the source and to join the source tree. This is, of course, the basic way that an RP receives (S,G) traffic from sources within their domains. (That is to say, by explicitly joining the Source Tree.) If RPs could somehow learn of the existence of active sources in other PIMSM domains, it could simply send (S,G) Joins toward those sources and join their Source Trees. This resulted in the concepts of the Multicast Source Discovery Protocol (MSDP).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

15

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

16 16

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

16

MSDP Overview

Simple but elegant


Abandon inter-domain shared trees; just use inter-domain source trees Reduces to problem to locating active sources RP or receiver last-hop can join inter-domain source tree

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

17 17

MSDP Overview
By abandoning the notion of inter-domain Shared-Trees and using only inter-domain Source-Trees, the complexity of interconnecting PIM-SM domains is reduced considerably. The remaining problem becomes one of communicating the existence of active sources between the RPs in the PIM-SM domains. The RP can join the inter-domain source-tree for sources that are sending to groups for which the RP has receivers. This is possible because the RP is the root of the Shared-Tree which has branches to all points in the domain where there are active receivers. Note: If the RP either has no Shared-Tree for a particular group or a Shared-Tree whose outgoing interface list is Null, it does not send a Join to the source in another domain. Once a last-hop router learns of a new source outside the PIM-SM domain (via the arrival of a multicast packet from the source down the SharedTree), it too can send a Join toward the source and join the source tree.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

17

MSDP Overview

Works with PIM-SM only


RPs knows about all sources in a domain
Sources cause a PIM Register to the RP Can tell RPs in other domains of its sources
Via MSDP SA (Source Active) messages

RPs know about receivers in a domain


Receivers cause a (*, G) Join to the RP RP can join the source tree in the peer domain
Via normal PIM (S, G) joins Only necessary if there are receivers for the group

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

18 18

MSDP Overview
The entire concept of MSDP depends on the RPs in the inter-connected domains being the be-all, know-all oracle that is aware of all sources and receivers in its local domain. As a result, MSDP can only work with PIMSM. RPs know about all sources in a domain. Whenever a source goes active in a PIM-SM domain, the first-hop router immediately informs the RP of this via the use of PIM Register messages. (S,G) state for the source is kept alive in the RP by normal PIM -SM mechanisms as long the source is actively sending. As a result, the RP can inform other RPs in other PIM-SM domains of the existence of active sources in its local domain. This is accomplished via MSDP Source Active (SA) messages. RPs know about receivers in a domain. Receivers cause last-hop routers to send (*, G) Joins to the RP to build branches of a Shared-Tree for a group. If an RP has (*, G) state for a group and the outgoing interface list of the (*, G) entry is not Null, it knows it has active receivers for the group. Therefore, when it receives an SA message announcing an active source for group G in another domain, it can send (S, G) Joins toward the source in the other domain.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

18

MSDP Overview
MSDP peers talk via TCP connections
UDP encapsulation option

Source Active (SA) messages


Peer-RPF forwarded to prevent loops
RPF check on AS-PATH back to the peer RP If successful, flood SA message to other peers Stub sites accept all SA messages
Since they have only one exit (e.g., default peer)

MSDP speaker may cache SA messages


Other MSDP speakers can query for active sources Reduces join latency
No need to wait for periodic SA message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

19 19

MSDP Overview
MSDP Peers (typically RPs) are connected via TCP sessions. Note: The MSDP specification describes a UDP encapsulation option but this is not currently available in the IOS implementation. Source Active (SA) Messages RPs periodically originate SA messages for sources that are active in their local domain. These SA messages are sent to all active MSDP peers. When a MSDP speaker receives an SA messages from one of its peers, it is RPF forwarded to all of its other peers. An RPF check is performed on the arriving SA message (using the originating RP address in the SA message) to insure that it was received via the correct AS-PATH. Only if this RPF check succeeds is the SA message flooded downstream to its peers. This prevents SA messages from looping through the Internet. Stub domains (i.e. domains with only a single MSDP connection) do not have to perform this RPF check since there is only a single entrance/exit. MSDP speakers may cache SA messages. Normally, these messages are not stored to minimize memory usage. However, by storing SA messages, join latency can be reduced as RPs do not have to wait for the arrival of periodic SA messages when the first receiver joins the group. Instead, the RP can scan its SA cache to immediately determine what sources are active and send (S, G) Joins. Non-caching MSDP speakers can query caching MSDP speakers in the same domain for information on active sources for a group.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

19

MSDP Overview
MSDP Example
MSDP Peers

Domain E
RP Join (*, 224.2.2.2)

r
Domain C
RP

Domain B
RP RP

Domain D
RP

Domain A

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

20 20

MSDP Example
In the example above, PIM-SM domains A through E each have an RP which is an MSDP speaker. The solid lines between these RPs represents the MSDP peer sessions via TCP and not actual physical connectivity between the domains. Note: The physical connectivity between the domains is not shown in the drawing above. Assume that a receiver in Domain E joins multicast group 224.2.2.2 which in turn, causes its DR to send (*, G) Join for this group to the RP. This builds a branch of the Shared-Tree from the RP in Domain E to the DR as shown. When a source goes active in Domain A, the first-hop router (S) sends a PIM Register message to the RP. This informs the RP in Domain A that a source is active in the local domain. The RP responds by originating an (S, G) SA message for this source and send them to its MSDP peers in domains B and C. (The RP will continue to send these SA messages periodically as long as the source remains active.) When the RPs in domains B and C receive the SA messages, they are RPF checked and forwarded downstream to their MSDP peers. These SA messages continue to travel downstream and eventually reach the MSDP peers (the RPs) in domains D and E. Note: The SA message traveling from domain B to domain C, will fail the RPF check at the domain C RP (MSDP speaker) and will be dropped. However, the SA message arriving at domain C from domain A will RPF correctly and will be processed and forwarded on to domains D and E.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

20

MSDP Overview
MSDP Example
MSDP Peers Source Active Messages SA SA

Domain E
RP

Domain C
RP SA

Domain B SA
RP SA

SA

RP

SA SA Message 192.1.1.1, 224.2.2.2 RP

SA Message 192.1.1.1, 224.2.2.2

Domain D

s
Domain A

Register 192.1.1.1, 224.2.2.2


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

21 21

MSDP Example
In the example above, PIM-SM domains A through E each have an RP which is an MSDP speaker. The solid lines between these RPs represents the MSDP peer sessions via TCP and not actual physical connectivity between the domains. Note: The physical connectivity between the domains is not shown in the drawing above. Assume that a receiver in Domain E joins multicast group 224.2.2.2 which in turn, causes its DR to send (*, G) Join for this group to the RP. This builds a branch of the Shared-Tree from the RP in Domain E to the DR as shown. When a source goes active in Domain A, the first-hop router (S) sends a PIM Register message to the RP. This informs the RP in Domain A that a source is active in the local domain. The RP responds by originating an (S, G) SA message for this source and send them to its MSDP peers in domains B and C. (The RP will continue to send these SA messages periodically as long as the source remains active.) When the RPs in domains B and C receive the SA messages, they are RPF checked and forwarded downstream to their MSDP peers. These SA messages continue to travel downstream and eventually reach the MSDP peers (the RPs) in domains D and E. Note: The SA message traveling from domain B to domain C, will fail the RPF check at the domain C RP (MSDP speaker) and will be dropped. However, the SA message arriving at domain C from domain A will RPF correctly and will be processed and forwarded on to domains D and E.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

21

MSDP Overview
MSDP Example
MSDP Peers

Domain E
RP

r
Domain C
RP
Join (S, 224 .2.2.2)

Domain B
RP

RP

Domain D
RP

s
Domain A

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

22 22

MSDP Example
Once the SA message arrives at the RP (MSDP speaker) in domain E, it sees that it has an active branch of the Shared-Tree for group 224.2.2.2. It responds to the SA message by sending an (S, G) Join toward the source. IMPORTANT: The (S, G) Join will follow the normal inter-domain routing path from the RP to the source. This inter-domain routing path is not necessarily the same path as that used by the MSDP connections. In order to emphasis this point, the (S, G) Join is shown following a different path between domains.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

22

MSDP Overview
MSDP Example
MSDP Peers Multicast Traffic

Domain E
RP

r
Domain C
RP

Domain B
RP RP

Domain D
RP

s
Domain A

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

23 23

MSDP Example
Once the (S, G) Join message reaches the first -hop router (S) in domain A, (S, G) traffic begins to flow to the RP in domain E via the Source Tree shown. IMPORTANT: The (S, G) traffic will not flow over the TCP MSDP sessions. It will instead follow the path of the Source Tree that was built in the preceding step.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

23

MSDP Overview
MSDP Example
MSDP Peers Multicast Traffic

Domain E
RP

Domain C
RP

Join 2) (S, 224.2.2.

Domain B
RP RP

Domain D
RP

s
Domain A

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

24 24

MSDP Example
Once the (S, G) traffic reaches the last-hop router (R) in domain E, the lasthop router may optionally send an (S, G) Join toward the source in order to bypass the RP in domain E. This is shown in the above example.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

24

MSDP Overview
MSDP Example
MSDP Peers Multicast Traffic

Domain E
RP

r
Domain C
RP

Domain B
RP RP

Domain D
RP

s
Domain A

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

25 25

MSDP Example
At this point in the example, the (S, G) traffic is flowing to the last-hop router (R) in domain E via the Source-Tree as shown in the above example.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

25

Back to ISP Requirements


Want an explicit join protocol for efficiency
PIM-SM

Use existing (unicast) operation model


MBGP

Will not share RP with competitors


MSDP

Want flexibility regarding RP placement


MSDP
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

26 26

ISP Requirements Revisited


The requirements have all now been met using a combination of PIM-SM, MBGP and MSDP. Again, PIM-SM meets this requirement. Want to use an existing unicast operations model. The extension of BGP to MBGP (BGP4+) permits both unicast and multicast traffic flows to be configured and managed using the same exist set of tools. Will not share RPs with competitors MSDP permits each PIM -SM domain to have its own RP for each group. Want flexibility regarding placement of the RPs RPs can be placed anywhere in the PIM-SM domain as long as they are linked via MSDP to other RPs in other domains. (Note: It is not necessary to have a full mesh of MSDP connections. It is sufficient to have a single MSDP connection that leads to the rest of the MSDP speakers (RPs) in the Internet.)

Want an Explicit Join Protocol

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

26

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

27 27

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

27

MSDP Peers
MSDP establishes a neighbor relationship between MSDP peers
Peers connect using TCP port 639
Lower address peer initiates connection Higher address peer waits in LISTEN state

Peers send keepalives every 60 secs. (fixed) Peer connection reset after 75 seconds if no MSDP packets or keepalives are received

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

28 28

MSDP Peers
Like BGP, MSDP establishes neighbor relationships with other MSDP peers. MSDP peers connect using TCP port 639. The lower IP address peer takes the active role of opening the TCP connection. The higher IP address peer waits in LISTEN state for the other to make the connection. MSDP peers send Keepalives every 60. The arrival of data performs the same function as the Keepalive and keeps the session from timing out. If no Keepalive or data is received for 75 seconds, the TCP connection is reset and reopened.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

28

MSDP Peers

MSDP peers must run BGP!


BGP NLRI is used to RPF check arriving SA messages.
May use NLRI from MRIB, URIB or both

Exceptions:
When peering with only a single MSDP peer. When using an MSDP Mesh-Group.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

29 29

MSDP Peers
MSDP speakers must run BGP. This requirement is due to the fact that the SA message RPF check mechanism uses AS-PATH information contained in the MBGP M-RIB or URIB. There are some special cases where the requirement to perform an RPF check on the arriving SA message is suspended. This is the case when there is only a single MSDP peer connection or if the MSDP mesh groups are in use. In these cases, (M)BGP is not necessary.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

29

MSDP Peers
LO0 220.220.8.1 RP LO0 220.220.16.1

RP

Interface Loopback 0 ip address 220.220.8.1 255.255.255.255 ip msdp peer 220.220.16.1 connectconnect-source Loopback0 ip msdp peer 220.220.32.1 connect connect-source Loopback0

RP BGP TCP/IP Peer Connection MSDP TCP/IP Peer Connection

B
LO0 220.220.32.1

MSDP peer connections are established using the MSDP peer configuration command
ip msdp peer <ip -address> [connect-source < intfc>]
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

30 30

MSDP Peers
Peer connections are establish by the use of the following IOS command: ip msdp peer <ip-address> [connect-source <interface>] In the above example Router A has MSDP peer connections with both Routers B and C using their Loopback address as the connection address.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

30

MSDP Peers
LO0 220.220.8.1 RP LO0 220.220.16.1

RP

Interface Loopback 0 ip address 220.220.26.1 255.255.255.255 ip msdp peer 220.220.8.1 connectconnect-source Loopback0 ip msdp peer 220.220.32.1 connectconnect-source Loopback0

RP BGP TCP/IP Peer Connection MSDP TCP/IP Peer Connection

B
LO0 220.220.32.1

MSDP peer connections are established using the MSDP peer configuration command
ip msdp peer <ip -address> [connect-source < intfc>]
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

31 31

MSDP Peers
in the above example Router C has MSDP peer connections with both Routers A and B using their Loopback address as the connection address.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

31

MSDP Peers
LO0 220.220.8.1 RP LO0 220.220.16.1

A
Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp peer 220.220.8.1 connectconnect-source Loopback0 ip msdp peer 220.220.16.1 connectconnect-source Loopback0

RP

RP BGP TCP/IP Peer Connection MSDP TCP/IP Peer Connection

B
LO0 220.220.32.1

MSDP peer connections are established using the MSDP peer configuration command
ip msdp peer <ip -address> [connect-source < intfc>]
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

32 32

MSDP Peers
in the above example Router B has MSDP peer connections with both Routers A and C using their Loopback address as the connection address.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

32

MSDP Peers
LO0 220.220.8.1 RP

ISP

Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp defaultdefault-peer 220.220.8.1

RP

MSDP TCP/IP

Peer Connection

LO0 220.220.32.1

Stub-networks may use default peering without being an MBGP or BGP peer by using the MSDP default-peer configuration command.
ip msdp default-peer <ip-address>
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

33 33

MSDP Peers
Stub networks may use default peering to a single MSDP peer. This eliminates the need to run (M)BGP to have the information necessary to perform the RPF check on arriving SA messages. Note: Since there is only a single connection, there is no need to perform the RPF check since this is the only path that SA messages can take. The format of the IOS command that establishes a default peer connection is: ip msdp default-peer <ip-address>

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

33

MSDP Peers
LO0 220.220.8.1 RP

ISP1

LO0 192.168.2.2 RP

ISP2

Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp defaultdefault-peer 220.220.8.1 ip msdp default default-peer 192.168.2.2

RP

MSDP TCP/IP

Peer Connection

LO0 220.220.32.1

Multiple default-peers may be configured in case connection to first default-peer goes down.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

34 34

MSDP Peers
Stub networks may configure additional secondary default peer connections to provide some redundancy in case the primary default peer goes down. In the above example, the primary default peer connection is to Router A (220.220.8.1). The secondary default peer connection is to Router C. This connection will not be activated unless the connect to Router A is lost.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

34

MSDP Peers
LO0 220.220.8.1 RP

ISP1

LO0 192.168.2.2 RP

ISP2

X
Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp defaultdefault-peer 220.220.8.1 ip msdp default default-peer 192.168.2.2

RP

MSDP TCP/IP

Peer Connection

LO0 220.220.32.1

When connection to first default-peer is lost, the next one in the list is tried.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

35 35

MSDP Peers
Continuing with the previous example, the primary default peer connection is to Router A (220.220.8.1) has gone down. The secondary default peer connection is to Router C is now activated.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

35

MSDP Peers
LO0 220.220.8.1 RP

ISP

Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp peer 220.220.8.1 connectconnect-source Loopback0

RP

MSDP TCP/IP

Peer Connection

LO0 220.220.32.1

Stub-networks configured with only a single MSDP peer are treated in the same manner as when a single default-peer is configured. (i.e. BGP is not required.)
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

36 36

MSDP Peers
Stub networks may configure a single MSDP peer using the normal ip msdp peer IOS command. When only a single MSDP peer is configured in this manner, it is treated in the same manner as a default peering. This eliminates the need to run (M)BGP to have the information necessary to perform the RPF check on arriving SA messages. Note: Since there is only a single connection, there is no need to perform the RPF check since this is the only path that SA messages can take.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

36

MSDP Peers
LO0 220.220.8.1 RP

ISP1

LO0 192.168.2.2 RP

ISP2

Interface Loopback 0 ip address 220.220.32.1 255.255.255.255 ip msdp peer 220.220.8.1 connectconnect-source Loopback0 ip msdp peer 220.220.16.1 connectconnect-source Loopback0

BGP TCP/IP Peer Connection MSDP TCP/IP

RP

Peer Connection

LO0 220.220.32.1

Remember: BGP is necessary when multiple MSDP peers are configured.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

37 37

MSDP Peers
If more than one MSDP peer is configured using the ip msdp peer command, (M)BGP must also be configured. In the example above, Router B has active MSDP peering sessions with both Router A and Router C. In this case, (M)BGP must also be configured so that Router B has the necessary AS-PATH information to properly RPF Check arriving SA messages. Note: The only exception to this rule is if all three routers are in an MSDP Mesh Group. (Mesh Groups are discussed in a later section.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

37

MSDP Peers
Showing MSDP Peers
show ip msdp summary sj-mbone# show ip msdp summary MSDP Peer Status Summary Peer Address AS State 192.150.44.254 192.150.44.250 10888 Up 10876 Up

Uptime/ Downtime 1d19h 04:52:34

Reset Peer Name Count 10 pao5.pao4.verio.net 25 maoz.com

Clearing MSDP Peers


clear ip msdp peer <peer-address>

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

38 38

MSDP Peers
Summary information on a routers MSDP peer connections can be displayed using the following command: show ip msdp summary An MSDP connection can be reset by using the following command: clear ip msdp peer

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

38

MSDP Peers
Showing MSDP Peer detail status
show ip msdp peer [<peer-address>]
sj-mbone# show ip msdp peer MSDP Peer 192.150.44.254 (pao5.pao4.verio.net), AS 10888 Description: PAIX Connection status: State: Up, Resets: 10, Connection source: none configured Uptime(Downtime): 1d19h, Messages sent/received: 148699/8689 Output messages discarded: 0 Connection and counters cleared 5d14h ago SA Filtering: Input filter: 111, route -map: none Output filter: 111, route-map: none SA-Requests: Input filter: none Sending SA -Requests to peer: disabled Peer ttl threshold: 32 Input queue size: 0, Output queue size: 0

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

39 39

MSDP Peers
Detailed information on a routers MSDP peer connections can be displayed using the following command: show ip msdp peer [<peer-address>]

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

39

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

40

MSDP Messages
MSDP Message Contents
One or more messages (in TLV format)
Keepalives Source Active (SA) Messages Source Active Request (SA-Req) Messages Source Active Response (SA-Resp) Message

Source Active (SA) Messages


Used to advertise active Sources in a domain Can also carry initial multicast packet from source
Hack for Bursty Sources (ala SDR)

SA Message Contents:

Module11. ppt

IP Address of Originating RP Number of (S, G)s pairs being advertised List of active (S, G)s in the domain Encapsulated Multicast packet [optional]
8/21/2001 2:33 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

41 41

MSDP Message Contents


There are four basic MSDP message types, each encoded in their own TLV format. Keepalives Source Active (SA) Source Active Request (SA-Req) Source Active Response (SA-Resp) Source Active (SA) Messages These messages are used to advertise active sources in a domain. In addition, these SA messages may contain the initial multicast data packet that was sent by the source. Carrying this first data packet in the initial SA message helps to deal with the bursty source problem such as low rate SDR announcements. SA Messages contain the IP address of the originating RP as well as one or more (S,G) pairs being advertised. In addition, the SA message may contain an encapsulated data packet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

41

MSDP Messages
MSDP Message Contents (cont.)
SA Request (SA-Req) Messages
Used to request a list of active sources for a group
Sent to an MSDP SA Cache Server Reduces Join Latency to active sources

SA Request Messages contain:


Requested Group Address

SA Response (SA-Resp) Messages


Sent in response to an SA Request message SA Response Messages contain:
IP Address of Originator (usually an RP) Number of (S, G)s pairs being advertised List of active (S, G)s in the domain

Keepalive messages
Used to keep MSDP peer connection up
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

42 42

MSDP Message Contents


Source Active Request (SA-Req) Messages These messages are used to request a list of active sources for a specific group. These messages are sent to an MSDP SA Cache Server that is maintaining a list of active (S, G) pairs in its SA cache. Join latency can be reduced by using this technique to request the list of active sources for a group instead of having to wait up to 60 seconds for all active sources in the group to be readvertised by the originating RP(s). Source Active Response (SA-Resp ) Messages These messages are sent by the MSDP SA Cache Server in response to an SA-Req message. These SA-Resp message contains the IP address of the originating RP as well as one or more (S, G) pairs of the active sources in the originating RPs domain. Keepalive Messages These messages are sent every 60 seconds in order to keep the MSDP session active. If no Keepalives or SA messages are received for 75 seconds, the MSDP session is cleared and reopened.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

42

Receiving SA Messages
SA Message RPF Check
Accept SAs via a single deterministic path
Ignore all other arriving SAs Necessary to prevent SAs from looping endlessly

Problem
Need to know MSDP topology of Internet
But, MSDP does not distribute topology data!

Solution
Use (m)BGP data to infer MSDP topology.
Impact:
The MSDP topology must follow BGP topology. An MSDP peer must generally also be an m(BGP) peer.
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

43 43

SA Message RPF Check


SA messages must only be accepted from the MSDP RPF peer that is in the best path back towards the originator. The same SA message arriving from other MSDP peers must be ignored or SA loops can occur. Deterministically selecting the MSDP RPF peer for an arriving SA message requires knowledge of the MSDP topology. However, MSDP does not distribute topology information in the form of routing updates. This means that the MSDP topology must be inferred via some other means. The solution is to use the (m)BGP routing data as the best approximation of the MSDP topology for the SA RPF check mechanism. This has the following implications: The MSDP topology must follow the same general topology as the BGP peer topology. This means that with a couple of exceptions, an MSDP peer generally should also be an m(BGP) peer.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

43

Receiving SA Messages
RPF Check Rules depend on peering
Rule 1: Sending MSDP peer = i(m)BGP peer Rule 2: Sending MSDP peer = e(m)BGP peer Rule 3: Sending MSDP peer != (m)BGP peer

Exceptions:
RPF check is skipped when:
Sending MSDP peer = Originating RP Sending MSDP peer = Mesh-Group peer Sending MSDP peer = only MSDP peer
(i.e. the default-peer or the only msdp-peer configured.)
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

44 44

Receiving SA Messages
RPF Check rules depend on the BGP peering between the MSDP peers. Rule 1: Applied when the sending MSDP peer is also an i(m)BGP peer. Rule 2: Applied when the sending MSDP peer is also an e(m)BGP peer. Rule 3: Applied when the sending MSDP peer is not an (m)BGP peer. RPF Checks are not done in the following cases: If the sending MSDP peer is the only MSDP Peer. This would be the case if a single msdp-peer command is configured or if only the default-peer command is used. If the sending MSDP peer is a Mesh-Group peer. If the sending MSDP peer address is the RP address contained in the SA message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

44

Receiving SA Messages
Determining Applicable RPF Rule
Using IP address of sending MSDP peer
Find (m)BGP neighbor w/matching IP address IF (no match found)
Use Rule 3

IF (matching neighbor = i(m)BGP peer)


Use Rule 1

ELSE {matching neighbor = e(m)BGP peer}


Use Rule 2

Implication
The MSDP peer address must be configured using the same IP address as the (m)BGP peer!
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

45 45

Determining the Applicable RPF Rule


Cisco IOS uses the following logic to determine which RPF rule will be applied: Find the (m)BGP neighbor that has the same IP address as the sending MSDP peer. IF no match is found
Apply Rule 3

IF the matching (m)BGP neighbor is an internal BGP peer


Apply Rule 1

IF the matching (m)BGP neighbor is an external BGP peer


Apply Rule 2 The implication of the above rule selection logic is the following: The IP address used to configure an MSDP peer on a router must match the IP address used to configure the (m)BGP peer on the same router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

45

RPF Check Rule 1


When MSDP peer = i(m)BGP peer
Find Best Path to RP in BGP Tables
Search MRIB first then URIB. If no path to Originating RP found, RPF Fails

Note BGP Neighbor that advertised path


(i.e IP Address of BGP peer that sent us this path)

Warning:
This is not the same as the NextNext-hop of the path!!! i(m)BGP peers normally do not set NextNext-hop = Self. This is also not necessarily the same as the RouterRouter - ID!

Rule 1 Test Condition:


MSDP Peer address = BGP Neighbor address?
If Yes, RPF Succeeds
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

46 46

RPF Check Rule 1


Applied when the sending MSDP peer is also an i(m)BGP peer. Search the BGP MRIB for the best-path to the RP that originated the SA message. If a path is not found in the MRIB, search the URIB. If a path is still not found, the RPF check fails. Determine the address of the BGP Neighbor for this path. (This is the address of the BGP neighbor the sent us this path in a BGP Update message.) Be careful not to assume the BGP Neighbor address is the same as the Next-Hop address in the path. Since i(m)BGP peers do not update the NextHop attribute of a path, it is usually the case that the Next-Hop address is not the same as the address of the BGP peer that sent us the path. The BGP Neighbor address is also not necessarily the same as the BGP Router-Id of the peer that sent us the path. Rule 1 Test: If the IP address of the sending MSDP peer is the same as the BGP Neighbor address (i.e. the address of the BGP peer that sent us the path), then the RPF check succeeds; otherwise it fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

46

RPF Check Rule 1


Test Condition:
MSDP Peer address = BGP Neighbor address?

Implications:
The MSDP topology must mirror the (m)BGP topology
Specifically, the MSDP peer address must be the same as the i(m)BGP peer address! If this condition is not met, RPF Check Rule 1 will fail!!!

Pay attention to addresses used when configuring MSDP and i(m)BGP peers.
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

47 47

RPF Check Rule 1 Implications


The MSDP topology must mirror the (m)BGP topology. Generally speaking, this means that wherever you have an i(m)BGP peer connection between two routers, you should configure an MSDP peer connection. More specifically, the IP address of the far end MSDP peer connection must be the same as the far end i(m)BGP peer connection. The reason for this is that BGP topology between i(m)BGP peers inside an AS is not described by the AS path. If it were always the case that i(m)BGP peers updated the Next-Hop address in the path when sending an update to another i(m)BGP peer, then we could rely on the Next-Hop address to describe the i(m)BGP topology (and hence the MSDP topology). However, this is not the case since the default is for i(m)BGP peers to not update the Next-Hop address. Instead, we must use the address of the i(m)BGP peer that sent us the path to describe the i(m)BGP/MSDP topology inside the AS. (Fortunately, BGP keeps track of the sending (m)BGP peer address information so it is easy for it to use this information for this MSDP RPF Check rule. Care must be taken when configuring the MSDP peer addresses to make sure that the same address is used as was used when configuring the i(m)BGP peer addresses.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

47

Rule1: MSDP peer = i(m)BGP peer


RP G Source

AS5
172.16.6.1

AS7

RP F

172.16.5.1

i(m)BGP peer address = 172.16.3.1 (advertising best -path to RP) 172.16.4.1 D 172.16.3.1 E MSDP Peer address = 172.16.3.1

MSDP Peer address = i(m)BGP Peer address


RP A

SA RPF Check Succeeds


show show ip ip mbgp mbgp 172.16.6.1 172.16.6.1 BGP BGP routing routing table table entry entry for for 172.16.6.0/24, 172.16.6.0/24, version version 8745118 8745118 Paths: (1 available, Paths: (1 available, best best #1) #1) 77 5, (received & used) 5, (received & used) 172.16.5.1 172.16.5.1 (metric (metric 68096) 68096) from from 172.16.3.1 172.16.3.1(172.16.3.1) (172.16.3.1)
8/21/2001 2:33 PM

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

48 48

Rule 1 Example 1
In this example, router A receives an SA message originated by router G from router E which is an i(m)BGP peer. Applying Rule 1, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. This best path was received from i(m)BGP peer, 172.16.3.1 The sending MSDP peer address is also 172.16.3.1 Therefore the RPF check Rule 1 succeeds.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

48

Rule1: MSDP peer = i(m)BGP peer


RP G Source

AS5
172.16.6.1

AS7

RP F

172.16.5.1

i(m)BGP Peer address = 172.16.3.1 (advertising best -path to RP) 172.16.4.1 D 172.16.3.1 E MSDP Peer address = 172.16.4.1

X
RP A

MSDP Peer address != i(m)BGP Peer address

SA RPF Check Fails


show show ip ip mbgp mbgp 172.16.6.1 172.16.6.1 BGP BGP routing routing table table entry entry for for 172.16.6.0/24, 172.16.6.0/24, version version 8745118 8745118 Paths: (1 available, Paths: (1 available, best best #1) #1) 77 5, (received & used) 5, (received & used) 172.16.5.1 172.16.5.1 (metric (metric 68096) 68096) from from 172.16.3.1 172.16.3.1(172.16.3.1) (172.16.3.1)
8/21/2001 2:33 PM

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

49 49

Rule 1 Example 2
In this example, router A receives the same SA message (originated by router G) from router D which is an i(m)BGP peer. Applying Rule 1, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. This best path was received from i(m)BGP peer, 172.16.3.1 The sending MSDP peer address is 172.16.4.1 Therefore RPF check Rule 1 fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

49

Rule1: MSDP peer = i(m)BGP peer


RP G Source

AS5
172.16.6.1

AS7

RP F

Common Mistake #1:


Failure to use same addresses for MSDP peers as i(m)BGP peers!

172.16.5.1

i(m)BGP Peer address = 172.16.3.1 (advertising best -path to RP) 172.16.4.1 D 172.16.20.1 172.16.3.1 E MSDP Peer address = 172.16.20.1

RP A

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt

show show ip ip mbgp mbgp 172.16.6.1 172.16.6.1 BGP BGP routing routing table table entry entry for for 172.16.6.0/24, 172.16.6.0/24, version version 8745118 8745118 Paths: (1 available, Paths: (1 available, best best #1) #1) 77 5, (received & used) 5, (received & used) 172.16.5.1 172.16.5.1 (metric (metric 68096) 68096) from from 172.16.3.1 172.16.3.1(172.16.3.1) (172.16.3.1)
8/21/2001 2:33 PM

1998 2001, Cisco Systems, Inc. All rights reserved.

Rule 1 Common Mistake 1


The most common mistake is for the MSDP and (m)BGP peering sessi ons to the same router to use different IP addresses. In this example, The MSDP peer address to router C is 172.16.20.1 The (m)BGP peer address to router C is 172.16.3.1 Router A receives the SA message (originated by router G) from router E which is an i(m)BGP peer. Applying Rule 1, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. This best path was received from router C which is the i(m)BGP peer with an IP address of 172.16.3.1 However, the sending MSDP peer (also router C) address is 172.16.20.1 Therefore RPF check Rule 1 fails. (Even though the SA message arrived via the correct path.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

MSDP Peer address != i(m)BGP Peer address

SA RPF Check Fails

50 50

Module11.ppt

50

Rule1: MSDP peer = i(m)BGP peer


RP G Source

AS5
172.16.6.1

AS7

RP F

Common Mistake #2:


Failure to follow i(m)BGP topology! Can happen when RRs are used.

172.16.5.1

i(m)BGP Peer address = 172.16.1.1 (advertising best -path to RP) 172.16.4.1 D RR 172.16.1.1 172.16.3.1 E MSDP Peer address = 172.16.3.1

X
A RP

MSDP Peer address != i(m)BGP Peer address

SA RPF Check Fails


show show ip ip mbgp mbgp 172.16.6.1 172.16.6.1 BGP BGP routing routing table table entry entry for for 172.16.6.0/24, 172.16.6.0/24, version version 8745118 8745118 Paths: (1 available, Paths: (1 available, best best #1) #1) 77 5, (received & used) 5, (received & used) 172.16.5.1 172.16.5.1 (metric (metric 68096) 68096) from from 172.16.1.1 172.16.1.1(172.16.1.1) (172.16.1.1)
8/21/2001 2:33 PM

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

51 51

Rule 1 Common Mistake 2


The other common mistake is to failure of the MSDP topology to follow the i(m)BGP peering topology. This can happen when Route Reflectors are used. In this example, The MSDP peer address of router E is 172.16.3.1 The i(m)BGP peer router is the Route Reflector RR whose peer address is 172.16.1.1 Router A receives an SA message (originated by router G) from router E which is the MSDP peer. Applying Rule 1, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. This best path was received from the Route Reflector which is the i(m)BGP peer with an IP address of 172.16.1.1 However, the sending MSDP peer is router E whose address is 172.16.20.1 Therefore RPF check Rule 1 fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

51

RPF Check Rule 2

When MSDP peer = e(m)BGP peer


Find (m)BGP Best Path to RP
Search MRIB first then URIB.
If no path to Originating RP found, RPF Fails

Rule 2 Test Condition:


First AS in path to the RP = AS of e(m)BGP peer?
If Yes, RPF Succeeds

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

52 52

RPF Check Rule 2


Applied when the sending MSDP peer is also an e(m)BGP peer. Search the BGP MRIB for the best-path to the RP that originated the SA message. If a path is not found in the MRIB, search the URIB. If a path is still not found, the RPF check fails. Rule 2 Test: If the first AS in the best-path to the RP is the same as the AS of the e(m)BGP peer (which is also the sending MSDP peer), then the RPF check succeeds; otherwise it fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

52

RPF Check Rule 2


Test Condition:
First AS in path to the RP = AS of e(m)BGP peer?

Implication:
The MSDP topology must mirror the (m)BGP topology Should MSDP peer with the e(m)BGP peer.
Normal case is to configure MSDP peering wherever e(m)BGP peering is configured.
Exception: When Rule 3 is used.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

53 53

RPF Check Rule 2 Implications


The MSDP topology must mirror the (m)BGP topology. Generally speaking, this means that wherever you have an e(m)BGP peer connection between two routers, you should configure an MSDP peer connection. In this case, the IP address of the far end MSDP peer connection does not have to be the same as the far end e(m)BGP peer connection. The reason that the addresses do not have to be identical is that BGP topology between two e(m)BGP peers is not described by the AS path. If it were always the case that i(m)BGP peers updated the Next-Hop address in the path when sending an update to another i(m)BGP peer, then we could rely on the Next-Hop address to describe the i(m)BGP topology (and hence the MSDP topology). However, this is not the case since the default is for i(m)BGP peers to not update the Next-Hop address. Instead, we must use the address of the i(m)BGP peer that sent us the path to describe the i(m)BGP/MSDP topology inside the AS. (Fortunately, BGP keeps track of the sending (m)BGP peer address information so it is easy for it to use this information for this MSDP RPF Check rule. Care must be taken when configuring the MSDP peer addresses to make sure that the same address is used as was used when configuring the i(m)BGP peer addresses.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

53

Rule2: MSDP peer = e(m)BGP peer


RP G Source First-AS in best-path to RP = 3 AS of MSDP Peer = 3

AS5
172.16.6.1

AS7
F 172.16.5.1

RP

AS1
172.16.4.1 D RP

AS3
172.16.3.1 E RP First-AS in best-path to RP = AS of e(m)BGP Peer

SA RPF Check Succeeds


Router Router A's A's BGP BGP Table Table Network Next Network Next Hop Hop *> 172.16.3.1 *> 172.16.3.0/24 172.16.3.0/24 172.16.3.1 172.16.3.0/24 172.16.4.1 172.16.3.0/24 172.16.4.1 *> 172.16.4.1 *> 172.16.4.0/24 172.16.4.0/24 172.16.4.1 172.16.4.0/24 172.16.3.1 172.16.4.0/24 172.16.3.1 *> 172.16.3.1 *> 172.16.5.0/24 172.16.5.0/24 172.16.3.1 172.16.5.0/24 172.16.4.1 172.16.5.0/24 172.16.4.1 *> 172.16.3.1 *> 172.16.6.0/24 172.16.6.0/24 172.16.3.1 172.16.6.0/24 172.16.4.1 172.16.6.0/24 172.16.4.1
8/21/2001 2:33 PM

A RP

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Path Path 33 ii 11 33 ii 11 ii 33 11 ii 33 77 ii 11 33 77 ii 33 77 55 ii 11 33 77 55 ii
54 54

Rule 2 Example 1
In this example, router A receives an SA message originated by router G via router E which is an e(m)BGP peer. Applying Rule 2, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. The first-hop AS in the best path to the originating RP is AS3. The origin AS of the sending MSDP peer (172.16.3.1) is also AS3. (This is determined by locating the best-path to the MSDP peer and then finding the last AS in the AS-Path list.) Therefore the RPF check Rule 2 succeeds.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

54

Rule2: MSDP peer = e(m)BGP peer


RP G Source First-AS in best-path to RP = 3 AS of e(m)BGP Peer = 1

AS5
172.16.6.1

AS7
F 172.16.5.1

RP

AS1
172.16.4.1 D RP

AS3
172.16.3.1 E RP FirstFirst - AS in bestbest - path to RP != AS of e(m)BGP Peer

SA RPF Check Fails!

X
A RP

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Router Router A's A's BGP BGP Table Table Network Next Network Next Hop Hop *> 172.16.3.1 *> 172.16.3.0/24 172.16.3.0/24 172.16.3.1 172.16.3.0/24 172.16.4.1 172.16.3.0/24 172.16.4.1 *> 172.16.4.1 *> 172.16.4.0/24 172.16.4.0/24 172.16.4.1 172.16.4.0/24 172.16.3.1 172.16.4.0/24 172.16.3.1 *> 172.16.3.1 *> 172.16.5.0/24 172.16.5.0/24 172.16.3.1 172.16.5.0/24 172.16.4.1 172.16.5.0/24 172.16.4.1 *> 172.16.3.1 *> 172.16.6.0/24 172.16.6.0/24 172.16.3.1 172.16.6.0/24 172.16.4.1 172.16.6.0/24 172.16.4.1
8/21/2001 2:33 PM

Path Path 33 ii 11 33 ii 11 ii 33 11 ii 33 77 ii 11 33 77 ii 33 77 55 ii 11 33 77 55 ii
55 55

Rule 2 Example 2
In this example, router A receives the same SA message (originated by router G) via router D which is also an e(m)BGP peer. Applying Rule 2, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. The first-hop AS in the best path to the originating RP is AS3. The origin AS of the sending MSDP peer (172.16.4.1) is not AS3, it is AS1. (This is determined by locating the best-path to the MSDP peer and then finding the last AS in the AS-Path list.) Therefore the RPF check Rule 2 fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

55

RPF Check Rule 3


When MSDP peer != (m)BGP peer
Find (m)BGP Best Path to RP
Search MRIB first then URIB.
If no path to Originating RP found, RPF Fails

Find (m)BGP Best Path to MSDP peer


Search MRIB first then URIB.
If no path to sending MSDP Peer found, RPF Fails

Note AS of sending MSDP Peer


Origin AS (last AS) in AS-PATH to MSDP Peer

Rule 3 Test Condition:


First AS in path to RP = Sending MSDP Peer AS ?
If Yes, RPF Succeeds
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

56 56

RPF Check Rule 3


Applied when the sending MSDP peer is not an (m)BGP peer at all. Search the BGP MRIB for the best-path to the RP that originated the SA message. If a path is not found in the MRIB, search the URIB. If a path is still not found, the RPF check fails. Search the BGP MRIB for the best-path to the MSDP peer that sent us the SA message. If a path is not found in the MRIB, search the URIB. If a path is still not found, the RPF check fails. Note the AS of MSDP peer that sent us the SA. (This is the origin AS which is the last AS in the AS-PATH to the MSDP peer.) Rule 3 Test: If the first AS in the best-path to the RP is the same as the AS of the sending MSDP peer, then the RPF check succeeds; otherwise it fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

56

Rule3: MSDP peer != BGP peer


RP G Source First-AS in best-path to RP = 3 AS of MSDP Peer = 3

AS5
172.16.6.1

AS7
F 172.16.5.1

RP

AS1
172.16.4.1 D RP

AS3
172.16.3.1 E RP First-AS in best-path to RP = AS of MSDP Peer

SA RPF Check Succeeds


Router Router A's A's BGP BGP Table Table Network Next Network Next Hop Hop *> 172.16.3.1 *> 172.16.3.0/24 172.16.3.0/24 172.16.3.1 172.16.3.0/24 172.16.4.1 172.16.3.0/24 172.16.4.1 *> 172.16.4.1 *> 172.16.4.0/24 172.16.4.0/24 172.16.4.1 172.16.4.0/24 172.16.3.1 172.16.4.0/24 172.16.3.1 *> 172.16.3.1 *> 172.16.5.0/24 172.16.5.0/24 172.16.3.1 172.16.5.0/24 172.16.4.1 172.16.5.0/24 172.16.4.1 *> 172.16.3.1 *> 172.16.6.0/24 172.16.6.0/24 172.16.3.1 172.16.6.0/24 172.16.4.1 172.16.6.0/24 172.16.4.1
8/21/2001 2:33 PM

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

RP

Path Path 33 ii 11 33 ii 11 ii 33 11 ii 33 77 ii 11 33 77 ii 33 77 55 ii 11 33 77 55 ii
57 57

Rule 3 Example 1
In this example, router A receives an SA message originated by router G via router E which is neither an i(m)BGP peer nor an e(m)BGP peer. Applying Rule 3, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. The first-hop AS in the best path to the originating RP is AS3. The origin AS of the sending MSDP peer (172.16.3.1) is also AS3. (This is determined by locating the best-path to the MSDP peer and then finding the last AS in the AS-Path list.) Therefore the RPF check Rule 3 succeeds.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

57

Rule3: MSDP peer != BGP peer


RP G Source First-AS in best-path to RP = 3 AS of MSDP Peer = 1

AS5
172.16.6.1

AS7
F 172.16.5.1

RP

AS1
172.16.4.1 D RP

AS3
172.16.3.1 E RP First-AS in best-path to RP != AS of MSDP Peer

SA RPF Check Fails


Router Router A's A's BGP BGP Table Table Network Next Network Next Hop Hop *> 172.16.3.1 *> 172.16.3.0/24 172.16.3.0/24 172.16.3.1 172.16.3.0/24 172.16.4.1 172.16.3.0/24 172.16.4.1 *> 172.16.4.1 *> 172.16.4.0/24 172.16.4.0/24 172.16.4.1 172.16.4.0/24 172.16.3.1 172.16.4.0/24 172.16.3.1 *> 172.16.3.1 *> 172.16.5.0/24 172.16.5.0/24 172.16.3.1 172.16.5.0/24 172.16.4.1 172.16.5.0/24 172.16.4.1 *> 172.16.3.1 *> 172.16.6.0/24 172.16.6.0/24 172.16.3.1 172.16.6.0/24 172.16.4.1 172.16.6.0/24 172.16.4.1
8/21/2001 2:33 PM

AS100
BGP Peer MSDP Peer SA Message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

RP

Path Path 33 ii 11 33 ii 11 ii 33 11 ii 33 77 ii 11 33 77 ii 33 77 55 ii 11 33 77 55 ii
58 58

Rule 3 Example 2
In this example, router A receives the same SA message (originated by router G) via router D which is neither an i(m)BGP peer nor an e(m)BGP peer. Applying Rule 3, the following occurs: The best path in the BGP M-RIB for 172.16.6.1 (the originating RP) is located. The first-hop AS in the best path to the originating RP is AS3. The origin AS of the sending MSDP peer (172.16.4.1) is AS1. (This is determined by locating the best-path to the MSDP peer and then finding the last AS in the AS-Path list.) Therefore the RPF check Rule 3 fails.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

58

Debugging SA RPF Checking


MSDP debug commands
debug ip msdp [<peer-address>] [detail] [routes] sj-mbone# debug ip msdp . . . MSDP: 193.78.83.1: Received 53-byte message from peer MSDP: 193.78.83.1: SA TLV, len: 53, ec: 1, RP: 10.0.30.1, with data MSDP: 193.78.83.1: Peer RPF check passed for 10.0.30.1, used EMBGP peer MSDP: (10.0.10.1/32, 224.5.5.5/32), accepted MSDP: 193.78.84.1: Forward 53-byte SA to peer MSDP: 20.0.20.2: Forward 53-byte SA to peer

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

59 59

Debugging SA RPF Checking


Use the following IOS command to debug the exchange of MSDP messages and to see the results of the RPF check as each SA message arrives. debug ip msdp [<peer-address>] [detail] [routes] The MSDP debug lines in the above example indicates that: A 53 byte MSDP message was received from MSDP peer 193.78.83.1 This MSDP message was an SA message that was originated by the RP whose address is 10.0.30.1. It also contains an encapsulated multicast data packet. The RPF check succeeded on this message and the rule that was applied was the MSDP Peer = External (M)BGP Neighbor rule. The contents of the SA message contained a single (S, G) source advertisement for (10.0.10.1, 224.5.5.5). The SA message was forwarded to MSDP Peer 193.78.84.1 The SA message was forwarded to MSDP Peer 20.0.20.2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

59

Debugging SA Origination

MSDP debug commands


debug ip msdp [<peer-address>] [detail] [routes] sj-mbone# debug ip msdp . . . MSDP: 193.78.83.2: Send 33-byte SA encapsulated data for (10.0.10.1, 224.5.5.5) MSDP: 193.78.83.2: Send 33-byte SA encapsulated data for (10.0.10.2, 224.5.5.5) MSDP: 193.78.83.2: Send 33-byte SA encapsulated data for (10.0.10.3, 224.5.5.5) MSDP: 193.78.83.2: Send 20-byte message to peer

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

60 60

Debugging SA Origination on RPs


The debug ip msdp command can also be used to debug the origination of SA Messages by RPs. The MSDP debug lines in the above example indicates that the router (which is an RP) originated SA messages for the following active sources in its local PIM-SM domain: The router originated an SA message for local source (10.0.10.1, 224.5.5.5) to MSDP neighbor 193.78.83.2 The router originated an SA message for local source (10.0.10.2, 224.5.5.5) to MSDP neighbor 193.78.83.2 The router originated an SA message for local source (10.0.10.2, 224.5.5.5) to MSDP neighbor 193.78.83.2 The router sent a 20 byte MSDP message to MSDP peer 193.78.83.2.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

60

Processing SA Messages
Check mroute table for joined members.
i.e. (*,G) entry with an OIL that is != NULL

If so, create (S,G) state.


If it does not already exist .

Send join toward source. Flood SA to all other MSDP peers except
The RPF peer. Any MSDP Peers that are in the same MSDP Mesh-Group. (More on that later.)

Note: SA messages are saved if SA-Caching has been enabled. (On by default after 12.1(?))
1998 2001, Cisco Systems, Inc. All rights reserved.

Module11. ppt

8/21/2001 2:33 PM

61 61

Processing SA Messages
The following steps are taken by a router whenever it processes an SA message: Using group address G of the (S, G) pair in the SA message, locate the associated (*, G) entry in the mroute table. If the (*, G) entry is found AND its outgoing interface list isnot Null, then there are active receivers in the PIM-SM domain for the source advertised in the SA message. Create an (S, G) entry for the advertised source. If the (S, G) entry did not already exist, immediately trigger an (S, G) Join toward the source in order to join the source tree. Flood the SA message to all other MSDP peers with the exception of: The MSDP peer from which the SA message was received Any MSDP peers that are in the same MSDP Mesh Group as this router. (More on MSDP Mesh Groups later.) Note: SA messages are not stored locally by the router unless SA-Caching has been enabled on the router. (In most cases, Network Administrators enable SA-Caching in order to improve network debugging capabilities.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

61

Filtering SA Messages
SA Filter Command:
ip msdp sa-filter {in|out} <peer-address> [list <acl>] [route-map <map>]

Filters (S,G) pairs to / from peer based on specified ACL. Can filter based on AS-Path by using optional route-map clause with a path-list acl. You can filter flooded and originated SAs based on a specific peer, incoming and outgoing.

Caution: Filtering SA messages can break the Flood and Join mechanism!
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

62 62

SA Filtering
SA Filtering can be configured by the use of the following IOS command:
ip msdp sa-filter {in|out} < peer-address > [list < acl >] [route-map < map >]

The above command may be used to filter incoming or outgoing SA Messages based on the (S, G) pairs specified in the list <acl> clause. The above command may also be used to filter incoming or outgoing SA Messages based AS-PATH using the route map specified by the routemap <map> clause. Caution: Arbitrary filtering of SA Messages can result in downstream MSDP Peers from being starved of SA Messages for legitimate active sources. Care should be used when using these sorts of filters so that this does not occur. (Normally, these filters are only used to reject Bogons such as sources in network 10.0.0.0, etc.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

62

Originating SA Messages

Local Sources
A source is local if:
The router received a Register for (S, G), or The source is directly connected to RP

SAs are only originated for local sources


Denoted by the A flag on an (S,G) entry

Other conditions may suppress SA messages from being originated for local sources.
More on that later.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

63 63

Originating SA Messages for Local Sources


A local source is is defined as a source for which the RP: Has received a Register message for the source, or The sources is directly connected to the RP. An RP originates SA Messages only for local sources in its PIM-SM domain. A local source is denoted by the A flag being set in the (S, G) mroute entry on the RP. This indicates that the source is a candidate for advertisement by the RP to other MSDP peers. NOTE: In some IOS versions, the key in the show ip mroute command states that the A flag indicates that the (S, G) IS being announced via MSDP. This in fact is not correct as other factors such as filters may block this (S, G) from actually being advertised.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

63

Originating SA Messages

SA messages are triggered when any new source in the local domain goes active.
Initial multicast packet is encapsulated in an SA message.
This is an attempt at solving the bursty-source problem

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

64 64

Originating SA Messages
SA messages are triggered by an RP (assuming MSDP is configured) when any new source goes active within the local PIM-SM domain. When a source in the local PIM-SM domain initially goes active, it causes the creation of (S, G) state in the RP. New sources are detected by the RP by: The receipt of a Register message or The arrival of the first (S, G) packet from a directly connected source. The initial multicast packet sent by the source (either encapsulated in the Register message or received from a directly connected source) is encapsulated in the initial SA message in an attempt to solve the problem of bursty sources.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

64

Originating SA Messages
Encapsulating Initial Multicast Packets
Can bypass TTL-Thresholds
Original TTL is inside of data portion of SA message SA messages sent via Unicast with TTL = 255

Requires special command to control


ip msdp ttl-threshold <peer-address> <ttl>

Encapsulated multicast packets with a TTL lower than <ttl> for the specific MSDP peer are not forwarded or originated.
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

65 65

Originating SA Messages
A TTL-Threshold problem can be introduced by the encapsulation of a sources initial multicast packet in an SA Message. Because the multicast packet is encapsulated inside of the unicast SA Message (whose TTL = 255), its TTL is not decremented as the SA message travels to the MSDP peer. Furthermore, the total number of hops that the SA message traverses can be drastically different than a normal multicast packet. This is because multicast and unicast traffic may follow completely different paths to the MSDP peer and hence the remote PIM-SM domain. This can result in TTL-Thresholds being violated by this encapsulated packet. The solution to this problem is to configure a TTL Threshold that is associated with any multicast packet that is encapsulated in an SA message sent to a particular MSDP peer. This can be accomplished by configuring the following IOS command: ip msdp ttl-threshold <peer-address> <ttl> The above command prevents any multicast packet whose TTL is below <ttl> from being encapsulated in an SA message sent to the MSDP peer whose IP address is <peer-address> .

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

65

Originating SA Messages
Use msdp redistribute to control what SAs are originated.
Think of this as msdp sa-originate-filter function
ip msdp redistribute [list <acl>] [asn <aspath-acl>] [route-map <map>]

Filter by (S,G) pair using list <acl> Filter by AS-PATH using asn <aspath-acl> Filter based on route-map <map>

Omitting all acls stops all SA origination


Example: ip msdp redistribute

Default: Originate SAs for all local sources


If msdp redistribute command is not configured
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

66 66

Originating SA Messages
By default, an RP configured to run MSDP will originate SA messa ges for any and all local sources for which it is the RP. In some cases this may not be desirable. (Example: If a sources inside the PIM-SM domain are using private addresses such as network 10.0.0.0/8, it is generally not a good idea to advertise these to other MSDP peers in the Internet.) Control over which local sources should be advertised in SA Message can be accomplished using the following IOS command on the RP: ip msdp redistribute [list <acl>] [asn <aspath-acl>] [route-map <map>] This command permits filtering of the SA messages that are originated by the RP based on: (S, G) pair using the list <acl> clause. AS-PATH using the asn <aspath-acl> clause. Other criteria using the route-map <map> clause. Configuring this command without any of the acl or router-map clauses causes all SA origination by this RP to be stopped. (Note: The router will still forward SA messages from other MSDP peers in the normal fashion. It will just not originate any of its own. Authors Note: The choice of syntax for this command is a bit confusing and could have been better chosen. The term redistribute typically implies some other operation in IOS such as route redistribution. I prefer to mentally translate the syntax of this command into ip msdp sa-originatefilter because it is more descriptive of what the command actually does.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

66

Originating SA Messages

Once a minute
Router scans mroute table If group = sparse AND router = RP for group
For each (S,G) entry for the group:
If the msdp redistribute filters permits AND if the source is a local source Then originate an SA message for (S,G)

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

67 67

Originating SA Messages
The RP continues to periodically (every 60 seconds) originate SA messages for all active sources in the local PIM-SM domain for which it is functioning as the RP. The details of this mechanism that is performed every minute is as follows: For all Sparse mode (*, G) entries in the mroute table for which the router is functioning as the RP, originate an SA message for each subordinate (S, G) entry that meets the following conditions: The entry must be permitted by any msdp redistribute filters AND The source is a local source. (Denoted by the A flag.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

67

MSDP SA Statistics
Showing MSDP SA counters
show ip msdp count sj-mbone# show ip msdp count SA State per ASN Counters, <asn>: <# sources>/<# groups> Total entries: 1359 24: 4/4, 25: 3/3, 52: 60/16, 70: 1/1 103: 2/2, 131: 8/6, 145: 130/61, 293: 15/13 668: 218/56, 683: 7/4, 704: 151/89, 1239: 10/10 1249: 25/10, 1275: 17/14, 1835: 41/28, 1879: 2/2 2513: 3/2, 2603: 4/4, 2914: 2/2, 3582: 24/20 3701: 6/5, 5640: 2/1, 5779: 242/169, 6194: 2/2 6461: 7/5, 7660: 91/29, 9270: 209/56, 10490: 16/12 10680: 3/3, 10888: 47/41, 11423: 7/1

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

68 68

MSDP Statistics
Information on the number of sources and groups being advertised on an AS basis can be obtain by the use of the following IOS command: show ip msdp count The example show ip msdp count shown above indicates that: There are a total of 1359 sources being advertised via MSDP AS 24 is advertising 4 sources and 4 groups AS 25 is advertising 3 sources and 3 groups AS 52 is advertising 60 sources and 16 groups etc, etc.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

68

MSDP Mroute Flags


New mroute Flags for MSDP
sj-mbone#show ip mroute summary IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT M - MSDP created entry, entry X - Proxy Join Timer Running A - Advertised via MSDP Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (*, 224.2.246.13), 5d17h/00:02:59, RP 171.69.10.13, flags: S (171.69.185.51, 224.2.246.13), 3d17h/00:03:29, flags: TA (128.63.58.45, 224.2.246.13), 00:02:16/00:00:43, flags: M (128.63.58.54, 224.2.246.13), 00:01:16/00:01:43, flags: M

M flag indicates source was learned via MSDP A flag indicates source is a candidate for advertisement by MSDP
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

69 69

MSDP Mroute Flags


MSDP has added two new flags to the complement of flags that may appear on (S, G) entries in the Mroute Table on the RP. These new flags are as follows: M - Indicates that this source was learned via an MSDP SA message. A - Indicates that this source is a candidate for advertisement in an SA message. Note that the flag key is not 100% correct as it is possible for the A flag to be set without this source being advertised via MSDP by this router. This could be the case if an msdp redistribute filter were in use which denied a particular source from being advertised. Authors Note: I prefer to think of the A flag as an indicator that the source is a local source in the PIM -SM domain and thus is a candidate for the RP to advertise via MSDP. Unfortunately, there is currently no flag that positively indicates that a source actually is being advertised by MSDP. About the only way to determine that is to turn on debug ip msdp.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

69

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

70 70

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

70

MSDP Mesh-Groups
Optimises SA flooding
Useful when 2 or more peers are in a group

Reduces amount of SA traffic in the net


SAs are not flooded to other mesh-group peers

No RPF checks on arriving SA messages


When received from a mesh-group peer SAs always accepted from mesh-group peers
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

71 71

MSDP Mesh-Groups
An MSDP Mesh-Groups can be configured on a group of MSDP peers that are fully meshed. (In other words, each of the MSDP peers in the group has an MSDP connection to every other MSDP peer in the group. When an MSDP Mesh-Group is configured between a group of MSDP peers, SA flooding is reduced. This is because when an MSDP peer in the group receives an SA message from another MDP peer in the group, it can assume that this SA message was sent to all the other MSDP peers in the group. As a result, it is not necessary nor desirable for the receiving MSDP peer to flood the SA message to the other MSDP peers in the group. MSDP Mesh -Groups may also be used to eliminate the need to run (M)BGP to do RPF checks on arriving SA messages. This is because SA messages are never flooded to other MSDP peers in the mesh-group. As a result, it is not necessary to perform the RPF check on arriving SA messages.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

71

MSDP Mesh-Groups

Configured with:
ip msdp mesh-group <name> <peer-address>

Peers in the mesh-group should be fully meshed. Multiple mesh-groups per router are supported.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

72 72

MSDP Mesh-Groups
A MSDP Mesh -Group may be configured by using the following IOS configuration command: ip msdp mesh-group <name> <peer-address> This command configures the router as a member of the mesh-group <name> for which MSDP peer <peer-address> is also a member. All MSDP peers in the mesh-group must be fully-meshed with all other MSDP peers in the group. This means that each router must be configured with ip msdp peer and ip msdp mesh -group commands for each member of the mesh-group. Routers may be members of multiple Mesh-Groups.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

72

MSDP Mesh-Group Example


ip ip ip ip msdp msdp msdp msdp peer R2 peer R3 mesh-group My-Group R2 mesh-group My-Group R3 SA not forwarded to other members of the mesh -group

R1
SA

R4

SA

RP

RP

R2

SA

SA

R5

R3
ip ip ip ip ip msdp msdp msdp msdp msdp peer R1 peer R3 peer R4 mesh-group My-Group R1 mesh-group My-Group R3

ip ip ip ip ip

msdp msdp msdp msdp msdp

peer R1 peer R2 peer R5 mesh-group My-Group R1 mesh-group My-Group R2

MSDP mesh-group peering

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

73 73

MSDP Mesh-Group Example.


In the above example, routers R1, R2 and R3 are all configured as members of the same MSDP mesh-group. In addition, router R1 is also MSDP peering with router R4 and router R3 is MSDP peering with router R5. Neither R4 nor R5 are members of the MSDP mesh-group. Assume router R4 originates an SA message for a source in its local PIMSM domain. This message is sent to route R2 as shown in the drawing above. When router R2 receives this SA message, it must perform an RPF check on the message because it was received from an MSDP peer that is not a member of the mesh-group. In this case the RPF check is successful and router R2 floods the SA message (received from a non-mesh -group member) to all other members of the mesh-group. When routers R1 and R3 receive the SA message from mesh-group member R2, they do not have to perform an RPF check on the arriving message nor do they flood the SA message to each other since they are both members of the mesh-group. (They know that the other members of the mesh-group will have received a copy directly from R2 and therefore they do not have to forward the SA message to each other. This is why a full mesh between mesh-group members is required.) Finally, router R3 floods the SA message to all of its MSDP peers that are not members of the mesh-group. In this case, the SA message is flooded to router R5 to continue the flow of the SA message downstream away from the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

73

Avoid Mesh-Group Loops!!!


WARNING: There is no RPF check between MeshMesh -groups!!!
Other ISP

AS 2
Mesh-group as1-as2 Mesh-group as2-as3

RP

SA Loop!!!

RP

RP

RP

RP

AS1
Mesh-group as1 Mesh-group as1-as3

AS3
Mesh-group as3

MSDP Peering i(m)BGP Peering e(m)BGP Peering


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

DONT DO THIS!!!!
8/21/2001 2:33 PM

74 74

Avoid Mesh-Group Loops!!!


There is no RPF checking between Mesh-groups. Insure that you arent creating a continuous loop of Mesh-Groups.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

74

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

75 75

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

75

MSDP SA Caching
With MSDP SA Caching
RPF check received SA If RPF OK
If RP for group, trigger any necessary (S,G) Joins Store in SA cache If new cache entry, immediately flood downstream If existing entry, reset entrys SA-expire-timer
Timer is reset to 6 minutes by receipt of another SA. When timer = zero, entry has expired and is deleted.

Once per minute, scan SA cache


Send SA downstream for remaining entries
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

76 76

With MSDP SA Caching


Arriving SA messages are RPF checked in the normal fashion. Note: Arriving SA messages must pass any incoming SA filters that have been configured. If the RPF check succeeds: An (S,G) join is triggered toward this source if the router is the active RP for this group AND it has a non-null (*,G) OIL (which indicates that there are members for the group on the shared tree). The router then stores the SA message in the SA cache. If this is a new entry, the SA message is immediately flooded downstream to all MSDP neighbors. Next, the SA-expire-timer for the entry (new or existing) is reset to 6 minutes. Each time an (S,G) SA message is received, this timer in the cache entry is reset to 6 minutes. If the timer ever counts down to zero, then the entry is expired and it is removed from the SA cache. Once per minute The router scans the SA cache and sends an SA message downstream for each unexpired cache entry. Note: The SA messages must pass any outgoing SA filters that have been configured before it is sent.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

76

MSDP SA Caching
SA Caching Pros
Reduces join latency
RP maintains list of all active sources. Can immediately send (S,G) Joins as needed.
When a receiver joins the group. No need to wait for next (S,G) SA message to arrive

Valuable debugging tool


Use show ip msdp sa-cache
Lists all active sources in the Internet

Helps prevent SA Storms


SAs are advertised periodically from cache
Paces SA message propagation

SA Caching Cons
Consumes more memory
Minor memory impact to date
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

77 77

SA Caching Pros
Reduces Join latency When this SA caching is configured, the router will begin caching all (S,G) pairs received in SA messages. This reduces join latency as the RP maintains a list of all active sources. Therefore, when the first receiver joins the group, the RP doesnt have to wait 60 seconds for the next SA message before sending out the (S,G) Join. Valuable debugging tool The contents of the SA cache is a valuable source of MSDP debugging information. The show ip msdp sa-cache command will list its contents and display all active sources in the Internet along with information on what AS they reside and what RP advertised it. Helps prevent SA Storms Since SAs are advertised periodically from cache (instead of as soon as they are received from an upstream neighbor), the propagation of SA messages through out the Internet are better paced. This helps to avoid overrunning TCP input queues on the MSDP peers which results in session resets and instability of MSDP.

SA Caching Cons
Memory consumption The memory impact of turning on SA Caching in most RPs is, in general, very small.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

77

MSDP SA Caching
Without MSDP SA Caching
RPF check received SA If RPF OK
If RP for group, trigger any necessary (S,G) Joins Immediately flood SA downstream
Propagates the affect of any SA storms downstream Often results in TCP queue overflows and session resets

Current MSDP IETF specification


Requires SA Caching Benefits outweigh the memory consumption
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

78 78

Without MSDP SA Caching


Arriving SA messages are RPF checked in the normal fashion. Note: Arriving SA messages must pass any incoming SA filters that have been configured. If the RPF check succeeds: An (S,G) join is triggered toward this source if the router is the active RP for this group AND it has a non-null (*,G) OIL (which indicates that there are members for the group on the shared tree). The router then immediately forwards the SA message to all downstream MSDP peers. This behavior does nothing to prevent SA Storms since SAs are flooded to downstream peers as fast as they arrive.

Current MSDP IETF specification


The current draft of the MSDP specification requires an implementation to use SA caching and to pace the transmission of SA messages to downstream neighbors. This was added to the specification to help prevent SA Storms which had occurred in the early days of MSDP usage in the Internet.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

78

MSDP SA Caching
Enabling SA Caching
ip msdp cache-sa-state [list <acl>]

Caching is on by default
Beginning with IOS versions 12.1(7), 12.0(14)S1
Cannot be turned off.

Router begins caching all SA messages


Optional acl controls which (S, G) are cached Cached (S, G) entries timeout after 6 minutes
If not refreshed by another (S,G) SA message

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

79 79

Enabling SA Caching
SA caching can be enabled by the use of the following IOS command : ip msdp cache-sa-state [list <acl>] When this command is configured, the router will begin caching all (S,G) pairs received in SA messages. This reduces join latency as the RP maintains a list of all active sources. (The memory impact of turning on SA Caching in most RPs is, in general, very small. The other benefit is that additional information becomes available for network debugging if SA Caching is enabled.) The optional list <acl> clause may be used to control which (S,G) pairs are cached. If this optional clause is not specified, all (S,G) pairs are cached. (S,G) pairs in the SA cache have a 6 minute timeout. If a new SA message is not received with this (S,G) pair in that period, the entry is removed from the cache.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

79

MSDP SA Caching Server

SA Caching Server function


On whenever SA Caching is enabled Router will respond to SA-Requests
Received from non-caching routers Returns list of sources active for requested group

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

80 80

MSDP SA Caching Server


If SA Caching has been enabled on a router, it will respond to SA Request messages received from other routers. When an SA Request for group G is received by a router with SA Caching enabled, the router responds by sending back SA Response messages containing a list of all active sources for the requested group G.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

80

MSDP SA Caching Client


Enabling SA Caching Client
ip msdp sa-request <server-address >

Seldom used feature. Router will send SA-Requests to server


When it is the RP for a group AND The (*, G) OIL goes from NULL to non-NULL

Reduces join latency


Router doesnt have to wait for periodic SA messages Learns all sources at once from Caching Server
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

81 81

MSDP SA Caching
Routers that do not have SA Caching enabled can benefit from other routers that are SA Caching enabled by sending SA Request messages whenever receivers first join a group. In order to enable the se nding of SA Requests, the following IOS command must be configured: ip msdp sa-request <server-address> This command will cause the router to send an SA-Request message to the SA Caching router whose IP address is <server-address> under the following conditions: When the outgoing interface list of a (*,G) entry goes from Null to non-Null AND The router is the RP for group G.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

81

MSDP SA Caching

SA-Request filtering
ip msdp filter-sa-request <ip-address> [list <acl>]

Filter SA-Requests from <ip-address>


Based on optional group list acl Default is deny all groups if no acl specified.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

82 82

SA Request Filtering
In some situations, a router that has SA Caching enabled may not wish to honor all received SA Request messages. If this is desired, an SA Request filter may be configured using the following IOS command : ip msdp filter-sa-request <ip-address> [list <acl>] This command will cause the router to filter all SA-Requests received from the router whose IP address is <ip-address> based on the optional list <acl> clause. If the optional list <acl> clause is not specified, the default behavior is to not respond to SA Requests from this router.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

82

MSDP SA Caching
Listing the contents of the SA Cache
show ip msdp sa-cache [<group-or-source>] [<asn>]
sj-mbone# show ip msdp sa-cache MSDP Source-Active Cache - 1997 entries (193.92.8.77, 224.2.232.0), RP 194.177.210.41, MBGP/AS 5408, 00: 01:51/00:04:09 (128.119.167.221, 224.77.0.0), RP 128.119.3.241, MBGP/AS 1249, 0 6:40:59/00:05:12 (147.228.44.30, 233.0.0.1), RP 195.178.64.113, MBGP/AS 2852, 00: 04:48/00:01:11 (128.117.16.142, 233.0.0.1), RP 204.147.128.141, MBGP/AS 145, 00 :00:41/00:05:18 (132.250.95.60, 224.253.0.1), RP 138.18.100.1, MBGP/AS 668, 01:1 5:07/00:05:55 (128.119.40.229, 224.2.0.1), RP 128.119.3.241, MBGP/AS 1249, 06: 40:59/00:05:12 (130.225.245.71, 227.37.32.1), RP 130.225.245.71, MBGP/AS 1835, 1d00h/00:05:29 (194.177.210.41, 227.37.32.1), RP 194.177.210.41, MBGP/AS 5408, 00:02:53/00:03:07 (206.190.42.106, 236.195.60.2), RP 206.190.40.61, MBGP/AS 5779, 00:07:27/00:04:04 . . .

Clearing the contents of the SA Cache


clear ip msdp sa-cache [<group-address> | group-name]
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

83 83

MSDP SA Caching
The contents of the SA Cache can be very helpful to debugging MSDP problems in the network. (This is why most network administrators enable SA Caching on all MSDP routers.) The following IOS command can be used to display the contents of the SA Cache: show ip msdp sa-cache [<group-or-source>] [<asn>] The above command lists the contents of the SA Cache. The optional <group-or-source> and <asn> qualifiers may be used to limited the displayed output to only those desired entries. In the above example of this command we see that there are 1997 entries in the SA Cache. The information on the first entry is as follows: (192.92.8.77, 224.2.232.0) RP 194.177.210.41 MBGP/AS 5408 00:01:51/00:04:09 = Active source/group information = IP address of the originating RP. = The RP resides in AS 5408 = The source has been active for 1 min, 51 sec and will expire in 4 min, 9 sec.

The contents of the SA Cache can be cleared by the use of the following IOS command: clear ip msdp sa-cache [<group-address> | <group-name>] The optional <group-address> and <group-name> qualifiers may be specified to limit which entries are to be cleared from the cache.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

83

Pseudo MSDP peer


Other ISP Provider
RP RP

Other ISP

RR
RP RP

Customer
RP

Customer

MSDP Peering i(m)BGP Peering


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Customer

8/21/2001 2:33 PM

84 84

Pseudo MSDP Peer


MSDP peers do not have to be an RP. They will still forward any SAs (that pass the RPF check) to other MSDP peers. i(m)BGP must be used in parallel between the Pseudo (non-RP) MSDP peer and all other MSDP peers. This is necessary so that SA messages can be RPF checked. The Pseudo MSDP peer concept is often used on a (m)BGP Route Reflector.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

84

Pseudo MSDP peer

The MSDP peers do not have to be an RP to forward SAs. Use i(m)BGP for the RPF check on the Pseudo (non-RP) peer. The Pseudo peer is often an i(m)BGP Route Reflector.

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

85 85

Pseudo MSDP Peer


MSDP peers do not have to be an RP. They will still forward any SAs (that pass the RPF check) to other MSDP peers. i(m)BGP must be used in parallel between the Pseudo (non-RP) MSDP peer and all other MSDP peers. This is necessary so that SA messages can be RPF checked. The Pseudo MSDP peer concept is often used on a (m)BGP Route Reflector.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

85

Agenda
Inter-domain Multicast
Past & Future

MSDP Overview MSDP Peers MSDP Messages MSDP Mesh Groups MSDP SA Caching MSDP Applications
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

86 86

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

86

Anycast RP
draft-ietf-mboned-anycast-rp-nn.txt Within a PIM-SM domain, deploy more than one RP for the same group range.
Each RP configured with the same IP address.

DRs use closest RP


Sources and receivers are registered/joined to the closest RP.

RPs use MSDP to inform each other about active sources in their part of the domain.
Other RPs join SPT to these sources as needed.
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

87 87

Anycast RP
MSDP may be used to implement the concept of Anycast RPs within a PIM-SM domain to provide RP redundancy, rapid RP fail-over and RP loadbalancing. This concept was first documented in the following IETF draft: draft-ietf-mboned-anycast-rp-nn.txt The Anycast RP mechanism works as follows: Two or more routers are configured as active RP for the same group range at the same time. (This is normally is a configuration error that would partition the PIM -SM domain. However, MSDP is used to prevent this from happening.) Each RP is assigned the same RP Address. (This is usually accomplished using a Loopback interface and private address space.) Each router advertises its RP address as a host route in the unicast routing protocol. Sources and receivers (more specifically, their DRs) will use the closest RP based on their unicast routing table. The Anycast RPs are all connected via MSDP. This allows each RP to learn which sources have been registered with the other Anycast RPs in the domain. The normal PIM -SM RP behavior will result in the RPs joining the source tree of active sources in the other parts of the network as necessary.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

87

Anycast RP
Benefits
RP backup without using Auto-RP or BSR. RP fail-over at speed of unicast routing protocol.

Requirements
Use only one IP address for all your RPs. RPs advertise this address as a host route. MSDP is used between the RP routers. Use ip msdp originator-id command.
Disambiguates which RP originated SA message
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

88 88

Anycast RP Benefits
Anycast RPs provide for backup RPs without having to use Auto-RP or BSR. RP fail-over occurs roughly at the same speed as the unicast routing protocol converges.

Anycast RP Requirements
All Anycast RPs are configured to use the same IP address. All Anycast RPs advertise this IP address as a host route. This causes the DRs in the network to see only the closest RP. Note: If there does happen to be a metric tie, the normal RPF mechanism will select the only one path back to the RP. The path selected will be the one that has the highest next hop address. All Anycast RPs are tied together via MSDP peering sessions. The ip msdp originator-id command is used to control the IP address that is sent in any SA messages that are originated by an RP. This is done to disambiguate which RP originated the SA message. If this were not done, all RPs would originate SA messages using the same IP address.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

88

Anycast RP Overview

RP1 A A 10.1.1.1

MSDP

RP2 B B 10.1.1.1

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

89 89

Anycast RP Overview
In the drawing above, two Anycast RPs are configured with the same IP address, RP1 in San Francisco and RP2 in New York. Each are connected via MSDP. (Yes, you must use some other address in the ip msdp peer commands than 10.0.0.1.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

89

Anycast RP Overview

Src

Src

RP1 A A 10.1.1.1

MSDP
SA SA

RP2 B B 10.1.1.1

Rec

Rec

Rec

Rec

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

90 90

Anycast RP Overview
Notice that initially, the DRs for the sources and receivers register to the closest RP based on their unicast routing table entry for IP address 10.0.0.1. This causes in DRs in the eastern half of the U.S. to register/join to the RP in New York while the DRs in the western half register/join to the RP in San Francisco. When a new source registers with the nearest RP, that RP will se nd an MSDP SA message to its peer. This will cause the peer RP to join the SPT to the new source so it can pull the sources traffic to itself and then send it down the shared tree to its receivers.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

90

Anycast RP Overview

Src

Src

RP1

RP2 B B 10.1.1.1

Rec

X
Module11. ppt

A A 10.1.1.1

Rec

Rec

Rec

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

91 91

Anycast RP Overview
Continuing with our example, lets assume that the RP in San Fra ncisco goes down. When the unicast routing protocol reconverges, all of the DRs in the western half of the U.S. will now see the route to IP address 10.0.0.1 points toward the the New York RP. This results in new registers/joins being sent by the DRs in the western half of the U.S. to the RP in New York and the flow of traffic is reestablished.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

91

Anycast RP Configuration

ip pim rp-address 10.0.0.1

RP1
E0 S0

RP2
ip pim rp-address 10.0.0.1

10.1.1.1 via E0

Interface loopback 0 ip address 10.0.0.2 255.255.255.255 Interface loopback 1 ip address 10.0.0.1 255.255.255.255 ! ip msdp peer 10.0.0.3 connect-source loopback 0 ip msdp originator-id loopback 0

Interface loopback 0 ip address 10.0.0.3 255.255.255.255 Interface loopback 1 ip address 10.0,0.1 255.255.255.255 ! ip msdp peer 10.0.0.2 connect-source loopback 0 ip msdp originator-id loopback 0

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

92 92

Anycast RP Example
In this example, two Anycast RPs are configured with the same IP address, 10.0.0.1, using Loopback 0. Each are connected via MSDP using their Loopback 1 addresses, 10.0.0.2 and 10.0.0.3. (Yes, you must use some other address in the ip msdp peer commands than 10.0.0.1.)

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

92

Anycast RP Tips
Avoid Anycast RP/Router-ID conflicts
Insure Loopback address used for Anycast RP address is not accidentally used as a Router-ID.
This will mess up OSPF and BGP.

How to avoid conflict with Router-ID


Configure Anycast RP address as the lowest IP address or Use secondary IP address on Loopback for Anycast IP address or Use router-id commands in OSPF and BGP to statically configure Router-ID.
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

93 93

Anycast RP Tips
Care must be taken to prevent the Loopback addresses being used for the Anycast RPs from being accidentally used as the Router-ID for OSPF and BGP. If this occurs, there will be multiple OSPF/BGP routers in the network with the same Router-ID. (Can you say, My network is broken? Sure, I knew you could.) Avoiding the Router-ID conflict: Configure the Anycast RP Loopback address using the lowest IP address in the box. Configure a secondary address on the Loopback address and use this address for Anycast RP configuration. Use the router-id configuration commands to statically configure the OSPF and/or BGP Router-ids.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

93

Single-Homed, ISP RP, Non-MBGP


PIM Border Constraints
Tail-site Customer
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

int pos0/0 ip pim sparse-dense-mode 192.168.100.0/24 Receiver ip pim send-rp-announce Loopback0 scope 255 ip pim send-rp-discovery Loopback0 scope 255 int pos0/0 ip pim sparse-dense-mode

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

94 94

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

94

Single-Homed, ISP RP, Non-MBGP


Checking PIM Border (RP Mapping)
Tail-site Customer
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

tail-gw#show ip pim rp mapping PIM Group-to-RP Mappings Group(s) 224.0.0.0/4 192.168.100.0/24 RP 3.3.3.7 (loopback.transit.net), v2v1 Info source: 1.1.1.2 (tail.transit.net), via Auto-RP Uptime: 21:57:41, expires: 00:02:08

Receiver

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

95 95

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

95

Single-Homed, ISP RP, Non-MBGP


Checking PIM Border (RP Mapping)
Tail-site Customer
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

Receiver

Transit-tail#show ip pim rp mapping 192.168.100.0/24 PIM Group-to-RP Mappings This system is an RP (Auto-RP) This system is an RP-mapping agent Group(s) 224.0.0.0/4 RP 3.3.3.7 (loopback.transit.net), v2v1 Info source: 3.3.3.7 (loopback.transit.net), via Auto-RP Uptime: 22:08:47, expires: 00:02:14

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

96 96

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

96

Single-Homed, ISP RP, Non-MBGP


Border RPF Check
Tail-site Customer
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

ip route 0.0.0.0 0.0.0.0 1.1.1.2

192.168.100.0/24 ip route 192.168.100.0 255.255.255.0 1.1.1.1 Receiver router bgp 109 ... network 192.168.100.0 nlri unicast multicast

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

97 97

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

97

Single-Homed, ISP RP, Non-MBGP


MSDP RPF Check
Tail-site Customer
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

- no RP / no MSDP

192.168.100.0/24 Receiver - no downstream RP - no downstream MSDP peering

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

98 98

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

98

Single-Homed, Customer RP, Non-MBGP


PIM Border Constraints
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 192.168.100.0/24 Receiver ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111

Note: Access-list 111 = Recommended SA Filter


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

99 99

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

99

Single-Homed, Customer RP, Non-MBGP


PIM Border Constraints
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

192.168.100.0/24 Receiver

int ip ip ip

pos0/0 pim sparse-mode pim bsr-border multicast boundary 1

ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111

Note: Access-list 111 = Recommended SA Filter


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

100 100

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

100

Single-Homed, Customer RP, Non-MBGP


Border RPF Check
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

ip route 0.0.0.0 0.0.0.0 1.1.1.2

192.168.100.0/24 ip route 192.168.100.0 255.255.255.0 1.1.1.1 Receiver router bgp 109 ... network 192.168.100.0 nlri unicast multicast

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

101 101

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

101

Single-Homed, Customer RP, Non-MBGP


MSDP RPF Check
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

ip msdp peer 1.1.1.1 connect-source pos0/0 192.168.100.0/24 Receiver ip msdp peer 1.1.1.2 connect-source pos0/0

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

102 102

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

102

Single-Homed, Customer RP, MBGP


PIM Border Constraints
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 192.168.100.0/24 Receiver ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111

Note: Access-list 111 = Recommended SA Filter


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

103 103

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

103

Single-Homed, Customer RP, MBGP


PIM Border Constraints
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

192.168.100.0/24 Receiver

int ip ip ip

pos0/0 pim sparse-mode pim bsr-border multicast boundary 1

ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111

Note: Access-list 111 = Recommended SA Filter


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

104 104

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

104

Single-Homed, Customer RP, MBGP


Border RPF Check
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

router bgp 100 network 192.168.100.0 nlri unicast multicast neighbor 1.1.1.2 remote-as 109 nlri unicast multicast neighbor 1.1.1.2 update-source pos0/0 192.168.100.0/24 Receiver router bgp 109 neighbor 1.1.1.1 remote-as 100 nlri unicast multicast neighbor 1.1.1.1 update-source pos 0/0

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

105 105

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

105

Single-Homed, Customer RP, MBGP


MSDP RPF Check
Tail-site Customer
RP
pos0/0 1.1.1.1 pos0/0 1.1.1.2

Transit AS109
RP

ip msdp peer 1.1.1.1 connect-source pos0/0 192.168.100.0/24 Receiver ip msdp peer 1.1.1.2 connect-source pos0/0

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

106 106

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

106

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
int pos0/0 ip pim sparse-mode 192.168.100.0/24 ip pim bsr-border ip multicast boundary 1 Receiver int pos1/0 ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111

pos0/0 1.1.2.2

Unicast Transit

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

107 107

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

107

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
pos0/0 1.1.2.2
int 192.168.100.0/24 ip ip ip pos0/0 pim sparse-mode pim bsr-border multicast boundary 1

Receiver

Unicast Transit

ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

108 108

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

108

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver

Unicast Transit
Hey, this site knows no multicast so there is no PIM to constrain

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

109 109

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

109

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver router bgp 100 network 192.168.100.0 nlri unicast multicast neighbor 1.1.1.2 remote-as 109 nlri multicast neighbor 1.1.1.2 update-source pos 0/0 neighbor 1.1.2.2 remote-as 110 nrli unicast neighbor 1.1.2.2 update-source pos 1/0
1998 2001, Cisco Systems, Inc. All rights reserved.

Unicast Transit

Module11. ppt

8/21/2001 2:33 PM

110 110

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

110

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver router bgp 109 neighbor 1.1.1.1 remote-as 100 nlri multicast neighbor 1.1.1.1 update-source pos 0/0

Unicast Transit

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

111 111

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

111

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver router bgp 110 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 update-source pos0/0
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Unicast Transit

8/21/2001 2:33 PM

112 112

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

112

Dual-Homed, Customer RP, MBGP Incongruent MulticastUnicast


MSDP RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Multicast Transit

ip msdp peer 1.1.1.2 connect-source pos0/0

Transit AS110

pos0/0 1.1.2.2
192.168.100.0/24 Receiver ip msdp peer 1.1.1.1 connect-source pos0/0

Unicast Transit

Again, no multicast clue.. Then no MSDP peering.


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

113 113

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

113

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Receiver

int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 192.168.100.0/24 int pos1/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out 1.1.1.2 111 ip msdp sa-filter in 1.1.1.2 111 ip msdp sa-filter out 1.1.2.2 111 ip msdp sa-filter in 1.1.2.2 111
1998 2001, Cisco Systems, Inc. All rights reserved.

Transit AS110
pos0/0 1.1.2.2

RP Unicast & Multicast


Transit

Module11. ppt

8/21/2001 2:33 PM

114 114

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

114

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out 1.1.1.1 111 ip msdp sa-filter in 1.1.1.1 111

Receiver

RP Unicast & Multicast


Transit

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

115 115

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

115

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


PIM Border Constraints
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out 1.1.2.1 111 ip msdp sa-filter in 1.1.2.1 111
1998 2001, Cisco Systems, Inc. All rights reserved.

RP Unicast & Multicast


Transit

Module11. ppt

8/21/2001 2:33 PM

116 116

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

116

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiverrouter bgp 100 RP Unicast & Multicast network 192.168.100.0 nlri unicast multicast Transit neighbor 1.1.1.2 remote-as 109 nlri unicast multicast neighbor 1.1.1.2 update-source pos0/0 neighbor 1.1.2.2 remote-as 110 nlri unicast multicast neighbor 1.1.2.2 update-source pos1/0
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

117 117

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

117

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver

RP Unicast & Multicast


router bgp 109 neighbor 1.1.1.1 remote-as 100 nlri unicast multicast neighbor 1.1.1.1 update-source pos 0/0

Transit

Module11. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

118 118

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

118

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


Border RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

Transit AS110
pos0/0 1.1.2.2
192.168.100.0/24 Receiver

RP Unicast & Multicast


Transit

router bgp 110 neighbor 1.1.2.1 remote-as 100 nlri unicast multicast neighbor 1.1.2.1 update-source pos0/0
Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

119 119

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

119

Dual-Homed, Customer RP, MBGP Congruent MulticastUnicast


MSDP RPF Check
Customer AS100
RP pos0/0 1.1.1.1 pos1/0 1.1.2.1 RP pos0/0 1.1.1.2

Transit AS109

Unicast & Multicast Transit

ip msdp peer 1.1.1.2 connect-source pos0/0 ip msdp peer 1.1.2.2 connect-source pos1/0

Transit AS110

pos0/0 1.1.2.2
192.168.100.0/24 Receiver ip msdp peer 1.1.1.1 connect-source pos0/0

RP Unicast & Multicast


Transit

ip msdp peer 1.1.2.1 connect-source pos0/0


Module11. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 2:33 PM

120 120

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

120

Module11.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

121

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module11.ppt

121

PIM Protocol Extensions


Module 12

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Objectives

Upon completion of this module, you will be able to perform the following tasks:
Describe the principles of Source-Specific Multicast and configure it. Describe the principles of Bidir-PIM and configure it.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

2 2

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Agenda

Source Specific Multicast Bidirectional (Bidir) PIM

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

3 3

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Barriers to Inter-domain Multicast Deployment


Multicast Address Allocation
Dynamic Address Allocation
No adequate dynamic address allocation methods exist SDR Doesnt scale MASC Long ways off!

Static Address Allocation (GLOP)


Based on AS number. Insufficient address space for large Content Providers.

Multicast Content Jammers


Undesirable sources on a multicast group.
Capt. Midnight sources bogus data/noise to group. Can cause DoS attack by congesting low speed links.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

4 4

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Source Specific Multicast (SSM)

Simple solution for well-known sources


Particularly in cases where there is a single source sending to a given group. Allows immediate use of SPT to a specific source without creating shared tree. Eliminate dependence on MSDP for finding sources. Simplifies global group address allocation.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

5 5

Source Specific Multicast


Another variant of a PIM Sparse mode supports Source Specific Multicast (SSM) applications. The PIM SS (Source Specific) utilizes all the benefits of sparse mode protocols but eliminates shared trees at all and only builds source specific shortest path trees. These trees are built directly on receiving group membership reports that request a given source. The PIM SS is a draft proposal (draft-bhaskar-pim-ss-00.txt). The SSM is suitable for well known sources within a domain or in another domain. The Multicast Source Discovery Protocol (MSDP) which is needed for interdomain multicast routing when regular PIM Sparse Mode is used within a domain is no longer needed for SSM. A dedicated multicast group address range 232/8 is used exclusiv ely for shortest-path trees for SSM. Routers are prevented to build a shared tree for any of the groups from this address range. The address range 232/8 is assigned for global well-known sources. Source specific multicast (SSM) is a datagram delivery model that best supports one-to-many applications, also known as broadcast applications.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

SSM Overview
Hosts initiate join requests for a specific source(s) within a group. Last-hop router sends (S,G) join toward source without joining/creating shared tree. Content identified by both source and group address instead of group address alone. Eliminates shared tree, simplifying address allocation.
Dissimilar content sources can use same group without fear of interfering with each other.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

6 6

SSM: For Well-know Sources


The Source Specific Multicast allows last-hop router to immediately send (S,G) Join towards the source. Thus the PIM Sparse Mode (*,G) Join towards the RP is eliminated at all and first-hop routers start forwarding the multicast traffic donw the shortest-path tree (SPT) from the very beginning - as soon as the SPT is built by receiving first (S,G) Join. The assigned address range 232/8 also simplifies the address allocation problems since the range is a global range for sorces that have to be wellknown. Implementations in routers must not build any shared tree for those groups. Source specific groups can coexist with other groups in PIM Sparse mode domains.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Source Specific Multicast Example


Source

Host learns of source, group/port First-hop learns of source, group/port First-hop send PIM (S,G) Join

A PIM (S, G) Join

Out-of-band source directory, example: web server

IGMPv3 (S, G) Join

Receiver 1
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

7 7

SSM Example
The prerequisite for SSM deployment is a mechanism that allows hosts not only to report the group they want to join but also the source for the group. This mechanism is built into emerging IGMP version3 standard. With IGMP v3 lasthop routers learn from the report for the multicast source and the group. It then simply creates (S,G) Join and forwards it directyl to the source. The ways how hosts learn about existence of sources can be different normally via some directory services (session announcements directly from sources or some out -of-band mechanisms, e.g. web pages). Most of those mechanisms distribute the information via multicast.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Source Specific Multicast Example


Source

Result: Shortest path tree rooted at the source, with no shared tree.

Out-of-band source directory, example: web server

Receiver 1
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

8 8

SSM Example
The result of building source-rooted tree (shortest-path tree) right from beginning is that RP mechanisms for source-specific groups are completely eliminated. The RPs for those groups are not needed any more and routers must not build shared trees for groups in the range 232/8. The benefits of building shortest-path trees directly (and not via PIM Sparse mode switchover mechanism) are evident the latency of multicast traffic is decreased and less multicast state is kept in multicast forwarding tables. Another major benefit of SSM in in address management. Traditionally multicast applications had to acquire a unique IP multicast group address because traffic distribution was based only on the group address used. When two applications with different sources and receivers used the same IP multicast group address, the receivers received the traffic from both sources. In SSM, traffic from each source is forwarded between routers in the network independent of traffic from other sources. Thus different sources can reuse multicast group addresses in the SSM range

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Effect of shared trees on SSM

SSM will work with shared trees but...


Cant control who transmits on shared tree. Cant avoid address collisions.

IANA has allocated 232/8 for SSM.


Shared trees are prohibited in this range. Requires special filter configuration for legacy network support.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

9 9

Effect of Shared Trees on SSM


Source specific multicast can coexist with shared trees but there are some caveats that affect the deployment. The assigned address range 232/2 for SSM is not recognized by older multicast implementations and if a source creates a session to any of the groups in this range legacy multicast routers could build shared trees for those groups. The assignment of 232/8 address range was done under assumption that implementations will prohibit building shared trees for the range at all. With older multicast router implementations this is certainly not true which can result at least in in address collisions if not in multicast forwarding loops.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

Eliminating shared trees in 232/8

Filtering Register Messages


Use ip pim accept-register at RP.
Prevents sources from registering in 232/8 range.

Filtering SA Messages
Use ip msdp sa-redistribute at RP.
Stops SA message origination in the 232/8 range.

Use ip msdp sa-filter on MSDP peers.


Prevents forwarding of SA messages in the 232/8 range.
10 10

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

Eliminating Shared Trees in 232/8


If some multicast routers in the network do not support Source Specific Multicast 232/8 address range an additional filtering is needed to prevent building shared trees for the address range assigned for SSM. The following 232/8 filtering mechanisms have to be in place: Prevent first-hop routers to register to the RP filter on the RP (e.g. send Register-Stop immediately) Prevent last-hop routers to originate (*,G) Joins filter on last-hop routers Prevent intermediate routers to originate any (S,G) Prune with RP-bit set filter on intermediate routers In inter-domain multicast routing using MSDP prevent origination and/or forwarding information on sources that are active for groups in SSM range filter on the RP or on a border router

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

10

SSM Host Signalling Overview


IETF proposed signalling: IGMPv3
draft-ietf-idmr-igmp-v3-04.txt Proposed for IP SSM
Also for filtering in RFC1112 style IP Multicast service.

IGMPv3 will only be active ... IF supported in last-hop routers AND IF supported in host operating systems AND IF supported in receiver applications
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

11 11

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

11

IGMPv3 Host Signalling


(Too?) Complex protocol !
No report suppression by hosts. Complex Set-theory Rules for Include/Exclude processing.

Host membership reports sent to 224.0.0.22.


Allows for implementation of IGMPv3 snooping in less expensive switches.

Possible memberships for hosts:


INCLUDE({S1,,Sn}, G) - traffic from {S1,,Sn} to G EXCLUDE({S1,,Sn}, G) traffic from all sources to G, except for traffic from {S1,,Sn}
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

12 12

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

12

IGMPv3 Host Signalling


IGMPv3 interaction inside of SSM-Range:
Routers ignore IGMPv1/v2 membership reports. Routers ignore IGMPv3 EXCLUDE type membership reports. Uses only IGMPv3 INCLUDE({S1,,Sn},G) memberships! No way to request all sources traffic:

IGMPv3 interaction outside of SSM-Range:


Filters traffic on local wire according to Include/Exclude membership reports received from hosts. Permits PIM to filter traffic back to source.
PIM-DM: Simple, only (S,G) state anyhow. PIM-SM: More complex but doable. (SPT vs. RPT filtering) Bidir-PIM: No way, there is only (*,G) state !
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

13 13

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

13

SSM Host Signalling


How to bootstrap deployment ?

IGMPv3:
Should eventually become industry standard. Cisco IGMPv3 implementation in IOS 12.1(3)T and 12.0(15)S.

Questions:
When will host Operating Systems get IGMPv3 support? When will applications be written to use IGMPv3 support? Do we want to wait for all this to happen?

Answer: No!
We need the benefits of IP SSM today to:
Resolve certain multicast Security issues Avoid address collisions
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

14 14

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

14

SSM Host Signalling


Bootstrap Solutions

Cisco IOS value added IP SSM bootstrap solutions


URD: (URL Rendezvous Directory)
Enable existing receiver applications for IP SSM via the web.

IGMP v3lite:
Provide for a partial IGMPv3 API on IGMPv1/v2 hosts. Enables us to write and run IP SSM applications NOW

Common idea of URD and IGMP v3lite:


Generate a (S,G) channel subscription (somehow) in addition to the IGMPv1/v2 membership that MUST already come from the kernel of the applications host. and let the router figure out the right thing to do ...
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

15 15

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

15

SSM Host Signalling


Bootstrap Solutions

URD and IGMP v3lite Concepts :


IGMPv1/v2 report alone has no meaning in SSM!
IGMPv1/v2 report indicates host wants to receive traffic. (S,G) subscription tells routers from which sources!

Router begins forwarding (S,G) traffic IFF:


IGMPv1/v2 Report received AND URD or v3lite (S,G) channel subscription received.

Router continues forwarding (S,G) traffic:


Based on IGMPv1/v2 Group membership. No need for refreshing of (S,G) subscriptions. Implication: (S,G) traffic flows until last host leaves group!
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

16 16

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

16

SSM Host Signalling Bootstrap Solutions

URD and IGMP v3lite Concepts:


IGMP (v1/v2) Snooping and CGMP still work!
Group based traffic restriction due to IGMPv1/v2 reports. IGMP Snooping will not work with full IGMPv3 reports unless supported explicitly !

Solutions dont work outside of IP SSM range.


IGMPv1/v2 reports still interpreted as (*,G) Joins! Causes router to join the Shared Tree.
Implies forward ALL sources in the group !

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

17 17

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

17

URD Overview

A content provider builds a web page that contains URD links.


List of sources willing to provide multicast content

The user (receiver) clicks on one of the links A cgi script runs that provides the host an HTTP redirect to TCP port 659 When the host sends the redirect, it is intercepted by the last-hop router (directly connected to the host)
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

18 18

URD Overview
The idea of URD as an interim solution for transition to IGMP v3 is that the content provider builds a web page that contains URD links. Those links contain information on sources that are willing to provide the multicast content for certain groups. When a user clicks on such a link the browser of a host will try to open a TCP connection to the web server on port 659. If the last hop router is enabled for URD on the interface where the router receives the TCP packets from the host, it will intercept all packets for TCP connections destined to port 659 independent of the actual destination address of the TCP connection. From the information in URD the router learns about sources and groups. Because normal IGMPv1/v2 group membership reports are still sent by the application, URD is compatible with IGMPv1/v2 snooping and CGMP in switches.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

18

URD Host Signalling


Users Desktop Users favourite Browser Please select you TV program: Click here for the Movie Click Click here for the Sports

Click here for the News

Http:/www.broadcast.com/sports.htm
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

19 19

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

19

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

20 20

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

20

URD Host Signalling


Users Desktop Old streaming video receiver application. Does IP Multicast, but not IP SSM Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

Works fine if we dont try to run it in the SSM -Range


Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

21 21

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

21

URD Host Signalling


Users Desktop Old streaming video receiver application. Does IP Multicast, but not IP SSM Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00 That is.. Unless some unwanted traffic disturbs the reception, maybe some DoS attack...
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

22 22

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

22

URD Host Signalling


Users Desktop Old streaming video receiver application. Does IP Multicast, but not IP SSM Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00 Running the application on an SSM -Range alone does not help: The application will receive nothing!
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

23 23

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

23

URD Host Signalling


Users Desktop Old streaming video receiver application. Does IP Multicast, but not IP SSM Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00 But thanks to URD, the old application can run on an address in the SSM -Range and will only receive traffic from the right source!
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Retrieved URL String successfully

8/21/2001 3:39 PM

24 24

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

24

URD Host Signalling


Users Desktop Lets repeat this in Slow Motion... Users favourite Browser Please select you TV program: Click here for the Movie

Click here for the Sports

Click here for the News

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

25 25

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

25

URD Host Signalling


Users Desktop Users favourite Browser Please select you TV program: 0. The user sees some HTML page in his browser 1. The user clicks on a hotlink. A Hotlink is a URL that the browser will then start to retrieve (via HTTP). 2. The browser learns that the content of the URL is another HTML page. Click here for the Movie Click Click here for the Sports

Click here for the News

Http:/www.broadcast.com/sports.htm
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

26 26

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

26

URD Host Signalling


Users Desktop 3. The browser will clear the display to start painting this new HTML page. The browser then starts reading and interpreting that HTML page. Users favourite Browser Thank you for choosing this Sports channel

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

27 27

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

27

URD Host Signalling


Users Desktop 3. While interpreting, the browser stumbles across a reference to another URL Users favourite Browser Thank you for choosing this Sports channel

View source: http:/www.broadcast.com/sports.htm

...

<FRAME SRC="http://sessions.broadcast.com/sports.sdp" NAME=Frame to start receiver app" >

...

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

28 28

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

28

URD Host Signalling


Users Desktop 3... It will retrieve this URL and see from the content-type (NOT HTML!), that this is input for an application that it has to start (or run as a plugin) Users favourite Browser Thank you for choosing this Sports channel

HTTP connection to sessions.broadcast.com for /sports.sdp

GET /sports.sdp HTTP/1.0 ...

Content-type: application/x-sdp Content-length: i=Sports Channel c=232.3.4.5 ...

Actual URL content

Transferring from sessions.broadcast.com


Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

29 29

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

29

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel

4. The browser will look into his application mappings for this content-type x-sdp, and start the appropriate application - our old player.
Transferring from sessions.broadcast.com
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

30 30

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

30

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel

i=Sports Channel c=232.3.4.5 ...

4 While doing so, the browser will also hand over the Actual URL content to that application (typically in a file as a command line argument for the application).
Transferring from sessions.broadcast.com
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

31 31

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

31

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel

5. From this URL, the application knows the multicast group IGMPv1/v2 to use, and it will Join Group join to that group. 232.3.4.5
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Transferring from sessions.broadcast.com


8/21/2001 3:39 PM

32 32

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

32

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel

6. But the application will not yet receive traffic, because it is an IP SSM group, and this old applications group membership report is not good enough alone !
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Transferring from sessions.broadcast.com


8/21/2001 3:39 PM

33 33

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

33

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels

7. Back to the browser who continues to interpret and display his original HTML page...
Transferring from sessions.broadcast.com
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

34 34

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

34

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel

View source: http:/www.broadcast.com/sports.htm

...

<FRAME Currently showing SRC="http://www.broadcast.com:659/urd-helper? Euro 2000 Soccer group=232.3.4.5&source=192.44.81.5 live from " Brussels NAME=URD command URL" > England : Germany

...

3:1 Min 89:00

8. and stumbles across another embedded URL that it needs to retrieve.


Transferring from sessions.broadcast.com
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

35 35

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

35

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

The Internet

The web server www.broadcast.com

Transferring from sessions.broadcast.com

The Host

The last hop router running 12.1(3)T or later and enabled for

ip urd
on the interface to the host

Lets zoom out a bit...

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

36 36

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

36

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

The Internet

The web server www.broadcast.com

Transferring from sessions.broadcast.com

The Host

The last hop router running 12.1(3)T or later and enabled for

ip urd
on the interface to the host

If the browser tries to retrieve the URL http://www.broadcast.com:659/urdhelper?group=232.3.4.5&source=192.44.81.5

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

37 37

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

37

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

The Internet

The web server www.broadcast.com

Transferring from sessions.broadcast.com

The Host

The last hop router running 12.1(3)T or later and enabled for

ip urd
on the interface to the host

If the browser tries to retrieve the URL http://www.broadcast.com:659/urdhelper?group=232.3.4.5&source=192.44.81.5 Then it wants to open a TCP connection to www.broadercast.com, port 659

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

38 38

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

38

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

The Internet

The web server www.broadcast.com

Transferring from sessions.broadcast.com

The Host

The last hop router running 12.1(3)T or later and enabled for

ip urd
on the interface to the host

If the browser tries to retrieve the URL http://www.broadcast.com:659/urdhelper?group=232.3.4.5&source=192.44.81.5 Then it wants to open a TCP connection to www.broadercast.com, port 659 But it only gets up to the first-hop router, who intercepts all TCP connections to port 659, whatever destination address they are for !
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

39 39

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

39

URD Host Signalling


Watch TCP connection
Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

GET /urd-helper?

group=232.3.4.5& The source=192.44.81.5 Internet The web server HTTP/1.0


The last hop router running 12.1(3)T or later and enabled for

www.broadcast.com

Transferring from sessions.broadcast.com

The Host

ip urd
on the interface to the host

The router disguises itself as a web server and listens to what the host want to have.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

40 40

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

40

URD Host Signalling


Watch TCP connection
Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

GET /urd-helper?

group=232.3.4.5& The source=192.44.81.5 Internet The web server HTTP/1.0


The last hop router running 12.1(3)T or later and enabled for

www.broadcast.com

Transferring from sessions.broadcast.com

The Host

And the router thinks:

I understand this URL request, ip urd I understand this URL request, lets join letsremember rememberto toPIM-SS PIM-SS join on the remember interfaceto toto the host lets PIM-SSM join to 232.3.4.5 for source to group group 232.3.4.5 for source group 232.3.4.5 for source 192.44.81.5, 192.44.81.5, 192.44.81.5, if, or as soon as I Ialso have an if, or as soon as also have an if, or as soon as I also have an IGMPv1/v2 IGMPv1/v2group groupmembership membership report for 232.3.4.5 from this IGMPv1/v2 group membership report for 232.3.4.5 from this report interface for interface 232.3.4.5 from this interface

I understand this URL request,

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

41 41

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

41

URD Host Signalling


Watch TCP connection
Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

GET /urd-helper?

group=232.3.4.5& The source=192.44.81.5 Internet The web server HTTP/1.0 www.broadcast.com

Transferring from sessions.broadcast.com

The Host

And so the router answers: And closes the TCP connection.

HTTP/1.1 200 OK The last hop router running 12.1(3)T or later and Server: cisco IOS enabled for Content-Type: text/html <html> ip urd on the interface to the host <body> Retrieved URL string successfully </body> </html>

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

42 42

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

42

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

PIM join (192.44.81.5, 232.3.4.5)

The Internet

Transferring from sessions.broadcast.com

IGMPv1/v2 membership reports for 232.3.4.5

The video source 192.44.81.5

And once it sees the first IGMPv1/v2 report for the group (from the application), the router will join to the source via PIM-SS and continue as long as the IGMPv1/v2 group reports come in. Note: The URL request from the browser and the first IGMPv1/2 report from the application may arrive in any order within ~ 1 minute
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

43 43

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

43

URD Host Signalling


Users Desktop Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

PIM join (192.44.81.5, 232.3.4.5)

The Internet

Transferring from sessions.broadcast.com

IGMPv1/v2 membership reports for 232.3.4.5

The video source 192.44.81.5

And finally the picture arrives and is being forwarded as long as the application runs and sends the IGMPv1/v2 membership reports

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

44 44

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

44

URD Host Signalling


Users Desktop Old streaming video receiver application. Does IP Multicast, but not IP SSM Users favourite Browser Thank you for choosing this Sports channel Currently showing Euro 2000 Soccer live from Brussels England : Germany 3:1 Min 89:00

And all the user could notice, is the string returned by the router (may be hidden)!
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Retrieved URL String successfully

8/21/2001 3:39 PM

45 45

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

45

URD Configuration

Enable IP SSM for existing applications


Works with every browser that supports frames (or one click more for those without)

No plugins required
Complete host platform independence

Nothing to configure on the host URL easily added to WWW server HTML pages
No additional CGI scripts required.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

46 46

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

46

URD Ciscos Implementation


Supported in IOS 12.1(3)T, 12.0(15)S and later. Supported in the process, fast and CEF paths Intercepting solely based on TCP port 659
If first hop router is not URD enabled, www-server may want to reply to HTTP on that port too (error discovery)

Port 659 assigned by IANA for Cisco URD. URD - URL Rendezvous Directory
Name still carries the idea that it is also quite simple to write a CGI-Script to completely emulate an RP, i.e.: add web pages, where you would click onto if you are a source, and the script would then create the URD command URLs for the receivers.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

47 47

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

47

IGMP v3lite Overview


Source side:
No application changes required!

Receiver side IGMPv3 API:


draft-ietf-idmr-msf-api-00.txt
Socket Interface Extensions for Multicast Source Filter Supports all memberships possible with IGMPv3: Group membership with INCLUDE or EXCLUDE list of sources. Different subsets of the API defined (one for IP SSM) Kernel implementation of this will also filter out any unwanted received traffic still forwarded to host. Layer2 of hosts is not IP SSM aware, input filtering on group only.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

48 48

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

48

IGMP v3lite Overview

IGMP v3lite HSIL (Host Side IGMP Library)


Provides for the IP SSM subset of IGMPv3 API Applications must still filter out unwanted traffic. Forward compatible with OS supported IGMPv3:
Recompile of application without HSIL HSIL may also be able to detect and support host native IGMPv3 if available.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

49 49

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

49

IGMP v3lite Host Signalling


IP SSM IP SSM Application Application (s)
IP SSM API IP SSM API

IGMP v3lite Daemon

HSIL HSIL

Cisco IOS 12.1(3)T or later router with ip igmp v3lite enabled

Host Operating System

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

50 50

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

50

IGMP v3lite Host Signalling


IP SSM IP SSM Application Application (s) Join (S,G) IP SSM API
IP SSM API

IGMP v3lite Daemon

HSIL HSIL

Cisco IOS 12.1(3)T or later router with ip igmp v3lite enabled

Host Operating System

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

51 51

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

51

IGMP v3lite Host Signalling


IP SSM IP SSM Application Application (s) Join (S,G) IP SSM API
IP SSM API

IGMP v3lite Daemon

HSIL HSIL

Cisco IOS 12.1(3)T or later router with ip igmp v3lite enabled

Join (G)

Host Operating System

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

52 52

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

52

IGMP v3lite Host Signalling


IP SSM IP SSM Application Application (s) Join (S,G) IP SSM API
IP SSM API

IGMP v3lite Daemon

HSIL HSIL

Cisco IOS 12.1(3)T or later router with ip igmp v3lite enabled UDP Port:659
Membership report INCLUDE (S,G)

Join (G)

Host Operating System

IGMPv1/v2 membership report for G


Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

53 53

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

53

IGMP v3lite Ciscos Implementation


Solution to start developing and deploying IP SSM applications with an IGMPv3 API subset.
Router side supported in IOS 12.1(3)T and later Host side written by Whitebarn for Cisco Supported for typical OSs (Windows, Unix, Linux) Host side binaries will be freely downloadable from www.whitebarn.com Supported ONLY at the IP SSM API in the host
i.e.: Do not try to write your own HSIL or Daemon and expect IOS to do the right thing.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

54 54

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

54

SSM Summary

Solves multicast address allocation problems.


Flows differentiated by both source and group.
Not just by group.

Content providers can use same group ranges.


Since each (S,G) flow is unique.

Helps prevent certain DoS attacks


Bogus source traffic:
Cant consume network bandwidth. Not received by host application.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

55 55

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

55

Agenda

Source Specific Multicast Bidirectional (Bidir) PIM

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

56 56

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

56

Multicast Application Categories


One-to-Many Applications
Video, TV, Radio, Concerts, Stock Ticker, etc.

Few-to-Few Applications
Small (<10 member) Video/Audio Conferences

Few-to-Many Applications
TIBCO RV Servers (Publishing)

Many-to-Many Applications
Stock Trading Floors, Gaming

Many-to-Few Applications
TIBCO RV Clients (Subscriptions)
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

57 57

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

57

Multicast Application Categories


PIM-SM (S, G) State

One-to-Many Applications
Single (S,G) entry

Few-to-Few Applications
Few (<10 typical) (S,G) entries

Few-to-Many Applications
Few (<10 typical) (S,G) entries

Many-to-Many Applications
Unlimited (S,G) entries

Many-to-Few Applications
Unlimited (S,G) entries
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

58 58

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

58

Many-to-Any State Problem


Creates huge amounts of (S,G) state
State maintenance workloads skyrocket
High OIL fanouts make the problem worse

Router performance begins to suffer

Using Shared-Trees only.


Provides some (S,G) state reduction
Results in (S,G) state only along SPT to RP Frequently still too much (S,G) state Need a solution that only uses (*,G) state
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

59 59

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

59

Eliminating (S,G) State


Solution 1

Register-Encapsulate all data to RP


Easy to implement
RP never bothers to send a Register-Stop

Effectively IP-IP tunneling traffic to RP Still results in (S,G) state in:


The RP The first-hop routers

Each packet must be de-encapsulated


Process-switched Only feasible if data rates are very low.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

60 60

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

60

Eliminating (S,G) State


Solution 2

Bidirectional Shared-Trees
Allows data to flow up the Shared Tree
Source traffic follows Shared Tree to get to the RP and all other receivers on the Shared Tree

Cannot use current (*,G) RPF rules


Care must be taken to avoid multicast loops

Requires a Designated Forwarder (DF)


Responsible for forwarding traffic up Shared Tree
DFs will accept data on the interfaces in their OIL. Then send it out all other interfaces. (Including the IIF.)
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

61 61

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

61

Bidirectional (Bidir) PIM


Idea:
Use the same tree for traffic from sources towards RP and from RP to receivers

Benefits:
Less state in routers
Only (*, G) state is used Source traffic follows the Shared Tree
Flows up the Shared Tree to reach the RP. Flows down the Shared Tree to reach all other receivers.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

62 62

Bidir PIM
PIM Sparse Mode in its native form is unidirectional the traffic from sources to the RP initially flows encapsulated in Register messages wich presents a significant burden due to encapsulation / decapsulation mechanisms. Additionally, shortest path tree is built between the RP and the source (initiated by the RP) which results in (*,G) and (S,G) entries at least on the way between the RP and the source. Several multicast applications use many-to-many model where each participant is receiver and sender as well. In such an environment (*,G) and (S,G) entries appear everywhere along the path from participants and the associated RP in a PIM Sparse Mode domain resulting in increased memory and protocol overhead. It is also possible that the path from the source to the RP and the opposite path (from the RP to the source which is a receiver as well) are incongruent. Bi-directional PIM dispenses with both encapsulation and source state by allowing packets to be natively forwarded from a source to the RP using shared tree state only. This ensures that only (*,G) entries will appear in multicast forwarding tables and that the path taken by packets flowing from the participant (source and/or receiver) to the RP and vice versa will be the same.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

62

Bidirectional PIM Overview

RP Receiver

Sender/ Receiver

Shared Tree Receiver


Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

63 63

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

63

Bidirectional PIM Overview

RP Receiver

Sender/ Receiver

Source Traffic forwarded bidirectionally using (*,G) state. Shared Tree Source Traffic Receiver
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

64 64

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

64

Bidirectional PIM Overview

RP Receiver

Sender/ Receiver

( * , G) State created only along the Shared Tree. Shared Tree Source Traffic Receiver
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

Source Traffic forwarded bidirectionally using (*,G) state.

8/21/2001 3:39 PM

65 65

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

65

PIM Modifications for Bidir Operation

Designated Forwarders (DF)


On each link the router with the best path to the RP is elected to be the DF.
Note: Designated Routers (DR) are not used for bidir groups.

The DF is responsible for forwarding traffic upstream towards the RP. No special treatment is required for local sources.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

66 66

PIM Modifications for Bidir Operation


The major modification of PIM Sparse Mode to support bidirectional mode is an addition of a Designated Forwarder, which takes over the role of a Designated Router (DR) and has the following responsibilities: It is the only router that forwards packets travelling downstream (towards receiver segments) onto the link It is the only router that picks-up upstream traveling packets (away from the source) off the link and forwards them towards the RP There is one DF per RP for bidirectional group(s) on each link. One and only one election is performed at RP discovery time. There is no constant control traffic and control messages appear only on changes. The election is robust and enforces consistent view on all routers on link. The router with the best unicast route to the RP is elected as a DF. There is no effect of this election on local sources their traffic reaches locally attached receivers directly and special treatment is no longer required when the sources are directly connected to a router. Data from those sources will automatically be picked up by the DF and forwarded towards the RP.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

66

Forwarding / Tree Building


Downstream routers with receivers Join towards the DF.
N1

RP

N2

DF Join to DF
N3

Join to DF
N4

R1
Shared Tree
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

R2

8/21/2001 3:39 PM

67 67

Forwarding / Tree Building


The DF also forwards / initiates all (*,G) Joins towards the RP for the active group. Downstream routers forward their (*,G) Joins via upstream DFs. They indicate that in the Upstream Router field of a PIM Join message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

67

Forwarding / Tree Building


Downstream routers with receivers Join towards the DF. DF adds link to (*,G) olist and Joins towards the RP.

Join
N1

RP

N2

DF

N3

N4

R1
Shared Tree
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

R2

8/21/2001 3:39 PM

68 68

Forwarding / Tree Building


When (*,G) Join is received by the DF on the link it adds the link to the outgoing interface list (OIL) for the group. If the entry already exists the interface timer is refreshed. The (*,G) Join is then forwarded by the DF towards the RP for the group.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

68

Forwarding / Tree Building


Downstream routers with receivers Join towards the DF. DF adds link to (*,G) olist and Joins towards the RP. Shared Tree is now built.
N1

RP

N2

DF

N3

N4

R1
Shared Tree
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

R2

8/21/2001 3:39 PM

69 69

Forwarding / Tree Building


When (*,G) Join is received by the DF on the link it adds the link to the outgoing interface list (OIL) for the group. If the entry already exists the interface timer is refreshed. The (*,G) Join is then forwarded by the DF towards the RP for the group.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

69

Forwarding / Tree Building


Downstream routers with receivers Join towards the DF. DF adds link to (*,G) olist and Joins towards the RP. Shared Tree is now built. The DF forwards all traffic from the link upstream towards the RP. At the same time, traffic flows down the tree.
N1

RP

N2

DF
S1
N3 N4

R1

R2

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

70 70

Forwarding/Tree Building
The Designated Forwarder (DF) has all the responsibilities for forwarding multicast traffic in bidirectional PIM. It has to forward multicast traffic received on a link for which it is a DF via RPF-interface towards the RP (in addition to forward the traffic via interfaces in OIL excluding the inteface on which the traffic was received).

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

70

Forwarding / Tree Building


Downstream routers with receivers Join towards the DF. DF adds link to (*,G) olist and Joins towards the RP. Shared Tree is now built. The DF forwards all traffic from the link upstream towards the RP. At the same time, traffic flows down the tree. Downstream traffic is forwarded through the DF.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

S2
RP

N1

N2

DF
S1
N3 N4

R1

R2

8/21/2001 3:39 PM

71 71

Forwarding / Tree Building


The branch of the tree built via (*,G) Joins is bidirectional which means that: the traffic from upstream sources follows the same (downstream) path that was built with (*,G) Joins and is forwarded to the link by the same DF a single path through the DF is enforced for traffic travelling upstream to the RP

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

71

Designated Forwarder Election


Performed as soon as a bidir RP is learned via Auto-RP or BSR. Elects the router on the link with the best path to the RP. Ensures all routers on link have a consistent view of the winner identity and metrics. Uses assert-like metric comparison rules to pick best path.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

72 72

DF Election
The election of a Designated Forwarder on each link follows similar principles known from the Assert process in PIM Dense Mode. The mechanism ensures that all the routers on the link have consistent view of the same RP. To perform the election of the DF for a particular RP, routers on a link need to exchange their unicast routing metric information for reaching the RP. Note: The election of a DF is per RP and not per individual group. The election process happens once only - when information on a new RP becomes available. There are however some conditions where an update to the election is needed: A change in unicast metric to reach the RP for any of the routers on the link The interface on which the RP is reachable changes to an interface for which the router was previously the DF A new PIM neighbor on a link The elected DF dies

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

72

DF Election Messages

Offer: Used to advertise local metrics to reach the RP. Winner: Used by a DF announcing or re-asserting its status. Backoff Backoff: : Used by a DF to acknowledge receipt of a better Offer. Pass: Used by an acting DF to pass the DF responsibility to a better candidate.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

73 73

DF Election Messages
The DF election mechanism is based on four control messages exchanged between the routers on the link. The Offer message is used to advertise router`s unicast metric to reach the RP and is used for comparison with other routers participating in DF election. The Winner message allows the winning router to declare to every other router on the link the identity of the winner and the metrics it is using. The message is used by the DF to reassert its status as well. The Backoff message is used by the DF on receipt of an offer that is better than its own metric. The DF records the received information and responds with a Backoff message. This instructs the offering router to hold off for a short period of time while the unicast routing stabilizes. The Pass message is used by the acting DF to pass its role to another router offering better metric. The old DF stops its tasks as soon as the transmission is made.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

73

Initial Election
On RP discovery send Offer with metric to RP.
N1

RP

N2

Offer 10

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

74 74

Initial Election
When a router finds out a new RP and the DF does not exist yet it sends an Offer message. The message contains the router`s metric to reach the RP and the router`s identity. The Offer message is periodically (Offer-Interval) retransmitted. If the router learns about a better metric from a neighbor it stops sending Offer messages for a period of three times the Offer-Interval. If after this period no winner is elected, the election is restarted by the router. The same happens if an Offer with a worse metric is received. A router takes the role of the DF after sending three Offers without receiving any offer from any other neighbor.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

74

Initial Election
On RP discovery send Offer with metric to RP. We are better Neighbors compare with own metric and N1 send Offer only if Offer 8 better.
N3

RP

N2

Offer 10

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

75 75

Initial Election
When neighbors hear the Offer message they compare the offered metric with their own one. If their metric is worse they back off (remain silent for three times the Offer-Interval) and thus allow the offering router to win. A timer is still running to restart offering in case election fails. If the neighbor that heard the Offer has better metric it actively starts participating in the election by sending its own Offer messages including its metric to the RP and its identity.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

75

Initial Election
On RP discovery send Offer with metric to RP. We are better Neighbors compare with own metric and N1 send Offer only if Offer 8 better.
N3

RP

We lose
N2

Offer 10

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

76 76

Initial Election
If the offering router hears an Offer with a better metric it assumes it lost and stops sending Offer messages for the period of three times the Offer-Interval. If after that interval the situation is not yet resolved, the election process will restart.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

76

Initial Election
On RP discovery send Offer with metric to RP. Neighbors compare with own metric and send Offer only if better. After repeating 3 uncontested Offers, send a Winner and assume DF role.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

RP

We Win!
N1 N2

Winner 8

N3

N4

8/21/2001 3:39 PM

77 77

Initial Election
The router that sent the better Offer three times (and hasn`t heard of better Offer or no Offer at all) assumes the DF role and transmits a Winner message which declares to every router on the link the identity of the winner and the metric it is using. Routers hearing a Winner message stop participating in the election and record the identity and metrics of the winner.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

77

Initial Election
Winner message informs the other routers who is DF.
We Win!
N1

RP

DF is N1
N2

Winner 8 DF is N1
N3 N4

DF is N1

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

78 78

Initial Election
The router that sent the better Offer three times (and hasn`t heard of better Offer or no Offer at all) assumes the DF role and transmits a Winner message which declares to every router on the link the identity of the winner and the metric it is using. Routers hearing a Winner message stop participating in the election and record the identity and metrics of the winner.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

78

DF Preemption
New candidate sends improved Offer.
RP

Candidate
N1 N2

DF

Offer 6

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

79 79

DF Preemption
Once the DF is elected the process does not restart if there are no changes in metrics, PIM neigbors, DF reachability or interfaces towards the RP. If the unicast metric to a RP changes for a non-DF router to a value that is better than that previously advertised by the DF the router sends a new Offer. A new Offer includes an improved metric and the candidate`s identity.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

79

DF Preemption
New candidate sends improved Offer. DF responds with Backoff instructing candidate to wait.
N1 RP

Candidate
N2

DF

Backoff 6

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

80 80

DF Preemption
Upon receipt of an Offer that is better than its current metric, the DF records the identity and metrics of the offering router and responds with a Backoff message (including the metric of the candidate that just sent the Offer). The offering router will hold off for a period of time (defined in the Backoff message) while the unicast routing stabilises. All routers on the link who have pending offers with metrics equal or worse than those in the backoff message (including the original offering router) will hold further offers for the defined period. If during the period someone else sends a new better Offer, the Backoff message is repeated for the new Offer and the backoff period restarted.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

80

DF Preemption
New candidate sends improved Offer. DF responds with Backoff instructing candidate to wait. Before backoff period expires, old DF stops forwarding and sends Pass.
N1 RP

Candidate
N2

Pass N2

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

81 81

DF Preemption
Just before the backoff period expires, the current DF declares the candidate router with the best Offer as the new DF. This is done via a Pass message which includes the IDs and metrics of both the old and new DFs. The current DF stops acting as a DF soon after the Pass is transmitted. The new DF assumes the role of the DF as soon as it receives the Pass message. All other routers on the link record the identity and the metric of the newly elected DF.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

81

DF Preemption
New candidate sends improved Offer. DF responds with Backoff instructing candidate to wait. Before backoff period expires, old DF stops forwarding and sends Pass. On receipt candidate becomes DF.
N1 RP

Candidate
N2

Pass N2

DF

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

82 82

DF Preemption
Just before the backoff period expires, the current DF declares the candidate router with the best Offer as the new DF. This is done via a Pass message which includes the IDs and metrics of both the old and new DFs. The current DF stops acting as a DF soon after the Pass is transmitted. The new DF assumes the role of the DF as soon as it receives the Pass message. All other routers on the link record the identity and the metric of the newly elected DF.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

82

DF Preemption
New candidate sends improved Offer. DF responds with Backoff instructing candidate to wait. Before backoff period expires, old DF stops forwarding and sends Pass. On receipt candidate becomes DF. Other routers hear Pass, learn N2 is now DF.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

RP

N1

N2

Pass N2

DF

N3

N4

DF is N2

DF is N2
83 83

8/21/2001 3:39 PM

DF Preemption
Just before the backoff period expires, the current DF declares the candidate router with the best Offer as the new DF. This is done via a Pass message which includes the IDs and metrics of both the old and new DFs. The current DF stops acting as a DF soon after the Pass is transmitted. The new DF assumes the role of the DF as soon as it receives the Pass message. All other routers on the link record the identity and the metric of the newly elected DF.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

83

PIM Neighbor Startup


Router N1 restarts and has no knowledge of DF.
RP

On RP discovery it sends an Offer.

Restarted
N1 N2

Offer 8

DF

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

84 84

PIM Neighbor Startup


A router that started after the DF election outcome or a router that restarted in the meantime will have no knowledge of a previously elected DF. It will start advertising its metric in Offer messages on RP discovery time.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

84

PIM Neighbor Startup


Router N1 restarts and has no knowledge of DF.
RP

On RP discovery it sends an Offer.


N1 N2

Acting DF responds with Winner or Backoff depending on metric comparison.


N3

Winner 6

DF

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

85 85

PIM Neighbor Startup


As soon as the current DF hears the Offer from the PIM neighbor that just (re)started it will respond either with a Winner or with a Backoff message depending on the metric in the Offer message. The rest of the procedure is the same as in every reelection.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

85

DF Loses Path to the RP


Current DF loses only path to the RP. Stops acting as the DF Sends Offer with infinite metric to trigger new DF election.
RP

N1

Offer ?

N2

DF

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

86 86

DF Loses Path to the RP


When the path to the RP currently used by the DF switches to be through the link for which it is the DF, then it can no longer provide forwarding services. Recall that the DF forwards the traffic (from a source) received on an interface towards the RP, but never via the interface on which the traffic was received. Thus in this case the DF immediately stops being the DF and restarts the election by sending an Offer with an infinite metric. If no better Offer is received an infinite Offer is repeated periodically.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

86

DF Loses Path to the RP


Current DF loses only path to the RP. Stops acting as the DF Sends Offer with infinite metric to trigger new DF election. Other candidates respond with real Offers and eventually best candidate takes over with a Winner message.
RP

N1

N2

DF

Winner 8

N3

N4

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

87 87

DF Loses Path to the RP


The procedure after the router acting as a DF loses the path to the RP and its RPF-interface becomes the same as the interface for which it is the DF is similar to standard DF election procedure. Routers that hear an infinite Offer respond with their Offers and the one with the best Offer takes over the role of a new DF with the Winner message.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

87

DF Failures

Detecting DF Failures
Downstream Routers
RP RPF info no longer points to DF.

Non-Downstream Routers
PIM Neighbor timeout of DF.

Router response to a DF Failure


Routers resend their Offer messages
Triggers new DF election

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

88 88

DF Fails
The speed at which a new DF is elected after the original DF dies depends on whether there are any downstream routers on the link. For downstream routers the RPF neighbor (who is the DF at the same time) will change and they will initiate the reelection by sending Offer messages. If the RP is reachable through the link via another upstream router they will use an infinite metric. If no downstream routers are available the only way for other upstream routers to detect a DF failure is by the timeout of the PIM neighbor information, which will take longer.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

88

Other Metric Changes


DF metric changes:
Better metric:
May send Winner message with new metric. Updates other routers.

Worse metric:
Sends 3 Winner messages with new metric. Other routers can respond with a better Offer.

Non-DF metric changes:


Better metric than DF:
Send new Offer to trigger DF re -election.

Worse metric than DF:


No action is taken.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

89 89

Other Metric Changes


There are some other situations where the metric to the RP changes. When the metric of the non-DF router chenges to a value still wors than that of the current DF, no action is taken. There can be changes to the metric of the current DF. If the metric becomes worse than before (assuming the DF still has a path to the RP) the DF sends a set of 3 randomly spaced Offer messages with the new metric. Routers who receive this message and have a better metric may respond with an Offer message which triggers the same procedure as follows when non-DF metric becomes better than the current DF metric. All routers assume the DF has not changed until they see a Pass or Winner message indicating the change. If the routing metric at the DF changes to a better value, a single Winner message is sent advertising the new metric.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

89

Additional Robustness

DF re-announces sending Winner message when new PIM neighbors are discovered. Periodic Winner messages can be sent for RPs with active groups.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

90 90

Additional Robustness
In order to ensure an additional robustness in DF election whenever a new PIM neighbor is discovered by the current DF a Winner message is reannounced. The proposal allows the DF to send periodic Winner messages for RPs serving currently active groups as well.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

90

DF Advantages
DF election enforces a single forwarder for traffic in both directions between a link and the RP. DF is responsible for originating Joins for local receivers thus eliminating loops that were previously possible due to DR placement. Customized unicast routes in downstream routers do not affect the choice of the forwarding router. This eliminates loops due to misconfiguration.
Module12. ppt
1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

91 91

DF Advantages
The implementation of PIM bidirectional mode where all the forwarding on a link is centered around the Designated Forwarder ensures highly robust PIM SM multicast networks and eliminates possible loops. All the multicast traffic from the link towards the RP and in the opposite direction passes the DF. Since the role of a Designated Router (DR) is handed to a Designated Forwarder (DF) the placement of a DR is no more an issue. All (*,G) Joins are originated (forwarded) via DF which again eliminates the possibilities for forwarding loops. Even if downstream routers on the link use customized unicast routes the election of a DF ensures that all those routers know who the DF is and use it for forwarding (*,G) Joins via it. This again eliminates multicast forwarding loops that were possible in regular PIM SM due to misconfigurations.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

91

Configuring Bidir PIM


(BSR Example)

Define Candidate RP and groups / modes it is willing to serve


ip pim rp -candidate Loopback0 group-list 45 bidir ip pim rp -candidate Loopback1 group-list 46 ! Two loopbacks needed due to a nature of ACLs (permit, deny) ip pim bsr-candidate Loopback2 4 access-list 45 permit 224.0.0.0 0.255.255.255 access-list 45 permit 227.0.0.0 0.255.255.255 ! Those two groups will be PIM SM bidirectional access-list 45 deny 225.0.0.0 0.255.255.255 ! This group will be PIM DM access-list 46 permit 226.0.0.0 0.255.255.255 ! This group will be PIM SM

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

92 92

Configuring Bidir PIM (BSR Example)


A bidirectional PIM capable router can run in bidirectional mode, sparse mode, dense mode or any combination of them. If a router is configured for bidirectional mode but does not learn of a bidirectional capable RP it will operate in sparse mode. If a bidirectional capable router learns of a bidirectional RP then the group range advertised by the RP will operate in bidirectional mode. If the RP advertises any groups with a negative prefix they will operate in dense mode. By default a bidirectional RP advertises all groups as bidirectional. An access group on the RP can be used to specify a list of groups to be advertised as bidirectional. Groups with the "deny" clause will operate in dense mode. A different (non bidirectional) RP address needs to be specified for groups that need to operate in sparse mode. This is because a single access-list allows only "permit" or a "deny" clause. The example shows how to configure a bidirectional RP to run all 3 modes. 224/8 and 227/8 are bidirectional groups, 225/8 is dense mode and 226/8 is sparse mode. Both the bidirectional RP and the sparse mode RP are configured on one router using two different loopback interfaces.

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

92

Bidir PIM Summary

Drastically reduces network mroute state.


Eliminates ALL (S,G) state in the network.
SPTs between sources to RP eliminated. Source traffic flows both up and down Shared Tree.

Allows Many-to-Any applications to scale.


Permits virtually an unlimited number of sources.

Module12. ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

8/21/2001 3:39 PM

93 93

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

93

Module12.ppt

1998 2001, Cisco Systems, Inc. All rights reserved.

94

Copyright ? ?1998-2001, Cisco Systems, Inc.

Module7.ppt

94

You might also like