Professional Documents
Culture Documents
V2.1
BRKIPM-2265
Cisco Public
Housekeeping
BRKIPM-2265
Cisco Public
Agenda
Introduction / Motivation
BRKIPM-2265
Cisco Public
Link and/or Node Failures cause packet loss until routing has converged
Milliseconds?
How often is LoC acceptable?
BRKIPM-2265
Cisco Public
Routing
Protocols
VoIP
call
Signalling
Video
Tunnels
go
down
L1/L2
Transport
Convergence
50 msec
BRKIPM-2265
TCP
Sessio
n dies
500 msec
1 sec
5 secs
30 secs
1 Minute
Cisco Public
Detection
(link or node aliveness, routing updates received)
Update topology database
Compute optimal path
State propagation
(routing updates
sent)
Download to HW FIB
BRKIPM-2265
Cisco Public
pr60-ipcs50-bgp160-ip2ip-remote-nolb
Agilent measurements
700
600
500
400
300
200
100
0
0%
50%
100%
average
90%
0
500
1000
1500
2000
2500
3000
3500
prefix nr
BRKIPM-2265
Cisco Public
10
BRKIPM-2265
Cisco Public
11
Internet-Facing Datacenter/
Hosting Facility using Full
Internet Routing table
Service
Provider A
> Couple
thousand
routes
Service
Provider B
MPLS
Network
BRKIPM-2265
n * 350000 (!!)
routes received
Cisco Public
12
3/8, NH 1.1.1.1
NH <PE2>
PE1
1.1.1.1
3/8, NH <PE1>
NH
NH<PE2>
<PE2>
3.0.0.0/8
1.1.1.5
PE3
PE2
3/8, NH 1.1.1.5
NH <PE1>
BRKIPM-2265
Cisco Public
13
3/8, NH 1.1.1.1
NH <PE2>
PE1
1.1.1.1
3/8, NH <PE1>
NH <PE2>
3.0.0.0/8
1.1.1.5
PE3
PE2
3/8, NH 1.1.1.5
NH <PE1>
BRKIPM-2265
Cisco Public
14
3/8, NH 1.1.1.1
NH <PE2>
PE1
1.1.1.1
3/8, NH <PE1>
NH <PE2>
3.0.0.0/8
1.1.1.5
PE3
PE2
3/8, NH 1.1.1.5
NH <PE1>
BRKIPM-2265
Cisco Public
15
Two out of the three scenarios depend on IGP convergence, including IGP failure
detection
Can react to this in much less than a second
Fast ReRoute techniques can even recover core failures in sub-50 milliseconds
BGP is notified of the change, and needs to identify affected paths by scanning
the BGP table(s)
Scanning implementation can be improved by only scanning affected
table(s) or parts of tables (ex: scoped walks in IOS-XR), but scanning
effort will still grow with table size
Hence: We need to find another approach to achieve predictable fast
BGP convergence!
BRKIPM-2265
Cisco Public
!!
16
BRKIPM-2265
Cisco Public
17
t, Loss Of
Connectivity
no PIC
PIC
2n
3n
4n
5n
6n
PIC is the ability to restore forwarding without resorting to per prefix operations.
Loss Of Connectivity does not increase as my network grows (one problem less to
worry about)
BRKIPM-2265
Cisco Public
19
PE1 (10.0.0.3)
Group of prefixes
110.0.0.0/24
110.1.0.0/24
110.2.0.0/24
...
GigE1/0
BGP Entry
110.0.0.0/24
Path List
BGP Entry
110.1.0.0/24
Path 1
Via 10.0.0.3
BGP Entry
110.2.0.0/24
Path List
IGB Entry
10.0.0.3/32
BRKIPM-2265
Path 1
GigE0/0
Cisco Public
20
BGP Net
110.1.0.0/24
BGP Net
110.5.0.0/24
IGP Net
10.0.0.3/32
OIF
OIF
OIF
OIF
Each BGP FIB entry has its own local Outgoing Interface (OIF) information (ex: Gig0/0)
Cisco Public
21
P1
PE1
PE3
PE2
P2
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24
BGP pathlist
PE1
PE2
BGP nexthop(s)
BGP Net
110.5.0.0/24
PE1 via P1
PE1 via P2
Gig1, dmac=x
Gig2, dmac=y
IGP pathlist
PE2 via P2
IGP nexthop(s)
Output Interface
Pointer Indirection between BGP and IGP entries allow for immediate update of the multipath BGP
pathlist at IGP convergence
Only the parts of FIB actually affected by a change needs to be touched
Used in newer IOS and IOS-XR (all platforms), enables Prefix Independent Convergence
BRKIPM-2265
Cisco Public
22
PIC-acting router
P1
PE1
PE3
PE2
P2
Cisco Public
23
P1
Characterization
BGP PIC Core
PE1
PE3
PE2
P2
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24
BGP Net
110.5.0.0/24
BGP pathlist
PE1 via P1
PE1 via P2
PE1
PE2
IGP pathlist
BGP nexthop(s)
Gig1, dmac=x
Gig2, dmac=y
Core
PE2 via P2
IGP nexthop(s)
100000
Output Interface
PIC
no PIC
1000
100
10
1
1
25
00
0
50
00
0
75
00
10 0
00
0
12 0
50
0
15 0
00
0
17 0
50
0
20 0
00
0
22 0
50
0
25 0
00
0
27 0
50
0
30 0
00
0
32 0
50
0
35 0
00
00
LoC (ms)
10000
Prefix
BRKIPM-2265
Cisco Public
24
P1
PE1
PE1
PE3
PE3
PE2
PE2
P2
P2
Cisco Public
25
P1
Characterization
BGP PIC Edge
PE1
PE3
PE2
P2
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
BGP nexthop(s)
PE2 via P2
Gig2, dmac=y
Output Interface
msec
1000000
250k PIC
10000
250k no PIC
500k PIC
1000
500k no PIC
100
10
500000
450000
400000
350000
1
300000
100000
250000
IGP nexthop(s)
200000
110.5.0.0/24
IGP pathlist
150000
BGP Net
PE1
PE2
Gig1, dmac=x
100000
BGP pathlist
PE1 via P1
PE1 via P2
50000
110.1.0.0/24
Prefix
BRKIPM-2265
Cisco Public
26
PE3
PE2
P2
With the edge device setting next-hop to its loopback (next-hop-self), edge link going
down does not change next-hop as seen on other routers
Failure goes unnoticed by others!
However: Next-hop on edge device with failing link goes away, so this device can react
in PIC-Edge fashion
Traffic re-routed via core to alternate edge
BRKIPM-2265
Cisco Public
27
Key Take-Aways
All BGP prefixes (no matter how many!!) converge as quickly as their next-hop
Generally, IGP is responsible for next-hop convergence BGP convergence depends
on IGP convergence
So? What do I need to do to speed up my BGP convergence with BGP PIC???
BRKIPM-2265
Cisco Public
29
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24
PE1 via X4
PE1 via P2
Ge1, dmac=x
BGP pathlist
PE1
PE2
Ge2, dmac=y
IGP pathlist
BGP nexthop(s)
PE2 via P2
BGP Net
110.5.0.0/24
IGP nexthop(s)
Output Interface
IGP pathlist
BGP Net
110.1.0.0/24
PE1
PE1 via X4
PE1 via P2
Ge1, dmac=x
PE2
Ge2, dmac=y
IGP pathlist
BGP nexthop(s)
PE2 via P2
BGP Net
110.5.0.0/24
IGP nexthop(s)
BRKIPM-2265
Cisco Public
Output Interface
31
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24
PE1 via X4
PE1 via P2
Ge1, dmac=x
BGP pathlist
PE1
PE2
Ge2, dmac=y
IGP pathlist
BGP nexthop(s)
PE2 via P2
BGP Net
110.5.0.0/24
IGP nexthop(s)
Output Interface
Please see past Fast Convergence sessions/techtorials for details, visit us in the Design Clinics or meet using
Meet the Engineer
BRKIPM-2265
Cisco Public
32
Loop!
BRKIPM-2265
Cisco Public
33
BGP Net
110.0.0.0/24
IGP pathlist
BGP Net
110.1.0.0/24
PE1
PE1 via X4
PE1 via P2
Ge1, dmac=x
PE2
Ge2, dmac=y
IGP pathlist
BGP nexthop(s)
PE2 via P2
BGP Net
110.5.0.0/24
IGP nexthop(s)
Output Interface
PE3
PE1
1/8 via PE1
RR
1/8
PE2
PE4
BRKIPM-2265
Cisco Public
34
Update:
NetA, via PE1, LP: 200
PE1 (active)
e0
Update:
NetA, via PE2 LP:
100
CE
s0
PE2 (standby)
NetA, via s0 LP: 100
BRKIPM-2265
Cisco Public
35
Update:
NetA, via PE1, LP: 200
PE1 (active)
e0
CE
s0
Update:
NetA, via PE2 LP:
100
PE2 (standby)
NetA, via s0 LP: 100
via PE1 LP: 200
BRKIPM-2265
Cisco Public
36
PE1 (active)
e0
Withdrawal:
NetA, via PE2 LP:
100
CE
s0
PE2 (standby)
NetA, via s0 LP: 100
via PE1 LP: 200
BRKIPM-2265
Cisco Public
37
Update:
NetA, via PE1, LP: 200
PE1 (active)
PE3
e0
CE
s0
Update:
NetA, via PE2 LP:
100
PE2 (standby)
NetA, via s0 LP: 100
via PE1 LP: 200
router bgp
Cisco Public
38
Scenario1:
iBGP Path Propagation and Reflection
Regular BGP BestPath algorithm leads to an RR only reflecting one path, namely its
best path
Without explicit policy either hot-potato policy prevails (rule 8) or the lowest neighbor
address (rule 13) is used as tiebreaker
What can we do to propagate both paths to PE3/4?
PE3
PE1
1/8 via PE1
RR
1/8
PE4
BRKIPM-2265
PE2
Cisco Public
39
PE1
RD1
RR
1/8
RD2
PE4
BRKIPM-2265
PE2
RD2:1/8 via PE2,
Label L2
2013 Cisco and/or its affiliates. All rights reserved.
Cisco Public
40
PE1
RD1
RR
1/8
RD2
PE4
BRKIPM-2265
PE2
RD2:1/8 via PE2,
Label L2
2013 Cisco and/or its affiliates. All rights reserved.
Cisco Public
41
Solution 2: No RR
Full iBGP mesh
Yes, I am serious , at least for reasonably sized and static
environments
PE3
PE1
1/8
PE4
BRKIPM-2265
PE2
Cisco Public
42
PE3
PE1
1/8 via PE1
RR
1/8
PE4
BRKIPM-2265
PE2
Cisco Public
43
Solution 4: Engineered RR
Some operators place their RRs in the topology to ensure they select different
paths. Their PEs are clients to multiple RRs.
the top RR is closer to PE1 and selects the black path
the bottom RR is closer to PE2 and selects the blue path
Above behaviour can also be achieved using specific BGP policies
PE3
RR
PE1
1/8
RR
PE4
BRKIPM-2265
PE2
Cisco Public
44
Solution 5: AddPath
New Capability to allow a BGP speaker to advertise more than one path (The holy
grail)
Available in IOS-XR 4.0, IOS-XE 3.7, 15.2(4)S, 15.3T
Requires support for this functionality on RR and PEs
PE3
PE1
1/8 via PE1
RR
1/8
PE4
PE2
http://www.cisco.com/en/US/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/asr1000/irg-additional-paths.html
BRKIPM-2265
Cisco Public
45
PE1
1/8 via PE1
RR
1/8
PE4
PE2
http://www.cisco.com/en/US/docs/ios/ios_xe/iproute_bgp/configuration/guide/irg_diverse_path_xe.html
BRKIPM-2265
Cisco Public
46
BRKIPM-2265
Cisco Public
47
PE1 (active)
PE3
e0
RR reflects:
NetA, via PE1, LP: 200
BRKIPM-2265
CE
s0
PE2 (standby)
Update:
NetA, via PE2 LP:
100
2013 Cisco and/or its affiliates. All rights reserved.
48
BGP PIC leverages IGP convergence Make sure IGP converges quickly
IOS-XR: IGP Timers pretty-much tuned by default
IOS: Sample OSPF config:
process-max-time 50
ip routing protocol purge interface
interface
carrier-delay msec 0
negotiation auto
ip ospf network point-to-point
bfd interval 100 min_rx 100 mul 3
router ospf 1
timers throttle spf 50 100 5000
timers throttle lsa all 0 20 1000
timers lsa arrival 20
timers pacing flood 15
passive-interface Loopback 0
bfd all-interfaces
BRKIPM-2265
Cisco Public
50
PIC Core is a pure data plane feature, leveraging a hierarchical forwarding/CEF plane
On IOS devices (C7600, ASR1k), hierarchy needs to be enabled via:
cef table output-chain build favor convergence-speed
All other platforms supporting PIC core (CRS, XR12k ASR9k, NX-OS) do this natively,
there is nothing to configure here
BRKIPM-2265
Cisco Public
51
Two BGP-PIC Edge Flavors: BGP PIC Edge Multipath and Unipath
Multipath: Re-routing router load-balances across multiple next-hops, backup nexthops are actively taking traffic, are active in the routing/forwarding plane,
commonly found in active/active redundancy scenarios.
No configuration, apart from enabling BGP multipath (maximum-paths ... )
Unipath: Backup path(s) are NOT taking traffic, as found in active/standby scenarios
router bgp ...
address-family ipv4 [vrf ...]
or
address-family vpnv4
bgp additional-paths install
Cisco Public
52
route-policy BACKUP
! Currently, only a single backup path is supported
BRKIPM-2265
Cisco Public
53
Backup/Repair Path
BRKIPM-2265
Cisco Public
54
PE1
PE3
PE2
PIC-Edge Enabled
PE3#sh bgp vpnv4 unicast vrf red 110.0.0.11/32
BGP routing table entry for 1:3:110.0.0.11/32, version 43
Paths: (2 available, best #2, table red)
Additional-path-install
Advertised to update-groups:
1
Refresh Epoch 1
110, imported path from 1:1:110.0.0.11/32
10.0.0.2 (metric 21) from 10.0.0.6 (10.0.0.6)
Origin IGP, metric 0, localpref 100, valid, internal, backup/repair
Extended Community: RT:1:100
Originator: 10.0.0.2, Cluster list: 10.0.0.6 , recursive-via-host
mpls labels in/out nolabel/16
Refresh Epoch 1
110, imported path from 1:2:110.0.0.11/32
10.0.0.1 (metric 21) from 10.0.0.6 (10.0.0.6)
Origin IGP, metric 0, localpref 2000, valid, internal, best
Extended Community: RT:1:100
Originator: 10.0.0.1, Cluster list: 10.0.0.6 , recursive-via-host
mpls labels in/out nolabel/17
PE3#
Backup/Repair Path
PE2
Cisco Public
55
PE1
PE3
PE2
List
Cisco Public
56
PE1
PE3
PE2
BRKIPM-2265
Cisco Public
57
PE1
PE3
PE2
BRKIPM-2265
Cisco Public
58
PE1
PE3
PE2
BRKIPM-2265
Cisco Public
59
Next Hops:
10.0.0.3
b: 10.0.0.4
10.0.0.3/32
10.0.0.4/32
10.0.0.0/24
0.0.0.0/0
IP Routing Table
10.0.0.3/32
10.0.0.4/32
10.0.0.0/24
...
Default behaviour is for the FIB to recurse through the longest match prefix
This could include a summary or a default route, which could be a valid path to the
destination
However: In most designs, a /32 next-hop going away next-hop router is gone for
good, there is no point to find a longer match
Similar problem happens on the control plane when tracking the BGP next-hop
BRKIPM-2265
Cisco Public
61
recurse-through-connected flag
IOS-XR
No issue for labeled paths (i.e. vpnv4/6/rfc3107 next-hops), always only recursed via
/32s
New in 4.2: For unlabelled paths, including VRF prefixes: Does not recurse via
default route or via BGP summaries, or when candidate prefix length is less than
configured:
router bgp ...
address-family ...
nexthop resolution prefix-length minimum {32|128}
BRKIPM-2265
Cisco Public
62
p1, L=20
p2, L=61
p3, L=22
p4, L=34
p5, L=51
p6, L=51
p7, L=67
p8, L=67
p10, L=88(a)
p11, L=88(a)
p12, L=88(a)
...
p1
p2
CE1
p3
p4
CE2
p5
p6
CE1
p7
p8
CE2
p10
p11
CE1
p12
CE2
Cisco Public
63
FIB:
PE1
(active)
p1
CE1
p1
PE2
(standby)
Cisco Public
64
FIB:
PE1
(active)
p1
CE1
p1
PE2
(standby)
Cisco Public
65
PE1 allocates
per-vrf label 50
p1 via CE1,
backup: PE2, L=30
p2 via CE1,
backup: PE2, L=30
PE1
(active)
CE1
p1
p2
Cisco Public
66
PE1
(active)
p1 via CE1
backup: PE2, L=30
p2 via CE1
backup: PE2, L=30
Loop!
CE1
p1
p2
PE2
(standby)
BRKIPM-2265
Cisco Public
67
p1 via CE1
backup: PE2, L=30
p2 via CE1
backup: PE2, L=30
PE1
(active)
CE1
p1
p2
PE2
(standby)
BRKIPM-2265
Cisco Public
68
PE1
(active)
p1
p2
CE1
p1
p2
p1 via CE1
p2 via CE1
PE2
(standby)
BRKIPM-2265
Cisco Public
69
b)
Example: In below situation: which label (123 or 456) should PE1 swap for label 50 when PE1CE1 link goes down? PE1 cant be sure!
p1 via CE1, per-CE label=50
alternate: PE2, L=123
p2 via CE1, per-CE label=50
alternate: PE2, L=456
PE1
(per-CE label) L 50: untagged, via <ce1>
backup: swap with <pe2>,????
p1
p2
CE1
p1
70
PE1
(active)
p1
p2
p2
PE2
(standby)
2013 Cisco and/or its affiliates. All rights reserved.
CE1
p1
BRKIPM-2265
CE2
Cisco Public
71
Per-Prefix
Per-CE
Per-VRF
*)
Possible transient
loop
Fully supported
Not supported
BGP Best-External
Yes
Yes
Possible transient
loop
eiBGP Multipath
Yes
Yes
No
ECMP PE-CE
Yes
No
Yes
Cisco Public
72
eBGP
eBGP
No special
PIC considerations
2013 Cisco and/or its affiliates. All rights reserved.
eBGP
Cisco Public
73
p1
PE2
ASBR2
Cisco Public
74
Wrapping Up
Summary
Routing Fast Convergence essential to deliver todays critical applications
IGP conservatively meet sub-second convergence requirements
Timer tuning required in IOS, IOS-XR and NX-OS optimized by default
IGP needs to be kept slim
BGP PIC enables BGP to achieve the same level of convergence, no
matter how many prefixes we deal with
Availability of alternate path needs to be kept in mind whenever designing
BGP services
BRKIPM-2265
Cisco Public
76
ASR1k: XE2.5.0
NX-OS: 5.2
PIC-Edge
7600, 7200: 12.2(33)SRE
ASR1k: XE3.2.0 (v4), 3.3.0(v6)
NX-OS: Radar
BRKIPM-2265
Cisco Public
77
BRKIPM-2265
Cisco Public
78
BRKIPM-2265
Cisco Public
79
Related Sessions
BRKIPM-2265
Cisco Public
80
Call to Action
BRKIPM-2265
Cisco Public
BRKIPM-2265
Cisco Public
82