networking n stuff

Thursday, August 31, 2017

The Curious Case Of OSPF NSSA LSA on GRE Tunnel

Today I encountered a case which is really easy when you know what exactly to look for but seems puzzling at the first glance.

This is what our topology looks like:



We have two routers connecting two locations with OSPF configured on each. The primary connection is via GRE tunnel while backup connection is just a direct connection (say we have dark fiber or L2VPN between these locations). OSPF neighbor relations are established on both links.

Nodes exchange routes between each other and everything goes on just fine..until the primary connection goes down. Then some of the routes R2 sends to R1 are seen on R1 as connected via the backup link..and some are still seen as connected via tunnel interface, though OSPF neighbor is already considered dead on this link.

Sounds intriguing?




That's what I haven't told you so far: both R1 and R2 are configured as NSSA. Let's take a closer look at the configuration:

R1:

interface Loopback0
 ip address 1.1.1.1 255.255.255.255
 ip ospf 1 area 1
!         
interface Tunnel0
 ip address 192.168.0.1 255.255.255.0
 ip ospf 1 area 1
 ip ospf cost 1
 tunnel source 10.0.100.1
 tunnel destination 10.0.200.2
!
interface Ethernet0/0
 ip address 10.0.100.1 255.255.255.0
!
interface Ethernet0/1
 ip address 172.16.0.1 255.255.255.0
 ip ospf 1 area 1
 ip ospf cost 200
!
interface Ethernet0/2
 no ip address
 shutdown
!
interface Ethernet0/3
 no ip address
 shutdown
!         
router ospf 1
 area 1 nssa
!
ip route 10.0.200.0 255.255.255.0 10.0.100.2

R2:

interface Loopback1
 ip address 3.3.3.3 255.255.255.255
!
interface Tunnel0
 ip address 192.168.0.2 255.255.255.0
 ip ospf 1 area 1
 ip ospf cost 1
 tunnel source 10.0.200.2
 tunnel destination 10.0.100.1
!
interface Ethernet0/0
 ip address 10.0.200.2 255.255.255.0
!
interface Ethernet0/1
 ip address 172.16.0.2 255.255.255.0
 ip ospf 1 area 1
 ip ospf cost 100
!
interface Ethernet0/2
 ip address 192.168.100.2 255.255.255.0
!
interface Ethernet0/3
 ip address 192.168.200.2 255.255.255.0
 ip ospf 1 area 1
!
router ospf 1
 area 1 nssa
 redistribute connected subnets route-map REDIS_LOOP
 passive-interface Ethernet0/2
!
ip route 10.0.100.0 255.255.255.0 10.0.200.1
!
!
route-map REDIS_LOOP permit 1
 match ip address 1
!
!
access-list 1 permit 192.168.100.0
!
access-list 1 deny   any

Let's check we have our neighbor relations estableshed and check what we have in R1's routing table while primary connections is still up:

R1#sh ip os nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.16.0.2        0   FULL/  -        00:00:32    192.168.0.2     Tunnel0
172.16.0.2        1   FULL/BDR        00:00:33    172.16.0.2      Ethernet0/1

R1#sh ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       a - application route
       + - replicated route, % - next hop override

Gateway of last resort is not set

O N2  192.168.100.0/24 [110/20] via 192.168.0.2, 00:07:05, Tunnel0
O     192.168.200.0/24 [110/11] via 192.168.0.2, 00:05:32, Tunnel0

Seems legit. Now let's break something in the cloud so the primary link fails and see how R1 reacts:

R1#sh ip os nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.16.0.2        1   FULL/BDR        00:00:38    172.16.0.2      Ethernet0/1
R1#sh ip route os
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       a - application route
       + - replicated route, % - next hop override

Gateway of last resort is not set

O N2  192.168.100.0/24 [110/20] via 192.168.0.2, 00:01:10, Tunnel0
O     192.168.200.0/24 [110/210] via 172.16.0.2, 00:00:15, Ethernet0/1


Well, ain't it cool, really? Now we have one of our routes avialable via redundant path but the other one is still seen via tunnel interface. You have already probably noticed that the route still seen via tunnel interface is NSSA External route. And that's exactly where the problem lies. Let's take a closer look:

R1#sh ip os data nssa

            OSPF Router with ID (1.1.1.1) (Process ID 1)

                Type-7 AS External Link States (Area 1)

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 400
  Options: (No TOS-capability, Type 7/5 translation, DC, Upward)
  LS Type: AS External Link
  Link State ID: 192.168.100.0 (External Network Number )
  Advertising Router: 172.16.0.2
  LS Seq Number: 80000007
  Checksum: 0xF2A5
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        MTID: 0 
        Metric: 20 
        Forward Address: 192.168.0.2
        External Route Tag: 0

Note the Forward Address field. Yes, that's R2's tunnel interface source address - and tunnel interface is still considered up, cause we have no keepalives configured and the failure was indirect.

But how is this address choosen? This is what Cisco says about this:
Forwarding address is selected on ASBR using the following rules:
If there is a loopback configured in the area then IP address of loopback is selected as forwarding address.
If first condition is not met then IP address of first interface on the OSPF interface list is selected as forwarding address. You can see OSPF interface list by using "show ip ospf interface brief" command. The interface on top will be the last interface which was attached to OSPF.

Well, we do not have loopback configured. And Tunnel0 is indeed the top interface:

R2#sh ip os int b
Interface    PID   Area            IP Address/Mask    Cost  State Nbrs F/C
Tu0          1     1               192.168.0.2/24     1     P2P   0/0
Et0/3        1     1               192.168.200.2/24   10    DR    0/0

Et0/1        1     1               172.16.0.2/24      100   BDR   1/1

And how "top" interface is choosen exactly? Well, it's just..the last interface you turned OSPF on. So it's just a question of luck.

The workaround is, of course, to simply configure OSPF on loopback interface - it will always be chosen as forward address then. You can also configure keepalive on tunnel interface, so that new Forward Address will be chosen as soon as tunnel interface fail. However, I encountered this problem on non-Cisco's device which does not support keepalives and considers tunnel is always up (even when we have direct physical failure).

No comments:

Post a Comment