Val:~$ whoami

I am Val Glinskiy, network engineer specializing in data center networks. TIME magazine selected me as Person of the Year in 2006.

Search This Blog

Friday, September 28, 2012

Nexus: peer-switch and STP Bridge ID

With release of NX-OS 5.2, Cisco started supporting peer-switch feature on Nexus 5K. When peer-switch is enabled, both VPC primary and secondary switches originate STP BPDUs on vPC ports and use the same designated bridge ID on vPC ports. This got me wandering what brige ID vPC primary switch uses when peer-switch is not enabled. I set up vPC switch-pair with downstream switch connected via vPC port-channel. The switches are running MST. Here is partial BPDU captured on downstream Nexus switch with command:
ethanalyzer local interface inbound-hi display-filter "stp" limit-captured-frames 20

Spanning Tree Protocol
    Protocol Identifier: Spanning Tree Protocol (0x0000)
    Protocol Version Identifier: Multiple Spanning Tree (3)
    BPDU Type: Rapid/Multiple Spanning Tree (0x02)
    BPDU flags: 0x7c (Agreement, Forwarding, Learning, Port Role: Designated)
    Root Identifier: 8192 / 0 / 54:7f:ee:01:15:81
    Root Path Cost: 0
    Bridge Identifier: 8192 / 0 / 54:7f:ee:01:15:81
    Port identifier: 0x9063
    Message Age: 0
    Max Age: 20
    Hello Time: 2
    Forward Delay: 15
    Version 1 Length: 0
    Version 3 Length: 96
    MST Extension
        MST Config ID format selector: 0
        MST Config name: blp-mst-Region-1
        MST Config revision: 2
        MST Config digest: d7e7e4984e26acd301b955c5289031ad
        CIST Internal Root Path Cost: 0
        CIST Bridge Identifier: 8192 / 0 / 00:23:04:ee:be:01
            CIST Bridge Priority: 8192
            CIST Bridge Identifier System ID Extension: 0
            CIST Bridge Identifier System ID: 00:23:04:ee:be:01
        CIST Remaining hops: 20
        MSTID 1, Regional Root Identifier 8192 / 54:7f:ee:01:15:81
        MSTID 2, Regional Root Identifier 8192 / 54:7f:ee:01:15:81

Note "Bridge Identifier" and "CIST Bridge Identifier". They are different. The former is "vPC local system-mac" and latter is "vPC system-mac". They can be found in "show vpc role" output:

nexus-primary# show vpc role

vPC Role status
----------------------------------------------------
vPC role                        : primary                      
Dual Active Detection Status    : 0
vPC system-mac                  : 00:23:04:ee:be:01            
vPC system-priority             : 32667
vPC local system-mac            : 54:7f:ee:01:15:81            
vPC local role-priority         : 8192

Here we can see, that without peer-switch enabled Nexus switch uses 2 different bridge IDs in the same BPDU. Why does it do it? I reached out to Cisco and will update when I hear anything.
When peer-switch is enabled, both vPC primary and secondary switches originate BPDUs on vPC ports and "Bridge Identifier" and "CIST Bridge Identifier" are the same and equal to "vPC system-mac"

Tuesday, February 21, 2012

IPExpert, it's 2012.

After reading excellent sample chapter from IPExpert's "IPv4/IPv6 Multicast Operation and Troubleshooting" book I decided to pre-order it. Today I got the pdf file and was very disappointed. The content is still great, but it can only be read on the PC or MAC - no iPad, Kindle or smartphone. Now, my commute is relatively long and I do most of my reading on the bus. Not be able to read documentation on mobile device is major problem for me. If greedy Hollywood studios found a way to provide their content on mobile platforms, so should IPExpert. Especially given the fact that INE provide PDF files DRM-free. This was my first and last purchase of IPExpert product.
Update: FileOpen released iPad/iPhone app, so now I can read that PDF on my iPad.

Saturday, February 04, 2012

ip accounting-list

Cisco has interface level command "ip accounting" which records number of bytes and packets passed through the router. If you want to count traffic only for specific IP address, you need to use "ip accounting-list" command. There seems to be a tiny bug in context help in 12.4(15)T14:

R2(config)#ip accounting-list 1.1.1.1 ?
  A.B.C.D  IP address mask

Note that context help says you need to enter address mask. 
R2(config)#ip accounting-list 1.1.1.1 255.255.255.255

After checking
R2(config)#do sho run | i accounting
ip accounting-list 0.0.0.0 255.255.255.255

That's not what I entered. In reality, you are supposed to enter wildcard mask, which makes more sense. Specifying ACL number or name would have made even more sense. So, let's fix it:

R2(config)#no ip accounting-list 0.0.0.0 255.255.255.255
R2(config)#ip accounting-list 1.1.1.1 0.0.0.0
R2(config)#interface fa 0/0
R2(config-if)#ip accounting

and test it. Traffic from R3 to 192.168.12.1 and 1.1.1.1 must pass through R2's Fa0/0 interface: 

R3#ping 192.168.12.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.12.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/58/120 ms
R3#ping 1.1.1.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/47/88 ms

R2#sho ip account
   Source           Destination              Packets               Bytes
 192.168.23.3     1.1.1.1                          5                 500

As you can see from the output above, only pings to 1.1.1.1 got counted.

Saturday, January 14, 2012

IGP: administrative distance per prefix

Routing protocol administrative distance defines route from which protocol will be placed in RIB - lower is better. However, AD can be changed via "distance" command on Cisco routers. The full syntax is:

distance ip-address wildcard-mask  [ip-standard-acl |  ip-extended-acl | access-list-name]

access-list option assumes that AD can be changed per IP subnet. Let's see how it works in RIPv2, EIGRP and OSPF.

I have very simple topology here

R1-------------------R2

Router R1 advertises 2 networks into RIP which we can see on R2:

R2#show ip route rip
R    192.168.200.0/24 [120/1] via 192.168.12.1, 00:00:11, FastEthernet0/0
R    192.168.100.0/24 [120/1] via 192.168.12.1, 00:00:11, FastEthernet0/0


Both routes have administrative distance 120 as it is default for RIP. Let's change AD for 192.168.100.0/24 R2#conf t
R2(config)#access-list 10 permit 192.168.100.0 0.0.0.255
R2(config)#router rip
R2(config-router)#distance 150 192.168.12.1 0.0.0.0 10
R2(config-router)#end

Now, we'll give it some time since RIP is notoriously slow to converge protocol and check

R2#show ip route rip
R    192.168.200.0/24 [120/1] via 192.168.12.1, 00:00:02, FastEthernet0/0
R    192.168.100.0/24 [150/1] via 192.168.12.1, 00:00:02, FastEthernet0/0

As you can see, 192.168.100.0/24 now has administrative distance 150

2. EIGRP
Now I configure EIGRP between my two routers

R2#show ip route eigrp
D    192.168.200.0/24 [90/156160] via 192.168.12.1, 00:00:13, FastEthernet0/0
D    192.168.100.0/24 [90/156160] via 192.168.12.1, 00:00:13, FastEthernet0/0

And repeat:

R2(config)#router eigrp 1
R2(config-router)#distance 150 192.168.12.1 0.0.0.0 10
R2(config-router)#end


Unlike RIP, EIGRP converges almost instantly:
R2#show ip route eigrp
D    192.168.200.0/24 [90/156160] via 192.168.12.1, 00:00:02, FastEthernet0/0
D    192.168.100.0/24 [150/156160] via 192.168.12.1, 00:00:02, FastEthernet0/0

3. OSPF
R2#show ip route ospf
O    192.168.200.0/24 [110/2] via 192.168.12.1, 00:00:17, FastEthernet0/0
O    192.168.100.0/24 [110/2] via 192.168.12.1, 00:00:17, FastEthernet0/0

In case of OSPF IP address in distance command should be router-id of OSPF neighbor from which route is learned.

R2#conf t
R2(config)#router ospf 1
R2(config-router)#distance 150 1.1.1.1 0.0.0.0 10
R2(config-router)#end

R2#sho ip route ospf
O    192.168.200.0/24 [110/2] via 192.168.12.1, 00:02:55, FastEthernet0/0
O    192.168.100.0/24 [150/2] via 192.168.12.1, 00:02:55, FastEthernet0/0

Once again, AD has changed to 150 for 192.168.100.0/24

Let's consider more complex OSPF scenario:


R2 and R3 advertise 192.168.100.0/24 and 192.168.200.0/24 to R4.
R4#sho ip route ospf | begin 192.168.200.0
O    192.168.200.0/24 [110/2] via 192.168.34.3, 00:00:10, FastEthernet0/0
                                [110/2] via 192.168.24.2, 00:00:10, FastEthernet0/1
O    192.168.100.0/24 [110/2] via 192.168.34.3, 00:00:10, FastEthernet0/0
                                [110/2] via 192.168.24.2, 00:00:10, FastEthernet0/1

Both paths are equal and R4 will use both of them by default. Now, for some hard to explain reason we want to use R3 as our primary path to 192.168.100.0/24.  It should be easy, all we need to do is to apply our access-list 10 from above to routes we receive from R2 (OSPF router-id 2.2.2.2):
R4#conf t
R4(config)#router ospf 1
R4(config-router)#distance 150 2.2.2.2 0.0.0.0 10
R4(config-router)#end

We can not use "ip ospf cost" command since it affects all routes coming via that interface. Routing check:

R4#sho ip route ospf | begin 192.168.100.0
O    192.168.100.0/24 [150/2] via 192.168.34.3, 00:15:07, FastEthernet0/0
                                [150/2] via 192.168.24.2, 00:15:07, FastEthernet0/1

Hmm, 192.168.100.0/24 still has AD of 150 for both next hops. What happened? After doing a lot of digging I found this post from Mike Timm. Cisco bug CSCeh44993 prevents modifying administrative distance per route per neighbor in OSPF. Alas, Cisco decided not to fix it and make it a feature.

Wednesday, January 04, 2012

IPexpert puzzle

IPexpert posted  interesting puzzle today. Here is my solution:


R2:
router ospf 1
 router-id 192.168.0.2
 log-adjacency-changes
 network 192.168.0.0 0.0.255.255 area 0
 default-information originate
R5:
router bgp 5
 no synchronization
 bgp router-id 192.168.0.5
 bgp log-neighbor-changes
 redistribute ospf 1
 neighbor 172.16.45.4 remote-as 4
 neighbor 172.16.45.4 default-originate route-map DEFAULT
 no auto-summary
!
ip prefix-list DEFAULT seq 5 permit 0.0.0.0/0
!
route-map DEFAULT permit 10
 match ip address prefix-list DEFAULT
Now let's head to R4 and check BGP routes:
R4#sho ip route bgp
B    192.168.25.0/24 [20/0] via 172.16.45.5, 00:55:28
     192.168.0.0/32 is subnetted, 2 subnets
B       192.168.0.2 [20/2] via 172.16.45.5, 00:55:28
B       192.168.0.5 [20/0] via 172.16.45.5, 00:55:28
B*   0.0.0.0/0 [20/0] via 172.16.45.5, 00:37:36
I am still trying to find out why OSPF would not redistribute static default route. BGP will not redistribute default route even it's in source protocol routing table. It must be loop prevention mechanism, but I can not come up with a scenario when redistributing default route as oppose to originating it can cause routing loop. Especially in OSPF, where "default-information originate" creates Type5 LSA - same type as "redistribute" command would have created:
R2#sho ip ospf database | begin Type-5
                Type-5 AS External Link States
Link ID         ADV Router      Age         Seq#       Checksum Tag
0.0.0.0         192.168.0.2     391         0x80000003 0x001F26 1