Val:~$ whoami

I am Val Glinskiy, network engineer specializing in data center networks. TIME magazine selected me as Person of the Year in 2006.

Search This Blog

Tuesday, September 10, 2013

Stop using "ships in the night"

It seems that every month brings news of yet another overlay network. STT, OTV, VXLAN, NVGRE and may be many more I've not heard of. People often use "ships in the night" words in the same sentence referring to the fact that these overlay network know nothing about underlying physical network infrastructure. However, nowadays even the smallest sea or ocean going vessel not only knows about any ship nearby, but acutely aware of a few satellites thanks to radar and GPS systems.
So, to keep up with pace of time and advances in technologies (after all, we are in tech business) I propose  new analogy for overlay and physical networks: polar bear and penguin. Those 2 definitely do not meet in real life. I am not considering corner case like zoo.

Wednesday, September 04, 2013

Rant

Any vendor requiring registration in order to access online documentation should be condemned to untangle network cables for the rest of the their product life.














Wednesday, August 07, 2013

Very interesting and popular explanation of Shennon limit: Part 1 and Part 2.

Friday, July 19, 2013

LACP timer and what it means

LACP (IEEE 802.3ad)is protocol used to bundle several physical interfaces to form single logical channel. It has a timer which defines how often devices inter-connected via this bundle exchange LACP PDUs or control messages. Currently, this timer can be set to either "rate fast" - 1 second, or "rate normal" - 30 seconds. What is not always clear is that when you configure "lacp rate " on Cisco or "set interfaces ae1 aggregated-ether-options lacp periodic fast" on Juniper, you do not configure how often this switch will send LACP PDUs. This command means that switch where this command is applied will expect to receive LACP PDUs with this frequency from the partner on the other side of logical channel.
Here is quick test. I have Nexus5500 connected to Cat6500. Let's configure port-channel between them with one physical member interface.

Cat6500#show run interface TenGigabitEthernet 1/5
!
interface TenGigabitEthernet1/5
 switchport
 switchport trunk encapsulation dot1q
 switchport mode trunk
 lacp rate fast
 channel-group 5 mode active

Nexus5500# show running-config interface Ethernet 1/1
!
interface Ethernet1/1
  switchport mode trunk
  channel-group 1 mode active

"lacp rate normal" is default setting on Nexus, so this command does not show up in the output, but we can confirm:

Nexus5500# show running-config interface Ethernet 1/1 all | include lacp
  lacp port-priority 32768
  lacp rate normal


Cat6500 is configured with rate fast and Nexus5500 - with rate normal. Let's see what's going on behind the scene.
On Catalyst:
Cat6500#show lacp internal
Flags:  S - Device is requesting Slow LACPDUs
           F - Device is requesting Fast LACPDUs
           A - Device is in Active mode       P - Device is in Passive mode    

Channel group 3
                            LACP port     Admin     Oper    Port        Port
Port      Flags   State     Priority      Key       Key     Number      State
Te1/5     FA      bndl      32768         0x3       0x3     0x106       0x3F

F flags says that Cat6500 requesting fast LACP PDUs from its partner.

On Nexus it's a little bit backwards, the "show" command tells you partner status, not its own.

Nexus5500# show lacp neighbor interface port-channel 1
Flags:  S - Device is sending Slow LACPDUs F - Device is sending Fast LACPDUs
          A - Device is in Active mode       P - Device is in Passive mode
port-channel1 neighbors
Partner's information
               Partner                           Partner                            Partner
Port        System ID                      Port Number     Age         Flags
Eth1/1     20,0-13-5f-20-63-80    0x106               910         SA

            LACP Partner           Partner                     Partner
            Port Priority              Oper Key                 Port State
            32768                      0x3                         0x3f

Nexus5500 says, that its partner - Cat6500 - is sending LACP PDUs every 30 seconds.



Monday, July 15, 2013

Find MAC address for IPv4 Multicast group

When troubleshooting multicast problem I find myself checking if IGMP snooping works as intended. "show mac address-table multicast" on Cisco switches shows MAC addresses of multicast groups. Tired of converting Multicast IPs into MACs with pencil and paper, I wrote my first ever script in Python which does just that. It takes IP address of multicast group as a parameter. Although I did some testing there might be bugs, so beware.
See Cisco's white paper for explanation how the conversion is done.
 

Saturday, July 13, 2013

The wait is over

Finally, in NX-OS 6.0(2) for Nexus 5000 platform Cisco implemented "default interface" command which lets you return interface to its factory default configuration. It is very-very-very useful feature in the lab environment when one has to do a lot of re-configuration and something does not work as expected simply because of left-over configuration from the previous test. 
This command has been available in IOS since 11.1 and in NX-OS for Nexus 7K since 5.1(1)

Wednesday, May 29, 2013

Juniper haiku

No wonder Juniper software is so bloated :).
admin@switch> show version and haiku
Hostname: switch
Model: ex4200-48t
JUNOS Base OS boot [10.4R12.1]
JUNOS Base OS Software Suite [10.4R12.1]
JUNOS Kernel Software Suite [10.4R12.1]
JUNOS Crypto Software Suite [10.4R12.1]
JUNOS Online Documentation [10.4R12.1]
JUNOS Enterprise Software Suite [10.4R12.1]
JUNOS Packet Forwarding Engine Enterprise Software Suite [10.4R12.1]
JUNOS Routing Software Suite [10.4R12.1]
JUNOS Web Management [10.4R12.1]


        Now that Zion's safe,
        can they find a better place
        to get some sweaters?

admin@switch> show version and haiku

        Juniper babies
        The next generation starts
        Gotta get more sleep

admin@switch> show version and haiku

        3am; darkness;
        Maintenance window closing.
        Safety net: rollback.


admin@switch> show version and haiku

        Shiny leather pants
        Why don't they squeak when they kick?
        Is _that_ the secret?

Saturday, April 20, 2013

PACL and MAC address learning

I have unenviable task to drag legacy application to 21st century. I am talking about 80-the legacy and some of its functions do not even use IP protocols. One of the proposed solution included 2 servers with the same
IP and MAC addresses (I know, but splitting network in 2 separate VLANs was not an option) connected to different switches, but in the same VLAN. ClientA and ClientB should be able to talk to each other and server connected to the same switch as client. ServerA and ServerB should not even know about each other's existence, so they won't complain about duplicate IP address. The switches are Cisco 6500s. One of the obvious  solutions is to put Port Access Control List on either side of the inter-switch link. PACL successfully blocked the traffic between servers, but switches still learned MAC address of the blocked traffic source and placed it MAC address table. This would cause MAC address flapping on the switch every time clients send ARP query for server's MAC or when both servers need to send traffic to their clients. Why would switch need to keep MAC address of the discarded traffic? Oh, well. Another network mystery.

Friday, April 12, 2013

"private-vlan syncronize" pitfall

Let's say you need to configure Private VLAN on Cisco Nexus switch with MSTP. You create new VLAN, make it isolated and map it to primary VLAN. Your newly created VLAN is automatically mapped to MST0 and if your primary happened to be in any other MST instance there is a possibility that your primary and secondary VLANs ended up with different L2 paths. As a bonus, you get
annoying message "These secondary vlans are not mapped to the same instance as their primary" every time you run "show spanning-tree configuration". Not a big deal, but things like these irritate me. Not to worry, "private-vlan syncronize" under "spanning-tree mst configuration" will automatically map all secondary VLANs to the same MST instance as primary VLAN. The moment it's done, you MST digest changes and boom, you have brand new MST region and STP convergence on top of it. So, either pick exiting VLAN mapped to the same instance and convert to secondary community or isolated VLAN, or have all your VLANs mapped to MST0 and use other methods to load-share traffic between transit links.
I know what you are thinking. No, I did it in the lab, so you won't have to.

Tuesday, March 05, 2013

How complex systems fail

A must read paper for every system architect. Also recording of the talk given at Velocity conference by Richard Cook. The main take-away for me is: don't build reliable system, build resilient system.

Monday, February 11, 2013

Port-security side effect

I discovered interesting side effect of configuring port-security which can prevent its deployment in certain circumstances. Let's have a look. Here is part of pertaining interface configuration:

 switchport port-security maximum 20
 switchport port-security
 switchport port-security aging time 1440
 switchport port-security violation restrict
 switchport port-security aging type inactivity

You can find explanations of what each command does here.

switch# show port-security interface gi 0/45
Port Security                      : Enabled
Port Status                          : Secure-up
Violation Mode                  : Restrict
Aging Time                        : 1440 mins
Aging Type                        : Inactivity
SecureStatic Address Aging : Disabled
Maximum MAC Addresses      : 20
Total MAC Addresses              : 1
Configured MAC Addresses    : 0
Sticky MAC Addresses            : 0
Last Source Address:Vlan        : abcd.ef12.3456:1234
Security Violation Count          : 0

In case you were wondering, the MAC address above is completely made up, but it associated with "floating" IP assigned to active device in cluster. Let's see what happens when active IP has to be moved to other device in cluster due to fail-over:

port_security-2-psecure_violation: security violation occurred, caused by mac address abcd.ef12.3456 on port gigabitethernet0/46.
Oops. Even though port-security configuration allows to learn up to 20 MAC addresses and there are no MAC addresses on Gi0/46, we got port-security violation. Why? Let's see debug port-security:


 PSECURE: psecure_add_addr_check: Found duplicate mac-address abcd.ef12.3456, It is already secured on Gi0/45
%PORT_SECURITY-2-PSECURE_VIOLATION: Security violation occurred, caused by MAC address abcd.ef12.3456 on port GigabitEthernet0/46.
PSECURE: Security violation, TrapCount:346
PSECURE: Read:2830, Write:2831
PSECURE: swidb = GigabitEthernet0/46 mac_addr = abcd.ef12.3456 vlanid = 1234
PSECURE: Adding abcd.ef12.3456 as dynamic on port Gi0/46 for vlan 1234
PSECURE: Violation/duplicate detected upon receiving abcd.ef12.3456 on vlan 1234: port_num_addrs 1 port_max_addrs 20 vlan_addr_ct 1: vlan_addr_max 20 total_addrs 4: max_total_addrs 6144

Port-security violation happened because MAC address has not been deleted from original port yet, hence "duplicate mac-address" message. To mitigate, but not completely alleviate the problem, we can reduce aging timer to 1 minute minimum. It still means that in case of fail-over, the floating IP address will not be accessible for another minute, which could be 1 minute too long.